Existing Citations

  • data anonymization : In 2006, Netflix released data pertaining to how 500,000 of its users rated movies over a six-year period. Netflix “anonymized” the data before releasing it by removing usernames. Still, Netflix assigned unique identification numbers to users in order to allow for continuous tracking of user ratings and trends. Researchers used this information to uniquely identify individual Netflix users. According to the study, if a person has information about when and how a user rated six movies, that person can identify 99% of people in the Netflix database. (†1556)
  • de-anonymization : In each of the above cases [Netflix study and AOL release of user data], data was re-identified by combining two datasets with different types of information about an individual. One of the datasets contained anonymized information; the other contained outside information – generally available to the public – collected on a daily or routine basis (such as voter registration information), and which includes identifying information (e.g., name). The two datasets will usually have at least one type of information that is the same (e.g., birthdate), which links the anonymized information to an individual. By combining information from each of these datasets, researchers can uniquely identify individuals in the population. (†1557)
  • reidentification : Re-identification is the process by which anonymized personal data is matched with its true owner. In order to protect the privacy interests of consumers, personal identifiers, such as name and social security number, are often removed from databases containing sensitive information. . . . Recently, however, computer scientists have revealed that this "anonymized" data can easily be re-identified, such that the sensitive information may be linked back to an individual. The re-identification process implicates privacy rights, because organizations will say that privacy obligations do not apply to information that is anonymized, but if the data is in fact personally identifiable, then privacy obligations should apply. (†1617)