June 26, 2024 | Jere Lehtimäki


Personal Data or Anonymized Data?

Personal Data or Anonymized Data?

Personal data refers to any information related to an identified or identifiable natural person, as defined in Article 4(1) of the European General Data Protection Regulation (679/2016) (hereinafter "GDPR"). A natural person, i.e., a human being, is referred to as a data subject under the mentioned GDPR article. The article defines an identifiable natural person as someone who can be directly or indirectly identified based on various identifiers. GDPR lists such identifiers as a name, identification number, and other characteristics specific to the data subject, such as genetic or economic information.

However, personal data may need to be used in ways that require it to be anonymized – meaning the data is transformed to become anonymous. The reason for this might be the intended use of the data, such as for research or data modelling, or because the data is related to an old customer relationship where personal data processing is no longer allowed. This brief article provides a summary of the conditions and key considerations when assessing whether certain human related information is anonymized – that is, whether it is still personal data or not.

Anonymous or Anonymized Data

The GDPR's data protection principles do not apply to anonymous data, meaning data that does not relate to an identified or identifiable natural person. GDPR also does not apply to information that was originally personal data but has been anonymized in such a way that the data subject can no longer be identified. Therefore, the data can either be originally anonymous or anonymized.

However, implementing anonymization measures that result in truly anonymized data can be very challenging. This is especially due to the challenges posed by continuously developing technology, which can make it difficult to ensure that data remains anonymous after anonymization. Additionally, there are various risks associated with anonymizing personal data that must be considered when evaluating anonymization methods – such as data ending up in the wrong hands through technology providers. When assessing the level of anonymization of personal data – that is, whether it is de facto anonymous data or still personal data – all reasonably available means that could be used to directly or indirectly identify the data subject must be considered in accordance with GDPR.

Anonymizing Personal Data

The process of anonymizing personal data can be succinctly described as the technical-legal processing of personal data, resulting in the removal of identifiability. There is no single correct procedure for anonymizing personal data; rather, anonymization must always be designed and implemented on a case-by-case basis, taking into account the characteristics of the dataset, the usage environment, and usability. This is a typical and general requirement in data protection regulations, where compliance is often demonstrated only through case-by-case assessments.

Personal data anonymization must be carefully planned and executed to ensure that it is not possible to discover the actual identifying and characteristic details of the persons behind the data, even with significant computational power or by combining data. Due to the continuously developing so-called re-identification methods, it is also reasonable to restrict the use of data considered anonymized to the specific purpose for which the data was originally collected.

Checklist – Has the Personal Data Really Been Anonymized?

In practice, the following questions can generally be used to assess the level of anonymity or anonymization of certain data:

  1. Distinguishability: Is the data subject still identifiable from the dataset after anonymization?
  2. Linkability: Is the data subject identifiable by combining certain information about them with another dataset?
  3. Inference: Can it be reasonably inferred from the available personal data that the data pertains to a particular data subject?

If the answers to questions one (1) and two (2) are negative, and the probability of inference in question three (3) is very low, the dataset can be considered to have a good level of anonymity. The above questions are based on themes highlighted by the European Data Protection Board, which are considered key risks in data anonymization.


Anonymization is a useful measure from a risk management perspective when handling personal data, for example, for archiving and research purposes and in general business operations where large amounts of personal data are processed, and there is a desire to model data based on originally personal data. This can, for example, be done when creating various graphs from large datasets that reveal certain human patterns or tendencies, for example, shopping habits or similar. In other words, anonymizing data can reduce risks to data subjects and help data controllers and processors comply with data protection regulations, while also opening opportunities to responsibly use the data for multiple purposes at a later stage. Additionally, anonymization is a good way to get rid of personal data while still retaining the anonymized data for future anonymous modelling, such as in data processing businesses or surveys. However, it is crucial to ensure that the data is de facto anonymized and no longer considered personal data if this approach is taken.

Nevertheless, it is undeniably true that even if the party anonymizing personal data takes into account the GDPR, domestic and European regulatory guidelines, as well as rules and standards from outside the EU, without forgetting to use various technical methods for anonymizing the data, an undeniable conclusion remains: it is more often than rarely impossible to completely eliminate a certain minimal residual risk of re-identification.

At Nordic Law, we have strong experience in data protection and privacy. Please feel free to contact us in any questions related to the subject matter.

Nordic LawPioneer in Web3 and Fintech law