Introduction to classification: definition
Data (or content) classification involves assigning a level of sensitivity or confidentiality to information, according to its value, criticality or potential impact in the event of disclosure or loss. This classification makes it possible to define security and access rules adapted to each level, to protect information against internal or external threats and guarantee its integrity, availability and confidentiality.
In addition to data classification, it is important to define access controls (who can see what), audit and detect data breaches, reduce risks and verify regulatory compliance (i.e., choose the right outsourcing solution).
Regulatory importance
In addition to its usefulness from a security and confidentiality point of view, certain laws (RGPD, Law no. 68-678 known as the “blocking law”,... ) or regulations (“Cloud at the center” SG 6282-SG of July 5, 2021 and SecNumcloud qualification) not to mention the world of national defense (IG 1300/SGDSN/PSE/PSD “Protection du secret de la défense nationale” in France, Executive Order 12356 in the USA) or health, impose control of certain data such as personal data, sensitive data,...
At European level:
In the GDPR [Article 5.1(f)], it is stated regarding personal data:
(f) processed in a manner that ensures appropriate security of the personal data, including protection against unauthorised or unlawful processing and against accidental loss, destruction or damage, using appropriate technical or organisational measures (‘integrity and confidentiality’).
In the new “European Data Regulation” (EU) 2023/2854 [Article 4]:
§6 ... The data holder or, where they are not the same person, the trade secret holder shall identify the data which are protected as trade secrets, including in the relevant metadata, and shall agree with the user proportionate technical and organisational measures necessary to preserve the confidentiality of the shared data ...
(101) ... In other cases, situations may arise where a request to transfer or provide access to non-personal data arising from a third country law conflicts with an obligation to protect such data under Union law or under the national law of the relevant Member State ... related to national security or defence, as well as the protection of commercially sensitive data, including the protection of trade secrets, and the protection of intellectual property rights ...
In the Council Decision for European bodies “Security rules for the protection of EU classified information” (2013/488/EU) [Article 3.1] “Classification management":
1. The competent authorities shall ensure that EUCI is appropriately classified, clearly identified as classified information and retains its classification level for only as long as necessary.
In France:
In the circular to administrations “Cloud au Centre” n° 6282-SG of July 5, 2021 [R9]:
If the IT system or application handles particularly sensitive data, such as the personal data of French citizens, economic data relating to French companies, or business applications relating to public-sector employees: the commercial cloud offering selected must comply with SecNumCloud qualification (or a European qualification of at least equivalent level), and be immune to all non-EU regulations.
In the "blocking law" [Art.1] which applies to French private companies:
no communication of sensitive information held by a French company to a requesting foreign public authority may harm the interests of the Nation
In the United States:
In NIST SP 800-53 (for US federal agencies), AC-3 (11) “Restrict access to specific information types” & (13) “Attribute-based access control”:
... that restricts system access to authorized users based on specified organizational attributes (e.g., job function, identity), action attributes (e.g., read, write, delete), environmental attributes (e.g., time of day, location), and resource attributes (e.g., classification of a document). ...
In the HIPAA (Health Insurance Portability and Accountability Act) health data law, which we won't go into here.
At the Standards level :
In ISO 27001:2022 Annex A 5.12 “Classification of information”:
Information should be classified according to the information security needs of the organisation based on confidentiality, integrity, availability and relevant interested party requirements.
Of course, there are many other texts and standards, such as the PCI-DSS (Payment Card Industry Data Security Standard) in the banking and financial sector, which we won't go into here.
As you can see, whether in the private or public sector, there are a great many regulations and/or standards to follow.
The key point is that, whatever the underlying text, sensitive data must be classified (personal, health, business secrets, banking, etc.).
A brief overview of classification models
Numerous classification models exist, generally segmented into civilian and military.
In the private sector, standards and regulations rarely impose classification levels. In France, it should be noted that the "blocking law" refers to a specific classification (such as “Sovereign Sensitive”).
Nevertheless, we regularly find :
- C0 - Public or Unclassified (NC)
- C1 - Internal
- C2 - Confidential
- C3 - Restricted / Secret
- Rarely C4 - Top Secret
Note that level C3 may require additional protection, such as document encryption.
Note the Traffic Light Protocol (TLP) proposed by FIRST and adopted by the European CERT-EU (Computer Security Incident Response Team) and American CISA (Cybersecurity and Infrastructure Security Agency). It is more of a protocol between security players, and concerns emails, documents and even instant messaging conversations:
- TLP:CLEAR (no internal or external restrictions)
- TLP:GREEN (distribution limited to a “community”)
- TLP:AMBER (limited distribution within the organization and/or its “customers”)
- TLP:AMBER+STRICT (equivalent to internal distribution)
- TLP:RED (defined and limited list)
For the military, for example, we find the NATO classification:
- COSMIC Top Secret (CTS)
- NATO Secret (NS)
- NATO Confidential (NC)
- NATO Restricted (NR)
Example of classification by type of content
HIGH | MODERATE | LOW |
Health data | Application, CV, Interviews, Employment contracts, Salary, ... | List of company contacts |
Personal data including identification numbers (Social Security, Passport, Driving License, ...) | Management notes and reports | Internal memos without critical information |
Intellectual Property | Customer information, commercial contracts, etc. | Procedures |
Information on executive salaries and compensation | Financial data, budgets, ... |
|
Business secrets (business plans, annual reports before publication, mergers and acquisitions, etc.) | Partnership contracts, suppliers |
|
Bank details and company account information |
|
|
Critical research data |
|
|
Authentication data (password, PIN, private keys, etc.) |
|
|
Vulnerability report |
|
|
Highly sensitive” data may be classified as ‘C3 - Restricted / Secret’, while ‘moderately’ sensitive data may be classified as ‘C2 - Confidential’, for example.
Putting it into practice
Practical implementation will of course depend on the information system used to manage content. A file server will be very limited, whereas an DMS will offer many more possibilities.
In a DMS, there is first of all the 'document type' (sometimes called 'Category') which, in theory, could help with classification. This document type could be 'Contract', 'Salary slip', ....
However, it does not always determine sensitivity. For example, a 'Report' will not have the same sensitivity if it's a management report or a department report, or if it's a report on a current acquisition project.
Other metadata will therefore be required, such as the type of information contained ('Personal data', 'Financial data', ...), and/or directly the classification level ('C0', 'C1', ...).
This level of classification can be determined manually, semi-automatically and potentially in the future automatically, notably by AI (but human control will inevitably be necessary, given the risks in the event of error).
Once the data has been classified, access rules need to be defined. This means that certain users will have access rights to certain classification levels.
It is also possible to define 'spaces' (or 'folders' in DMS terms) with a classification level. A 'C2' area authorizes only content of type 'C2' or lower, and users of level 'C2' or higher.
Note that certain high classification levels require document encryption and certified hosting, such as SecNumCloud in France or C2S (USA), or the future EUCS regulation (European Union Cybersecurity Certification Scheme on Cloud Services).
Some Références :
- Regulation (EU) 2023/2854 of the European Parliament and of the Council of 13 December 2023 on harmonised rules on fair access to and use of data and amending Regulation (EU) 2017/2394 and Directive (EU) 2020/1828 (Data Act)
- Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance)
- COUNCIL DECISION of 23 September 2013 on the security rules for protecting EU classified information (2013/488/EU)
- Actualisation de la doctrine d'utilisation de l'informatique en nuage par l'État (« cloud au centre») (FRA)
- La loi « de blocage » : réforme et publication d'un guide (FRA)
- "Guide à usage des entreprises d’identification des données sensibles" (FRA)
- NIST SP 800-53 Rev. 5 "Security and Privacy Controls for Information Systems and Organizations" (ENG)
- Public Consultation on the draft Candidate EUCC Scheme