What Do We Mean by the Type of Data Potentially in Scope for GDPR?
Since the General Data Privacy Regulation (GDPR) was formally announced, there has been much discussion about the potential type of data that may be in scope for any GDPR initiative. In this blog, I’ll explore some of the potential definitions of type and some of the challenges they may bring. To make this easier, I try and think about 2 definitions of type that could apply to GDPR initiative data.
- The first is around what we’d probably called an Entity, or something that describes the contents of potential in-scope GDPR data moving around an organisation
- The second is more around the technology options that this Entity data has surrounding it.
Let’s look at these options and see what challenges they might bring.
The Entity type
This type will reference the content of the data moving around and across the technology of an organisation. Given that GDPR is aimed at personal data, this type is likely to reference content that is about individuals although the scope of potential sources may not just be limited to this set. Let’s explore some potential GDPR initiative Entity types:
- Customer
- Personal information about Customers that could be attributed to B2C oriented businesses. Examples could include the personal details about the owner of a Bank account or a Loan
- Client
- Personal information about a Client that could be attributed to B2B oriented businesses. Although this definition is often associated with the name of a business, many businesses have the owner’s name as part of the company name. An example could include ‘Andrew Joss Consulting’ which, although is a fictional company, might include the actual name of the owner in the title
- Policyholder
- Personal information about an Insurance Policyholder for a wide range of potential products or services. Based upon the above two definitions, this could be related to either B2C or B2B oriented businesses
- Beneficiary
- Personal information that isn’t about a defined ‘Customer’ or ‘Policyholder’ but personal data about somebody attached or identified against a specific Financial Services product. An example could be an identified spouse to receive a survivors’ pension after the death of a ‘Policyholder’
- Contact
- Personal information often associated with a Client (in the B2B world) which, although not specifically the company in question, identifies individuals who might be used as contact points within that company and could be seen as individuals associated to a company
- This definition can also be used for data collected about individuals or groups that are prospects of an organisation
- Employee
- Personal information on employees within an organisation. Given the potential business structure to many modern businesses, the definition of what an ‘employee’ is could vary considerably
- Contractor
- Personal information on individuals or businesses that have a contract of some form. This could include agency or temporary staff
- Volunteer
- Personal information on individuals or groups that provide products or services where they are neither classed as employees or on contract. This group may not feature in an HR or contract system and therefore could be outside any data management environment
- Visitor
- Personal information on individuals or groups that may have no formal linkage to the organisation but whose privacy data may be collected. An example could include a visitor from a third party organisation to the institution
- ‘Other’
- This is a catch-all definition for personal data that may get collected as part of any business process and that doesn’t fit in any previous category.
Given that GDPR is a principles-based regulation, organisations will need to consider which Entity types may or may not be in scope for them. The complex nature of many modern business models means organisations will need to clearly identify which Entity types they consider will be in scope and then develop capability to manage this data as part of a wider GDPR initiative. It should be noted that organisations will need to consider how they will collect, manage and deploy their definition of the ‘Consent’ attributed to this data.
The Technology type
This refers to the many technology options used to manage and move Entity data, both around an organisation as well as across a possible ecosystem of organisations. Here are some examples of some of the more well identified types:
- Structure
- Options include structured, semi-structured and unstructured
- Entity data that could be held in a variety of structures and that can make management of that data very varied in difficulty. Structured data is well known and understood so tends to be considered the easier option. Unstructured data is often considered the most challenging due to the high degree of processing required to extract personal data from it.
- Closeness
- Options include online, near-line, offline and backup.
- Entity data that could be held with very quick and easy access all the way through to entity data that requires significant amounts of processing to retrieve it so it can be managed as part of a GDPR initiative.
- Medium
- Options include digital, physical or a combination.
- Digital data is often considered the easiest as it’s what many organisations collect and manage as default. Physical data could include information held within a paper document or in physical form. A combination is where there are aspects of both digital and physical involved. An example of this would be the physical letter from a policyholder writing to formally notify a change of address. This involves scanning the original document and creating meta-data about the scanned version plus the electronic updating of the Policy system
- Explicit
- Options include explicit, implicit or a combination.
- Explicit is probably self-explanatory. Implicit relates to data that is hasn’t been explicitly defined but contains personal information. An example would be an account number that contains the holder’s data of birth.
- Location
- Options include internal or external
- With the advent of cloud based solutions and the complexity of modern business models, personal data might sit inside or outside the organisation’s physical or electronic firewalls. If this is case, then consideration needs to be taken about any specific challenges or risks associated with the location of the data
So why might this be important?
One of the reasons for writing about this subject, is that I think it helps inform the decision making process organisations may be going through about how to prioritise the tackling of potential data in scope.
Given the potential complexity of the subject, organisations will probably want to consider what approaches can be taken to either simplify the situation or at least help bring some clarity to the sequence of possible activities. The picture above has been a useful discussion point to help organisations think about some of the attributes of potential in-scope data. The principle nature of GDPR means organisations will need to consider a number of potential factors associated with in-scope GDPR data and how to use these factors to inform the order in which to tackle them.
Note
It should also be noted that GDPR being principle based means organisations will consider what obligations they may, or may not, need to meet. This document is not intended to be any form of advice or guidance, rather a discussion point about topics that might inform the development and undertaking of a wider GDPR initiative.