TLDR; Abiding by the law doesn’t have to be seen as a cost, it can be an investment into your data platform. Create a data lake if you don’t have one, optimize your existing one if you do. Deliver better services and stay compliant.
The California Consumer Protection Act. A sister to GDPR and a nuance to almost every corporation in sunny California.
As of January 1st 2020 the law made it possible for consumers to request access to, and the deletion of, all data a company holds about you. And as of July 1st, the Attorney General can finally start dishing out fines to give the law some backbone.
The law is a huge win for the consumer no doubt but also a monumental task for companies to abide by.
First of all tracking down a consumer's data across a company's many silos is a difficult task but ask anybody close to the data and they’ll tell you, getting it into a readable format is also a nightmare.
To make matters more difficult, we’ll also need to create a system for two way communication to take place. After all the consumer will need to be able to request the deletion of their data and receive confirmation of successful deletion. Which is another complicated task if the infrastructure hasn’t been progressively updated over the years.
Well, I found myself on such a project and throughout the rest of this post I’ll cover my experience going through the project management aspect of setting up a system to do this and highlight how the artifacts the process creates can be used to optimize our data platform.
CCPA can serve as an opportunity for businesses to re-configure their data platform.
The byproduct of this law forcing us to crack open those business units in order to IDENTIFY our data and build processes to ACCESS can create some very valuable meta-data that database managers would kill for.
Simply put, create a system for customers to make access or delete requests across the whole company. Every database, CSV and text file will need to be retrieved or purged with every request.
In order to organize a project of this scale you need a core team responsible for project management that not only has a communication process setup across every business unit but also ensures a caidance is followed and updates are delivered.
Most companies opt for a team of project management consultants because these roles have long hours and require a high level of movement across the business units... which can get politically charged.
Weekly all business units (BU) stand ups to go over project progress, blockers, successes and general messaging of their progress for all to see.
Along with a weekly individual BU standups for direct support and problem solving. Mainly focused around the identification of the data they use / store in order to function but also assistance with obfuscation opportunities.
The last type of meeting is the adhoc, which lets you bring the stakeholders in the room and organize the problem in a way everyone can contribute to solving cross business unit problems. Which in my opinion is where the magic really happens and can be extremely fun to be a part of.
The types of meetings above are really used to drive the following tasks in each BU:
- Locating databases
- Identifying sensitive data points
- Creating / Updating the output schema
- Data obfuscation opportunities
- Assisting with the purge process
For a business / data nerd like myself, the exploration and business understanding required is exactly why this type of project attracts my attention. That being said, the most valuable point here is the output schema each business unit generates.
By taking all the outputs and creating a set of none duplicate columns, we can create a master schema that can then be used as the data source to serve a nicely formatted output to the user.
Databases are Important
That one schema will account for every single BUs data needs. The opportunities this creates are endless, but one the most interesting is the opportunity to use this data to create more accurate master tables and design better lookup tables.
Why does this matter? Well, experience tells us that, in most enterprises, applications come and go, but databases usually stand for a long time. It therefore makes good sense to make the effort to develop a good design based on the rules that are specific to the business segment in context. (Teorey, 1994).
Most of the challenges around a project like this come from organizational structure and historical technology impeding the progress of discovery.
Essentially what you’re doing is taking an x-ray of a company’s data infrastructure while it is in motion. Resources that have the knowledge you need are likely to be stretched and occasionally burnt out -- understanding how to minimize the impact your project has on their backlog will serve you well.
From a technical standpoint, abiding by internal security protocols when refreshing the data platform will require time and patience especially in cases where first party PII (personally identifiable information) is stored.
The best opportunities show up when you least expect them. For a red brick organization, CCPA sounds like a nightmare and it is for the most part -- but with a little forethought and planning; it can be turned into an opportunity.
With CCPA the opportunity is simply to create a single view of your customer across siloed BUs; enabling you to better serve your customers and potentially find interesting ways to find new revenue streams and cut costs.