We need to redefine what privacy actually means when we store data in the Cloud. Why should we re-engineer data privacy? What makes the Cloud different to on-premises data storage? Surely data security and privacy are largely synonymous?
Well, yes…and No.
Data Security and Privacy in historical context
When organisations managed their IT and data within their physical domains, data security focused on the technological safeguarding of data against unauthorised access. That meant protecting sensitive, valuable data against outsiders getting access to its plaintext. The data security industry evolved, and we experience this most publicly via AES encryption today.
On the other hand, data privacy focused on the safeguarding of authorised access. That’s all about us users. Some of us get access, but most don’t. From this, the Identity and Access Management (IAM) industry emerged, giving us a myriad of passwords and identity methods we now need to remember.
Protection means more than data security
Somewhere in between the need to protect data by backing it up against loss, corruption, things going bang, and people stealing or mishandling it came to the fore.
Nasty viruses, or “data infections” emerged in addition, and of course data could be lost in error, mislaid or be subject to other inexplicable event that brought its demise.
I personally can vouch for the power of caffeine when I knocked my cup of coffee off the shelf (I mean disk array) in my company’s computer room (that’s what they were called before we needed data centres). A lesson learned the hard way that introduced me to another aspect of protecting data. The backup industry became more prominent…
As human appetite for all things computing escalated, hardware and systems innovations evolved to enable redundant storage developments such as RAID and mirroring. Equally crucial, we all became familiar with the concept of the grandfather-father-son and latterly 3-2-1 protocol over data backups.
By the time the Internet spawned the Cloud as we know it today, the organisational world had developed tech divisions, departments and staff paid to look after data security, identity and access management and resilience of data and systems.
In the data context, the idea was we secure it by encrypting it, restrict access to it so it’s private to those with legitimate access, and protect it against loss by copying it and storing it elsewhere…
All good then?
It’s never that simple. The Cloud was born into an epidemic of Internet-enabled data hacks and identity breaches that has spawned the hacker. The Dark Web is now a breeding ground for hackers and data laundering. State sponsored hacks, and the reputed harvesting of enormous data lakes by ‘rogue states’ are rife, waiting for the day when computing power is sufficient to break today’s encryption in the time it takes to drink a cup of tea.
Then there’s ransomware. A kick in the teeth for the good guys, whose standardised security is turned by opportunistic hackers into corrupt commerce … the modern-day equivalent of a ‘data bank’ hold-up without stealing any data.
Then there are the whistle-blowers who find it principally acceptable to abuse their access rights. Whether they are morally right or wrong, it’s difficult with traditional means to prevent determined, rogue actors abusing rights they have legitimately been given.
Let’s not forget about political moves to enable government agencies to exercise legal right to access data that we might otherwise regard as private. Society needs to be protected to threats that may destabilise our way of life, but we don’t want an Orwellian backdrop to our online freedom.
Latterly, data privacy regulations, such as GDPR in Europe, have caused a stir. It’s important to note these follow efforts in the US, specifically for healthcare, which have proved complex to drive home, such as HIPAA and RTEC. The Japanese devised the APPI in 2003 which finally came to pass in its current form in 2017, after major data breaches that raised the need for tougher regulation. Most recently, the Californians introduced the CCPA, which will enable enforcement from July 2020.
This body of regulation, which indicates regulatory collaboration around the world, has created greater public awareness of the threat to data privacy online. This, in turn, has created a public concern for data privacy amid calls that, ‘data privacy is a human right’.
I certainly agree the public need to trust its data remains private. If it can’t, I for one believe the vast potential of Cloud computing will never crystallise.
Let’s get back to the Cloud and its implications for data privacy.
Data privacy regulation such as GDPR brings tech security problems arising in the corporate basement to the lofty height of the top floor boardroom…where data breaches are not counted in records but in organizational fines, public disclosure and brand impact.
The benefits of the Cloud are immense but not to be taken lightly. The ‘wild west’ of Internet expansion needs to be tamed. The power of regulatory fines, mandatory disclosure and public humiliation are starting to bite. The point is, regulators and their political masters are now concerned enough to act. They can’t fix the problem, but they can mete out pain where consumer privacy is not respected. That has a direct impact on the technology industry and all that depend on it.
The need to evolve Data Privacy, Protection and Recovery for the Cloud
What happened to data privacy you may ask? How can data get nicked to the tune of 8 billion records a year?
AES still provides immense data security. Unfortunately, it doesn’t stop data being hacked. That means it can’t stop mandatory GDPR disclosures where data is stolen, whether encrypted or not. If it’s identifiably yours, irrespective of what you do to mask, tokenise or otherwise de-identify it, it’s your organisation’s regulatory obligation to declare.
As for IAM, how can we maintain privacy when hackers can bypass organisational IAM systems and plunder Cloud object stores, third party copies or other copies held wherever, and under whoever’s control and surveillance?
In short, the Cloud brings a massive and still misunderstood challenge when it comes to data privacy. Sure, today’s encryption is solid. The extension of data resilience to Cloud means more third parties that need to ‘protect’ data from loss. That means more copies of your data in more places. The attack vector of your data gets worse. Organisational data, once protected by the organizational physical firewall and IAM systems, is now spread across the Cloud.
Let’s be clear. Cloud providers have sophisticated infrastructure security, high availability systems and backup against failure to the extent they publicise data SLAs into the many 99.999999s. The problem is the inevitable spread of data, responsibility and surveillance necessitated by applying traditional security, privacy and resilience protocols to the Cloud.
Whilst data privacy regulations have evolved in light of painful events, I’m not sure our need to redefine and re-engineer what data privacy means in the Cloud has.
For their part, Cloud vendors have a clear Shared Responsibility Model that defines what they are accountable for and what remains the end customer’s responsibility. I don’t believe that’s as well understood by organisations as it should be. Multiple breaches of Cloud object stores due to misconfiguration and lack of clarity over who owns what, amongst other vulnerabilities, proves this.
The practical manifestation of the growth of Cloud is that, for all its massive benefits, it’s pretty near impossible to safeguard all of the data all of the time. It’s absolutely impossible to expect end users to do so, especially for data held at third parties in the Cloud.
The world of data security has revolved around the NIST AES standard for over 15 years now. Data privacy, historically focused on access to data has developed apace, but the world of biometrics, behavioural identity management and even semi-conscious and dynamic authentication are in their infancy. Password managers remain the choice of the beleaguered end user but have an “all or nothing” characteristic when it comes to user authentication.
Bottom line, if we are to learn the lessons from our past, we must evolve and get stronger, Darwinian style. That means building on the best of foundations from the past and innovating to protect our future.
Data privacy for the Cloud as a platform for Edge and IoT
The Cloud introduces many societal, commercial and geopolitical benefits. It connects us all. The commercial race to exploit the Cloud has driven its exponential growth, now recognised as a steppingstone to Edge and IoT delivery.
These are the two computing developments designed to bring data closer to the devices and applications we will use. It means more data flying around more places and more complexity with reduced latency. We critically need a new way to keep our data private.
Importantly, we also need to reduce its attack vector given its “always on, available everywhere, anytime” nature. That means keeping less copies of it.
Equally important, if data is needed in the always on, work anywhere, reachable world of the Edge to come, it needs to be resilient. But also secure… and oh yes, private.
Without getting into the threat of quantum computing, which ultimately will make some computing processes go faster, but may be a bit of a damp squib for Joe Public, we need to apply the concept of “quantum” to data.
We need to think of data privacy as just that. Private at the data object level to the individual, especially where data is out of sight, and beyond the access management system that brings context otherwise to the concept of privacy. After all, there’s no silver bullet as we all know. There is simply making life tougher for the bad guy than the bad guy makes for us.
Can data privacy that’s environmentally, economically and socially acceptable truly be possible?
Data needs to be stored in ways that mean it’s virtually impossible for anyone other than its owner to get any value from it. The current epidemic of data theft, hacks and state sponsored data exfiltration mean we need to think hard about how we protect data in the Cloud. Especially when it’s out of sight at rest in Cloud stores, under control of third parties or copied for resilience across Multi-Cloud and offline environments.
That means it can’t be stored or passed around in a form that would drive a GDPR notifiable event. Static data is like a sitting target. That’s not good anymore. The concept of data movement must evolve.
Oh, it also needs to be resilient to one or more Cloud or data store failures. That means evolving the concept of data resilience to data recovery at the data object level if we are to embrace Edge and IoT. No point in waiting for full backups to recover a file in the always on, work anywhere world…
So, data needs to be agile enough to be moved around, whether simply to keep it private or to move it to where you may next need it. This evolves a field known as moving target defence. Not particularly glamorous, but a good building block.
Data needs to be recoverable at the object level wherever you are and need it, irrespective of whether one or more Cloud disruptions, technology failures, corruptions or data thefts have occurred.
That means overheads are needed, of course. However, let’s throw in the need for improving our environmental footprint.
Let’s cut down the number of copies and restrict all data resilience to less than doubling the size of any data object, but delivering object level recovery where one or more data stores are disrupted…We can do this by building on data resilience and sharding techniques used for over 30 years, now a theme in many Cloud solutions.
Let’s then help organisations customise exactly how they protect the data they are accountable for. Of course, they need to do this in a way that is regulatory compliant, affordable, carbon efficient and customer friendly. Now that adds real novelty…
That way, organisations can rebuild trust with customers feeling comforted by ‘customer unique’ data privacy.
It also means organisations have a unique defence to hackers…
Think of being able to say ’prove it’ when a hacker has but shards of encrypted, obfuscated, de-identified and contextless data with no link to your organization. Such data will be equally meaningless to third parties tasked with looking after it.
Building on the foundations of innovation – Green, clean and mean data privacy
In short, data privacy in the Cloud is significantly more onerous than applying access and identity management to it. The Cloud and its future manifestations bring challenges today’s separate security, resilience and privacy technologies in isolation and combined can’t deliver. The statistics prove this.
However, long-standing building blocks for security, resilience and privacy are available to us all. We need to innovate to leverage them, as others did before us.
Re-engineering Cloud solutions that combine all three, whilst balancing economic, environmental and social demands, is one way to keep hackers, and the threat of future technologies at bay. We can build a more inclusive, trusted, carbon-positive Cloud ecosystem.
It may not be the only way, but it’s our way.