Blogs

Does Your Organization Need a Data Diet?

The scenario is all-too-familiar: There’s a security breach, and afterward, the affected organization asks what it must do to better protect its data.

But what if that organization never collected and stored that sensitive information in the first place? Often, the best defense against an embarrassing and costly breach is to collect only data that is essential to an entity’s mission.

Some who work in computer privacy circles refer to this as “going on a data diet.” They know the temptation is great for organizations to bulk up on data of all kinds. After all, storage costs are low. With the move to the cloud, an organization doesn’t even need to invest in hardware and upkeep to store the information it collects. So why not grab whatever data a customer or client is willing to provide?

Because we live in a time when the only sensible approach to computer security is to wonder when your entity might be breached—not if. That’s why it pays to not only go on a data diet, but to adopt a regimen for keeping your organization’s databases lean and, as a result, your customer relationships healthy. 

Too often, companies collect data just because there’s a chance it might be useful or they think it’s harmless. I’ve seen this play out firsthand. Sales departments seek any small edge they can find; who knows what small shard of data might prove useful in the pursuit of potential customers? The same is often true of an organization’s marketing unit, whose people are working hard to reach a broader audience for their product or message. If information flows freely from the potential and current customers, they collect it in pursuit of any leg up in the competition for people’s attention.

But there are costs to an organization beyond the price tag of data storage. That includes reputational risk and exposure to lawsuits and the distractions these and other problems bring. More info might seem better—until hackers have broken into a system and the head of an organization is forced to explain why that data was gathered in the first place.

Weaponizing Seemingly Harmless Data

It was not that long ago that there seemed to be two distinct types of data: In one bucket there was personally identifiable information (PII) such as a name, address, phone number and email address and critical financial information like bank account and credit card numbers. Throw in personal health information (PHI) and other obviously sensitive information; everyone knew this sort of data must be zealously safeguarded.

The second bucket was everything else: The seemingly extraneous bits we all leave behind living in a digital world such as what time you visited a website, the language you choose for a website, what products you viewed. Think of it as a mop bucket—messy if spilled, but not vital to corporate security.

But those distinctions have blurred in recent years. An individual piece of data might seem innocuous, but not when combined with other sets of data that do not fall under the definition of PII as set out by law. What makes Big Data so promising to an organization’s sales and marketing teams is also what makes it potentially dangerous to collect: The connections that a clever algorithm and today’s awesome computing power can deliver crawling through vast data stores. If you link information that is personally sensitive (a person’s tastes or habits or activities) but not technically PII, it can be weaponized to harm individuals.

As an example, think of the tracking data offered by people’s cell phones. If a person leaves the same street address at around 8:30 every morning and makes the same drive back and forth, it’s quickly obvious where someone lives and where they work. Given other open records, a hacker with dark motives can find our commuter’s name and other information. It’s not a big jump to connect this information to crimes like theft or even stalking and blackmail.

Adopting a New Data Discipline

Given the risks, forward-looking organizations are now adopting privacy programs that force their teams to ask the hard questions before collecting information. For these privacy practitioners, any time someone internally proposes that a new piece of data be grabbed or used in a new way, a conversation ensues about its usefulness and the validity of collecting it.

Maybe the policy is conducted through an internal privacy committee that meets to debate its appropriateness. Maybe it’s a rigorous process that means checks and balances throughout an organization. But whatever the method, only data that can be justified as fundamental to an organization’s operation is collected.

Another idea gaining popularity among those focused on privacy concerns: A process for jettisoning old data. (An example: Does an organization need to know a person’s last several addresses or just the current one where they can be reached?)  Under such a data diet, all information that an organization continues to store beyond its immediate use requires a specific justification.

After all, the information you don’t have is the easiest to protect.

Pilar Garcia

Pilar Garcia is 1Password's Privacy Officer, looking after all things privacy and compliance. She has deep knowledge in data privacy regulations and processes, and is instrumental in ensuring 1Password customer data is being protected. Pilar also leads our security audit and assessment operations, with a particular focus on SOC2 certification. Pilar completed her Bachelors in physics at the Universidad de las Américas Puebla (Mexico), and received her Masters in Pure and Applied Logic from the University of Barcelona.

Recent Posts

AIOps Success Requires Synthetic Internet Telemetry Data

The data used to train AI models needs to reflect the production environments where applications are deployed.

10 hours ago

Five Great DevOps Jobs Opportunities

Looking for a DevOps job? Look at these openings at NBC Universal, BAE, UBS, and other companies with three-letter abbreviations.

20 hours ago

Tricentis Taps Generative AI to Automate Application Testing

Tricentis is adding AI assistants to make it simpler for DevOps teams to create tests.

2 days ago

Valkey is Rapidly Overtaking Redis

Redis is taking it in the chops, as both maintainers and customers move to the Valkey Redis fork.

3 days ago

GitLab Adds AI Chat Interface to Increase DevOps Productivity

GitLab Duo Chat is a natural language interface which helps generate code, create tests and access code summarizations.

4 days ago

The Role of AI in Securing Software and Data Supply Chains

Expect attacks on the open source software supply chain to accelerate, with attackers automating attacks in common open source software…

4 days ago