Redaction, Masking, or Tokenization? How to Choose (Before You Get Sued)
In 2019, a well-known law firm learned a million-dollar lesson the hard way. They were tasked with submitting documents for a high-profile celebrity case. Sensitive information, including private text messages, needed to be obscured. Thinking they were being diligent, they applied what they thought was a foolproof method: masking. Instead of permanently removing the confidential content, they simply covered it up with black bars. The digital equivalent of putting sunglasses on a PDF.
That was a huge mistake!
Turns out that the masking was easily reversible. The bars were a simple overlay, and anyone with the right tools and a curious mind could access the texts. The results were immediate! A $1 million lawsuit for privacy breach and a very public apology that echoed through the whole legal industry. The firm’s reputation took a huge hit, and it became harder to secure high-profile clients. All this from a seemingly small technical oversight.
This problem isn’t exclusive to law firms. Whether you’re a doctor protecting patient information, a banker entrusted with people’s financial information, or a freelancer handling different clients’ data, picking the wrong data protection method could backfire in spectacular fashion! It could lead to heavy fines, lawsuits, and reputational damage.
In this article, we’ll look at the 3 main ways of protecting sensitive information and explain where to use each. Keep reading if you’d like to avoid your own $1 million mishap.
The Data Protection Trio
There are 3 main ways you can use to protect sensitive data, each with a different approach and ideal use case. They include:
Masking
You can think of masking as wearing a disguise (mask). It obscures the original data, but it keeps some of its identifying characteristics and format intact. For example, the law firm that lost $1 million used black bars to mask sensitive information in texts. However, the bars could still be removed with the right software to reveal the original text.
Redaction
Redaction, one of the most commonly used data protection techniques, is like shredding sensitive information. Imagine you were writing your diary and realized that you just wrote a secret that you’d never want anyone to find out. You wouldn’t just scribble over it lightly, hoping no one peeks through. You’d take a thick, permanent marker and black it out completely, making it utterly unreadable.
That’s redaction explained simply. It permanently removes the sensitive information you want obscured, making it impossible to recover.
Tokenization
Tokenization takes a totally unique approach. Instead of destroying or hiding the data, it replaces it with random, non-sensitive values called tokens. Think of it like a spy using an alias on a mission. The headquarters will know their true identity, but to the rest of the world, they’ll be “Agent X.”
Real-World Face-Off
Understanding the definitions is one thing, but seeing these methods in action (and when they go wrong) really drives the point home.
Case 1: The Healthcare Blunder
What happened: A research clinic was conducting a study on a particular medical condition. To protect patient anonymity, they masked patient names in study reports with “Jane D**,” “Robert S**,” etc. They believed that this was responsible.
However, they forgot to mask one vital piece of information: the birthdate. An inquisitive (and perhaps indiscreet) researcher matched masked names and complete birthdates against public records. Before long, several participants in the study had been identified.
Lesson: Masking is not an ironclad route to anonymity. Though it can obscure explicit identifiers, when masked data are joined with other available data, re-identification is again possible. In this instance, redaction of the birthdates or the names (or both) was required to effectively safeguard patient privacy.
Case 2: The Payment Glitch
What went wrong: A new e-commerce company used tokenization to deal with customer credit card data used for repeat payments. They outsourced the tokenization to a third-party company so that sensitive credit card numbers never resided on their own servers. It all seemed like a secure and efficient system. But disaster hit when the third-party company had one major data breach.
Though the genuine credit card numbers were not made public, the same encryption key used to connect the tokens to the actual data got compromised. This put the startup in a very risky situation, with the threat of regulatory penalties and an immense loss of customer confidence.
Lesson: Tokenization can be an incredibly effective way to safeguard sensitive data, particularly when used in payment processing. At the same time, tokenization adds complexity and depends on the extent to which the tokenization environment and the protection afforded to the mapping key are secure. If the mapping key becomes compromised, the system can disintegrate.