Guide

Data Masking Methods & Techniques: The Complete Guide

Security & Compliance,

Data Management

Welcome to the complete guide to data masking methods and techniques!

We hear from enterprises all the time that are struggling to meet their data compliance commitments in a cost-effective and timely manner across their broad data estates. These estates typically feature multiple cloud vendors, multiple data technologies dating back to the mainframe, and a complex geographic customer environment.

In this resource, you’ll find a comprehensive overview of masking, as well as links to additional resources to help you solve your challenges.

Read along or jump ahead to the section that interests you most:

Data Masking Methods Overview
Data Masking Techniques
Techniques for Masking Different Data Sources
Why Use Data Masking Methods & Techniques?
Best Tools to Support Masking
Industry Examples
Get Started

Data Masking Methods Overview

Using data masking methods is the best way to safeguard sensitive data. But what’s the real value of masked vs. unmasked data? Masking preserves the data’s business value and ultimately enables faster testing and better insights for the enterprise. Unmasked data puts your enterprise at risk.

Data Masking vs. Other Methods

Data masking is just one method of protecting sensitive data; there are plenty of other methods you could use alongside or instead of masking personally identifiable information (PII).

Let’s compare data masking vs. other methods. We’ll also share data from the latest State of Data Compliance and Security Report on how frequently enterprises use each method. Most enterprises use data masking along with other security methods.

Data Anonymization

Data anonymization is considered a broad category that includes data masking. 44% of organizations use data anonymization. Compare the differences between data anonymization vs. data masking.

Data Encryption

Data encryption is a method that transforms data with a key or code, turning it into scrambled data. 61% of enterprises use data encryption. Compare the differences between data masking vs. data encryption.

Data Masking

Data masking is considered the best way to ensure compliance with GDPR, HIPAA, and CCPA. 66% of organizations use static data masking.

Uncover the essentials of data masking, from use cases to masking techniques, in this in-depth resource. Get the essential guide.

Or watch an on-demand webinar to learn requirements of data privacy regulations, and how you can best protect your data through masking.

Synthetic Test Data

Synthetic test data is an alternative method to data masking. It’s used to create artificial data that testing teams can use in place of real data. Compare synthetic test data vs. test data masking.

Common Data Masking Techniques

By leveraging the data masking techniques discussed here, you can safeguard sensitive information across your enterprise. Each technique offers different pros and cons, depending on your use cases.

Dynamic Masking

Dynamic data masking replaces or hides sensitive information in real time as it is accessed or queried without altering the original data in the database. Unauthorized users only see masked or redacted values. The original data remains stored securely.

Dynamic data masking is best suited for read-only applications.

In-Place Masking

In-place masking is a technique used to read from a target and then update it with masked data. This overwrites any sensitive information.

This masking technique is best used for traditional RDBMS lower environments such as Oracle, where the customer has significant storage constraints — for example, PaaS database environments.

Nulling Out or Deletion

Nulling out or deletion changes data characteristics and takes out any usefulness in data.

This technique is best used for data that is unnecessary for the testing and development use cases, for example large quantities historical data or journaling data that is not used for testing or development.

Obfuscation

Data obfuscation is a technique used to transform data into a non-recognizable and meaningless format while preserving the data structure. For instance, an account number `3456789456` could be converted to `abcd123xyz`.

Obfuscated data is best used for test cases where the data does not need to be meaningful or correspond to business logic rules. This type of obfuscation is often used for fields that are irrelevant to the core test scenario.

On-the-Fly Masking

On-the-fly masking refers to reading from a source (say production) and writing masked data into a target (usually non-production).

On-the-fly masking is best used for:

Populating a non-production data lake.
Sub-setting a much larger dataset.
When they want to establish an airgap separating production and non-production zones.

Redaction

Redaction is the removal or replacement of sensitive elements of structured or unstructured data with placeholders. Sensitive information such as names, addresses, or full credit card numbers are hidden, typically using characters like “X” or “*”. For example, an email `[email protected]` might be redacted to appear as `******@gmail.com`.

Redaction is a technique often used for end-user applications or document sharing. One common example is when redaction is used on large quantities of comment or text data that is not required for a particular test scenario.

Scrambling

Data scrambling is a masking technique that rearranges or shuffles the characters, digits, or other attributes of sensitive data to produce a non-usable but structurally similar output. Unlike obfuscation, scrambled data retains its original value length and data type, ensuring compatibility during software testing processes.

For example, a password `Pass1234` could be scrambled to `1sP2as34`. The scrambled data prevents unauthorized access but can retain simulated user context for testing purposes.

This technique is essentially another form of obfuscation, used for the same purpose.

Script-Based Masking

Often companies will build their own data masking scripts to mask small amounts of data for testing applications. This approach can work in a pinch but doesn’t scale to testing multiple applications or large databases.

Scripts are difficult and expensive to build and maintain and don’t offer the benefits of referential integrity across databases. And if the original script builder leaves the company, that knowledge leaves with them, which ultimately puts the organization back at square one.

Script-based masking is best used for small, localized tests for a single developer or unit test. It is highly unsuitable for full application testing due to the associated maintenance costs.

Shuffling

Shuffling reorders data within the same dataset or column to ensure the confidentiality of sensitive information. This method maintains realistic-looking data but mismatches the original relationships.

For example, employee IDs `101, 102, 103` and employee names `Alice, Bob, Claire` might be shuffled to match as `101-Claire, 102-Alice, 103-Bob`. Because relationships may be essential in testing environments, shuffling allows realistic data testing without exposing true values.

Shuffling is best used for cases where the merely breaking the relationships between data elements is sufficient. It is generally not recommended as the original data is still present.

Static Masking

Static data masking involves replacing sensitive data with fictitious yet realistic data directly at the source. This approach permanently alters the original values, making it impossible to revert to the unmasked data.

Static masking is best for use in non-production environments, such as development, testing, or analytics. It is ideal for creating full-sized, secure databases for performance testing.

To learn more, compare static data masking vs. dynamic data masking.

Static Data Masking vs. Dynamic Data Masking diagram.

Substitution

Substitution replaces sensitive data with fabricated data elements while maintaining the context and integrity of the dataset. For example, in a health record database, the medical history of “John Doe” might be substituted with fabricated yet plausible entries for “Jane Smith.”

This is one technique commonly used by static data masking software to create realistic but fictional data elements.

Tokenization

Tokenization involves replacing sensitive data elements with a non-sensitive equivalent, known as a token, that acts as a reference to the original data. The original data is stored securely in a separate location (e.g., a secure token vault).

For example, a customer’s credit card number `1234-5678-9012-3456` might be tokenized into `abcd-efgh-ijkl-mnop`. Unlike encryption, tokenization does not rely on mathematical algorithms, making it less vulnerable to sophisticated attacks.

This technique is commonly used to transfer data across geographic boundaries or to share data with third parties. The third party can perform whatever analysis is required, ship the data back, and the owner canre-identifythe tokenized elements in the safety of their own environment.

Variance

The data is changed based on the ranges defined. It can be useful in certain situations, e.g., where transactional data that is non-sensitive needs to be protected for aggregations or analytical purposes.

This technique is best used for scenarios where a range satisfies the business requirements. It also satisfies the particular regulation, for example, for HIPAA compliance where protecting key dates such as birth, death, admit, or discharge dates is required to maintain patient anonymity.

Techniques for Masking Data Sources

Your approach to data masking may vary based on the data source(s) you need to mask. Here are some examples of masking data sources.

Most enterprises have both Oracle and SQL databases. There are built-in options for Oracle data masking and SQL Server data masking. However, the built-in options will only work for those data sources. We recommend implementing a solution like Perforce Delphix data masking that can support all data sources, including Oracle and SQL.

In addition to databases, there are other sources to consider.

AI and analytics sources create unique challenges in masking. These can include Snowflake, Databricks, Microsoft Azure, and Microsoft Fabric, among others. A solution like Delphix Compliance Services can be used to mask these sources. Watch an on-demand webinar to learn more about how Delphix can help you accelerate data masking at scale for Microsoft Azure.

For business applications like Salesforce, it’s important to choose the right tool. Salesforce data masking is easier through Delphix’s Salesforce connector.

What other sources do you need to mask? Learn more about Delphix integrations for your key sources.

Five Approaches to Data Masking

Dive deep with our experts on five approaches to data masking: dynamic, script-based, encryption, static, and synthetic.

Why Use Data Masking Methods & Techniques?

Using data masking methods and techniques is critical to:

Eliminate data risks by delivering compliant data.
Remove data bottlenecks and speed up velocity.
Improve software quality through the use of realistic data.

Enterprise leaders need masking insights to keep pace with their peers. In a recent on-demand webinar, our team of experts shared insights from 250 enterprises on how they’re masking data today. Watch the webinar to keep pace.

For DevOps and CI/CD pipelines, the best way to eliminate data exposure is by automating the discovery of sensitive data and masking it. Hear from our masking experts on how to get started.

There are also unique cases by industry. For example, in the healthcare industry, there’s a need to de-identify Cognizant TriZetto healthcare data and healthcare EDI files in order to protect sensitive data. Dive deeper into the complexities of data masking in healthcare with help from our experts.

Developers experience the benefits of data masking, too. They get what they need — quality, masked data — quickly, so they can get back to work faster.

Best Tools to Support Masking

Looking for the best tools to enhance your data masking methods? You can start by referencing the Gorilla Guide to Data Masking. This guide breaks down why it’s important and what to look for in a solution.

We may be biased, but you don’t need to look very far. The best tool to support data masking in the enterprise is Delphix. And we’re not the only ones who say so.

In an excerpt from a 2022 announcement:

We’re proud to announce that, based entirely on 18 months’ worth of customer reviews on Gartner Peer Insights, Delphix achieved the highest Overall Rating of 4.5/5, based on 33 reviews as of 30 April 2022, among 14 vendors in the data masking market segment. Delphix is peer-recognized as a Customers’ Choice based on our Overall Rating and User Interest and Adoption. 94% of our customers would recommend Delphix, with one $10-30 billion banking customer calling our Continuous Compliance data masking solution the “Ultimate Solution For All Data Masking Challenges.

What makes Delphix unique?

One feature is Delphix’s data masking algorithms, like Secure Lookup. This is designed to irreversibly mask sensitive data while maintaining consistency across datasets. This approach ensures that masked data remains functional and valid for essential operations like application testing or analytics.

Another is the availability of Delphix APIs to allow IT teams to configure and execute masking procedures programmatically, which ensures accuracy while saving countless hours of manual effort. This can even be used to make file masking simple!

Ready to learn more? Discover:

Ways to identify confidential information and provide an enterprise-wide view of risk.
Approaches for deploying and customizing masking algorithm frameworks.
How to integrate data masking into critical business workflows via APIs.

Get the Delphix Masking Guide

Industry Examples

Financial Services

Masking is critical in the financial services industry, where you can’t afford to let your customers’ personally identifiable information (PII) get into the wrong hands.

That’s why some of the world’s largest financial services organizations choose Delphix to mask data to reduce risk, move faster, and improve quality. Here are just a few examples.

The Boeing Employee Credit Union (BECU) chose Delphix to solve their challenges in discovering and masking data efficiently. By using Delphix, they were able to mask 680 million data rows in just 15 hours. As a result, over 200 developers get self-serviced data. Read BECU’s story >>

Like BECU, an anonymous financial services institution needed to mask data faster. By using Delphix, they reduced data processing time from 30 days to 10 minutes. Read the anonymous story >>

Virgin Money needed to deliver technological changes quickly to market, but it was expensive and time-consuming to refresh environments for masking. By using Delphix, they can get a fully functional environment up within hours or days. On top of that, they can provision a large number of databases in minutes. Watch Virgin Money’s testimonial >>

Like Virgin Money, Worldpay by FIS found it was 7x faster to refresh test environments by using Delphix. On top of that, they reduced test data storage by 75-80%. Watch Worldpay’s testimonial >>

Insurance

In the insurance industry, it’s critical that protected health information (PHI) stays protected. But this can’t come at the cost of development speed. That’s why masking with virtualization from Delphix is widely adopted.

For Delta Dental, it used to take 8 weeks to extract data. On top of that, it was difficult to protect data and ensure compliance. By using Delphix, they can mask data and deliver virtual data copies to a team of 200 developers in minutes. Read Delta Dental’s story >>

Tokio Marine needed a solution for data masking that would also allow them to optimize their environments. By choosing Delphix, they were able to reduce non-production storage by 85%. Watch Tokio Marine’s testimonial >>

Telecom

Telecom runs 24/7. For Proximus, their testers need to be able to do their job without waiting for data. With Delphix, data masking time was reduced by 97%! Watch Proximus’s testimonial >>

Hospitality

In the hospitality industry, delivering a positive customer experience is critical. A big part of that is protecting consumer data. With Delphix, they can!

Choice Hotels found that by using Delphix, they can rest easy that data is masked and PII is protected. On top of that, Delphix helped them virtualize databases in minutes and speed up development cycles. Watch Choice Hotels’s testimonial >>

Education

In the education industry, there can be rigorous security requirements. That’s why Cal State University worked with Unisys to select Delphix.

“We deployed Delphix data masking that helped the entire data to be masked and protected. And with Delphix and its APIs, we were able to integrate that into our core pipelines and deliver to our developers much more agility and scalability and give the users a better experience."
– Raj Singh, Chief Architect of Strategic Accounts, Unisys

Read Cal State’s Story

Get Started

About Perforce Delphix

With Perforce Delphix, you don’t need to choose between compliance, quality, and speed. You can mask data for compliance, improve the quality of software with realistic test data, and deliver that compliant data to the teams who need it fast.

That’s the power of the Delphix DevOps Data Platform.

Delphix for Masking

Delphix data masking makes it easy to discover sensitive data automatically.

After that sensitive data is discovered, Delphix masks it in compliance with regulations such as GDPR, CCPA, HIPAA, and PCI DSS. And because masking transforms sensitive information, Delphix neutralizes the risk of breach in non-production environments that contain vast amounts of data that must be protected from cyberthreats. 

See it in action for yourself in the following recorded demos of Delphix's approach to data discovery and masking.

Need to mask specific data sources? Start with these quick demo videos featuring Delphix:

Want a custom demo for your organization? Get in touch with our team of experts.

Request Masking Demo

By Need

By Industry

Featured Product

Own Your Creative Workflows

2025 State of Data Compliance and Security