There are challenges in enterprise-wide data security, but policy-driven data obfuscation can help you solve those challenges.
So, What Is Data Obfuscation?
Data obfuscation is the process of replacing production data with realistic, yet fictitious values. This helps you protect sensitive data.
Why Is Data Obfuscation Important?
One of the most common security issues that enterprise software teams overlook is securing their non-production environments, which is where the majority of sensitive data lies within an enterprise.
Non-production environments is where software teams develop and test application changes using copies of real customer and company data, and the sheer number and volume of these lower environments is far larger than the highly visible production environments. For every production instance of an application, there are at least 10 copies of non-production. Put simply, it’s the larger part of IT that is not visible to the rest of the world.
As a result, the risk of data loss increases with every piece of sensitive data that makes it outside the production zone.
That's why data obfuscation is important.
What's the Best Approach to Data Obfuscation?
The best approach to data obfuscation is a policy-driven through data masking. Data masking is typically done while provisioning non-production environments, so the copies of data created to support test and development purposes are not exposing sensitive information.
Unlike encryption, homegrown scripts or even synthetic data, an advanced, powerful PII data masking technology can do the following:
- Automatically identify sensitive data.
- Irreversibly protect the data from restoring to its original, sensitive state.
- Make testing feasible with realistic, but fictitious data while providing zero value to thieves and hackers.
- Extensibility & flexibility features allow businesses to customize their solution for a wide variety of data sources they depend on.
- Preserve referential integrity for important data relationships.
3 Aspects of Policy-Driven Data Obfuscation
Here are three aspects of an optimal policy-driven approach to data obfuscation to protect sensitive data.
1. Prioritized List of Sensitive Data Types
First and foremost, the first step in securing an organization’s most sensitive data is to understand what and where that data lies across the enterprise. From a program perspective, automating data discovery — which we refer as profiling — provides a consistent method of identifying sensitive data across the organization and enables consistency through various algorithms.
Unlike many other data platforms in the market, Perforce Delphix uses what we call profile sets to define what type of data you might consider sensitive and would like to identify throughout all of the various data sources and environments. It’s specifically designed to locate and identify where sensitive data resides within complex tables and fields to help save time and effort, ultimately speeding up implementation.
2. Standard Method That Can Be Used Across the Enterprise
The next question to be answered is how. You can achieve a reliable and consistent set of masked values by standardizing a method that transforms the identified sensitive data types. If data integration is important between applications, choose a method that is deterministic, so content as well as data format is consistent.
3. Sufficient Resources to Build, Execute, and Support the Plan
The infrastructure team will play a key role within the organization to implement the tools and processes that will be used. Establishing this resource is critical to the overall success of the project as it will support all applications onboarded to the masking platform. Additionally, deep knowledge of obfuscation tools and techniques and data storage technologies will also be required from the team.
The other option is to have a distributed data obfuscation program. This model is the simplest to implement as each business line is given the directive to obfuscate data in all lower environments but involves little guidance or tools to standardize on. This method results in faster implementation but forges siloed processes that rarely support integration. Organizations can decide to take this approach as a short-term solution and retool to meet long term goals.
The 2024 State of Data Compliance and Security Report
75% of organizations say that sensitive data volume in non-production has increased. 91% are concerned about the expanded exposure footprint. Find out what you can do about it when you get the insights from 250 global leaders in this report.
How Data Obfuscation Works with Delphix
Delphix makes it easier to adopt a policy-driven data obfuscation approach. Here's how it works with the Delphix DevOps Data Platform.
Determining what and how you’re masking your sensitive data are the most important questions to kick off the program. The next step is to build your profile set, which is a grouping of search expressions used to search your databases to identify columns and fields containing sensitive data. The search can be performed using metadata or by sampling the table or file data.
Column-level metadata search should be performed as the primary search method since the result will be returned much faster. If your schemas employ uncommon column names, it will be necessary to perform a data level scan. It’s not uncommon to employ both techniques.
When using the Delphix profiler, a successful scan results in assignment of a domain (PII type) and algorithm (method). It employs Java regular expressions to find sensitive data by scanning columns names and if necessary, column data contained in the database. Delphix data masking provides over 50 out-of-the-box search expressions ready to use for profiling. For each expression, you will see a domain (data type), expression name, expression level, and expression text. The expression level determines whether the profiler will search through column names in the schema or data within the tables.
From there, the domain ties the search expression to an algorithm. The masking security policy is implemented by grouping the search expressions into the profiler set, and these constructs that implement the security policy within Delphix masking are managed in the settings tab of the UI.
The Delphix Masking Engine is pre-configured with two common use case profiler sets: financial and healthcare (HIPAA). These profiler sets contain a superset of common domains for each use case. I recommend that you review the expression/domain that are included in these profiler sets as an example of common data types to be included in your masking security policy.
It’s best to start with a smaller set (10 or fewer) of your most sensitive domains and evaluate the inventories produced. The profiler can be run again and again with additional expressions to rebuild the masking inventory.
Once the profiler has run, the inventory can be evaluated to make sure sensitive columns have been identified and the correct algorithm assigned. This evaluation is best performed by exporting the inventory to a CSV file (export button in inventory tab) and seeking feedback from the application team.
This spreadsheet (below) along with the masking policy spreadsheet above will provide the application SME with the “what” and “how.” An initial check by the application team and resulting feedback can save time during the masking process.
Sample Inventory Export
Make Data Obfuscation Easier
Having the ability to automate the discovery of sensitive data, mask that data and distribute it quickly and securely to both internal and external stakeholders will be key in mitigating risk within your enterprise, rather than locking down that data to protect it from unauthorized access. Not to mention, a policy-driven masking program can speed up internal adoption.
Delphix delivers data masking capabilities that enable businesses to mitigate risk and eliminate barriers to fast innovation. Delphix automatically discovers sensitive data values including names, email addresses, and payment information. Then, it transforms sensitive values into realistic, yet fictitious ones — while retaining referential integrity.
A recent IDC study found that 77.2% more data and data environments were masked and protected by using Delphix. That's data obfuscation made easier.
Related blog >> What Is Delphix?
Get Started with Data Masking
Try Delphix data masking and see how Delphix enables fast, automated compliance. Request a no-pressure compliance demo today. You’ll find out why industry leaders choose Delphix for policy-driven data obfuscation.