Administration
Datasource Management
Adding a Connection
Click the Add Data Source Connection button. This will present a popup form allowing you to choose your Data Source Type and fill in associated details.
The example below is for MSSQL but will differ depending on the data source type you choose.
Type | Description |
---|---|
Source Type | Database or File type (e.g., Oracle, MSSQL, Delimited File, etc). |
Connection Name | The name you want to provide the data source connection. |
Labels | One Word information attached to the object for better search and filtering. |
Status | Current Status of the Data Connection. |
Assigned To | User Group which can edit this data connection. |
Notification | List of users who will be notified for the execution results. |
Agent | Worker Bee which will be used for the data source (Need Separate Setup). |
Server | The database IP Address or Hostname. |
Username | The database username. |
Password | The database password. |
Port | The port the database is running on. |
Authentication Mechanism | Either "Username and password" or "SSL". Supported for MSSQL, MySQL and PostgreSQL. |
Note: Please ensure that details are correctly inputted as they are case sensitive.
You can then optionally click Check Connection to validate the detailss are correct and/or Save.
Editing a Connection
Click the Edit button to edit the Data Source. This will popup a similar form to the Add form but with existing details available for viewing/editing.
Cloning a Connection
Click the Clone Icon to create a copy the Data Source. A popup will be presented with copied information. Press Save to save the new cloned data source. Please note that the password needs to be inputted and won't be automatically copied as part of clone feature.
Deleting a Connection
Click the Delete button to edit the Data Source. A popup will be presented to confirm deletion.
Check/Test Connection
Click the Check Connection button to confirm/test that connectivity to the database or file can be achieved via the details provided and/or the database or file is accessible.
A green success message will be displayed if connection was successful or a red error message if it was not.
Note: If connection fails the details entered in the form may be correct but the database may be shut down or inaccessible preventing connectivity.
Data Library
Overview of Data Configuration Library
In Enov8's test data management suite, patterns are the data structures in which personal identifiable information (PII) data is categorised as. For example, a pattern could be first name, credit card information or a tax file number.
These patterns are crucial for profiling the data, as they help in distinguishing sensitive information from non-sensitive elements. Once these patterns are identified, masking techniques come into play. Masking involves replicating the identified patterns using functions that transform the original PII data into dummy or anonymised equivalents. For example, a function might replace actual credit card numbers with randomly generated, but structurally valid, dummy credit card numbers. Similarly, first names could be replaced with fictional names while preserving the original format.
Note: Out of the box data library objects are view only.
Pattern Table Mapping
Pattern Name | Description | Key Information |
---|---|---|
Many To One | Produces a value from a lookup file, using a hash to generate the value. | - Not unique and utilises hashing |
Fixed Value | A value will be provided by the user, and which will replace all source values for that column. | - User defined variable and replaces all values with the parameter |
Unique ID | A unique value will be generated for each source value found. The value will keep the same length and case as the source value. | - Unique data and preserved Data length and case |
Hash last 4 digits | Hashes the last 4 digits and pad 0s to the front if the hashed digit was less than 1000. | - Utilises hashing |
Date Many to One | Only days are masked, and the rest of the ‘date’ is preserved. | - Many to one |
Multi PII | Masks each value in the specific field with the required category and defined parameter. Best for when multiple values are placed in a single field. | - For multiple values |
Replace with empty | Replaces all values with an empty string. | - Clears whole column |
Replace with null | Replaces all values with “null”. | - Nulls whole column |
Name char replace | Replaces the source value with the user-specified character. It also removes all non-letters. For example, Jake -> Xxxx. | - Replaces source value with specified character |
Hash one to one | Converts the source value to a byte sequence and calculates the checksum twice. It essentially performs a two-step CRC calculation. | - CRC Calculation |
Keywords
Keyword | Definition |
---|---|
Lookup File | A file that contains a large amount of data for a specific category (for example 10,000 first names) |
Function | A Python script is generating the value algorithmically |
Custom Function | User-defined Python script that can manipulate the source value |
Profiling
Profiling pattern objects are utilised to identify data content (Database Records) based on three different pattern types
-
Keyword: Uses a list of keywords separated by a pipe ("|") to identify data records that matches the keywords. (e.g. SMITH|WILLIAMS|MOORE|ZHANG|MACDONALD)
-
RegEx: Identifies data rows that match a regular expression.
-
Function: Identifies data rows that match a php function.
In Addition to Database Content profiling, the platform also allows scanning for database column names (Meta) that match a keyword. This is generally suggested as a means of identifying tables to exclude from profiling (Such as System tables or temp tables) for very large databases.
- Meta: Identifies database column names that match a string of keywords.
Important:
-
Profiling by Meta only is not recommended as database column naming conventions can vary and are not a reliable indication of if the data content in the column is sensitive PII or not.
-
Columns that match based on Meta only are not automatically added into the Masking or Validation configuration.
Masking
Masking data library objects are made up of two different types.
-
Function: A masking function for replacing the source data. A masking function can operate in many different ways, for example;
-
Replacing the original production value with a lookup file value;
-
Generating a replacement value on the fly based on the original production value;
-
Replacing all values in the column with a fixed value;
-
Nullifying the column if the database column supports a null value.
-
Note: Methods 1 & 2 utilise a hashing algorithm, this ensures the replacement value is consistent across tables and databases as long as the original production value is the same.
- Lookup: A lookup file which contains replacement values used by a corresponding masking Function (e.g. Function example 1).
Note: Lookup tables can also be created to add synthetic test data into the database such as real test email addresses.
Validation
Validation data library patterns are made up of patterns only. These are used to confirm after successful masking if any production patterns still exist in the sensitive masked data that would indicate that real production data still exists after masking.
- Validate: A pattern that is made up of a distribution of data between most common and least common.
Custom Masking Functions
TDM now allows users to utilise their own Python functions to replace masked data. This functionality supports lookups, parameters, or a simplistic source value solution.
To create a custom function, navigate to Config Library Management. This can be done by going to Data Management ➨ Administration ➨ Config Library Management.
Next, click on the “add” button. Then, change the Type to “Mask” to create a masking pattern. The following window should appear:
Configurable Fields
Field | Definition |
---|---|
Keyword | The name of the pattern. This should also be the name of the function. |
Category | The type of the pattern. This can be either Firstname, Surname, etc. |
Country | The country the pattern should belong to. |
Language | The language of the pattern respective to the country. |
Type | The scan types the pattern will be used for. |
Sub Type | The type of the pattern. This should be “Function” to utilise custom functions. |
Value (the right text editor) | The screen where the function should be written or pasted to. |
Note: The function name should be the same as the Keyword.
Templates
Additionally, we recommend copying content from the template and modifying it as needed. The templates offer examples for simple source value changes, lookup utilisation, and parameter usage.
Note: It is best practice to load these templates and then modify the code.
Parameter vs Lookup
Parameters and lookups can be enabled by selecting them from the checklist. If both options are selected, you can use both features simultaneously.
Parameter masking involves using an argument in your function to manipulate the input data. This will also need to be added in the "Parameter" field in your masking config. Conversely, lookups refer to files that the function can utilise to mask the data. Both parameters and lookups can be used together within a single function.
Test
Next, to test whether the function is working as required, use the Test button by providing an example source value. This will return a masked value after using your algorithm.
Standard Setup Example
Lets create an example custom function where the source value is shuffled around.
Source Value: Source value to be masked
Return Value: Return the shuffled value
Step 1. Change the Type to "Mask"
Step 2. Import the template
Step 3. Lets write some code to generate a value based off a regex.
Note: the function name must be the same as the keyword
# Import supported modules here
import random
def shuffle_function(source_val):
# Change source_val into a list
source_list = list(source_val)
# Shuffle the list of characters
random.shuffle(source_list)
# Join the shuffled list back into a string
masked_val = ''.join(source_list)
# Return the new masked value
return masked_val
Step 4. Lets Test the function using a source value 12345
Parameter Setup Example
Lets create an example parameter based custom masking function. For this example, lets generate a value based off a regex and then append it to the source value.
Parameter/Input: Regex
Return Value: Masked value built from the regex
Step 1. Change the Type to "Mask"
Step 2. Import the template
Step 3. Lets write some code to generate a value based off a regex.
Note: the function name must be the same as the keyword
# Import supported modules here
import rstr
def regex_func(source_val, params):
# Perform custom masking to source_val using additional input parameter
# Set the regex variable to the parameter
regex = params
# Generate a random string
regex_value = rstr.xeger(regex)
# Add the source value and generated regex value
masked_val = str(source_val) + regex_value
# Return new masked value
return masked_val
Step 4. Lets Test the function using a Visa regex ^4[0-9]{12}(?:[0-9]{3})?$
Lookup Setup Example
Lets create an example lookup based custom masking function. For this example, lets take a balance value and add a random lookup based value to it.
Source Value: Source balance value
Lookup Value: Lookup list of random balance amounts
Return Value: Masked balance value
Step 1. Change the Type to "Mask"
Step 2. Import the template
Step 3. Lets write some code to append x amount to a balance
Note: the function name must be the same as the keyword
# Import supported modules here
from math import fmod
def balance_add(source_val, lookup):
# Use custom logic to calculate lookup index based on source value
hash_index = abs(hash(source_val))
lookup_index = int(fmod(hash_index, len(lookup))) + 1
# Obtain masked value from lookup using index
lookup_val = lookup[lookup_index]
masked_val = int(lookup_val) + int(source_val)
# Return new masked value
return masked_val
Step 4. Lets Test the function using a sample source value and a balance amount lookup (this is a custom lookup)
Validation Pattern Management
Validation patterns for the data library can be created by connecting TDM to a copy of sensitive production data and running the Administration > Validation Pattern Management feature within TDM.
From Administration > Validation Pattern Management click on Execute Pattern Search.
Select the Data Source, Table Name and Column Name you want to perform a Pattern Search on.
Once the search has been completed, indicated by the "Passed" status. Click on the View button.
This will show the pattern search results. A validation distribution pattern can be created by clicking "Save As Pattern".
This will open up a form to enter a name for the pattern and category.
Once saved navigate to Administration > Config Library Management filter the grid by typing in "Validate" in the Config Library search box.
You should be able to View, Edit and Delete the new validation pattern created.