Validation
Overview of Validation
The last step of the process is to Validate (Audit) your Data Source to see if production patterns are still present (Indicating whether unmasked production data may still be present in the data or not). After successfully profiling the data a draft configuration file is automatically created for review prior to execution.
Validation Configuration
The rulesets and patterns scanned as part of the validation process are defined in Validation Configurations.
NOTE: Support for validation configuration will be discontinued in future releases. Users are highly encouraged to use masking configurations for performing validation scans.
For TDM version 3.5:
Navigate to Data Management Hub > Manage Configurations > Validation Configurations. Here you can Add, Edit, Clone or Delete a Validation Configuration.
For TDM versions 3.4 and earlier:
Navigate to Validation > Validate Configurations here you can Add, Edit, Clone or Delete a Validation Configuration.
Editing a Validation Configuration
As the masking process creates a draft Validation Configuration, click the Edit button on the validation configuration you want to edit. This will present the Validation Configuration form for editing the following information.
For TDM version 3.5:
For TDM versions 3.4 and earlier:
On the form you can edit the following configuration information.
Type | Description |
---|---|
Configuration Name | The validation configuration name. |
DB Source Connection | The data source the validation configuration is configured for. |
Default Threshold | Percentage or threshold data must meet to be flagged as PII found. |
Description | The validation configuration description or long name. |
Country | The country or language to which the respective configuration may apply. (This field can be null.) |
Notification | List of users who will be notified. |
Status | Set the validation profile to Active for use or Draft/Inactive to hide it from the Execute Validation area. |
Add/Edit Field(s) section: -
Type | Description |
---|---|
Category | The field masking category. (This filters the results visible in the Pattern drop list). |
Table Name | The table the column/field is on for masking. |
Column Name | The column/field for masking. |
PII Level | For viewing/setting the masking pattern as Primary PII, Secondary PII, or Other. |
Pattern | The pattern type (Masking function) as defined in the Data Library Administration area. |
Parameter | Some pattern types/masking functions may allow a parameter to be provided to customize the masking behavior. |
Once complete, scroll to the bottom of the main window and click Save to Save the configuration or Close to Cancel out without saving.
Note: Please ensure the status is set to Active for the Mask Configuration to be selectable in the Execute Masking screen.
Adding a Validation Configuration
On the Validation Configurations screen, click the Add Validate Configuration button. This will present a new Validation Configuration form
Once complete, scroll to the bottom of the main window and click Save to Save the configuration or Close to Cancel out without saving.
Cloning a Validation Configuration
On the Validation Configurations screen, click the Clone button on the Validation Configuration you want to duplicate. This will present the Validation Configuration form with a duplication of all data for saving as a new Configuration.
Once complete, scroll to the bottom of the main window and click Save to Save the configuration or Close to Cancel out without saving.
Deleting a Validation Configuration
On the Validation Configurations screen, click the Delete button on the Validation Configuration you want to remove. This will present a confirmation window confirming that you want to delete the Configuration or not.
Comparing Validation Configurations
Tick on the check boxes in front of the two configurations which you want to compare and then press the Config Comparison button to view the comparison report. The text highlighted in red marks the differences in the two configurations.
Execute Validation - Single Data Source
For TDM version 3.5+:
To execute a new validation request, navigate to Data Management Hub > Execution Console.
Here, you can select your data source then press View. Scroll down to the Execution Details section, select Validation in the first dropdown menu, and your chosen validation/mask configuration in the second dropdown menu. The Execute button can be used to then run a validation scan or View Results to simply view the results from the previously ran validation jobs.
IMPORTANT: Validation scans can now be performed on file type data connections. However, file validation ONLY supports masking configs. Users are also highly encouraged to use masking configs for a validation execution. More information on masking configurations can be found in Masking.
For TDM versions 3.4 and earlier:
To Execute a new validation request, Navigate to Validation > Execute Validation
Here you can select a data source connection and validation Configuration for execution. Press Run button to run the selected validation configuration. On Run button press, the table list down will be refreshed with past executions and will also show the status of current execution request.
Execute Validation - Connection Groups
For TDM version 3.6:
In order to run a validate scan on a connection group, firstly ensure TDM Queue Manager is turned on. Then, navigate to Data Management Hub > Execution Console > Connection Group.
Here, select your Connection Group from the first dropdown menu, then select Validate for your scan type. After selecting all the fields, click on View.
You can then select the mask and validate configurations from the Data Configuration column for the specific Data Sources.
In order to run a bulk validate scan, select all the data sources that you want to execute the scan on and click on Execute.
Then, scroll to the bottom of the page to the Execution Details section to look at the scan status, and access the logs and reports.
Validation Log
In the Details column, the Log button can be used to view logs for the execution.
Note: The Log window refreshes every 15 seconds if the validation is currently executing.
A currently executing validation scan can also be cancelled by clicking the Cancel button.
Validation Report - Single Data Source
Successfully completed Validations will generate a Validation Report which can be viewed by clicking the Report button.
This can also be searched on or exported out to Excel or CSV.
Validation Report Table
Type | Description |
---|---|
Connection Name | The name of the data connection. |
Schema | The schema under which the table resides. |
Table Name | The name of the table containing the data. |
Column | The specific column within the table. |
Data Type | The type of data stored in the column (e.g., string, integer, date). |
Data Length | The maximum length of the data in the column. |
Masking Pattern Used | The pattern applied to mask the data in the column. |
Masking Lookup Used | Lookup table or source used during the masking process. |
Masking Parameters Used | Parameters applied during the masking process. |
Result | The outcome of the validation operation. |
Validation Score | The score indicating the effectiveness of the validation process. (%) |
Scan Duration (Seconds) | The duration taken to complete the scan, measured in seconds. |
Error Details | Details of any errors encountered for the respective column. |
Validation Report - Connection Groups
For grouped connections, each data source in the group will generate a Validation Report if successful, which can be viewed by clicking the Report button.
In order export the reports in bulk, navigate to the Execution Console section in the Connection Group tab, and select the data sources to include in the bulk report.
Click on the Generate Report button, then download the report by clicking on the popup window.
Important: Databases and files have different report structures, therefore their reports cannot be generated together in bulk. When selecting the data sources to include in the bulk report, make sure that files and database connections are not selected together.
Masking Configuration
Starting from TDM3.3+, validation can also be conducted using masking configurations, allowing comparison between the source values and the lookup tables used by each masking function.
However, some masking operations, such as those involving hashed or encrypted values, do not rely on lookup files and are not easily reversible. Consequently, these masking patterns are not supported for validation using masking configurations. The following tables outline the supported and unsupported patterns.
Masking Configuration
Starting from TDM3.3+, validation can also be conducted using masking configurations, allowing comparison between the source values and the lookup tables used by each masking function.
Some masking operations, like unique hashes or encryption, cannot be validated by our tool because it can't determine if those values were originally set by our masking function. For 'ManyToOne' functions, our tool can check the masked values against lookup tables, but for unique hashes, there's no way to confirm if they were applied by us. The tables below outline which masking patterns are supported and which are not.
Masking Pattern Support Matrix
Pattern Name | Category | Supported? | Reason Not Supported |
---|---|---|---|
TFN_ManyToOne | TFN | ✅ | |
ACN_ManyToOne | ACN | ✅ | |
ABN_ManyToOne | ABN | ✅ | |
Telephone_ManyToOne | Tel No. | ✅ | |
Telephone_FixedValue | Tel No. | ✅ | |
Telephone_HashLast4Digits | Tel No. | ❌ | Last 4 digits masked using a secure hash; verification not possible as values are not linked to lookup tables. |
Address_ManyToOne | Street Address | ✅ | |
Address_FixedValue | Street Address | ✅ | |
Email_ManyToOne | ✅ | ||
Email_FixedValue | ✅ | ||
Email_UniqueID | ❌ | The part before the '@' is replaced with a unique ID, making it impossible to match with the original value for validation. | |
Organisation_ManyToOne | Org Name | ✅ | |
Organisation_FixedValue | Org Name | ✅ | |
Surname_ManyToOne | Surname | ✅ | |
Surname_FixedValue | Surname | ✅ | |
Firstname_ManyToOne | Firstname | ✅ | |
Firstname_FixedValue | Firstname | ✅ | |
Fullname_All_ManyToOne | Fullname | ✅ | |
Fullname_FixedValue | Fullname | ✅ | |
NameCharReplace | Other | ✅ | |
Replace_With_Null | Other | ✅ | |
Replace_With_Empty | Other | ✅ | |
DOB_FixedValue | DOB | ✅ | |
DOB_DaysOnly_ManyToOne | DOB | ❌ | Day value is recalculated using a hash, making it impossible to verify against the original date. |
DOB_ManyToOne | DOB | ❌ | Date is modified using a lookup value to shift the date, which prevents matching it back to the original value. |
Others_FixedValue | Other | ✅ | |
Creditcard_FixedValue | CC Number | ✅ | |
Creditcard_ManyToOne | CC Number | ❌ | Middle digits are masked using a hash and recalculated with Luhn, making it impossible to verify the original value. |
Others_MultiPII | Other | ❌ | Each value is masked using multiple functions, making it impossible to determine the original value due to the complex and varying transformations applied. |
Encrypt_OneToOne | Other | ❌ | Values are encrypted using a unique ID function, making it impossible to verify the original value. |
Hash_OneToOne | Other | ❌ | Values are hashed using multiple hashing algorithms, making it impossible to match the hashed value with the original input. |
Others_UniqueID | Other | ❌ | Values are replaced with unique identifiers, making it impossible to match them with the original value for validation. |
Others_HashLast4Digits | Other | ❌ | The last 4 digits are hashed, making it impossible to verify if the transformation matches the original value. |
Feature Support Table
The table below outlines the validation features supported by various data source connection types:
Connection Type | Conditional Validation | Like(%) Query Validation |
---|---|---|
Amazon Redshift | ✅ | ✅ |
CouchDB | ❌ | ❌ |
DB2 | ✅ | ✅ |
MariaDB | ✅ | ✅ |
MongoDB | ❌ | ❌ |
MSSQL | ✅ | ✅ |
MySQL | ✅ | ✅ |
Oracle | ✅ | ✅ |
PostgreSQL | ✅ | ✅ |
Sybase | ❌ | ❌ |
Teradata | ✅ | ✅ |
File - AVRO | ❌ | ❌ |
File - Delimited | ✅ | ❌ |
File - Fixed Width | ✅ | ❌ |
File - JSON | ❌ | ❌ |
File - ORC | ❌ | ❌ |
File - Parquet | ❌ | ❌ |
File - XML | ❌ | ❌ |
Note: Normal validation is supported across ALL data source connection types.