Data Quality & Obfuscation

Data Quality : The usage of data is limited to extent of its quality. In geographically spread enterprises it remains a challenge with years data silos of systems and applications which have grown out of mergers, acquisitions, organizational changes under changing legal and regulatory guidelines. The data from such systems remains inconsistent and often fails to meet litmus test of being "fit for use" and "represent reality accurately and in time". Data Quality is closely tied to getting the return on investment a company makes in data integration tools and related technologies like databases, archival tools and business intelligence/analytics tools. Data Quality helps the subject matter experts, business and data analysts understand the data characteristics to build a picture of data gaps, apply remediation rules and reduce the rate of failure of applications due to data errors. This capability is most leveraged via the repository at the concept and high level solution phase of projects and such decisions should be baked into design to ensure a robust product. Hoonar undertakes a profiling driven approach to collect and publish the data quality metrics like accuracy, completeness, timeliness, uniqueness, conformance, referential integrity and consistency in the repository, both inline and offline with data integration process.

Data Obfuscation : Data obfuscation also referred as , anonimization, masking, and hiding is used to protect the sensitive customer and personal information for reducing the misuse of data, either intentional and inadvertent. Related but different to this is the synthetic data generation and test data generation. All this aim to provision a consistent data bed for its usage in quality assurance, testing ( integration / component / system ) and function within the various legal and regulatory guidelines. This obfuscated data serves the purpose of providing a test data bed for offsite and multishore development, testing and increasingly for training. A related discipline of encryption benefits companies in production environment. It reduces the risks for companies with by ensuring only the minimal information is shared for the purposes of completing the task successfully. A key requirement of such data is its ability to accurately represent the real life characteristics of the data in obfuscated or synthetic data. Understanding the data characteristics by usage of data profiling and analysis also enables the enterprises to undertake data obfuscation. Hoonar uses an integrated profiling driven approach to collect data characteristics and then model and execute the obfuscation suite to produce data that represents the actual data and at the same time keeping away real data.