The data quality solution provided by Data Quality Services (DQS) of SQL Server is used to maintain the quality of data and ensure that the data is suited for business usage. DQS is a knowledge-driven solution that provides both computer-assisted and interactive ways to manage the integrity and quality of data sources. DQS enables to discover, build, and manage knowledge about data. We can use that knowledge to perform data cleansing, matching, and profiling.
Why DQS?
In general, we have incorrect data in our application which can result from user entry errors, corruption in transmission or storage, mismatched data dictionary definitions, and other data quality and process issues. Aggregating data from different sources that use different data standards can result in inconsistent data. Incorrect data affects the ability of a business to perform its business functions and to provide services to its customers, resulting in a loss of credibility and revenue, customer dissatisfaction, and compliance issues. Automated systems often do not work with incorrect data, and bad data wastes the time and energy of people performing manual processes. Incorrect data can wreak havoc with data analysis, reporting, data mining, and warehousing.
High-quality data is critical to the efficiency of businesses. An organization of any size can use DQS to improve the information value of its data, making the data more suitable for its intended use. A data quality solution can make data more reliable, accessible, and reusable. It can improve the completeness, accuracy, conformity, and consistency of data, resolving problems caused by bad data in business intelligence or data warehouse workloads.
DQS Features
DQS provides the following features to resolve data quality issues.
• Data Cleansing: In this activity, DQS modify, remove, or standardized of data that is incorrect or incomplete, using both computer-assisted and interactive processes.
• Matching: In this activity, DQS identify of duplicates records in a rules-based process that enable to determine what constitutes a match and perform de-duplication.
• Reference Data Services: This will verify the quality of data using the services of a reference data provider. We can use reference data services from Windows Azure Marketplace DataMarket to easily cleanse, validate, match, and enrich data.
• Profiling: This enables to analyze a data source to provide insight into the quality of the data at every stage in the knowledge discovery, domain management, matching, and data cleansing processes. Profiling is a powerful tool in a DQS data quality solution. We can create a data quality solution in which profiling is just as important as knowledge management, matching, or data cleansing.
• Monitoring: Monitoring activity provides the ability to verify that data quality solution is doing what it was designed to do.
• Knowledge Base: Data Quality Services is a knowledge-driven solution that analyzes data based upon knowledge that builds with DQS. This enables to create data quality processes that continually enhances the knowledge about data and in so doing, continually improves the quality of data.
DQS Components
Data Quality Services consists of Data Quality Server and Data Quality Client. These components provide ability to perform data quality services separately from other SQL Server operations. Both are installed from within the SQL Server setup program.
Data Quality Server is implemented as three SQL Server catalogs that can manage and monitor in the SQL Server Management Studio (DQS_MAIN, DQS_PROJECTS, and DQS_STAGING_DATA). DQS_MAIN includes DQS stored procedures, the DQS engine, and published knowledge bases. DQS_PROJECTS includes data that is required for knowledge base management and DQS project activities. DQS_STAGING_DATA provides an intermediate staging database where we can copy source data to perform DQS operations and then export processed data.
Data Quality Client is a standalone application that enables to perform knowledge management, data quality projects, and administration in one user interface. It is a stand-alone executable file that performs knowledge discovery, domain management, matching policy creation, data cleansing, matching, profiling, monitoring, and server administration. Data Quality Client can be installed and run on the same computer as Data Quality Server or remotely on a separate computer. There are wizard-driven operations available to perform operations in Data Quality Client.
This is a great intro to DQS and has inspired me to dig deeper. What would be really interesting is how you might use this to resolve some of the DQ challenges in Dynamics CRM.
Unfortunately Microsoft don’t seem very joined up in this area, they provide Master Data Management and DQS services but don’t directly support their use in CRM, the same is generally true for SSIS too. Keep hoping someone at Microsoft might do the joined up thinking.
Thanks, Andrew for liking the post. I have used SSIS package for data quality cleansing and matching activity and then consumed CRM service to write clean data back to CRM. I will write another post for the SSIS approach.
I am trying to use DQS for data matching with a voluminous dataset. Did you ever write the post for SSIS approach for this?
I am trying to use DQS for data matching with a voluminous dataset. Did you ever write the post for SSIS approach for this? Thanks!
Thanks for the wonderful post. Is there a downloadable install file available for DQS Client 2016 or the only way to install it using the Base SQL Server 2016 installable and selecting the client feature in it ?
Pingback: Introduction to Data Quality Services (DQS) of SQL Server - Microsoft Dynamics CRM Community