Products > Landscape
The Data Quality Market - Q1 2010
The data quality market for the calendar year 2009 was worth around $803 million, of which software sales and maintenance accounted for around $637 million. This figure includes the professional services arms of data quality vendors, but excludes the (substantial) revenues of systems integrators and consultancies involved with data quality initiatives. This represents 9% growth over 2008, despite the difficult economic climate. The last year or so clearly saw dramatic economic events, especially in certain industries such as banking, and data quality companies saw fluctuating fortunes. Some experienced significant revenue declines, but others prospered, with Datanomic, for example, doubling its revenues.
Certain common themes ran through the industry. There is a clear emergence of software-as-a-service data quality offerings, with existing vendors seeing significant interest in their fledgling offerings, and some newer entrants such as Active Prime basing their whole business model on this approach. While some customers clearly have concerns about allowing their data outside the firewall, the advantages of avoiding potentially complex installation and upgrades on their own premises are beginning to outweigh these worries. Some vendors have been at this for longer than others – Uniserv has had such an offering for eight years now, and has seen it grow into a significant business line.
Data volumes continue to grow, and some vendors have responded by re-architecting their products for 64 bit architecture, and in some cases to allow massively parallel processing (MPP) approaches. Such approaches allow profiling, for example, to be carried out on large datasets rather than just relying on sampling. The data quality industry is hardly new: Innovative Systems has been providing data quality software for 32 years, yet there continue to be new uses and demands placed on the technology.
One theme is an increasing interest in identity resolution, beyond just name and address verification, as a reflection of the heightened concerns about both terrorism and on-line fraud. Infoglide built its business on this in the US (and Infoshare specialises in this area, mainly within the public sector in the UK), while DQ Global is another vendor who emphasise this capability.
As companies expand globally and Asia’s economies continue to develop, support for more exotic character sets becomes important: many vendors now support Unicode, and some have gone further. Omikron in particular provides software that tackles culture-specific name validation in various Asian character sets. Even established areas of data quality such as data profiling are seeing new entrants, in the form of X88.
A recurring theme of our client research is how customer name and address is not the only priority for customers – product data in particular is viewed as a major problem for enterprises, and is much more complex and less structured than name and address data. Companies such as Silver Creek (now bought by Oracle), Datactics and Inquera specialise in this area, while some other vendors are data domain-agnostic and have been adding functionality that is relevant to handling other datatypes.
Consolidation was a theme in 2009, seeing several significant acquisitions. Oracle purchased Silver Creek, matching specialist Netrics was bought by Tibco, and profiling vendor Exeros by IBM. In general, vendors continue to build out their offerings to encompass a full range of data quality functionality, sometimes through the OEM of specialist technologies. Some larger vendors (such as Informatica, IBM and SAP) now offer a broad suite of software encompassing data integration, master data management and data quality software. Microsoft is likely to follow this path in due course, given its prior acquisitions of Zoomix and Stratature.
The emerging area of data governance is beginning to have a distinct influence on the data quality industry. As the popularity of enterprise master data management continues, data governance initiatives have been set up, and the scope of these usually includes data quality. Hence data quality vendors are seeing demand for support for data governance initiatives, and many have added functionality in this area, such as support for the work done by data stewards. There are increasing connections between master data management and data quality, with many MDM vendors using OEMs of data quality vendors to round out their offerings, or in some cases (as with Tibco and Netrics) buying data quality vendors outright. One interesting development has been the other direction, with Ataccama for example starting with a data quality product and then releasing an MDM platform, a route that open-source vendor Talend has also followed; having started with data integration, Talend now has both data quality and MDM offerings.
Data quality remains a fragmented market. Since many vendors specialise in name and address verification, companies with deep local knowledge have built up offerings that track people and companies that move address in order to avoid badly targeted mailing and ensure optimal bulk rates for mass mailing (such as Satori Software). Significant enrichment of postal address data is possible via geocoding. This technique allows customers to not just check the postal code of an address, but to see it displayed on a map, and to display such things as its distance from nearby stores, the demographics of the area, its political constituency, or even whether the address lies within a flood plain. Pitney Bowes Business Insight (with its Mapinfo technology) is one vendor that majors in this area, with Capscan being another example. While most vendors support web services, some vendors specifically design their products to be embedded in other applications, such as Intelligent Search (Melissa Data and Netrics are others).
The diagram that follows shows the major data quality vendors, displayed on three dimensions. See later for definitions of these. The largest vendors of data quality in terms of revenue are Experian QAS, SAP, Informatica, IBM, Trillium and DataFlux.
It is important to understand that this is a high-level representation of the market, with vendors represented on the chart specialising in different areas and at very different price-points (HelpIT is an example of a vendor with a quite complete data quality toolset at a low price-point; Datras is another). If considering data quality software, it is important to tailor your selection process to the particular needs that you have rather than relying on high-level diagrams such as this. The Information Difference has various detailed models that can assist you in vendor selection and evaluation.
As part of the landscape process, each vendor was asked to provide at least eight reference customers (some provided over 20 references) which were surveyed to determine their satisfaction with the data quality software of the vendor (if insufficient references were provided then a neutral score was assigned). The happiest customers based on this survey were those of Melissa Data, followed by Active Prime, Datanomic, Pitney Bowes Business Insight, Satori Softare, DataFlux, Datactics and Infoshare.
Main Vendors
Below is a list of the main data quality vendors.
Vendor |
Brief Description |
Website |
Address Doctor |
Vendor that specializes in providing wide coverage of name and address information; now owned by Informatica. |
|
Ataccama |
Prague-based start-up with a modern data quality suite. |
|
Active Prime |
California-based vendor of data quality for CRM systems. |
|
Business Data Quality |
UK-based data profiling vendor. |
|
Capscan |
London-based provider of address management and data integrity services. |
|
Citrus Technology |
UK-based vendor of data profiling and data quality tools. |
|
Datactics |
UK-based vendor specializing in product data quality. |
|
Datanomic |
Cambridge-based vendor of data quality solutions. |
|
DataFlux |
Part of SAS, one of the leading players in data quality. |
|
DataQualityFirst |
US start-up whose application lives on top of IBM Quality Stage. |
|
Datiris |
Colorado vendor of data profiling technology. |
|
Datras |
Munich-based vendor with wide ranging data quality functionality. |
|
DQ Global |
UK data quality and address verification software. |
|
Exeros |
California-based vendor specializing in data discovery. Now owned by IBM. |
|
Experian QAS |
Global provider specializing in contact data managment. |
|
Help IT Systems |
UK vendor of data cleansing technology. |
|
Human Inference |
Dutch data quality vendor. |
|
IBM |
Data quality software from the industry giant. |
|
Identity Systems |
Identity resolution is now part of Informatica. Its technology has wide customer base in government and financial services. It is used via OEM agreements in MDM and fraud and compliance products. |
|
Informatica |
California-based vendor, a major player in data quality. |
|
Infogix |
Illinois-based vendor specializing in controls and compliance. |
|
Infoglide |
Austin-based vendor of data quality tools based on its identity resolution matching software. |
|
Infoshare |
UK data quality specialising in the public sector market. |
|
Inquera |
Israeli company with innovative approach to product data quality using machine-learning technology based on subject domain experts’ knowledge. |
|
Innovative Systems |
Long-established Pittsburgh-based vendor whose software uses an extensive knowledge base. |
|
Intelligent Search |
Identity management company now with a more general data quality capability. |
|
Melissa Data |
Data quality US vendor with a focus on the Microsoft software environment. |
|
Netrics |
New Jersey vendor of impressively accurate matching software. Now owned by Tibco. |
|
Omikron |
German data quality vendor. |
|
Pitney Bowes Business Insight |
The data quality vendor formerly known as Group 1, part of the Pitney Bowes group. |
|
Postcode Anywhere |
UK vendor of web-based addressing software. |
|
SAP |
The software giant is a major data quality player. |
|
Satori Software |
Seattle-based provider of address management solutions. |
|
Silver Creek Systems |
Colorado-based vendor of product data mastering software. Now owned by Oracle. |
|
Talend |
Paris-based open source data quality software vendor. |
|
Trillium |
Part of Harte Hanks, one of the leading data quality vendors. |
|
Uniserv |
Large German data quality vendor. |
|
X88 |
Recent UK market entrant specializing in data discovery. |
Other vendors of data quality software include:
Research Methodology
The Information Difference Landscape diagram shows three dimensions of a vendor:
“Market strength” is made up of a weighted set of five factors: revenues, growth, financial strength, geographic scope and partner network. Each of these individual elements is scored, the total producing the “market strength” figure. Similarly “technology” is made up of four factors: “technology breadth” (the coverage of the vendors in various data quality areas as illustrated below), the longevity of the software in the market, analyst perception of the product via briefings, and customer feedback from reference customers (this has a high weighting), which we surveyed. In each case the scoring is on a scale of 0 (worst) to 6 (best).
Vendors were asked to submit answers to various questions via a questionnaire. Vendors were interviewed directly by an analyst and their software demonstrated and assessed. Reference customers were surveyed to give their experience of the software of each vendor. The technology functions which the vendors were asked about are as shown as follows. These are drawn from the Information Difference vendor functionality model; if you are interested in more detail on this then please contact The Information Difference.
Functional Areas