DQ Landscape - The Information Difference Company Limited

Search

Go to content

Main menu

DQ Landscape

Products > Landscape

The Data Quality Market - Q1 2011

The data quality market for the calendar year 2010 was worth around $873 million, of which software sales and maintenance accounted for around $724 million. The overall figure includes the professional services arms of data quality vendors, but excludes the (substantial) revenues of systems integrators and consultancies involved with data quality initiatives.  This represents 9% growth over 2009, reflecting some recovery in the economy. Financial services in particular has shown significant growth in 2010, presumably reflecting the increased focus on regulation in banking, and new regulatory initiatives such as Solvency II in insurance.

Perhaps the most significant reshaping of the industry over the last year has been the clearer distinction between companies offering a broad platform capability, of which data quality is an integral part, and the pure-play data quality vendors.  The platform vendors, such as IBM, Informatica, SAP, Talend, Pitney Bowes Business Insight (PBBI) and to an extent DataFlux, argue that the combination of data integration and data quality, often with master data management (MDM) added too, is what customers really want.  The pure-play vendors naturally retort that customers do not want to be tied into one platform, and so will prefer to choose best of breed data quality tools.  Oracle appears also to have changed direction here with its recent acquisitions of Silver Creek and now Datanomic.       

There is a growing interest in providing data quality capabilities in real time, as a "data quality firewall", often via web services, with data quality capabilities being called up from within other applications.  Some vendors can provide data quality through the cloud rather than requiring on-premise software installation, with Postcode Anywhere and Active Prime operating solely in this manner.  

As companies expand globally and Asia's economies continue to develop, support for non-European character sets becomes important: more vendors now support Unicode, and some have gone further, with Omikron providing software that tackles culture-specific name validation in various Asian character sets.  

Data volumes continue to grow, and more vendors have responded by re-architecting their products for 64 bit architecture, in some cases allowing massively parallel processing (MPP) approaches.  Such approaches allow profiling, for example, to be carried out on large datasets rather than relying on sampling.  It is likely that further performance demands will be made on data quality tools in the future, making performance and scalability a more important differentiation for vendors than it has been.  

The industry has traditionally focused on customer name and address data, yet a continuing theme of our client research is how customer name and address is not the only priority for customers.  Product data in particular is viewed as a major problem for enterprises, and is much more complex and less structured than name and address data.  Companies such as Silver Creek (bought by Oracle), Datactics and Inquera specialize in this area, while some other vendors are data domain-agnostic and have been adding functionality that is relevant to handling data types other than customer, and have customer references to prove it.

Data governance is rapidly entering the mainstream, and beginning to have an influence on the data quality industry.  As master data management continues to grow, data governance initiatives have been set up to complement these MDM projects, and the scope of these initiatives usually includes data quality.  Data quality vendors are therefore seeing demand to support data governance initiatives, and many have added some functionality in this area, such as support for data stewards.  Increasingly the link between master data management and data quality in being recognised, with many MDM vendors using OEMs of data quality vendors to round out their offerings, or in some cases buying data quality vendors outright.  

Data quality remains a fragmented market.  Since many vendors specialize in name and address verification, companies with local knowledge have built up offerings that track people and companies that move address in order to avoid badly targeted mailing and ensure optimal bulk rates for mass mailing (such as Satori Software).  Others have built up deep local knowledge of particular markets, such as Uniserv in continental Europe.  

Significant enrichment of postal address data is possible via geocoding.  This technique allows customers to not just check the postal code of an address, but to see it displayed on a map, and to display such things as its distance from nearby stores, the demographics of the area, its political constituency, or even whether the address lies within a flood plain.  Pitney Bowes Business Insight (with its Mapinfo technology) is one vendor that offers particularly comprehensive enrichment capabilities, but many others are now doing so too.  

One intriguing development was the entrance of Google into the market with its Google Refine desktop (open source) product in late 2010.  Although existing vendors will doubtless deride it as lightweight, it has profiling, de-duplication and geocoding functionality, and Google's brand is one that will give a lot of people pause for thought, given that the product is free.  

The diagram that follows shows the major data quality vendors, displayed on three dimensions.  See later for definitions of these.  The largest vendors of data quality in terms of revenue are Experian QAS, SAP, Informatica, IBM, Trillium and DataFlux.

 
 

It is important to understand that this is a high-level representation of the market, with vendors represented on the chart specializing in different areas and at very different price-points (HelpIT is an example of a vendor with a quite complete data quality toolset at a low price-point).  If you are considering data quality software, it is important to tailor your selection process to the particular needs that you have rather than relying on high-level diagrams such as this.  The Information Difference has various detailed models that can assist you in vendor selection and evaluation.

As part of the landscape process, each vendor was asked to provide at least eight reference customers (some provided over 25 references) which were surveyed to determine their satisfaction with the data quality software of the vendor.  The happiest customers based on this survey were those of X88, IBM, DataFlux, Talend, Informatica and Datactics, followed closely by the customers of Trillium, Active Prime, and HelpIT.

Main Vendors
Below is a list of the main data quality vendors.

 

Vendor

Brief Description

Website

Address Doctor

Vendor that specializes in providing wide coverage of name and address information; now owned by Informatica.

www.addressdoctor.com

Ataccama

Prague-based start-up with a modern data quality suite.

www.ataccama.com

Active Prime

California-based vendor of data quality for CRM systems.

www.activeprime.com

Business Data Quality

UK-based data profiling vendor.

www.businessdataquality.com

Capscan

London-based provider of address management and data integrity services.

www.capscan.com

Datactics

UK-based vendor specializing in product data quality.

www.datactics.com

Datanomic

Cambridge-based vendor of data quality solutions. Now likely to be acquired by Oracle.

www.datanomic.com

DataFlux

Part of SAS, one of the leading players in data quality.

www.dataflux.com

DataQualityFirst

US start-up whose application lives on top of IBM Quality Stage.

www.dataqualityfirst.com

Datiris

Colorado vendor of data profiling technology.

www.datiris.com

Datras

Munich-based vendor with wide ranging data quality functionality.

www.datras.de

DQ Global

UK data quality and address verification software.

www.dqglobal.com

Exeros

California-based vendor specializing in data discovery. Now owned by IBM.

www.exeros.com

Experian QAS

UK-based vendor specializing in customer name and address.

www.qas.co.uk

Google

The search engine giant now does data quality.

code.google.com/p/google-refine

HelpIT

"UK/US-based vendor" of data cleansing technology.

www.helpit.com

Human Inference

Dutch data quality vendor.

www.humaninference.com

IBM

Data quality software from the industry giant.

www.ibm.com

Informatica

California-based vendor, a major player in data quality.

www.informatica.com

Infogix

Illinois-based vendor specializing in controls and compliance.

www.infogix.com

Infoglide

US vendor specializing in identity resolution.

www.infoglide.com

Infoshare

UK data quality specialising in the public sector market.

www.infoshare-is.com

Inquera

Israeli company with innovative approach to product data quality using machine-learning technology based on subject domain experts’ knowledge.

www.inquera.com

Innovative Systems

Long established Pittsburgh-based vendor whose software uses an extensive knowledge base.

www.innovativesystems.com

Intelligent Search

Identity management company now with a more general data quality capability.

www.intelligentsearch.com

Melissa Data

California-based data quality vendor with multiplatform tools and web services.

www.melissadata.com

Netrics

New Jersey vendor of impressively accurate matching software. Now owned by Tibco.

www.netrics.com

Omikron

German data quality vendor with strong Asian language capabilities.

www.omikron.com

Pitney Bowes Business Insight

The data quality vendor formerly known as Group 1 Software, part of Pitney Bowes Inc.

www.pbinsight.com

Postcode Anywhere

UK vendor of web-based addressing software.

www.postcodeanywhere.co.uk

SAP

The software giant is a major data quality player.

www.sap.com

Satori Software

Seattle-based provider of address management solutions.

www.satorisoftware.com

Silver Creek Systems

Colorado-based vendor of product data mastering software. Now owned by Oracle.

www.silvercreeksystems.com

Talend

Paris- and California-based vendor of open source data management and integration software.

www.talend.com

Trillium

Part of Harte Hanks, one of the leading data quality vendors.

www.trilliumsoftware.com

Uniserv

Large German data quality vendor.

www.uniserv.com

X88

Recent UK market entrant specializing in data discovery.

www.x88software.com

 

Other vendors of data quality software include:



Research Methodology
The Information Difference Landscape diagram shows three dimensions of a vendor:

  • Market strength

  • Technology

  • Customer base.


"Market strength" is made up of a weighted set of five factors: revenues, growth, financial strength, geographic scope and partner network.  Each of these individual elements is scored, the total producing the "market strength" figure.  Similarly "technology" is made up of four factors: "technology breadth" (the coverage of the vendors in various data quality areas as illustrated below), the longevity of the software in the market, analyst perception of the product via briefings, and customer feedback from reference customers (this has a high weighting), which we surveyed.  In each case the scoring is on a scale of 0 (worst) to 6 (best).  

Vendors were asked to submit answers to various questions via a questionnaire. Vendors were interviewed directly by an analyst and their software demonstrated and assessed.  Reference customers were surveyed to give their experience of the software of each vendor. The technology functions which the vendors were asked about are as shown below.  These are drawn from the Information Difference vendor functionality model; if you are interested in more detail on this then please contact The Information Difference.

Functional Areas

 
 
 
Back to content | Back to main menu