Data Quality

5 replies [Last post]
Matt K's picture
User offline. Last seen 1 day 11 hours ago. Offline
Joined: 07/17/2009
Graces: 111

Does anyone else work in the Data Quality/Data Profiling area?  If so, what do you think of the Talend suite?  What are the best OS programs out there right now?

catholicservant's picture
User offline. Last seen 1 day 5 hours ago. Offline
Joined: 06/29/2009
Graces: 157
I have no idea what that is!

Is it database software?

mdhoerr's picture
User offline. Last seen 26 weeks 5 days ago. Offline
Joined: 08/05/2009
Graces: 19
it is Extraction, Transformation and Loading software

I've never used formal software to do that, unless you count MS SQL Server's DTS routines. I've had to consolidate different databases several times, but I'm not a formal data warehouse / data mining person.

The Talend Suite looks interesting as an OS alternative. What commercial software (if any) are you using now?

Matt K's picture
User offline. Last seen 1 day 11 hours ago. Offline
Joined: 07/17/2009
Graces: 111
Shame on my for not replying

Shame on my for not replying sooner!  Well, yes it can be used for database software, but also for web transactions.  For example, when you buy something on BestBuy.com, the form you enter your address runs our product(I work for SAP Business Objects) and it returns the USPS certified address that is guarenteed mailable(in case of a typo, or error, ect).  So it can be used in the ETL enviroment but not exclusively.  Companies who understand the need for Data Quality generally run both batch and transactional implementations(applying to the whole enterprise is called EAI) 

Data Quality basically is the idea that why am I saving someones address in my database if that person has moved.  Or why do I have John Doe in Timbucktoo, WI in my table 5 times when I know there's only one John Doe living in Timbucktoo... 

The software we develop is called Data Services, but it had previous names of Data Quality XI, IQ8, and originally Firstlogic.  I also have used Informatica PowerCenter and our Data Integrator.  Talend is open source but not free.  So basically you can fix their bugs and send them the code?  I'm not sure, but it is probably much cheaper than a closed source program like we sell.

oscatholic's picture
User offline. Last seen 29 min 4 sec ago. Offline
Joined: 06/25/2009
Graces: 485
Woah cool! Hmm... we're

Woah cool! Hmm... we're looking for a solution like this for the Saint Louis Review. We were looking into Experian QAS for a while, but I don't know what other options are out there.

Advancing the faith.

Matt K's picture
User offline. Last seen 1 day 11 hours ago. Offline
Joined: 07/17/2009
Graces: 111
There's no free Open Source

There's no free Open Source solutions that I've found, but this is probably because the USPS does not give their data away for free to anyone either.  There has to be a license agreement and a whole lot of legal paperwork making sure the information is not misused. 

Yes Experian is one of the options out there.  I think Experian only does the address part, there's the Name/Firm part and Matching records too(maybe its not needed).  I've got a chart that shows the DQ software market and who's the large players.  Realistically your best bet might be to find a smaller company to get better price and service(like Experian).......

Actually there are some companies that you can send a list of customers and they will do the DQ part for you and then send you back the cleansed data.  Usually companies that are planning on running their entire mailing list maybe once a quarter will go this route.  Check out Lorton Data out of the Twin Cities.   They use our (SAP) software, I think.  http://www.lortondata.com/  I like when pricing is easily available on the main page of a website!

Post new comment

The content of this field is kept private and will not be shown publicly.