No Extract, Transform and Load? Really?
There are some new data aqusition tools that help reduce Extract, Transform and Load (ETL) and related support costs drastically. Kalido claims in their sales pitch, "Free Yourself From ETL," that their information engine eliminates the need for ETL. I agree to some extent for specific business situations, but marketing these tools as if they would eliminate need for ETL is quite a stretch. I am a big advicate of auto generated code replacing custom coding in data integration. I am also a big advocate of building reusable objects and transformations with the ETL realm. These best practices help save costs and manage resources better. I do not beilve every single ETL challenge is solvable by tools.
I can understand why an average project sponsor gets enticed by claims like no ETL and small prototypes, I would like to highlight the following facts about the nature of data acqusition work in an enterprise setting:
- The ETL developer needs solid skills in design, architecture, performance tuning, general programming abilities and writing complex SQLs. Even if the code is generated by the tools, the developer should be capable of understanding how to make the tool do the right things the right way. Given the role requirements, good ETL developers do not come cheep.
- Quick and dirty work, to be replaced later hurts data programs the most. It's quite costly to not do it right on the first pass.
- Typically, the ETL engines need to accomodate the changes in any of the source systems or the target systems.
- Enterprise governance standards related to reference and master data use, data integrity, data quality, information security are all enforced by the ETL engines.
Therefore, by eliminating ETL with a drag and drop tools without knowing the adverse impacts to enterprise data enablement can land the average project sponsors in to serious trouble.
In order to take the best advantage of the data acqusition tools that claim to eliminate or reduce ETL, make sure that the business situation where this can be experimented on. The following are some such business scenarios,
- Temporary data acqusition work for semi-adhoc or adhoc needs of a few selected user champions. This may be throw-a-away work.
- Explorative endeavors on a data source that is not yet clearly understood. Let us say the organization just acquired a new company and needs to bring in and integrate new company's data with the old company's data. In order to accomplish this task fast, one strategy may be to provide power users of both organizations an access to the relevant data of the other company. In scenarios such as this, tools can provide a first-cut access to the new enterprise data in a bit of a raw form.
- There is only one source system to the data mart or enterprise dataware house or there is really no need to match up master data between different sources system. This is a low risk business scenatio to eliminate complex ETL processes.
In summary, while "No ETL" is a bit of stretch, there is some merit in considering tools like Kalido for some specific business scenarios to reap the benefits of low ETL costs as well as better speed of delivery.
image: data movement/shutterstock
Raju is a data acquisition developer at Navy Federal Credit Union. He has over 20 years of diverse experience in project/program management, quality management, and data management. He holds many industry certifications including, CDMP, CBIP, CCP, PMP and CSQA. He can be reached at, firstname.lastname@example.org
Other Posts by Venkata Bodapati
The moderated business community for business intelligence, predictive analytics, and data professionals.