With all the hype and interest in Big Data lately, open source ETL tools seem to have taken a back seat. MapReduce, Yarn, Spark, and Storm are gaining significant attention, but it also should be noted that Talend’s ETL business and our thousands of ETL customers are thriving. In fact, the data integration market has a healthy growth rate with Gartner recently reporting that this market is forecasted to grow 10.3% in 2014 to $3.6 billion!
Open source ETL tools appear to be going though their own technology adoption lifecycle and are running mission critical operations for global 2000 corporations, which would suggest they are at least in the “early majority” adoption stage. Also based on their strong community, open standards and more affordable pricing model, open source ETL tools are a viable solution for small to midsize companies.
I would think the SMB data integration market, which has been underserved for many years, is growing the fastest. Teams of two or three developers can get up-to-speed very quickly and get a fast ROI over hand-coding ETL processes. Many Talend customers are reporting a huge savings on their data integration projects over hand-coding, e.g. Allianz Global Investors states that Talend is “proving to be 3 times faster than developing the same ETLs by hand and the ability to reuse Jobs, instead of rewriting them each time, is extremely valuable.”
A key component with open source is its vibrant community and the benefits it provides including sharing information, experiences, best practices and code. Companies can innovate faster through this model. For example, RTBF, one of over 100,000 Talend community users, states, “A major consideration was that Talend is open source and its community of active users ensures that the tools are rapidly updated and that user concerns are taken into account. Such forums make information easily accessible. As the community grows, more and more topics are covered which, of course, saves users a lot of time.”
And the good news is that the open source ETL tools category has blossomed with maturity to meet changing demands. What started as basic ETL and ELT capabilities has transformed into an open source integration platform. As firms break down their internal silos, data integration developers are being asked to integrate big data, to improve data quality and master data, to move from batch to real-time processing, and to create reusable services.
With increasing data integration requests, companies are looking for more and more pre-built components and connectors – from databases (traditional and NoSQL) and data warehouses, to applications like SAP and Oracle, to big data platforms like Cloudera and Hortonworks, to Cloud/SaaS applications like Salesforce and Marketo. Finally, not only do you need to connect to the Cloud, but run in the Cloud.
Almerys is an example that started with data integration and batch processing then moved to real-time data services, “Early on, significant real-time integration needs convinced us to adopt Talend Data Services, the only platform on the market offering the combination of a data integration solution and an ESB (Enterprise Service Bus).”
Big data may be getting all the attention and open source ETL tools may not be in the spotlight, but looking across the industry and what Talend customers are doing, they have certainly matured into an indispensible part of IT’s toolbox.
(Gartner: The State of Data Integration: Current Practices and Evolving Trends, April 3, 2014)