mercredi 5 août 2015

Plug and Play ETL tools inside Hadoop Ecosystem

Now that Hadoop Ecosystem has matured with so many tools that are making data analytics and science possible which is providing value to the company. Is this the time to start a another project that will give the user a plug and play tool like Informatica/ SAP and others inside this ecosystem. A lot of financial and other companies have deep rooted attachment with these type of tools because it provides ease of development and maintenance. With the current way of doing things, the companies are struggling to find investment that will integrate these tools with Hadoop ecosystem. I agree that a lot of legacy data warehouse code is developed using these tools and it will be a steep challenge to replace or share a portion of this technology inside an organization's technology stack.

Currently, if I understand correctly, Informatica is using Apache Hive for integrating with Hadoop. So the core of this is that these plug and play tools are using the core of Hadoop.

This is huge ask but if you look back Hadoop started small but grew big and there is a force that can act as a stimulant for working in this direction. I know some vendors have relationships with these tool providers but again Hadoop also started to challenge Teradatas, Oracles, Neteezas so as to gain a share of these technologies inside an organization data warehouses.

There are a lot of tools inside Hadoop ecosystem that can work together to provide an excellent plug and play interface for real time and batch analytics.

I would appreciate any feedback/ Pros and cons on this topic.

Thanks

R




Aucun commentaire:

Enregistrer un commentaire