Posted on Aug 03, 2021

Designed and implemented a suitable Hadoop architecture post analysis of various data sources

Challenges

  • The client wanted to scale up the legacy data warehouse system built on RDBMS based MPP system and perform real-time analytics on the marketing data in order to generate useful business insights
  • The client was receiving data from various data sources like SAP ERP, JDA, IRI, Acousta and many others, and was facing challenges in integrating the data from these data sources & performing real time analytics on the given data

Solutions

  • Analyzed the various data sources and targets and designed and implemented a suitable Hadoop architecture for the same
  • Designed near real-time ingestion system using Attunity and Kafka to perform real-time batch processing
  • Designed a data warehouse on Hive and HBase for all transactional data processing
  • Used Talend ETL tool to design and develop Spark Streaming and batch jobs
  • Designed OLAP cubes for reporting