Posted on Aug 03, 2021

Significant reduction in storage cost and data processing time

Challenges

  • SAS database is used to store all client data. The amount of data is huge and continues to grow fast. SAS licensing cost is high for each GB of data stored. Needed a solution to reduce the operations costs involved in data storage.
  • Processing time for mathematical operations is slower compared to a distributed system.
  • Process the unstructured dataset and convert it into structured ones.

Solutions

  • Implemented an open-source framework to store and process big data using unique programming models – Hadoop distribution (HDP2.4).
  • Migrated to Hadoop ecosystem to enhance the performance of queries at a faster rate.
  • Facilitated business logic upfront by creating a data-lake.
  • Executed an ETL process to extract data out of the source systems and placing them into the data warehouse.