ZigaForm version 5.7.1

Data process optimization with Databricks for accurate customer churn prediction

Data integration, warehousing, and analytics with Azure Databricks

Our client, a subscription-based meat delivery services company, was seeking to optimize their data processes that integrated multiple data sources and power analytics and AI modeling to extract customer behavioral patterns and help reduce customer churn. They had 26 data sources like financial data, and marketing data and had an initial Azure data lake, and used Azure data factory for ETL processes. 

Challenges

The client had 26 data sources consisting of their e-commerce platform and ERP and with rising competition, providing personalized customer experiences was pertinent for growth. But they were facing challenges in ensuring data quality and reporting.

The client was using Azure data lake and Azure data factory for data storage and processing. The major issue with the current process was there was no data observability or visibility to the data pipelines. There were no systems in place to track metrics, metadata, data lineage, and logs. 

    • The lack of data observability made it difficult for data engineers to understand actions performed against DBFS, Clusters, Accounts, Jobs, Notebooks, SSH, Workspace, Secrets, SQL Permissions, and Instance Pools.
    • Data was sometimes used by dependent processes before all the jobs were complete, introducing errors in the process. The lack of visibility affected the data quality resulting in bad and erroneous data, increasing the risk of errors in decision-making.
    • Monitoring and troubleshooting were also challenging due to a lack of visibility.
Datalakes

Solution

We proposed a solution using the Databricks platform for data integration, warehousing, and analytics with Azure data lake for storage 

    • We simplified the data architecture by integrating Azure data lake with Databricks for data warehousing and connecting data from Shopify, Infor M3, sensor data, and 23 other data sources with FiveTran to create ML models that provide insights to optimize customer engagement and drive growth.
    • Implemented Unity Catalog for data governance to manage data assets and metadata. Automated data lineage tracking with unity catalog to ensure data quality across all workloads, perform impact analysis or change management of any data changes across the lakehouse, and conduct root cause analysis of any errors in their data pipelines.
    • Built an alerting system using the Azure logic app to stop dependent jobs if something goes wrong with a process or if the files are not available on time.
    • Streamlined data flow through the bronze, silver, and gold layers in the medallion architecture reducing the manual effort required to fix issues of data duplications and reporting errors.
    • Created 3+ ML models including Customer Segmentation, Life-Time Value Prediction, and Churn Prediction to unlock powerful insights on customer segments and high-value customers to drive a focused marketing approach or take proactive measures such as offering targeted incentives to retain customers.
    • Monitoring and maintaining existing infrastructure and ML models on a weekly basis to track accuracy, reliability, and performance and ensure that there is no Data or Model drift and make decisions on model retraining, optimization, or retirement.

Outcomes

    • Reported 7x better performance from shifting to Databricks due to auto-scaling, parallel processing, caching and memory optimization, faster development and deployment of data driven solutions
    • Ensured the quality of data accessed by various business users across the organization.
    • 31 % -reduction in manual efforts required to fix data errors.
    • 20 % – reduction in overall cloud computing costs.
    • Enabled faster data-driven decision-making for customer retention by accurate churn prediction.
    • Improved overall employee productivity and operational efficiency by leveraging a single platform to power data engineering, data science and modelling and ML Ops

Reach out to our team

Schedule a conversation with our technology experts

Share your requirement with us and our team will contact you within one business day to schedule a personalized consultation.

 

Once you connect with our technology leaders, they will evaluate your specific business case and share a proof of concept with estimates of costs, the effort required in terms of technologies and developers, and the timeline for the process.

Request free consultation