Unchain your (Data-)Heart

Dr. Jan Scharnetzky


Unchain your (Data-)Heart

D ONE & Pax: Our Journey from SAP BW to Databricks

Recently, we had the opportunity to present Pax's migration from SAP Business Warehouse (BW) to Databricks at the Data + AI Summit in San Francisco. Our presentation was met with great interest due to the widespread use of legacy SAP BW systems and the end-of-life for the BW on Hana in 2028. Our talk was so popular that it was completely booked out.

In retrospect, it is evident that Pax's decision to build the PaxDataHub on Databricks was driven by a strategic business decision to advance their goal of becoming a data-driven company with the technological aspects being only one part of the decision. This post will address both the technical feasibility of the migration in the first part and the business decision-making process that led to this move, utilizing the D ONE’s foundation for data-driven value creation.

Image 1: The principle architecture of the PaxDataHub

Lakehouse approach
The objective of establishing a unified data analytics system as a single source of truth is widely acknowledged; however, consolidating data from multiple operating systems into one analytics system poses significant challenges. Our lakehouse approach revolves around Databricks for transformation and relies on Azure Data Factory for ingestion. Reports are generated using Power BI, as depicted in Graphic 1. The transformation section of our lakehouse is divided into a factory (corresponding to one Databricks workspace) for production workflows with no end-user access, while an additional workspace serves as a laboratory for self-service purposes. The laboratory has read-only access to all data products of the factory.

Image 2: Zoom-In on the data ingestion of SAP ERP Sources.

The Azure Data Factory regulates ingestion, with native, Azure-supported connectors to all our relevant data sources, here categorized into SAP and non-SAP data sources. The SAP CDC connector enables direct access to the ODP framework of the SAP ERP system. Setting up the virtual machine with a self-hosted integration runtime is well-documented, and the change-data-capture feature enables incremental data loading into the Azure storage account, as indicated in Image 2.

Concentrate on creating business value

With the ingestion and transformation parts established, our architecture is ready for its first use case, and we can concentrate on creating business value. But as previously highlighted in the introduction, it is evident that this transformation was not only a technical one, but also a significant business transformation. We addressed the setup by making use of D ONE’s foundation for data-driven value creation (image 3): Together we set up a culture of work, namely co-creation, conducted interviews within the organization, deep-dived into the data modeling and visualization and of course also tackled the technological challenges. All leading to Pax’ decision to migrate to Databricks to advance the company's goal of becoming a data-driven company.

Success through collaboration
The success of the project does not solely depend on the technicalities of our architecture, but on the collaboration between skilled SAP BW experts with years of experience in the life insurance business and experienced Databricks & Cloud Engineers. Together, we decided to tackle the most complex use case implemented in the BW to demonstrate the value of the PaxDataHub. Although it was not an easy start, we were able to deliver within the set timeline and budget while also enjoying a productive work environment.

Furthermore, we gained many technical insights from the initial use case. For instance, we could calculate the total costs of Databricks as we knew how much time a computationally expensive job took in both SAP BW and Databricks. The total cost of ownership for the new PaxDataHub was calculated to be around 1/3 of the previous setup with a 80% decrease of runtime for the job tested.

Image 3: D ONE’s foundation for data driven value creation.

Proof of value drives the project

The completion of the proof of value instilled confidence in the entire company that the PaxDataHub could serve as the company-wide analytics platform. Furthermore, we conducted interviews across all business units to understand their requirements for a modern analytics platform, addressing Pax's needs. The outcome was summarized into four key features: a unified datalake with a business-ready data model, a company-wide self-service platform, and with that a common understanding of the data across the entire company. Analyzing the requirements revealed that the proposed setup fulfilled most of the demands. Although there were platform-independent needs, addressing them required considerable effort, particularly in data modeling and establishing a shared business comprehension of the data. Currently, we are actively working to overcome these challenges, and despite the complexity of the situation, we are progressing well and are enjoying the benefits.

Call for action

If you are considering migrating from SAP BW to Databricks, know now that you are not alone. Many companies are facing similar situations and are taking large steps towards the transition to modern analytics platforms. If I can support on that journey please reach out to me

Jan Scharnetzky


Thanks for your registration!