About

Data Engineer — building lakehouse platforms, streaming pipelines, and the infrastructure around them.

I'm a Data Engineer specializing in building scalable, efficient data platforms that drive business value. With deep expertise in Apache Spark, Delta Lake, and Databricks on Azure, I architect lakehouse solutions that prioritize data governance, cost optimization, and performance.

My approach combines technical depth with business context — data infrastructure should scale, but it also needs to deliver measurable ROI through intelligent design.

Data engineering is not about moving data — it's about creating robust, maintainable systems that let organizations make decisions at scale. Architecture first, cost-conscious always, governance by design.

Career Journey

Aug 2024 – Present

Staff Developer

CEDES · Berlin, Germany (Remote)

Taking care of Data, ML Operations and Databricks Platform on Azure.

Aug 2022 – Jul 2024

Technical Lead

relayr · Berlin, Germany

Leading the development of an innovative IIoT solution: Equipment as a Service (EaaS), seamlessly connecting OEMs, customers, IoT platforms, and IFS FinOps ERP systems from inception to implementation.

Jan 2021 – Oct 2022

Senior Data Engineer

relayr · Berlin, Germany

Expertly managed terabytes of data ingestion from Kafka into distributed PostgreSQL (Citus) and Azure Data Lake for analytics. Developed from scratch, including Grafana monitoring. Implemented sophisticated recovery solutions between hot and cold storage layers.

Aug 2018 – Jan 2021

Scala IoT Developer

relayr · Berlin, Germany

Engineered a robust solution to eliminate frequent downtime in critical IoT ingestion points, ensuring uninterrupted service with 99.999% availability. Led end-to-end development of an Enterprise HiveMQ MQTT-based system supporting 30,000 IoT devices, processing 15k messages per second.

Jul 2017 – Jul 2018

Software Developer

Springer Nature · Pune, India

Migrated data from a 25-year-old SQL-based database to MongoDB for research content using Kafka Connect-based Change Data Capture. Implemented Kafka Connect and Akka Streams solutions for seamless data transition.

Feb 2016 – Jun 2017

System Analyst

Bitwise Inc · Pune, India

Led creation of the Hydrograph ETL tool from scratch using Eclipse SDK & RCP Plugin + GEF framework, integrated with Scala and Hadoop backend. Successfully completed PoC, design, development, and distribution.

Jul 2015 – Feb 2016

Program Analyst

Bitwise Inc · Pune, India

Key contributor to the Test Data Management web app built from scratch with OSGi Felix. Developed client-specific applications using IBM BPM.

Jul 2014 – Jul 2015

Software Engineer

Tech Mahindra · Pune, India

Successfully convinced both the team and department head to adopt the AngularJS framework for the ActiveVOS BPM tool, resulting in the implementation of a feature-rich frontend solution.

Jun 2012 – Jun 2014

Associate Software Engineer

Tech Mahindra · Pune, India

Independently learned the ActiveVOS BPM tool and swiftly integrated into the project, emerging as a key contributor to the overall solution.

Technical Expertise

Data Platform

Apache Spark Spark Structured Streaming Delta Lake Databricks Unity Catalog Azure Data Lake MLOps

Streaming & Messaging

Apache Kafka Kafka Connect HiveMQ MQTT Akka Streams

Observability

Grafana Databricks Dashboards Spark UI & Metrics Cost & SLA Monitoring

Cloud & Infrastructure

Azure Azure Databricks Azure Data Explorer (ADX)

Languages

Scala Python SQL

Databases

PostgreSQL MongoDB ElasticSearch

Architecture

Lakehouse Medallion Architecture IIoT Data Governance