I build scalable data pipelines, ML systems, and analytics platforms;
blending data engineering and data
science to turn raw data into actionable insights.
End-to-end data systems built for scale; from real-time pipelines to ML-powered applications.
projects
A deep dive into e-commerce data using K-Means clustering to uncover the hidden relationship between delivery performance and customer retention.
data engineering
Building a scalable recommendation engine with Lambda Architecture, Kafka, Flink, and Airflow
software engineering
Building an opportunistic data offload system using ESP32 and Raspberry Pi Pico to extract diagnostic telemetry from trains on the 740km Konkan route where LTE is absent for hours at a time
Deep dives into algorithms, statistics, and ML theory; implemented from first principles.
Statistics
Building a complete experimentation framework — from hypothesis testing to Bayesian inference
ML
Understanding the mathematics of reducing high-dimensional data — from eigendecomposition to manifold learning
ML
Implementing core ML algorithms from first principles using only NumPy
Building data solutions across industries; from insurtech analytics to edge inference systems.
2021 - 2023
Siemens Ltd.
ETL • Python • Data Pipelines
2024
Habitat
Edge Inference • CI/CD
2025 - Present
RiskPe
PySpark • SQL • Power BI