I build scalable data pipelines, ML systems, and analytics platforms;
blending data engineering and data
science to turn raw data into actionable insights.
End-to-end data systems built for scale; from real-time pipelines to ML-powered applications.
projects
A deep dive into e-commerce data using K-Means clustering to uncover the hidden relationship between delivery performance and customer retention.
data engineering
Building a scalable recommendation engine with Lambda Architecture, Kafka, Flink, and Airflow
data engineering
Building an enterprise-grade Retrieval-Augmented Generation system with dbt, Qdrant, and LangChain
Deep dives into algorithms, statistics, and ML theory; implemented from first principles.
Statistics
Building a complete experimentation framework — from hypothesis testing to Bayesian inference
ML
Understanding the mathematics of reducing high-dimensional data — from eigendecomposition to manifold learning
ML
Implementing core ML algorithms from first principles using only NumPy
Building data solutions across industries; from insurtech analytics to edge inference systems.
2021 - 2023
Siemens Ltd.
ETL • Python • Data Pipelines
2024
Habitat
Edge Inference • CI/CD
2025 - Present
RiskPe
PySpark • SQL • Power BI