MSc Data Science, University of Bath (Dean's Award for Academic Excellence, 2025)
SQL · Power BI · Python · Tableau · Azure (In Progress)
London, UK | Graduate Visa — Full Work Rights Until February 2027
I build data infrastructure that makes reporting reliable and decisions defensible.
My background spans enterprise data quality at TCS (5M+ row validation pipelines), analytics automation at Predictea Digital, EV telemetry analysis at Tata Motors, and commercial KPI operations in UK retail. My MSc dissertation applied transformer-based NLP (DistilBERT) to sentiment classification, improving macro-F1 from 0.66 to 0.78 over SVM baselines.
I focus on three things: clean data foundations, business-ready reporting, and pipelines that work in production — not just in notebooks.
SQL Python Power BI Star Schema DAX Row-Level Security
End-to-end retail analytics solution processing 200,000+ transactional records. SQL ETL pipeline with Bronze-Silver-Gold architecture, star schema data model, and a multi-page Power BI dashboard with DAX time-intelligence, Z-score shrinkage anomaly detection, RLS, and deployment pipeline configuration. Reduces reporting time from 45 minutes to under 5 minutes.
Business value: Enables operations directors to monitor sales, labour cost %, shrinkage, and availability KPIs in real time rather than reviewing lagged weekly reports.
Azure Data Factory Azure Blob Storage Azure SQL Power BI Python HM Land Registry API
Production-pattern cloud data pipeline: HM Land Registry API → Azure Blob Storage (Bronze) → Azure Data Factory transformation → Azure SQL Database (star schema, Gold) → Power BI with scheduled monthly refresh. 4.5M+ property transaction records from 2020–2025. Full architecture documentation with ADF pipeline exports, SQL DDL, and Python ingestion scripts.
Business value: Demonstrates full modern data stack — ingestion, orchestration, transformation, storage, visualisation — on real government open data.
Power BI Advanced DAX SQL Companies House Data CFO Reporting
Multi-company financial intelligence dashboard using UK Companies House filing data. Covers 5-year P&L trend analysis, actual vs budget variance reporting, peer benchmarking, and EBITDA margin decomposition. Built for a CFO/finance director audience with role-appropriate bookmark views, documented assumptions, and fully normalised star schema across 5 tables.
Business value: Reduces monthly board pack preparation from days to hours while enabling real-time peer benchmarking.
Python HuggingFace DistilBERT Scikit-learn NLP transformers-interpret
Comparative study on sentiment classification of customer feedback using traditional models (SVM, Logistic Regression) versus fine-tuned transformer models. Domain-specific fine-tuning of DistilBERT with custom tokens improved macro-F1 from 0.66 → 0.78, particularly for neutral and negative sentiment categories where traditional ML consistently underperforms. Model decisions validated using transformers-interpret for explainability.
University of Bath, MSc Data Science, 2025
| Category | Skills |
|---|---|
| Languages | Python (Pandas, NumPy, Scikit-learn, SQLAlchemy), SQL (CTEs, Window Functions, Stored Procedures), R (basic) |
| BI & Visualisation | Power BI (DAX, RLS, Deployment Pipelines, Incremental Refresh), Tableau, Advanced Excel & Power Query |
| Data Engineering | ETL Pipeline Design, Star Schema Modelling, Azure Data Factory, Azure Blob Storage, Azure SQL |
| Cloud | Azure (ADF, Blob Storage, Azure SQL — in progress toward AZ-900/DP-203) |
| Other | Git/GitHub, Jupyter Notebooks, HuggingFace Transformers, Data Quality & Governance |
MSc Data Science — University of Bath, UK (Oct 2023 – Jan 2025)
Dean's Award for Academic Excellence — presented to top-performing postgraduate students for distinction-level achievement
Dissertation: Computational Analysis of Sentiment in Customer Feedback using Machine Learning
BEng Electronics & Telecommunications — Savitribai Phule Pune University, India (2018–2022)
Grade: 9.35/10 | First Class Honours equivalent
| Certification | Status | Target |
|---|---|---|
| Microsoft PL-300 — Power BI Data Analyst Associate | In Progress | March 2026 |
| Microsoft Applied Skills: Power BI Reporting | In Progress | February 2026 |
| Databricks Fundamentals | Planned | March 2026 |
| AZ-900 Azure Fundamentals | Planned | April 2026 |