Skip to content
View anandi-mahure's full-sized avatar

Block or report anandi-mahure

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
anandi-mahure/README.md

Anandi Mahure — Data Analyst

MSc Data Science, University of Bath (Dean's Award for Academic Excellence, 2025)
SQL · Power BI · Python · Tableau · Azure (In Progress)
London, UK | Graduate Visa — Full Work Rights Until February 2027


About

I build data infrastructure that makes reporting reliable and decisions defensible.

My background spans enterprise data quality at TCS (5M+ row validation pipelines), analytics automation at Predictea Digital, EV telemetry analysis at Tata Motors, and commercial KPI operations in UK retail. My MSc dissertation applied transformer-based NLP (DistilBERT) to sentiment classification, improving macro-F1 from 0.66 to 0.78 over SVM baselines.

I focus on three things: clean data foundations, business-ready reporting, and pipelines that work in production — not just in notebooks.


Portfolio Projects

SQL Python Power BI Star Schema DAX Row-Level Security

End-to-end retail analytics solution processing 200,000+ transactional records. SQL ETL pipeline with Bronze-Silver-Gold architecture, star schema data model, and a multi-page Power BI dashboard with DAX time-intelligence, Z-score shrinkage anomaly detection, RLS, and deployment pipeline configuration. Reduces reporting time from 45 minutes to under 5 minutes.

Business value: Enables operations directors to monitor sales, labour cost %, shrinkage, and availability KPIs in real time rather than reviewing lagged weekly reports.


Azure Data Factory Azure Blob Storage Azure SQL Power BI Python HM Land Registry API

Production-pattern cloud data pipeline: HM Land Registry API → Azure Blob Storage (Bronze) → Azure Data Factory transformation → Azure SQL Database (star schema, Gold) → Power BI with scheduled monthly refresh. 4.5M+ property transaction records from 2020–2025. Full architecture documentation with ADF pipeline exports, SQL DDL, and Python ingestion scripts.

Business value: Demonstrates full modern data stack — ingestion, orchestration, transformation, storage, visualisation — on real government open data.


Power BI Advanced DAX SQL Companies House Data CFO Reporting

Multi-company financial intelligence dashboard using UK Companies House filing data. Covers 5-year P&L trend analysis, actual vs budget variance reporting, peer benchmarking, and EBITDA margin decomposition. Built for a CFO/finance director audience with role-appropriate bookmark views, documented assumptions, and fully normalised star schema across 5 tables.

Business value: Reduces monthly board pack preparation from days to hours while enabling real-time peer benchmarking.


Python HuggingFace DistilBERT Scikit-learn NLP transformers-interpret

Comparative study on sentiment classification of customer feedback using traditional models (SVM, Logistic Regression) versus fine-tuned transformer models. Domain-specific fine-tuning of DistilBERT with custom tokens improved macro-F1 from 0.66 → 0.78, particularly for neutral and negative sentiment categories where traditional ML consistently underperforms. Model decisions validated using transformers-interpret for explainability.

University of Bath, MSc Data Science, 2025


Technical Skills

Category Skills
Languages Python (Pandas, NumPy, Scikit-learn, SQLAlchemy), SQL (CTEs, Window Functions, Stored Procedures), R (basic)
BI & Visualisation Power BI (DAX, RLS, Deployment Pipelines, Incremental Refresh), Tableau, Advanced Excel & Power Query
Data Engineering ETL Pipeline Design, Star Schema Modelling, Azure Data Factory, Azure Blob Storage, Azure SQL
Cloud Azure (ADF, Blob Storage, Azure SQL — in progress toward AZ-900/DP-203)
Other Git/GitHub, Jupyter Notebooks, HuggingFace Transformers, Data Quality & Governance

Education

MSc Data Science — University of Bath, UK (Oct 2023 – Jan 2025)
Dean's Award for Academic Excellence — presented to top-performing postgraduate students for distinction-level achievement
Dissertation: Computational Analysis of Sentiment in Customer Feedback using Machine Learning

BEng Electronics & Telecommunications — Savitribai Phule Pune University, India (2018–2022)
Grade: 9.35/10 | First Class Honours equivalent


Certifications (In Progress)

Certification Status Target
Microsoft PL-300 — Power BI Data Analyst Associate In Progress March 2026
Microsoft Applied Skills: Power BI Reporting In Progress February 2026
Databricks Fundamentals Planned March 2026
AZ-900 Azure Fundamentals Planned April 2026

Connect

LinkedIn Email

Popular repositories Loading

  1. data-analysis-foundations data-analysis-foundations Public

    Jupyter Notebook 1

  2. anandi-mahure anandi-mahure Public

    Config files for my GitHub profile.

  3. machine-learning-university-of-bath machine-learning-university-of-bath Public

    Jupyter Notebook

  4. retail-kpi-analytics retail-kpi-analytics Public

    End-to-end retail analytics system — SQL ETL pipeline, star schema, Power BI KPI dashboard with DAX, RLS, and deployment pipeline. Processes 200K+ transactional records.

    Jupyter Notebook