Unified modeling and dashboard project to quantify how player movement across teams, divisions, and conferences impacts team performance for:
- In-season moves (trades)
- Off-season moves (free agency)
Build a decision-ready analytics dashboard that answers:
- Which roster moves changed expected team outcomes the most?
- How much of team performance change is attributable to incoming/outgoing players?
- Do cross-division and cross-conference moves have systematically different impact profiles?
- Seasons: last 5 completed NFL seasons
- Move types: in-season trades, off-season free agent signings
- Outcomes:
- Team win percentage
- Point differential per game
- Offensive EPA per play
- Geography dimensions:
- Team, division, conference
- Outputs:
- Team impact scorecards
- Movement timeline
- Scenario simulation card
- Uncertainty intervals for all impact estimates
- Draft pick value propagation
- Contract value efficiency overlays
- Injury shock decomposition
- Playoff probability calibration model
Proposed monorepo layout:
nflanalysis/
README.md
docs/
metric-spec.md
modeling-notes.md
data-dictionary.md
adr/
data/
raw/
external/
processed/
pipelines/
ingestion/
features/
validation/
models/
baseline/
hierarchical/
simulation/
artifacts/
api/
app/
schemas/
tests/
dashboard/
src/
public/
tests/
.github/
ISSUE_TEMPLATE/
PULL_REQUEST_TEMPLATE.md
workflows/
- Player movement events
- trade date, effective week/date
- signing date, contract start
- source team, destination team
- Player context
- position, age, experience
- snap share, usage role
- injury availability signal
- Weekly team performance metrics (offense/defense/special teams)
- Opponent strength and schedule features
- Coaching/scheme continuity flags
- Division/conference indicators
- Every movement event must have both source and destination entity
- Event dates must align to NFL week calendar mapping
- Missingness checks on key model features must be tracked by run
Estimate marginal team performance impact of player movement, not just correlation.
- Baseline interpretable model
- Regularized regression on team outcome deltas
- Hierarchical impact model
- Player-position-team random effects
- Partial pooling to stabilize sparse players/roles
- Counterfactual simulation layer
- Compute no-move vs observed-move team outcomes
- Difference-in-differences framing around event windows
- Pre-trend checks prior to movement event
- Controls for schedule difficulty, injuries, and coaching changes
- Report 50% and 90% intervals on impact estimates
- Flag low-confidence estimates in UI
Main branch live pages:
- Overview
- league-wide movement impact ranking
- Team page
- inbound/outbound movement cards
- pre/post trend charts
- Player movement explorer
- filter by season, position, team, division, conference
- Scenario sandbox
- remove/add move and recompute expected team delta
- Always show uncertainty next to point estimate
- Distinguish observed outcome from modeled counterfactual
- Avoid causal overclaim language on low-confidence cases
- Build event schema and ingestion jobs
- Create canonical movement table
- Publish data dictionary and quality checks
- Implement feature pipeline and baseline model
- Backtest on historical seasons
- Define baseline dashboard API payloads
- Train hierarchical model
- Add simulation service
- Compare baseline vs hierarchical calibration
- Implement dashboard pages and filters
- Add model cards + assumptions panel
- Validate end-to-end reproducibility and CI
Copy this directly into your issue backlog.
- Define canonical player movement schema (trade + FA)
- Build NFL week/date calendar mapping table
- Implement movement event ingestion pipeline
- Create player metadata normalization job
- Build team-week outcome aggregation table
- Add data quality checks for missing key fields
- Document data dictionary for all MVP tables
- Implement roster churn feature set by team-week
- Implement position-group value delta features
- Add schedule strength and opponent adjustments
- Build baseline regularized regression model
- Add time-based backtest split framework
- Implement pre-trend and placebo validation tests
- Build hierarchical player-position-team model
- Implement counterfactual simulation endpoint
- Define API schemas for dashboard cards/charts
- Build Overview dashboard page
- Build Team detail page with movement timeline
- Build Scenario sandbox with uncertainty output
- Add CI workflow for data validation + model regression tests
Use this as the first model contract for analytics + product.
Movement Impact Score (MIS) for team t in period p:
Where:
-
$\hat{Y}$ is predicted team performance under a fixed model - Performance can be win%, point differential/game, or EPA/play
To compare across outcomes and seasons:
Total team movement impact decomposition:
- Display median estimate
- Display 50% and 90% intervals
- Flag "low confidence" when interval width exceeds configurable threshold
- High positive impact: MIS^z >= 1.0
- Moderate positive impact: 0.3 <= MIS^z < 1.0
- Neutral: -0.3 < MIS^z < 0.3
- Moderate negative impact: -1.0 < MIS^z <= -0.3
- High negative impact: MIS^z <= -1.0
For the single-season offseason snapshot workflow, use the dedicated pipeline:
- Place raw inputs under
data/raw/offseason/:transactions_raw.csvplayers_metadata.csvteam_spending_otc.csvwin_totals.csv
- Build canonical tables:
/usr/bin/python3 pipelines/offseason/ingest_offseason_snapshot.py \
--transactions data/raw/offseason/transactions_raw.csv \
--players data/raw/offseason/players_metadata.csv \
--win-totals data/raw/offseason/win_totals.csv \
--season 2026 \
--week 1- Build features (including spending and win totals integration):
/usr/bin/python3 pipelines/offseason/build_offseason_team_features.py \
--movement data/processed/movement_events.csv \
--players data/processed/player_dimension.csv \
--outcomes data/processed/team_week_outcomes.csv \
--team-spending data/raw/offseason/team_spending_otc.csv \
--win-totals data/raw/offseason/win_totals.csv \
--output data/processed/team_week_features.csv- Train baseline + hierarchical models locally (no deployment):
/usr/bin/python3 models/baseline/train_baseline_model.py \
--features data/processed/team_week_features.csv \
--outcomes data/processed/team_week_outcomes.csv \
--output data/processed/model_outputs.csv \
--coefficients-output models/artifacts/baseline_coefficients.csv \
--model-version baseline-ridge-v0.2.0-offseason/usr/bin/python3 models/hierarchical/train_hierarchical_model.py \
--features data/processed/team_week_features.csv \
--outcomes data/processed/team_week_outcomes.csv \
--movement data/processed/movement_events.csv \
--players data/processed/player_dimension.csv \
--output data/processed/model_outputs_hierarchical.csv \
--effects-output models/artifacts/hierarchical_effects.csv \
--model-version hierarchical-eb-v0.2.0-offseasonSee pipelines/offseason/README.md for details.
The MVP is complete when all conditions are met:
- Data pipelines run end-to-end for last 5 seasons
- Baseline and hierarchical models are trained and versioned
- Counterfactual endpoint returns scenario deltas with intervals
- Dashboard exposes required filters and core pages
- CI validates data quality checks and model regression checks
- Model assumptions and caveats are documented in docs/
- Create milestone labels and project board columns:
- data-foundation, feature-engineering, modeling, dashboard, validation
- Open the first 20 issues and assign milestone + owner
- Start Milestone 1 by implementing canonical schema + ingestion