Skip to content

A curated list of awesome marketing science resources including geo incrementality testing, media mix models, multi-touch attribution, causal inference, and more from shakostats.com . Star ⭐ the repo if it helps you, and feel free to contribute your own favorite resources

License

Notifications You must be signed in to change notification settings

shakostats/Awesome-Marketing-Science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Awesome License LinkedIn

Awesome Marketing Science

A curated list of awesome machine learning libraries for marketing, including media mix models, multi touch attribution, causal inference and more shakostats.com.

Star ⭐ the repo if it helps you, and feel free to contribute your own favorite resources

Open Source Libraries

A collection of open source repositories and libraries.

Attribution

Marketing Mix Models (MMM)

  • BayesianMMM Github Stars - - Bayesian Media Mix Modeling with Python and PyMC3.
  • dammmdatagen Github Stars - - (R) Media Mix Modeling Data Generator.
  • lightweight-mmm Github Stars - - A lightweight Bayesian Marketing Mix Modeling library by Google.
  • mamimo Github Stars - - Small Media Mix Models designed to be used in conjunction with ML libraries (e.g. SKL)
  • mmm-stan Github Stars - - Marketing Mix Modeling with Stan.
  • pymc-marketing Github Stars - - Bayesian marketing mix modeling and customer lifetime value in Python.
  • Robyn Github Stars - - Facebook's automated Marketing Mix Modeling (MMM) code.
  • Meridian Github Stars - - Google's new open-source Bayesian MMM framework (successor to LightweightMMM).
  • Ecommerce Marketing Spend Optimization Github Stars - - Machine Learning model for optimizing marketing budget.
  • MMM Prior Elicitation Github Stars - - Tools for prior elicitation in MMM.

Geo Experimentation & Lift Testing

Causal Inference & Bayesian Analysis

Customer Analytics (CLV, Segmentation, Uplift)

  • causalml Github Stars - - Uplift modeling and causal inference with machine learning.
  • btyd Github Stars - - Buy Till You Die and CLV statistical models in Python.
  • lifetimes Github Stars - - Measure customer lifetime value in Python.
  • lucius-ltv Github Stars - - CLV for subscriptions.
  • amazon-denseclus Github Stars - - Python module for clustering both categorical and numerical data using UMAP and HDBSCAN by Amazon.
  • rfm Github Stars - - RFM Analysis and Customer Segmentation.
  • retentioneering-tools Github Stars - - Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization...
  • ecommercetools Github Stars - - Data science toolkit for those working in technical ecommerce, marketing science, and technical seo and includes a wide range of features to aid analysis and mod...
  • lifelines Github Stars - - Survival analysis in Python.
  • pysurvival Github Stars - - An open source python package for Survival Analysis modeling.
  • scikit-survival Github Stars - - Survival analysis built on top of scikit-learn.
  • EconML Github Stars - - Automated Learning and Intelligence for Causation and Economics.
  • arules Github Stars - - Association Rules (apriori, eclat) in R.
  • BTYDplus Github Stars - - Extended BTYD models (R).
  • mr-uplift Github Stars - - Uplift Modeling with Multiple Treatments/Responses.
  • BTYD Github Stars - - Buy Till You Die - Probability Models for Customer-Base Analysis (R).

Customer Response Modeling

Forecasting

  • NeuralProphet Github Stars - - A hybrid forecasting framework based on PyTorch and trained with standard deep learning methods.
  • pmdarima Github Stars - - A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
  • prophet Github Stars - - Additive time series modelling by Facebook.
  • sktime Github Stars - - A unified framework for ML with Time Eeries.
  • StatsForecast Github Stars - - Lightning ⚡️ fast forecasting with statistical and econometric models.
  • stumpy Github Stars - - STUMPY computes something called the matrix profile, which is just an academic way of saying "for every subsequence automatically identify its corresponding nea...
  • temporian Github Stars - - Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖.
  • tbats Github Stars - - BATS and TBATS time series forecasting
  • tslearn Github Stars - - The machine learning toolkit for time series analysis in Python.
  • NeuralForecast Github Stars - - Scalable and user friendly neural forecasting algorithms.
  • Nixtla Github Stars - - TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection.
  • MLForecast Github Stars - - Scalable machine 🤖 learning for time series forecasting.

Product Affinity/Association

Recommender Systems

  • lightfm Github Stars - - A Python implementation of a number of popular recommendation algorithms.
  • openrec Github Stars - - A Modular Framework for Extensible and Adaptable Recommendation Algorithms.
  • recmetrics Github Stars - - Library of metrics for evaluating recommender systems.
  • recommenders Github Stars - - Best Practices on Recommendation Systems (by Microsoft).
  • Surprise Github Stars - - Scikit for building and analyzing recommender systems that deal with explicit rating data.

Data & Utilities

  • gapandas4 Github Stars - - Python package for querying the Google Analytics Data API for GA4 and displaying the results in a Pandas dataframe.
  • Decoy Github Stars - - Synthetic Data Generator using DuckDB at its core.
  • SDV Github Stars - - Python library designed to be your one-stop shop for creating tabular synthetic data.

Papers, Blogs, & Resources

Articles, papers, and other resources organized by topic.

Geo Experimentation & Lift Testing

Causal Inference & Bayesian Analysis

  • External Resource - How to measure the incremental Return On Ad Spend (iROAS) is a fundamental problem for the online advertising industry. A standard modern tool is to run randomized geo experiments, where experiment...
  • External Resource - Gaussian processes are powerful non-parametric probabilistic models for stochastic functions. However, the direct implementation entails a complexity that is computationally intractable when the nu...
  • Bayesian - The challenges posed by high-dimensional data and use of the simplex constraint are two major concerns in the empirical application of the synthetic control method (SCM) in econometric studies. To ...
  • infernce for simplex weights - In many applications, the parameter of interest involves a simplex-valued weight which is identified as a solution to an optimization problem. Examples include synthetic control methods with group-...
  • synthetic business cycles - This paper investigates the use of synthetic control methods for causal inference in macroeconomic settings when dealing with possibly nonstationary data. While the synthetic control approach has g...
  • Synthetic Control Method (Vanilla SCM)
  • Augmented Difference-in-Differences
  • Forward Difference-in-Differences
  • Two Step Synthetic Control
  • Synthetic Control Method with Nonlinear Outcomes - The synthetic control estimator (Abadie et al., 2010) is asymptotically unbiased assuming that the outcome is a linear function of the underlying predictors and that the treated unit can be well ap...
  • Proximal Causal Inference for SCM (Surrogates) - The synthetic control method (SCM) has become a popular tool for estimating causal effects in policy evaluation, where a single treated unit is observed, and a heterogeneous set of untreated units ...
  • Proximal SCM Framework - Synthetic control (SC) methods are commonly used to estimate the treatment effect on a single treated unit in panel data settings. An SC is a weighted average of control units built to match the tr...
  • Relaxed Balanced Synthetic Control - The synthetic control method (SCM) is widely used for constructing the counterfactual of a treated unit based on data from control units in a donor pool. Allowing the donor pool contains more contr...
  • L1-INF Synthetic Control - This paper reinterprets the Synthetic Control (SC) framework through the lens of weighting philosophy, arguing that the contrast between traditional SC and Difference-in-Differences (DID) reflects ...
  • Synthetic Control with Multiple Outcomes (TLP and SBMF) - We generalize the synthetic control (SC) method to a multiple-outcome framework, where the conventional pre-treatment time dimension is supplemented with the extra dimension of related outcomes in ...
  • Synthetic Controls for Experimental Design - This article studies experimental design in settings where the experimental units are large aggregate entities (e.g., markets), and only one or a small number of units can be exposed to the treatme...
  • DeepTCN paper - We present a probabilistic forecasting framework based on convolutional neural network for multiple related time series forecasting. The framework can be applied to estimate probability density und...
  • Chronos-2 report - Pretrained time series models have enabled inference-only forecasting systems that produce accurate predictions without task-specific training. However, existing approaches largely focus on univari...
  • Conformalized Prediction - Conformal prediction is a technique for constructing prediction intervals that attain valid coverage in finite samples, without making distributional assumptions. Despite this appeal, existing conf...
  • Matched Markets paper - Although randomized controlled trials are regarded as the "gold standard" for causal inference, advertisers have been hesitant to embrace them as their primary method of experimental desi...
  • TBR paper - Two previously published papers (Vaver and Koehler, 2011, 2012) describe
    a model for analyzing geo experiments. This model was designed to measure
    advertising effectiveness using the rigor of...
  • GeoX paper - Advertisers have a fundamental need to quantify the effectiveness of their advertising. For search ad spend, this information provides a basis for formulating strategies related to bidding, budgeti...
  • Benidis et al. - Deep learning based forecasting methods have become the methods of choice in many applications of time series prediction or forecasting often outperforming other approaches. Consequently, over the ...
  • Orbit: Probabilistic Forecast with Exponential Smoothing - Time series forecasting is an active research topic in academia as well as industry. Although we see an increasing amount of adoptions of machine learning methods in solving some of those forecasti...
  • Treatment Effects with Instruments paper
  • Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS) - We consider the estimation of heterogeneous treatment effects with arbitrary machine learning methods in the presence of unobserved confounders with the aid of a valid instrument. Such settings ari...
  • Arxiv preprint arxiv:1806.04823 - This paper proposes a Lasso-type estimator for a high-dimensional sparse parameter identified by a single index conditional moment restriction (CMR). In addition to this parameter, the moment funct...
  • ArXiv preprint arXiv:1608.00060 - Most modern supervised statistical/machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically ...
  • CausalML: Python package for causal machine learning - CausalML is a Python implementation of algorithms related to causal inference and machine learning. Algorithms combining causal inference and machine learning have been a trending topic in recent y...
  • ArXiv Paper - We introduce Gluon Time Series (GluonTS, available at https://gluon-ts.mxnet.io), a library for deep-learning-based time series modeling. GluonTS simplifies the development of and experimentation w...
  • Measuring Ad Effectiveness Using Geo Experiments
  • Estimating Ad Effectiveness Using Geo Experiments in a Time-Based Regression Framework - Two previously published papers (Vaver and Koehler, 2011, 2012) describe
    a model for analyzing geo experiments. This model was designed to measure
    advertising effectiveness using the rigor of...
  • NeuralProphet - We introduce NeuralProphet, a successor to Facebook Prophet, which set an industry standard for explainable, scalable, and user-friendly forecasting frameworks. With the proliferation of time serie...
  • Be Careful When Interpreting Predictive Models in Search of Causal Insights
  • 2021 Conference on Digital Experimentation @ MIT (CODE@MIT)
  • The Kernel Cookbook: Advice on Covariance functions
  • Gaussian Processes: HSGP Reference & First Steps
  • Gaussian Processes: HSGP Advanced Usage

Forecasting

  • DeepAR paper - Probabilistic forecasting with autoregressive recurrent networks (Amazon).
  • N-BEATS paper - Neural basis expansion analysis for interpretable time series forecasting.
  • N-HiTS paper - Neural Hierarchical Interpolation for Time Series Forecasting.
  • TCN paper - An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling.

Multi Armed Bandits

Recommender Systems

  • xDeepFM - eXtreme Deep Factorization Machine: Combining Explicit and Implicit Feature Interactions for Recommender Systems.
  • Google Primer - Introduction to Recommendation Systems.

Key Researchers

  • Susan Athey - The Economics of Technology Professor at Stanford Graduate School of Business. Leading researcher in the intersection of machine learning and causal inference.
  • Guido Imbens - Applied Econometrics Professor and Professor of Economics at Stanford Graduate School of Business. Nobel Laureate (2021) for methodological contributions to the analysis of causal relationships.
  • Peter Fader - Frances and Pei-Yuan Chia Professor of Marketing at The Wharton School. Author of Customer Centricity.
  • Byron Sharp - Professor of Marketing Science and Director of the Ehrenberg-Bass Institute. Author of How Brands Grow.
  • Ron Berman - Associate Professor of Marketing at The Wharton School. Focuses on online marketing, marketing analytics, and game theory.
  • Randall Lewis - Economic Research Scientist at Netflix. Known for work on "Ghost Ads" and measuring advertising effectiveness.
  • Stefan Wager - Associate Professor of Operations, Information & Technology at Stanford GSB. Research on causal inference and statistical learning.
  • Catherine Tucker - Sloan Distinguished Professor of Management at MIT Sloan. Expert in digital marketing, privacy, and online advertising.
  • Dominique Hanssens - Distinguished Research Professor of Marketing at UCLA Anderson. Known for Long-Term Impact of Marketing.
  • Garrett Johnson - Associate Professor of Marketing at Boston University. Co-author of "Ghost Ads" and research on privacy/GDPR.

Books & Courses

Blogs

Resources

About

Feel free to submit an issue or pull request with any suggestions!

This list is maintained by Shako Stats.

Connect with me on LinkedIn.

About

A curated list of awesome marketing science resources including geo incrementality testing, media mix models, multi-touch attribution, causal inference, and more from shakostats.com . Star ⭐ the repo if it helps you, and feel free to contribute your own favorite resources

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors