SurvivalPredict

A python packaged centered around Survival Analysis Statistical Learning, for prediction of survival acoss time.

WIP. A pypi release should be released soon. In the meantime, the code in this repo can be installed via pip install git+https://github.com/pr38/survivalpredict. Ideally, before the first pypi release, some left-censoring support, docstrings, example notebooks, as well as implementations for 'multi-task logistic regression' model should be added. With the goal of adding sparse data support as well as tree-based, ensemble, and exotic neural network further down the line.

models

Estimators

Below are the estimators implemented in the survivalpredict.estimators sub-module

Estimators	Description	Stratifiable
CoxProportionalHazard	Cox proportional hazards model is a linear semi-parametric relative risk model. A staple of survival analysis. Fast and efficient to train. Survivalpredict's implementation has many optimizations and is up to 10x to 20x faster than other implementations available to Python. Both breslow and efron ties are supported. Currently only the breslow base hazard is avalable.	Yes
ParametricDiscreteTimePH	A fully parametric linear hazards model. Chen, weibull, log_normal, log_logistic, gompertz, gamma and additive_chen_weibull baseline hazards are available as hyperparameters. Maximum likelihood is estimated using a survival distinct time likelihood with censorship. Implemented with Pymc/Pytensor, with either a Jax or numba backend.	Yes
KaplanMeierSurvivalEstimator	Univariate non-parametric survival curve. Useful as a baseline/dummy estimator.	Accepts strata, but builds a survival curve for each strata.
KNeighborsSurvival	K nearest neighbors for survival. An in-memory non-parametric model that builds a Kaplan-Meier survival curve based on neighbors.	No
CoxNNetPH	A neural network model for estimating relative risk. Cox proportional hazards model's 'negative log likelihood for Breslow ties' is used as a loss function. Breslow's base hazard for relative risk is used to estimate survival across time. Implemented using Jax.	Yes
AalenAdditiveHazard	Linear multivariate non-parametric estimation of hazard. Allows for each interval of time and feature to have an associated coefficient, allowing for the effects of features to change over time.	No

Metrics

Survivalpredict focuses on metrics that directly measure prediction performance. Hence, the survivalpredict.metrics module intentionally excludes metrics based on ranking relative risk(i.e., ' c-index').

.

Metrics	Description
brier_scores_administrative	Squared error between the true survival and prediction for each time of interest. Censored intervals are ignored. Averaged by the number of rows not censored at a given interval of time. Ideal in cases of 'administrative' censorship, where 'survival time' is modeled after the time of an individual in the experiment, and not calendar time. This mertic is ideal for cases of churn, conversion and operational failure. See here
integrated_brier_score_administrative	Integral of administrative brier scores, to allow for a singular metric of performance.
integrated_brier_score_administrative_sklearn_metric	scikit-learn metric wraper around `integrated_brier_score_administrative` function, for acessing said metric in when using the SklearnSurvivalPipeline wrapper class when interfacing with scikit-learn.
integrated_brier_score_administrative_sklearn_scorer	scikit-learn scorer wraper around `integrated_brier_score_administrative` function, for acessing said metric in when using the SklearnSurvivalPipeline wrapper class when interfacing with scikit-learn.
brier_scores_ipcw	Brier scores with inverse probability of censoring weights. The squared error between the true survival and prediction is weighted using a Kaplan-Meier curve with inverted events, depending on censoring and failure at different points in time. This is a common metric within the field of biostatistics and is used in clinical trials.See here
integrated_brier_score_ipcw	Integral of brier scores with probability of censoring weights, to allow for a singular metric of performance.
integrated_brier_score_ipcw_sklearn_metric	scikit-learn metric wraper around `integrated_brier_score_ipcw` function.
integrated_brier_score_ipcw_sklearn_scorer	scikit-learn scorer wraper around `integrated_brier_score_ipcw` function.

Strata Preprocessing

The survivalpredict.strata_preprocessing module allows for the creation of strata to be used various estimators.

Class	Description
StrataBuilderDiscretizer	Builds strata keys from numeric data. Allows various splitting strategies.
StrataBuilderEncoder	Builds strata keys from categorical data.
StrataColumnTransformer	Allows various StrataBuilders to be stacked and simultaneously to be run on different columns to build the strata. Modeled after scikit-learn's ColumnTransformer.
make_strata_column_transformer	Generates the StrataColumnTransformer class without having to name each transformation directly, like scikit-learn's make_column_transformer.

Pipeline

Due to various reasons, survivalpredict intentionaly breaks with scikit-learn's api in several ways. The survivalpredict.pipeline module allows for creating wrappers around various survivalpredict classes, in order for survivalpredict intperpolate with the greater scikit-learn ecosysteam(ie, for feature selection or hyperparameter tuning); in addition of the various utility of a conventional scikit-learn's pipeline.

Class	Description
build_sklearn_pipeline_target	Builds a singular target array from the times and events arrays. Used as the 'y'/observed for scikit-learn ecosystem.
SklearnSurvivalPipeline	Stacks various sklearn transformers and survivalpredict strata_builders and estimators into single class. It assumes the output of the `build_sklearn_pipeline_target` function as the 'y'/observed.
make_sklearn_survival_pipeline	Generates a SklearnSurvivalPipeline class without having to directly name all the steps.

Validation

survivalpredict comes with some native model validation capability, within survivalpredict.validation.

Class	Description
sur_cross_val_score	survivalpredict's equivalent to scikit-learn's cross_val_score.
sur_cross_validate	survivalpredict's equivalent to scikit-learn's cross_validate.

Model Selection

Scikit-learn's model_selection is also mimicked within survivalpredict.model_selection

Class	Description
Sur_GridSearchCV	survivalpredict's equivalent to scikit-learn's GridSearchCV
Sur_RandomizedSearchCV	survivalpredict's equivalent to scikit-learn's RandomizedSearchCV

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
survivalpredict		survivalpredict
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SurvivalPredict

Estimators

Metrics

Strata Preprocessing

Pipeline

Validation

Model Selection

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SurvivalPredict

Estimators

Metrics

Strata Preprocessing

Pipeline

Validation

Model Selection

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages