## Reproducing the Rothman Index 29 November 2016
## Outline: 1. Why model patient deterioration 2. What is the rothman index 3. Reproducing it on our data 4. Next steps for modeling deterioration
## Why model patient deterioration - There are many possible diagnoses, difficult to identify the correct one - Might be easier to just look for signs of things generally going badly - Help physicians identify 'where to look first' when coming on call or making resource allocation decisions
## Goals of the Rothman Index - Should be easy to calculate and interpret - Should be applicable to a large majority of patients
## What is the Rothman Index - A number between -91 and 100, where lower is more 'deteriorated' - Broadly speaking, calculated as: $RI = 100 - \alpha \sum_{features} ExcessRisk(feature_i)$ - Take away: extremely simple additive model
## Learning the Index: Feature Selection - Criteria for feature selection: 1. related to patient condition 2. regularly collected on almost all patients 3. susceptible to change (i.e. not age, sex, diagnosis)
## Learning the Index: Feature Selection - Further narrowed down on the basis of - parsimony - excluding the less occurent of correlated features - relevance - p value < 0.05 of coefficient in forward stepwise logistic regression of features on 1-year mortality - coefficients not otherwise used - not a logistic regression model!
## Learning the Index: Selected Features ![features used in rothman index](images/rothman_features.png)
## Learning the Index: Excess Risk - Definition: > Percent increase in 1-year all-cause mortality associated with any value of a clinical variable, relative to the minimum 1-year mortality identified for that variable. - Not entirely clear from paper how they calculate this - Fit linear model with polynomial terms to binned data normalized to risk of death at lowest bin? - Not relevant to us; we're training on a continuous outcome anyway
## Learning the Index: Excess Risk - Discrete Case: Respiratory Rate ![respiratory excess risk graph](images/resp_excess_risk.png)
## Learning the Index: Excess Risk - Binary Case: Nursing Assessments - Simple ratio of percentages between pass and fail cases ![nursing assessment excess risk graph](images/nursing_excess_risk.png)
## Building the model - Remember $RI = 100 - \alpha \sum_{features} ExcessRisk(feature_i)$ - Details: - If feature is unobserved (within some time tolerance) $ExcessRisk(feature_i) = 0$ - i.e. pretend that feature is at minimum-risk value (kind of ignores missing data problem) - $\alpha$ scales so most patients fall between 0 and 100
## Building the model - Laboratory tests are collected infrequently. - Final Rothman Index is made up of two Rothman Indices as described so far $RI = RI\_{noLab}\gamma + smoothingFunction(RI\_{lab}(1 - \gamma))$ - Where $\gamma = \frac{TimeSinceLabs}{48}$ - These contain overlapping subsets of the features listed previously
## Building the model - Assumes other measures are collected more than every 48 hours. - Assumes linear decay of relevance of lab scores over time. - Assumes labs come back at the same time. - Smoothing function prevents large jumps when labs come back.
## Reproducing the model - Train the same model on our data. - We don't have 1 year mortality, so we calculate excess risk using 'time to discharge' data.
## Cox proportional hazard models $ \lambda(t|X_i) = \lambda_0 (t)exp(X_i \beta) $ - $X_i$ is the feature vector for individual $i$ - $\lambda_0(t)$ is the baseline hazard function (when $\beta = 0, exp(X_i \beta = 1))$
## Cox proportional hazard models $ \lambda(t|X_i) = \lambda_0 (t)exp(X_i \beta) $ - The parameter vector can only change the scale of the hazard function .i.e. all individuals have proportional hazard functions. - Significant assumption, but is convenient in our case: we care exactly about how different feature values affect relative risk, not about how hazard varies over time.
## Cox proportional hazard models - Because the Rothman Index only uses one (the last) observation per patient, we also take only one random observation per-feature per-patient, and look at how long before discharge that observation was taken. - We then train a proportional hazard function
## Cox proportional hazard models - The natural output for a proportional hazard model is the _hazard ratio_ $$ \frac{\lambda_0(t)\exp(X_i \beta)}{\lambda_0(t)} = \exp(X_i \beta) $$ - This is how many times more at risk of 'event' subject $i$ is than someone with all zero-valued features. - Note that because we assume the hazard functions for all subjects are proportional, this does not vary in $t$
## Cox proportional hazard models - Because in our case, event is 'disharge from the hospital', higher ratio indicates better result for patient, which is not what we want for rothman index substitute. - We might instead want to know how many times more people will still be in the hospital for a patient with $X_i$: $$ \frac{S(t)}{S_0(t)} = \frac{\exp(-\int_0^t \lambda_0 \exp(X_i \beta)dt)} {\exp(-\int_0^t \lambda_0(t)dt) } $$ $$ = \exp((1 - \exp(\beta_iX)) \int_0^t \lambda_0(t) dt) $$
## Cox proportional hazard models $$ \frac{S(t)}{S_0(t)} = \frac{\exp(-\int_0^t \lambda_0 \exp(X_i \beta)dt)} {\exp(-\int_0^t \lambda_0(t)dt) } $$ $$ = \exp((1 - \exp(\beta_iX)) \int_0^t \lambda_0(t) dt) $$ - But of course this _does_ vary with time, so it doesn't plug in nicely to the rothman index formula. - Conceptually doesn't make sense to have a 'proportional survival model'
## Excess risk function - Instead, we normalize hazard ratios to the maximum ratio, where $X_j = \max \exp(X_i \beta)$: $$ \frac{ \frac{\lambda_0(t) \exp(X_i \beta)}{\lambda_0(t)} }{\frac{\lambda_0(t) \exp(X_j \beta)}{\lambda_0(t)} } = exp((X_i - X_j) \beta) $$ - For $i = j$, $ER_i = exp(0\beta) = 1$
## Excess risk function: remix - Look at how much lower the normalized ratios are than the maximal hazard ratio, which by construction has been normalized to 1. $$ ER' = 1 - exp((X_i - X_j) \beta) $$ - Interpretation: > Percent decrease in instantaneous hospital discharge rate assocaited with any value of a clinical variable, relative to the maximum hospital discharge rate identified for that variable.
## Excess risk function: Rothman > Percent increase in 1-year all-cause mortality associated with any value of a clinical variable, relative to the minimum 1-year mortality identified for that variable.
## Excess risk function prime > Percent decrease in instantaneous hospital discharge rate assocaited with any value of a clinical variable, relative to the maximum hospital discharge rate identified for that variable.
## Excess risk comparison | Rothman | Reproduction | | ---------- | ------------ | | ![](images/resp_excess_risk.png) | ![](images/resp_excess_risk_prime.png) |
## Excess risk comparison | Rothman | Reproduction | | ---------- | ------------ | | ![](images/spo2_excess_risk.png) | ![](images/spo2_excess_risk_prime.png) |
## Weigting - Not explicitly specified in paper - I adapt their weighting by fixing weights with the constraints: 1. $\sum w_f = \alpha$ 2. $\frac{w\_f}{w\_{f'}} = \frac{48 - t\_f}{48 - t\_{f'}}$ for $0 \le t \le 48$ 3. $w_f = 0$ for $t_f > 48$
## Weighting i.e. solve for $\pmb w$ $$ \begin{bmatrix} 1 & 48 - t\_2 & 48 - t\_3& \cdots & 48 - t\_n \\\\ 1 & -(48 - t\_1) & 0 & \cdots & 0 \\\\ 1 & 0 & -(48 - t\_1) & \cdots & 0 \\\\ \vdots & \vdots & \vdots & \ddots & \vdots \\\\ 1 & 0 & 0 & \cdots & -(48 - t\_1) \end{bmatrix}^T \begin{bmatrix} w\_1 \\\\ \vdots \\\\ w\_n \end{bmatrix} = [\alpha \;\; 0 \;\; \cdots \;\; 0] $$ where $t\_n$ is the min of the hours since measurement and 48.
## Weighting - Note that this weighting scheme implicitly assumes that unknown excess-risks are at the weighted average of known excess risks - Different from Rothman, which assumes they're at optimal values - Weights constantly change, but solving small matrix equations is fast.
## Evaluation Rothman evaluates method in 4 ways: - Discharge Disposition - 24 Hour Mortality - 30 Day Readmission - Comparison to existing indices (APACHE III and MEWS) We do the first two
## Evaluation: Discharge Disposition Summaries | Rothman | Reproduction | | ---------- | ------------ | | ![](images/disch_disp_summary_roth.png) | ![](images/disch_disp_summary.png) |
## Evaluation: Discharge Disposition Tukey HSD ![](images/discharge_disp_comparison.png) Rothman has all means statistically different; we have home care, nursing and rehab all _not_ statistically different. Note: we have substantially lest test data, which contributes to larger confidence intervals (TODO: WRITE MORE ABOUT TRAINING AND TESTING SAMPLE SIZE)
## Evaluation: Discharge Disposition Box Plots | Rothman | Reproduction | | ---------- | ------------ | | ![](images/disch_disp_boxplot_roth.png) | ![](images/disch_disp_boxplot.png) |
## Evaluation: Discharge Disposition AUC ROC - Treating the RI as a binary classifier between: 1. Death and Hospice 2. Remaining discharge dispositions - Rothman calculated results between .915 and .965 on three datasets - Our results: .865
## Evaluation: 24 Hour Mortality AUC ROC - Treating the RI as a binary classifier between: 1. Patient dying within 24 hours 2. Patient not dying within 24 hours - Rothman calculated results between .929 and .948 on three datasets - Our results: .898
## Evaluation: 24 Hour Mortality AUC ROC | Rothman | Reproduction | | ---------- | ------------ | | ![](images/24fr_mortality_roc_roth.png) | ![](images/24hr_mortality_roc.png)|
## Evaluation: 24 Hour Mortality against RI | Rothman | Reproduction | | ---------- | ------------ | | ![](images/24hr_mortality_v_rothman_roth.png) | ![](images/24hr_mortality_v_rothman.png)|
## Evaluation: Take aways - Rothman does slightly better; they have more data - Evaluation is primarily based on intuition and arbitrary classification - p-value hacking?
## Additional Evaluation (not in paper)
## Ongoing work - Censoring - currently 'ignoring' - We are potentially interested in multiple events, some of which censor others - gets complicated fast - Better estimtes of missing values and robustness to faulty data - Gaussian processes, measure robustness to missing data with simulations
## Ongoing work - Rothman uses only a single observation per patient to train - Make principled use of longitudinal data in training - Include interaction terms, constant features, medical history - Eventually: create robust generative model for all events of interest
```python def hello(): pass ```
## Sources Rothman, Rothman et al. (2013) _Development and validation of continuous measure of patient condition using the Electronic Medical Record_, Journal of Biomedical Informatics