## Reproducing the Rothman Index
29 November 2016
## Outline:
1. Why model patient deterioration
2. What is the rothman index
3. Reproducing it on our data
4. Next steps for modeling deterioration
## Why model patient deterioration
- There are many possible diagnoses, difficult to identify the correct one
- Might be easier to just look for signs of things generally going badly
- Help physicians identify 'where to look first' when coming on call or making
resource allocation decisions
## Goals of the Rothman Index
- Should be easy to calculate and interpret
- Should be applicable to a large majority of patients
## What is the Rothman Index
- A number between -91 and 100, where lower is more 'deteriorated'
- Broadly speaking, calculated as:
$RI = 100 - \alpha \sum_{features} ExcessRisk(feature_i)$
- Take away: extremely simple additive model
## Learning the Index: Feature Selection
- Criteria for feature selection:
1. related to patient condition
2. regularly collected on almost all patients
3. susceptible to change (i.e. not age, sex, diagnosis)
## Learning the Index: Feature Selection
- Further narrowed down on the basis of
- parsimony - excluding the less occurent of correlated features
- relevance - p value < 0.05 of coefficient in forward stepwise
logistic regression of features on 1-year mortality
- coefficients not otherwise used - not a logistic regression model!
## Learning the Index: Selected Features
![features used in rothman index](images/rothman_features.png)
## Learning the Index: Excess Risk
- Definition:
> Percent increase in 1-year all-cause mortality associated with any
value of a clinical variable, relative to the minimum 1-year
mortality identified for that variable.
- Not entirely clear from paper how they calculate this
- Fit linear model with polynomial terms to binned data normalized to
risk of death at lowest bin?
- Not relevant to us; we're training on a continuous outcome anyway
## Learning the Index: Excess Risk
- Discrete Case: Respiratory Rate
![respiratory excess risk graph](images/resp_excess_risk.png)
## Learning the Index: Excess Risk
- Binary Case: Nursing Assessments
- Simple ratio of percentages between pass and fail cases
![nursing assessment excess risk graph](images/nursing_excess_risk.png)
## Building the model
- Remember
$RI = 100 - \alpha \sum_{features} ExcessRisk(feature_i)$
- Details:
- If feature is unobserved (within some time tolerance)
$ExcessRisk(feature_i) = 0$
- i.e. pretend that feature is at minimum-risk value
(kind of ignores missing data problem)
- $\alpha$ scales so most patients fall between 0 and 100
## Building the model
- Laboratory tests are collected infrequently.
- Final Rothman Index is made up of two Rothman Indices as described so far
$RI = RI\_{noLab}\gamma + smoothingFunction(RI\_{lab}(1 - \gamma))$
- Where $\gamma = \frac{TimeSinceLabs}{48}$
- These contain overlapping subsets of the features listed previously
## Building the model
- Assumes other measures are collected more than every 48 hours.
- Assumes linear decay of relevance of lab scores over time.
- Assumes labs come back at the same time.
- Smoothing function prevents large jumps when labs come back.
## Reproducing the model
- Train the same model on our data.
- We don't have 1 year mortality, so we calculate excess risk using
'time to discharge' data.
## Cox proportional hazard models
$ \lambda(t|X_i) = \lambda_0 (t)exp(X_i \beta) $
- $X_i$ is the feature vector for individual $i$
- $\lambda_0(t)$ is the baseline hazard function (when $\beta = 0, exp(X_i \beta = 1))$
## Cox proportional hazard models
$ \lambda(t|X_i) = \lambda_0 (t)exp(X_i \beta) $
- The parameter vector can only change the scale of the hazard function
.i.e. all individuals have proportional hazard functions.
- Significant assumption, but is convenient in our case: we care exactly about how
different feature values affect relative risk, not about how hazard
varies over time.
## Cox proportional hazard models
- Because the Rothman Index only uses one (the last) observation per patient,
we also take only one random observation per-feature per-patient, and look
at how long before discharge that observation was taken.
- We then train a proportional hazard function
## Cox proportional hazard models
- The natural output for a proportional hazard model is the _hazard ratio_
$$ \frac{\lambda_0(t)\exp(X_i \beta)}{\lambda_0(t)} = \exp(X_i \beta) $$
- This is how many times more at risk of 'event' subject $i$ is than someone
with all zero-valued features.
- Note that because we assume the hazard functions for all subjects are
proportional, this does not vary in $t$
## Cox proportional hazard models
- Because in our case, event is 'disharge from the hospital', higher ratio
indicates better result for patient, which is not what we want for rothman
index substitute.
- We might instead want to know how many times more people will still be in the
hospital for a patient with $X_i$:
$$ \frac{S(t)}{S_0(t)} =
\frac{\exp(-\int_0^t \lambda_0 \exp(X_i \beta)dt)}
{\exp(-\int_0^t \lambda_0(t)dt) } $$
$$ = \exp((1 - \exp(\beta_iX)) \int_0^t \lambda_0(t) dt) $$
## Cox proportional hazard models
$$ \frac{S(t)}{S_0(t)} =
\frac{\exp(-\int_0^t \lambda_0 \exp(X_i \beta)dt)}
{\exp(-\int_0^t \lambda_0(t)dt) } $$
$$ = \exp((1 - \exp(\beta_iX)) \int_0^t \lambda_0(t) dt) $$
- But of course this _does_ vary with time, so it doesn't plug in nicely to the
rothman index formula.
- Conceptually doesn't make sense to have a 'proportional survival model'
## Excess risk function
- Instead, we normalize hazard ratios to the maximum ratio, where
$X_j = \max \exp(X_i \beta)$:
$$ \frac{ \frac{\lambda_0(t) \exp(X_i \beta)}{\lambda_0(t)} }{\frac{\lambda_0(t) \exp(X_j \beta)}{\lambda_0(t)} } = exp((X_i - X_j) \beta) $$
- For $i = j$, $ER_i = exp(0\beta) = 1$
## Excess risk function: remix
- Look at how much lower the normalized ratios are than the maximal
hazard ratio, which by construction has been normalized to 1.
$$ ER' = 1 - exp((X_i - X_j) \beta) $$
- Interpretation:
> Percent decrease in instantaneous hospital discharge rate assocaited
with any value of a clinical variable, relative to the maximum hospital
discharge rate identified for that variable.
## Excess risk function: Rothman
> Percent increase in 1-year all-cause mortality associated with any
value of a clinical variable, relative to the minimum 1-year
mortality identified for that variable.
## Excess risk function prime
> Percent decrease in instantaneous hospital discharge rate assocaited
with any value of a clinical variable, relative to the maximum hospital
discharge rate identified for that variable.
## Excess risk comparison
| Rothman | Reproduction |
| ---------- | ------------ |
| ![](images/resp_excess_risk.png) | ![](images/resp_excess_risk_prime.png) |
## Excess risk comparison
| Rothman | Reproduction |
| ---------- | ------------ |
| ![](images/spo2_excess_risk.png) | ![](images/spo2_excess_risk_prime.png) |
## Weigting
- Not explicitly specified in paper
- I adapt their weighting by fixing weights with the constraints:
1. $\sum w_f = \alpha$
2. $\frac{w\_f}{w\_{f'}} = \frac{48 - t\_f}{48 - t\_{f'}}$ for $0 \le t \le 48$
3. $w_f = 0$ for $t_f > 48$
## Weighting
i.e. solve for $\pmb w$
$$
\begin{bmatrix}
1 & 48 - t\_2 & 48 - t\_3& \cdots & 48 - t\_n \\\\
1 & -(48 - t\_1) & 0 & \cdots & 0 \\\\
1 & 0 & -(48 - t\_1) & \cdots & 0 \\\\
\vdots & \vdots & \vdots & \ddots & \vdots \\\\
1 & 0 & 0 & \cdots & -(48 - t\_1)
\end{bmatrix}^T
\begin{bmatrix}
w\_1 \\\\
\vdots \\\\
w\_n
\end{bmatrix}
= [\alpha \;\; 0 \;\; \cdots \;\; 0]
$$
where $t\_n$ is the min of the hours since measurement and 48.
## Weighting
- Note that this weighting scheme implicitly assumes that unknown
excess-risks are at the weighted average of known excess risks
- Different from Rothman, which assumes they're at optimal values
- Weights constantly change, but solving small matrix equations is fast.
## Evaluation
Rothman evaluates method in 4 ways:
- Discharge Disposition
- 24 Hour Mortality
- 30 Day Readmission
- Comparison to existing indices (APACHE III and MEWS)
We do the first two
## Evaluation: Discharge Disposition Summaries
| Rothman | Reproduction |
| ---------- | ------------ |
| ![](images/disch_disp_summary_roth.png) | ![](images/disch_disp_summary.png) |
## Evaluation: Discharge Disposition Tukey HSD
![](images/discharge_disp_comparison.png)
Rothman has all means statistically different; we have home care, nursing and
rehab all _not_ statistically different.
Note: we have substantially lest test data, which contributes to larger confidence
intervals (TODO: WRITE MORE ABOUT TRAINING AND TESTING SAMPLE SIZE)
## Evaluation: Discharge Disposition Box Plots
| Rothman | Reproduction |
| ---------- | ------------ |
| ![](images/disch_disp_boxplot_roth.png) | ![](images/disch_disp_boxplot.png) |
## Evaluation: Discharge Disposition AUC ROC
- Treating the RI as a binary classifier between:
1. Death and Hospice
2. Remaining discharge dispositions
- Rothman calculated results between .915 and .965 on three datasets
- Our results: .865
## Evaluation: 24 Hour Mortality AUC ROC
- Treating the RI as a binary classifier between:
1. Patient dying within 24 hours
2. Patient not dying within 24 hours
- Rothman calculated results between .929 and .948 on three datasets
- Our results: .898
## Evaluation: 24 Hour Mortality AUC ROC
| Rothman | Reproduction |
| ---------- | ------------ |
| ![](images/24fr_mortality_roc_roth.png) | ![](images/24hr_mortality_roc.png)|
## Evaluation: 24 Hour Mortality against RI
| Rothman | Reproduction |
| ---------- | ------------ |
| ![](images/24hr_mortality_v_rothman_roth.png) | ![](images/24hr_mortality_v_rothman.png)|
## Evaluation: Take aways
- Rothman does slightly better; they have more data
- Evaluation is primarily based on intuition and arbitrary classification
- p-value hacking?
## Additional Evaluation (not in paper)
## Ongoing work
- Censoring - currently 'ignoring'
- We are potentially interested in multiple events, some of which censor
others - gets complicated fast
- Better estimtes of missing values and robustness to faulty data
- Gaussian processes, measure robustness to missing data with simulations
## Ongoing work
- Rothman uses only a single observation per patient to train
- Make principled use of longitudinal data in training
- Include interaction terms, constant features, medical history
- Eventually: create robust generative model for all events of interest
```python
def hello():
pass
```
## Sources
Rothman, Rothman et al. (2013)
_Development and validation of continuous measure of patient condition using the Electronic Medical Record_, Journal of Biomedical Informatics