ML & Predictive Analytics

Post-HCT Survival Prediction for Clinical Risk Stratification

Client

Academic Research Team

Industry

Healthcare

Timeline

4 weeks

Type

ML & Predictive Analytics

Overview

A university research team collaborated with us to build a practical machine learning workflow for post-HCT survival prediction. The objective was to move from exploratory analysis to a reproducible modeling process that technical reviewers could trust. In a 4-week sprint, we delivered a full pipeline for data cleaning, feature engineering, model benchmarking, and validation, with a best holdout result of 0.761 ROC-AUC.

Challenge

The project started with a real healthcare machine learning problem: useful clinical features existed, but the modeling workflow was not stable enough for credible comparison.

Clinical and transplant variables had missing values and mixed data types.
Categorical features had a direct impact on downstream model quality.
Single-model testing was not enough for a defensible survival prediction benchmark.
The team needed interpretable outputs for technical stakeholders, not only a final score.

Without a structured benchmark, the work risked becoming another one-off notebook result rather than a reusable clinical predictive analytics artifact.

Solution

We implemented a reproducible post-HCT risk modeling pipeline with clear decisions at each stage.

Architecture pattern: Sequential workflow across preprocessing, feature engineering, model training, validation, and interpretation.
Data strategy: Removed non-core features for modeling scope, then standardized missing-value handling for both numerical and categorical signals.
Feature strategy: Discretized select features into bins, then encoded features for compatible model training.
Model strategy: Benchmarked LightGBM, CatBoost, AdaBoost, Random Forest, XGBoost, and Naive Bayes under consistent train/test and cross-validation conditions.
Evaluation strategy: Used standard classification metrics plus a domain-focused custom metric combining class-1 recall, class-0 precision, and ROC-AUC.

Tradeoff: we prioritized model reliability and interpretability for technical review over building a production API in this phase.

Key Features

End-to-end post-HCT survival prediction pipeline
Missing-value handling for mixed clinical feature types
Categorical-aware model benchmarking across six algorithms
Holdout and 5-fold cross-validation evaluation workflow
Custom metric design aligned with asymmetric classification risk
Explainability layer using feature importance and SHAP outputs

Technical Implementation

Backend & Infrastructure

The implementation was built in Jupyter notebooks with an artifact-first workflow across raw, interim, and processed datasets. This made the experimentation process reproducible and easier to audit. The final training/evaluation flow was deterministic with fixed random state configuration.

Data & AI Components

The pipeline used approximately 28,800 training records with target label after preprocessing.

Core implementation details:

Removed non-modeling columns early to reduce noise.
Filled and encoded missing values with explicit strategy instead of silent drops.
Converted age features into bins for stronger categorical signal handling.
Trained and compared six model families using aligned data splits and metrics.
Added cross-validation summaries for stability checks.

Frontend & User Experience

This was an analytics-first engagement, so the primary UX value was for technical users reviewing model behavior. Outputs were organized for fast comparison, metric traceability, and interpretation rather than visual dashboard polish.

Security & Reliability

Reliability came from consistent splits, repeatable preprocessing artifacts, and explicit metric definitions across all models.

Results

0.761 ROC-AUC achieved on holdout evaluation with CatBoost.
0.7507 ROC-AUC (5-fold mean) demonstrated stable validation performance.
~5.2 AUC-point improvement versus lower-performing benchmark models in the same workflow.
Delivered a credible baseline for clinical risk stratification and future healthcare ML iterations.

Technology Stack

AI/ML: CatBoost, LightGBM, XGBoost, AdaBoost, SHAP, scikit-learn
Backend:: Python, pandas, NumPy, mlxtend
Frontend:: Plotly, Matplotlib, Seaborn
Infrastructure: Jupyter notebooks

Interested in Similar Results?

Let's discuss how we can craft a custom solution for your business challenges.

Start a Conversation View More Projects