Introduction

The event-based model constructs a discrete picture of disease progression from cross-sectional data sets, with each event corresponding to a new biomarker becoming abnormal. However, it relies on the assumption that all subjects follow a single event sequence. This is a major simplification for sporadic disease data sets, which are highly heterogeneous, include distinct subgroups, and contain significant proportions of outliers.

In this work , they relax this assumption by considering two extensions to the event-based model:

  1. Generalized Mallows model, which allows subjects to deviate from the main event sequence;
  2. Dirichlet process mixture of generalized Mallows models, which models clusters of subjects that follow different event sequences, each of which has a corresponding variance.