Diagnostics Development and Validation

Seamless approaches for development and validation of diagnostic devices using machine learning can increase the chance of market authorization.

So you have conducted a clinical study and used the data to train a Machine Learning classifier, perhaps embedded in a device, to make predictions on future subjects for the presence/absence of a disease or a risk factor for a disease and you are planning a validation (confirmatory) clinical trial for registrational purposes?

In our 10+ years of experience in this area, we identify the major risks of failure of a validation trial to show the desired level of accuracy in predicting for new subjects as follows:

In order to mitigate such risks, we propose and have supported several sponsors with an adaptive approach to augmented training + validation. An example of such an approach is depicted in the figure below:

An example of an adaptive augmented-training + validation trial design

Augmented Training

YES

NO

YES

NO

Start Pilot-Validation
(say with N=50)
- Check Acc.
- Th. Opt.

Th. Opt. &
Acc. Adequate?

Unsupported markdown: list

Add pilot to
training and
re-train

Start Validation trial
with pre-planned interim
looks for early sopping
and/or sample size
re-estimation

Next Data Batch
(say with N=20)
- Check Acc.
- Th. opt.

Th. Opt. &
Acc. Adequate?

Add last
data batch
to training &
re-train

The above adaptive strategy can also be generalized into a seamless training + validation trial while following a learning curve (accuracy vs. training set size) to decide when to start the validation phase.

Our statisticians and data scientists have experience both in setting up efficient machine learning workflows as well as in designing risk-mitigated validation studies. We can also support in regulatory approval of such adaptive study designs and in general for all statistical aspects related to the training and validation of machine learning classifiers.