Diagnostics Development and Validation

Seamless approaches for development and validation of diagnostic devices using machine learning can increase the chance of market authorization.

E: monalisa.roy@musigmas.com

Diagnostics Development and Validation

Seamless approaches for development and validation of diagnostic devices using machine learning can increase the chance of market authorization.

So you have conducted a clinical study and used the data to train a Machine Learning classifier, perhaps embedded in a device, to make predictions on future subjects for the presence/absence of a disease or a risk factor for a disease. You are now planning a validation (confirmatory) clinical trial for registrational purposes?

In our 10+ years of experience in this area, we have identified the major risks of failure of a validation trial to show the desired level of accuracy in predicting for new subjects. There are as follows:

In order to mitigate such risks, we propose and have supported several sponsors with an adaptive approach to augmented training + validation. An example of such an approach is depicted in the figure below:

An example of an adaptive augmented-training + validation trial design

flowchart LR
  A["Start Pilot-Validation<br/>(say with N=50)<br/>- Check Acc.<br/>- Th. Opt."] --> B{"Th. Opt. &<br/>Acc. Adequate?"}
  B -->|YES| C["Freeze Algorithm and threshold<br/>Re-calc. validation sample size<br/>based on final CV acc. from<br/>augmented training"]
  B -->|NO| D["Add pilot to<br/>training and<br/>re-train"]
  C --> H["Start Validation trial<br/>with pre-planned interim<br/>looks for early stopping<br/>and/or sample size<br/>re-estimation"]
  D --> E["Next Data Batch<br/>(say with N=20)<br/>- Check Acc.<br/>- Th. opt."]
  E -.-> F{"Th. Opt. &<br/>Acc. Adequate?"}
  F -->|YES| C
  F -.->|NO| G["Add last<br/>data batch<br/>to training &<br/>re-train"]
  subgraph AG["Augmented Training"]
    G -.-> E
    D
    E
    F
    G
  end

The above adaptive strategy can also be generalized into a seamless training + validation trial while following a learning curve (accuracy vs. training set size) to decide when to start the validation phase.

Our statisticians and data scientists have experience both in setting up efficient machine learning workflows as well as in designing risk-mitigated validation studies. We can also support in regulatory approval of such adaptive study designs and in general for all statistical aspects related to the training and validation of machine learning classifiers.