Analysis of 2211 samples analyzed in 94 assays

Below is an analysis of a hypothetical pathogen screen of 1122 patient samples carried out in 4 different ways:


The metrics used to describe each experimental design performance are as follows:


To create these diagrams, we randomly created 500 experiments, each with 1122 samples for each prevalence and single assay accuracy rate. Next we ran each assay in silico and decoded the results. For each experiment we could then compute the true positive, true negative, false positive, and false negative counts. Furthermore, each virtual run allowed us to record the number of assays that would need to be performed.

The MCC data and number of assays required are median values from the full list of 500. For each condition, we also have a high resolution description of the error distributions as is shown in in Figure 2 at the bottom for the condition prevalence=0.001, assay accuracy rate 0.99, and at prevalence 0.02 and accuracy 0.95.

loading...

Figure 1: Matthew's correlation coefficient and assay size requirements for four different models. The phase diagram shows which models work best under what conditions. The actual values in the figure are shown by hovering over the cell to display the underlying data.

loading...

Figure 2: Detailed comparison of statistics for two cases. (a) High accuracy, low prevalence case (prevalence=0.001, assay accuracy rate 0.99), and (b) moderate accuracy, and moderate prevalence case (prevalence=0.02, assay accuracy rate 0.95).

Discussion

Overall these results show that pure multiplex assay (XL3) performs well only at low prevalence (0.1% or just over 1 positive sample in the 1122 group). The reason the MCC falls off at higher prevalence is because the XL3 assay begins to produce more false positives (as is shown in Figure 2).

The second stage of retesting yields significantly better results across the board, with the aXL3 design showing the greatest robustness to assay error.

Traditional single assay per sample designs only become appropriate when the prevalence is high (>12%).

Not surprisingly, the single assay accuracy rate is important. At accuracies below 0.98, none of the designs can achieve an MCC score of >0.95. While I suspect that a model with a greater number of internal replicates could overcome this assay error, it would do so at the cost of the design compression--possibly a worthwhile tradeoff.