Analysis of 270 samples

Below is an analysis of a hypothetical pathogen screen of 270 patient samples carried out in 6 different ways:

The metrics used to describe each experimental design performance are as follows:

To create these diagrams, we randomly created 500 experiments, each with 270 samples for each prevalence and single assay accuracy rate. Next we ran each assay in silico and decoded the results. For each experiment we could then compute the true positive, true negative, false positive, and false negative counts. Furthermore, each virtual run allowed us to record the number of assays that would need to be performed.

The MCC data and number of assays required are median values from the full list of 500. For each condition, we also have a high resolution description of the error distributions as is shown in in Figure 2 at the bottom for three cases.


Figure 1: Matthew's correlation coefficient and assay size requirements for four different models. The phase diagram shows which models work best under what conditions. The actual values in the figure are shown by hovering over the cell to display the underlying data.


Figure 2: Detailed comparison of statistics for three cases. (a) High accuracy, low prevalence, (b) medium accuracy and prevalence, (c) low accuracy and high prevalence.


Overall these results show that pure multiplex assays (L3) performs well at lower prevalence (0.4% or just over 1 positive sample in the 270 group), while the larger multiplex design (XL5) works well up to 2% prevalence (over 5 positives out of 270). The reason the MCC falls off at higher prevalence is because the multiplex asays begins to produce more false positives (as is shown in Figure 2).

Interestingly, the XL5 shows very robust error correction, particularly at low prevalence levels. For example, at the lowest prevalence of 0.1% the XL5 design is has a near perfect MCC score even if the underlying assay is only 85% accurate. Note too that when we have simulated assay accuracy, we include both postive and negative error. Thus an 85% accurate assay is will call a positive a negative 15% of the time, and a negative a positie 15% of the time. An assay with an 85% accuracy rate would be only marginally useful when used with standard designs, but becomes useful when multiple internal replicates are included.

The second stage of retesting yields significantly better results across the board, with the aXL5 design showing the greatest robustness to assay error.

Traditional single assay per sample designs only become appropriate when the prevalence is high (>17%).

Examining the phase diagram, the Dorfman pool (pool_18) performs well across many prevalence values, however this design is significantly more sensitive to assay errors. This lack of robustness to error is likely due to two factors:
  1. The Dorfman pool design tests each pool once, and those that report a negative are assumed to be negative based on this single measurement. This single measurement assignment means that Dorfman pool designs are particularly sensitive to false negative results, as they have no means to identify or correct for these kinds of errors.
  2. The Dorfman pool has few internal replicates. At most, sample will be tested twice if is in a positve pool, while most samples are only tested once.