A small example: 6 person company testing
Imagine that you are part of a company of 6 people, we will call them A, B, C, D, E, and F. These six people work together and need to be tested every week for COVID19. There is a testing service that accepts saliva samples mail ordered to them. Some notes:
-
The testing service charges a moderate fee for each sample submitted.
- Sample mailing and processing takes 3-4 days, so only one round of tests per week can be run.
- The COVID19 test is good, but not perfect. There are news reports that the test rarely yields false positive results, but occasionally yields false negative results. More quantitative estimates are not available.
- Sick employees are told to stay home, so we expect to only rarely get positive results back.
How should the company proceed?
Below I outline 9 different ways the company could proceed.
(1) 1:1
Most people's first answer would be to simply test everyone, each with their own test. This testing design is illustrated below
While this design is conceptually simple to design and interpret, it has the following problems:
- No error detection: If an assay were to fail for some reason, we could never know with this design. For example, if assay #5 were to incorrectly return a negative result by a random error, then we would miss the infection in person E.
- Inefficient: We assume that positive results are rare, so the vast majority of the time we expect to get back all negative results.
(2) Duplicate runs
If we wanted to check for errors, a simple design tweak would be to take two samples from each employee and run those duplicates too.
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
A1 |
B1 |
C1 |
D1 |
E1 |
F1 |
A2 |
B2 |
C2 |
D2 |
E2 |
F2 |
This design doubles the number of tests to 12, but does introduce some level of error detection. If an assay report comes back with assay 1 and 7 positive, then we have strong evidence that employee A is positive. However, if only assay 1 is positive and 7 is negative, then we are less sure of our diagnosis.
(3) Triplicate runs
An even more robust strategy would be to take three samples from each employee and run all of those samples. This design is below:
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
A1 |
B1 |
C1 |
D1 |
E1 |
F1 |
A2 |
B2 |
C2 |
D2 |
E2 |
F2 |
A3 |
B3 |
C3 |
D3 |
E3 |
F3 |
By gathering three samples from each employee, we provide extra robustness to false positive results, but we now require 18 tests.
One advantage of using triplicate measures over duplicates is that it is somewhat clearer what to do when the results are inconsistent. For example, if assays 1, 7, and 13 are positive, then employee A is very likely positive. However, if employee A is positive in only 1/3 tests, then we have more confidence that the assay result was an error.
(4) Extreme pooling
We can exploit the rareness of a positive result by simply pooling everyone into a single sample and mailing that sample in. This vastly simplified assay looks like:
This extreme pooling design requires only 1 test per week. However, the design has the following shortcomings:
- No error detection: Even more than the 1:1 design, the extreme pooling design has no ability to detect errors. Furthermore, an assay error could miss some bad situations such as if everyone happened to be infected, but the assay happened to make an error and returned a negative result.
- Ambiguous: Assuming no errors, the assay only returns a clear answer if the assay is negative. However if the assay returns a positive it indicates that *someone* is positive, but doesn't specify who in particular is infected.
- Dilution: Depending on the sensitivity of the assay, there is a risk of diluting a single positive employee's sample with the negative samples from the other employees. Most COVID19 assays are very sensitive, but there are always edge cases and dilution will always reduce signal (COVID19 assays do not have homeopathic scaling).
(5) Paired duplicates
Somewhere between duplicates and extreme pooling is a paired duplicates. Here two samples are taken from each employee, and each sample is mixed in different pairs, such as the following:
1 |
2 |
3 |
4 |
5 |
6 |
A1 B1 |
B2C1 |
C2 D1 |
D2 E1 |
E2 F1 |
F2A2 |
This design forms a single stepped ring, from A-F and back to A. There is a similar design with two small rings such as the following:
1 |
2 |
3 |
4 |
5 |
6 |
A1 B1 |
B2C1 |
C2 A2 |
D1 E1 |
E2F1 |
F2 D2 |
In both of these designs we are able to test all 6 employees in duplicate with only 6 assays.
The paired duplicate design also has error detection. For example, if only assay 1 is positive, this implies some sort of error as there is no positive/negative employee status that maps to a single positive assay result. The design does not correct the error in that we can't know if assay 1 was a false positive, or assay 6 or 2 were false negatives, or if there were multiple errors! All we know is something isn't right.
Assuming no errors, this design can always identify up to 1 positive employee, however it struggles if more employees are positive. As an example, in either design above if employees A and C are positive, then these results will "hide" the state of B as B is only present in pools with A and C. But there are some pairs that that can be detected such as positive results for C and D can be uniquely defined.
(6) Compressed paired duplicates
It is possible to compress the paired duplicates design above my making larger pools:
1 |
2 |
3 |
4 |
A1B1 C1 |
B2D1 F1 |
C2D2 E1 |
A2E2 F2 |
This compressed design tests all six employees, in duplicate with only four tests.
The design can also uniquely identify up to one positive employee, just like the larger #5 paired duplicate design above. Like the paired duplicate design, this design can't uniquely call out more than 1 positive result though. The compression comes with a small cost of larger pools (3 samples per pool vs 2 in the paired duplicate).
The design also has error detection, in that if only one assay is positive, then that suggests that there is a false positive or false negative.
The compression achieved by this design is remarkable in that it provides:
- Internal duplicate sampling
- Compression
- Error detection
The strength of non-adaptive designs is that they can provide all three of these benefits at once. In fact, these benefits become larger for larger designs, as I will discuss later.
(7) Compressed unbalanced
If we relax the requirement that each sample must be tested the same number of times we can get even better compression:
1 |
2 |
3 |
A1 D1 F1 |
B1 D2 E1 |
C1 E2 F2 |
This compressed design tests all six employees, with only three tests.
The design can also uniquely identify up to one positive employee, and can't uniquely call out more than 1 positive result. Note that this design could also fit a 7th employee (let us call them employee "G") if they were placed in all of the assays. This design creates distinct assay patterns for each case of 0 or 1 positive sample, does not produce unique results with more than one positive sample.
The compression has both benefits and costs:
- Partial replication: Some samples are replicated, some are not. D, E and F are tested twice, while A, B, and C are tested only once. This imbalance is relevant if our assay is error prone, because it means that we have a greter risk of errors for samples with less replication.
- Larger pool size: The pool size varies more here and extends up to 3 samples per assay.
- Some error detection: An example of an error state would be if all of the assays were positive then we would suspect we have more than one positive sample or an error. Because there is more compression however, there is inherently less error detection / correction due to space limitations.
(8) Compressed triplicates
Similar to the compressed duplicates, we can also devise a compressed triplicate design:
1 |
2 |
3 |
4 |
5 |
6 |
7 |
A1 E1 F1 |
A2 C1 D1 |
B1 C2 E2 |
B2 D2 F2 |
A3 B3 |
D3 E3 |
C3 F3 |
In contrast to the 3*6=18 assays used in the standard triplicate run shown in #3, this compressed assay uses only 7 assays.
This larger design is also more robust than the compressed duplicates in #6 in that it can uniquely detect 0,1, or 2 positive employees.
(9) Compressed quadruplicate
Taking one more step, we can devise a compressed quadruplicate design like the following:
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
B1 D1 F1 |
A1 F2 |
E1 F3 |
C1D2 |
B2 E2 |
C2 F4 |
A2 C3 E3 |
D3E4 |
B3 C4 |
A3B4 |
A4D4 |
This design distributes 6*4=24 samples over 11 wells to test each employee in quadruplicate. The larger test is also more robust in that it can uniquely detect 0, 1, 2, or 3 positive employees.
Note too that for this larger model, the pool size begins to decrease. As the number of replicates increases, these designs tend to become:
- Less compact (less compression)
- More robust
- Smaller pool size
Summary
Scenario |
Replication |
Assays |
Compresion |
Max positive |
Error detection |
Max pool size |
(1) 1:1 |
1 |
6 |
1x |
6 |
No |
1 |
(2) Duplicate runs |
2 |
12 |
0.5x |
6 |
Yes |
1 |
(3) Triplicate runs |
3 |
18 |
0.33x |
6 |
Yes |
1 |
(4) Extreme pooling |
1 |
1 |
6x |
0 |
No |
6 |
(5) Paired duplicates |
2 |
6 |
1x |
1 |
Yes |
2 |
(6) Compressed paired duplicates |
2 |
4 |
1.5x |
1 |
Yes |
3 |
(7) Compressed unbalanced |
1-2 |
3 |
2x |
1 |
Little |
3 |
(8) Compressed triplicates |
3 |
7 |
0.85x |
2 |
Yes |
3 |
(9) Compressed quadruplicate |
4 |
11 |
0.55x |
3 |
Yes |
3 |
600 person company testing
Now imagine that we had a larger company of 600 people with the same conditions. We could construct similar designs and get the following results:
Scenario |
Replication |
Assays |
Compresion |
Max positive |
Error detection |
Max pool size |
(1) 1:1 |
1 |
600 |
1x |
600 |
No |
1 |
(2) Duplicate runs |
2 |
1200 |
0.5x |
600 |
Yes |
1 |
(3) Triplicate runs |
3 |
1800 |
0.33x |
600 |
Yes |
1 |
(4) Extreme pooling |
1 |
1 |
600x |
0 |
No |
600 |
(5) Paired duplicates |
2 |
600 |
1x |
1* |
Yes |
2 |
(6) Compressed paired duplicates |
2 |
36 |
16.7x |
1* |
Yes |
35 |
(7) Compressed unbalanced |
1-10 |
10 |
60x |
1 |
Little |
512 |
(8) Compressed triplicates |
3 |
61 |
9.83x |
2* |
Yes |
30 |
(9) Compressed quadruplicate |
4 |
100 |
6x |
3* |
Yes |
29 |
(*) Starred values indicate that these are the guaranteed number of positives that can be uniquely detected. In practice more positives are detectable but not 100% of the time.
These larger designs show some interesting features about nonadaptive designs:
-
Larger designs can be more compressed
This increased compression is particularly clear for the compressed paired sample. In this case duplicate employee samples are tested with only 36 tests.
-
More compression means larger pool sizes
This is most clearly seen with the extreme pooling where all 600 employee samples are pooled into a single assay. While this assay is compact, if only one employee were positive, the signal from that employee's sample would be diluted by a factor of 600x.
-
Larger designs are non-trival to design
While the smaller designs with 6 employees can be figured out with paper and pen, the larger compact designs with 600 employees require specialized software to design. Devising efficient ways to generate these designs in an active area of operations research and theoretical computer science.
Implementing larger designs is complicated
With a design in hand, the physical process of pooling samples correctly is a challenge. For example, the compressed quadruplicates (#9) for 600 employees involves collecting 4*600= 2,400 samples and arraying these samples out in a very particular pattern into 100 different sample bottles. This mixing process is best handled by robotic liquid handlers.
-
Maximum number of positives remain the same
A weakness of these designs is that the guaranteed number of positive samples it can uniquely identify remains the same small values. However in practice, these designs can often uniquely identify many more individual positive samples, just not in 100% of the cases. Furthermore, if the disease prevalence is low then we would expect the number of positive cases to be low.
A note on design failure: Failing forward
When a nonadaptive design is used in a situation when there are too many positive results, how does it fail? Examination of the designs such as the compressed paired duplicates above show that "failure" here means that the design will:
-
Return negative results only for negative samples (no false negatives)
-
Identify some samples as likely positives if they are the lone sample in a pooled assay that could explain its positive state.
-
Identify some samples as ambiguous if they are "hidden" by other positive results or there are multiple explanations.
This failure mode is a desirable feature of nonadaptive designs because it does not generate "false" results in that it identifies a positive as a negative or vice versa (assuming no assay error). Instead it produces three classes of results of positive, negative, and ambiguous.
Rather than forcing the ambiguous calls into a positive or negative, they are better understood as samples that were omitted from the assay all together. Depending on the policy situation, safety may dictate that these ambiguous calls must be treated as their worst case just to be sure, but don't conflate that policy decision with the fact that the assay is simply shrugging its shoulders and saying "I can't say".