Problem 352
Blood tests
Each one of the 25 sheep in a flock must be tested for a rare virus, known to affect 2% of the sheep population. An accurate and extremely sensitive PCR test exists for blood samples, producing a clear positive / negative result, but it is very time-consuming and expensive.
Because of the high cost, the vet-in-charge suggests that instead of performing 25 separate tests, the following procedure can be used instead:
The sheep are split into 5 groups of 5 sheep in each group. For each group, the 5 samples are mixed together and a single test is performed. Then,
- If the result is negative, all the sheep in that group are deemed to be virus-free.
- If the result is positive, 5 additional tests will be performed (a separate test for each animal) to determine the affected individual(s).
Since the probability of infection for any specific animal is only 0.02, the first test (on the pooled samples) for each group will be:
- Negative (and no more tests needed) with probability 0.985 = 0.9039207968.
- Positive (5 additional tests needed) with probability 1 - 0.9039207968 = 0.0960792032.
Thus, the expected number of tests for each group is 1 + 0.0960792032 × 5 = 1.480396016.
Consequently, all 5 groups can be screened using an average of only 1.480396016 × 5 = 7.40198008 tests, which represents a huge saving of more than 70% !
Although the scheme we have just described seems to be very efficient, it can still be improved considerably (always assuming that the test is sufficiently sensitive and that there are no adverse effects caused by mixing different samples). E.g.:
- We may start by running a test on a mixture of all the 25 samples. It can be verified that in about 60.35% of the cases this test will be negative, thus no more tests will be needed. Further testing will only be required for the remaining 39.65% of the cases.
- If we know that at least one animal in a group of 5 is infected and the first 4 individual tests come out negative, there is no need to run a test on the fifth animal (we know that it must be infected).
- We can try a different number of groups / different number of animals in each group, adjusting those numbers at each level so that the total expected number of tests will be minimised.
To simplify the very wide range of possibilities, there is one restriction we place when devising the most cost-efficient testing scheme: whenever we start with a mixed sample, all the sheep contributing to that sample must be fully screened (i.e. a verdict of infected / virus-free must be reached for all of them) before we start examining any other animals.
For the current example, it turns out that the most cost-efficient testing scheme (we’ll call it the optimal strategy) requires an average of just 4.155452 tests!
Using the optimal strategy, let T(s,p) represent the average number of tests needed to screen a flock of s sheep for a virus having probability p to be present in any individual.
Thus, rounded to six decimal places, T(25, 0.02) = 4.155452 and T(25, 0.10) = 12.702124.
Find ΣT(10000, p) for p=0.01, 0.02, 0.03, … 0.50.
Give your answer rounded to six decimal places.
血液检测
一群共25只羊需要逐一检测是否感染了一种罕见的病毒,已知这种病毒在羊群中的感染率为2%。现在有一种准确而极为敏感的PCR检测,能够对血液样品给出明确的阳性或阴性的结论,但这种监测非常耗时而且昂贵。
出于成本考虑,兽医负责人建议,并不需要进行25次分别检测,而是采用以下的做法:
首先将羊群分成5组,每组有5只羊。在每一组中,将5份血液样品混合在一起进行一次检测。随后,
- 如果结果呈阴性,这一组中所有的羊一定都没有感染病毒。
- 如果结果呈阳性,再进行5次检测(每只羊一次)以确认被感染的个体。
由于每只羊感染的概率仅为0.02,每组第一次检测(对混合血液样品)的结果将会;
- 有0.985 = 0.9039207968的概率为阴性(不需要再进行检测)。
- 有1 - 0.9039207968 = 0.0960792032的概率为阳性(需要再进行5次检测)。
因此,每一组的期望检测次数为1 + 0.0960792032 × 5 = 1.480396016。
由此,所有5个组总共需要1.480396016 × 5 = 7.40198008次检测,比最初的方案节省了超过70%!
尽管上述安排看起来非常有效,它仍然有很大的改进空间(前提是这种检测充分敏感以及混合不同的样品不会有副作用)。例如:
- 我们可以在一开始先混合所有25份血液样品进行一次检测。可以验证,在大约60.35%的情况下,检测结果会是阴性,因此不需要再进行更多的检测。只有在剩下的39.65%的情况下需要进一步检测。
- 如果我们知道在一组5只羊中至少有一只感染了病毒,而前4只的检测结果均为阴性,那么就不需要对第五只羊进行检测(我们知道它一定被感染了)。
- 我们可以尝试其它的组数或每组的羊数,通过调整这些数目使得期望总检测数最小。
为了简化这个问题非常宽泛的各种可能性,我们对于成本最低的检测安排有一个限制要求:如果我们检测了一份混合样品,必须先完全确认该样品中所有的羊(也就是对每只羊给出一个感染或未感染的结论)之后才能检测其它的羊。
在上述样例中,成本最低的检测安排(我们称之为最优策略)平均仅需要4.155452次检测!
现在有一群共s只羊和一种感染率为p的病毒,记T(s,p)为最优策略所需要的平均检测次数。
已知在四舍五入到六位小数时,T(25, 0.02) = 4.155452以及T(25, 0.10) = 12.702124。
对于p=0.01、0.02、0.03、……、0.50,求ΣT(10000, p)。
将你的答案四舍五入到六位小数。