Interpreting Test Results

Due 10/2/2014

For background, read "The accuracy of yes/no classification". We'll look at two cases: breast cancer screening via mammography, and prostate cancer screening via PSA testing. Turn in your answer as a text file -- you can use any method that you like to do the required arithmetic. As usual, it's helpful to explain the steps that lead to your answers.

Breast Cancer

According to Emily Banks et al., "Influence of personal characteristics of individual women on sensitivity and specificity of mammography in the Million Women Study", BMJ 2004:

A sample of 122,355 women aged 50-64 years were followed for outcome of screening and incident breast cancer in the next 12 months. [...] Overall sensitivity was 86.6% and specificity was 96.8%. 

The incidence of breast cancer in that sample was 0.6%. Take that to represent the prevalence of the condition in the relevant population (women aged 50-64).

The cited paper includes this sentence (with numbers removed):

Breast cancer was diagnosed in [N1] women, [N2] in screen positive and [N3] in screen negative women; [N4] were screen positive but had no subsequent diagnosis of breast cancer.

Based on the sensitivity, specificity, and prevalence, can you determine the values of N1, N2, N3, and N4, without looking at the paper? Compare your results to the numbers in the paper.

Based on those numbers, what are the odds that a positive screening result for a woman in that age range will result in a positive diagnosis of breast cancer?

Below is Figure 1 from Per-Henrik Zahl et al., "Incidence of breast cancer in Norway and Sweden during introduction of nationwide screening", BMJ 2004:

The graph shows incidence rates (for different age ranges, different locations, and different collection years) of between about 50 and 350 per 100,000. Recalculate the odds of a positive mammogram meaning cancer, for these estimated rates, assuming that the sensitivity and specificity would remain at 86.6% and 96.8%.

Prostate Cancer

According to Flip H. Jansen et al, "Prostate-Specific Antigen (PSA) Isoform p2PSA in Combination with Total PSA and Free PSA Improves Diagnostic Accuracy in Prostate Cancer Detection", European Urology 2010, the traditional %fPSA test has (estimated) 5.7% specificity at 95% sensitivity, and 22.2% specificity at 90% sensitivity.

That paper reports on a proposed new test -- a statistical combination of a larger set of indicators -- that has (estimated) 23.2% specificity at 95% sensitivity, and 36.2% specificity at 90% sensitivity.

As for prostate cancer prevalence, here's what the National Cancer Institute's Surveillance, Epidemiology, and End Results Program has to say about it:

So the overall incidence of new cases is 147.8 per 100,000 per year; but the rates are highly dependent on age. This figure from Cancer Research UK suggests that the rate for men 50-54 is about 50 per 100,000, while the rate for men 75-79 is about 800 per 100,000:

Assume the new-and-improved test, with 23.2% specificity at 95% sensitivity, and 36.2% specificity at 90% sensitivity. If a 50-year-old male gets a positive test result, what are the chances that he actually has prostate cancer? What about a 75-year-old male?