No Comments

Back to the Lancet

Iraq Comments (0)

Apparently, we need to go back and review a bit more extensively the article on Iraqi civilian casualties which was published in the 29 October 2004 edition of The Lancet.

This is an extra-ordinary article in several aspects. The first of these is clearly the speed with which the article was submitted, peer reviewed and published. The publishers of The Lancet have clearly acknowledged this.

Roberts and his colleagues submitted their work to us at the beginning of October. Their paper has been extensively peer-reviewed, revised, edited, and fast-tracked to publication because of its importance to the evolving security situation in Iraq.

The only thing happening around October involving the “evolving security situation in Iraq” that would justify fast-tracking anything for a four-weeks or less publication schedule would have been the 2 November United States presidential election.

The editors go on to add:

And therefore certain limitations were inevitable and need to be acknowledged right away. The number of population clusters chosen for sampling is small; the confidence intervals around the point estimates of mortality are wide; the Falluja cluster has an especially high mortality and so is atypical of the rest of the sample; and there is clearly the potential for recall bias among those interviewed.

What I’ve centered my criticism of the paper around are those confidence intervals. They are not only wide, they are really unacceptably wide. From their low end estimate of 8,000 they then spread over 186,000 possible casualties up to 194,000 to obtain a 95 CI. When I was taking political science courses back in the mid 1990s, we were required to use a 98 CI.

Now, let’s talk about what a “confidence interval” is. The Slate article I mentioned really makes it a bit more simple than it actually is, or than most people care about, but since I’m going to take the time to talk about this again, I’ll do it right. A confidence interval is an interval in which a measurement or trial falls corresponding to a given probability (source: Eric W. Weisstein. “Confidence Interval.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/ConfidenceInterval.html).

When we apply this to the statement of Roberts, et al., we find that they are saying “we are 95% confident that the actual number of casualties is some where between 8,000 and 194,000 and that repeated observations over time will show that it is between 8,000 and 194,000.”

This statistic works well in epidemological circles when extrapolated to entire nations, and this would be acceptable in the context of the entire nation of Iraq if you assume a somewhat constant and consistent level of insurgency. However, as the Falluja cluster demonstrated, some areas are not pacified and others are. When you narrow the universe from which the samples are taken, you expect that the range in which your CI 95 occurs to narrow as well. What is an acceptable range at 25 million is not acceptable at 10 million.

This fact is demonstrated by the paper itself:

After the invasion, 142 deaths were reported in 138439 person-months of residency. Before the invasion, respondent households reported 46 deaths during 110538 person-months of residency. As mentioned above, the Falluja cluster is an obvious outlier and might not belong with the others. When included, we estimate that the rate of death increased 2.5-fold after the invasion (relative risk 2.5 [95% CI 1.6-4.2]) compared with before the war. When Falluja was excluded, we estimated the relative risk of death for the rest of the country was 1.5 (95% CI 1.1-2.3).

Now, consider what Number Watch has to say about “relative risk numbers”:

In observational studies, they [publishers - MLC] will not normally accept an RR of less than 3 as significant and never an RR of less than 2. Likewise, for a putative beneficial effect, they never accept an RR of greater than 0.5. Sometimes epidemiologists choose to dismiss such caution as an invention of destructive sceptics, but this is not the case. For example:

In epidemiologic research, [increases in risk of less than 100 percent] are considered small and are usually difficult to interpret. Such increases may be due to chance, statistical bias, or the effects of confounding factors that are sometimes not evident. [Source: National Cancer Institute, Press Release, October 26, 1994.]

This strict view of RRs may be relaxed somewhat in special circumstances; for example in a fully randomised double blind trial, as opposed to an observational study, which produces a result with a high level of significance. (emphasis in original)

And take note again of the circumstances here. This is not a double-blind trial, but a collation of anecodotal stories. The range of the relative risk is anywhere between 1.6 (not significant) and 4.2 (which would, in fact, be pretty significant) and their split-the-difference number is 2.5, which is considered only very marginally significant. But, that includes Falluja. If you exclude Falluja then the relative risk falls to a range of 1.1 (insignificant) and 2.3 (extremely marginally significant), with their middle of the road number being 1.9 (also insignificant).

Since the paper’s authors admit that Falluja is an outlier that “might not belong with the others” we see that the only way that the authors can get their findings to bear any publishable risk significance at all is to include the outlier. If you drop that outlier, then there is no call to publish the paper at all unless you’re writing only about the worst case scenario (something professional journals are loathe to do on such matters). You only get this range if you include the outlier and then extrapolate that data to all Iraqi provences, both pacified and unpacified.

Further, you have to deal with truthfulness on the part of the Iraqis who may not be telling the whole truth about how family members died. The editors of the Lancet mention this generously as “recall error.” While not calling them liars, the paper’s authors do state, “Many of the Iraqis reportedly killed by US forces could have been combatants.”

So, what’s the answer? How do you get better precision?

That’s simple and obvious: you get more clusters and increase the number of samples. This is an approach rejected by the authors. As the editors of The Lancet stated: “To have included more clusters would have improved the precision of their findings, but at an enormous and unacceptable risk to the team of interviewers who gathered the primary data.” I don’t blame the study’s authors for taking safety into consideration, but they should not blame me for calling them on the “limited precision” of the findings that come about as the result of their abundance of caution.

MickC @ November 21, 2004

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>