The Major Flaws With The Lancet Reports On Iraqi Deaths, Part I
In 2004 and 2006, the British medical journal The Lancet published two reports by an American-Iraqi survey team that estimated the number of deaths that occurred after the 2003 U.S. invasion of Iraq. Together, they popularly became known as the Lancet Reports. They immediately gained notoriety, because their numbers were far above any others. The first one estimated 98,000 excess deaths from March 2003 to September 2004. The second had a figure of 654,965 killed from March 2003 to July 2006. The two studies had major flaws with them that undermined their findings. Four of them were the timing of their release, the conduct of the survey teams that interviewed Iraqis, the fact that their methodology and protocol were not always followed, and the writers’ refusal to share their data. While the two reports were received well by the public, this ignored the major flaws in their work.
The Lancet studies used a standard cluster sample survey where people were randomly selected to be interviewed. The research and writing was done by a team from Johns Hopkins University, Columbia University, and Baghdad’s al-Mustansiriya University. The point of their study was to estimate how many excess deaths had occurred after the 2003 overthrow of Saddam Hussein. To do that, the first Lancet report compared the 14.5 months before the U.S. invasion to the 17.8 months afterward covering the period from January 1, 2003 to September 20, 2004. The teams randomly selected districts within provinces using random numbers, then neighborhoods within those, than randomly selected a street, then a house on that street, and then interviewed 30 consecutive houses. In total, the teams conducted surveys in 33 clusters, comprising a total of 988 households, and 7,868 people. Each cluster represented 1/33 of the Iraqi population, or 739,000 people. Because not all provinces had that population, some were put together, and then the teams randomly selected which of the two would be visited. In the end, Baghdad, Ninewa, Sulaymaniya, Salahaddin, Babil, Karbala, Dhi Qar, Maysan, Diyala, Wasit, and Anbar ended up being surveyed. Each household was asked about how many people lived there, their ages and sexes, the movement of people in and out of the household, how many births there were, and how many deaths since January 2002. Names were not supposed to be recorded for security reasons. Only 5 of the 988 households refused to participate. When a death was reported, the team asked for a death certificate to ensure that the fatalities were not faked. 80% of the time, a certificate was presented.
The first Lancet report raised eyebrows, because of its high estimate of how many Iraqis died after the U.S. invasion. When all the results of the survey were tabulated, the number of deaths from the 17 months after the March 2003 invasion was subtracted from the number of deaths in the previous 14 months to determine the number of excess deaths. Because 71% of the deaths recorded occurred in Fallujah, it was considered an outlier, and not included in the totals. The estimated pre-war mortality rate was 5 deaths per 1,000 people per year. After the war, not including Fallujah, it increased to 7.9 deaths. That led to a range of possible post-invasion deaths of 8,000-194,000, with 98,000 being the most probable. If Fallujah was included that last figure would have surpassed 100,000. The leading cause of death was violence, most of which came from Coalition forces, especially air strikes. When the report was published it caused shockwaves for years afterwards. No other study or organization had come close to 98,000 Iraqis killed in the period immediately after the invasion. That was only outdone by the second Lancet paper.
The 2006 Lancet was put together by the same American-Iraqi team, and followed the same methodology. The second Lancet survey was conducted from May 20 to July 10, 2006, and included 16 of Iraq’s 18 provinces. Those were Baghdad, Ninewa, Basra, Sulaymaniya, Dhi Qar, Babil, Irbil, Diyala, Anbar, Salahaddin, Najaf, Wasit, Qadisiyah, Tamim, Maysan, and Karbala. Survey teams went to 50 randomly selected clusters to interview 40 households each for a total of 1,849 with 12,801 people. Three clusters were thrown out for problems, leaving 47 to be included in the findings. The questions asked were the same for the first Lancet: the age and sex of each household member, how many people lived there, who moved in and out, the number of births, and the number of deaths, followed by the asking for a death certificate if a fatality was mentioned. This time, 92% of the time a certificate was asked for one was presented. Again, no names were supposed to be included for security purposes.
The second Lancet had an even higher number of estimated deaths. The second report covered the time period from January 2002 to July 2006. It estimated a pre-invasion mortality rate of 5.5 deaths per 1,000 people each year, compared to 13.2 afterward. That led to 7.8 excess deaths after March 2003, and a possible range of 392,979-942,636 fatalities, with the highest probability being 654,965 killed after the U.S. invasion. 601,027 of those were from violence. The authors claimed that since the 2006 results were very close to the 2004 one that proved that the two studies were valid. The second Lancet received even more coverage than the first, forcing world leaders like President George Bush and Prime Minister Tony Blair to respond.
Initially, both Lancet reports were received positively. Bradley Woodruff of the U.S. Centers for Disease Control and Prevention believed that the methodology of the 1st Lancet was sound for instance. Similar comments were made about the 2nd Lancet. Some of the Lancet authors also had experience conducting surveys in war zones such as Bosnia, Congo, and Rwanda adding to their credibility. To this day, various news reports and books mention the second Lancet figure for how many Iraqis might have died during the Iraq War showing that the reports are still widely accepted.
At the same time, there were those who were more skeptical. The main reason was the magnitude of the estimate of excess deaths caused by the war. There were no other figures that came close to either of the Lancet studies. Iraq’s Health Ministry for instance, reported approximately 50,000 deaths by 2006, compared to the 654,965 of the second Lancet. Iraq Body Count had 53,916 deaths up to that point. To understand exactly how many fatalities the second Lancet was implying, Mark Van der Laan and Leon de Winter in a paper for the University of California at Berkeley looked at one specific period covered in the study. The second Lancet broke up its results into three time periods, the last of which was June 2005 to June 2006. During that time, the teams recorded 165 deaths. When extrapolated for the entire Iraqi population, that would equal 330,000 people dying across all of Iraq for that one-year period. That would break down to 27,500 deaths per month, 6,875 per week, and 982 per day. 40,000 of those deaths were caused by air strikes, 60,000 were from car bombs, 40,000 were from other types of bombs, and 174,000 were from shootings. In comparison, Iraq Body Count recorded 21,593 deaths for that same period. Iraq Body Count does not rely upon survey work for its figures, but rather uses press reports, which obviously do not capture all of the violence across Iraq, but is still a good starting point for a comparison. If the second Lancet was correct, that would mean that the media missed 93% of the deaths in Iraq from June 2005 to June 2006. Some questioned whether they could be doing such a bad job. Iraq Body Count and Iraq’s Health Ministry also found around three people wounded in the war to every Iraqi death. If the second Lancet’s death figure was correct that would mean 1.8 million people were also wounded, approximately 1 in every 15 Iraqis in the entire country. Hospitals only recorded a tenth of that number, which would mean 90% of those injured, around 1.6 million, never went to be treated for their wounds. Not only that, but there were other surveys done in Iraq that came up with lower estimates. One was conducted by the United Nations’ Development Program (UNDP) from April to May 2004, four months before the first Lancet that estimated a range of 18,000-29,000 Iraqi deaths. In October 2004, a Norwegian team did the Iraqi Living Conditions Survey for the U.N. that estimated 23,743 civilians killed in the 13 months after the March 2003 invasion. Finally, in 2008 the U.N.s World Health Organization published a report in the New England Journal of Medicine that estimated 151,000 deaths from March 2003 to June 2006. All of those surveys covered a much larger proportion of the population, and were more rigorous than the two Lancet reports. Given all those other estimates it was easy to understand why some would not take the Lancet reports at face value. That led to a healthy debate about the validity of the two reports. The problem was that most of that was done in academia, and never heard by the general public.
One major problem with the Lancet reports was the timing of their publishing. One of the Lancet authors Dr. Les Roberts of the Columbia University’s Mailman School of Public Health pushed The Lancet journal to publish the first survey just before the 2004 U.S. presidential election. The fieldwork was finished in mid-September 2004. Roberts went through the data, analyzed it, and wrote up the paper just before the end of the month. He claimed that in that short period it was peer reviewed and edited. He then emailed the results to The Lancet on September 30 on the condition that it would be published before the vote in America. Roberts later said that he did not want to the influence the election, but rather wanted the candidates to “pledge to protect civilian lives in Iraq.” Roberts was also openly anti-war. Lancet editor Richard Horton agreed to fast track it saying that it should be made public as quickly as possible, because of what it revealed about the affects of the war. Horton was anti-war as well. The paper was finally published on October 29, 2004, the Friday before the balloting in the United States. In a similar pattern, the second Lancet was published right before Congressional elections. Roberts would later give different reasons for why he rushed the publishing of the Lancet reports. At one time, he claimed that he had no political motive for the timing of the first Lancet being published, and that if it did not appear immediately it would look like the authors were trying to cover-up their results. Roberts also told people that Riyahd Lafta of al-Mustansiriya University who organized and managed the survey teams in Iraq, and all the people who were interviewed could have been killed if the first Lancet did not come out when it did. The fact that Roberts and Lancet editor Horton were both against the Iraq War, and were likely motivated to get the first survey out as quickly as possible, despite Roberts denial of a political agenda, was not the issue. His statements about wanting to protect the Iraqi survey teams and those interviewed proved facetious as Lafta told the Chronicle of Higher Education that he was never in danger, and Roberts later revealed that when the surveys were conducted everyone in the neighborhood was informed about it by word of mouth, so who was involved was no secret back in Iraq. The Lancet paper also included no details about where the survey clusters took place, so how publishing the paper was supposed to protect anyone was a mystery. The real issue was Roberts claim that the first Lancet was reviewed and edited in just a few days after he finished writing it up. Most papers take months for that to happen. That raised questions that he might have never had the survey checked, because he wanted it out as soon as possible to shape the elections. Peer review is a basic tenant of any published research work, so the veracity of the first Lancet had to be brought up.
A second major issue was how the survey teams could interview so many households in one day. Roberts said that in 2004, the teams took a long time to do their work. One rural cluster took 6 hours to complete. Later on the authors changed their story saying that on average one team could interview their allotment of 30 houses in just three hours. That would average out to only six minutes per house with no breaks. In 2006, the authors said that the teams could visit all 40 households in just two hours. That would mean it would only take three minutes each to do an interview. This raised important questions. In order to do their work, a team would have to map out an area to be surveyed, conduct a random selection process for which street to start from, then number all the houses in the area, and randomly pick a house to start with. After they knocked on the door of the first house they would have to introduce themselves, explain the surveys, ask for permission to carry it out, go inside the house and meet the rest of the household before even conducting the poll. That would include recording the number of people in the house, writing down their ages, genders, births, any moves, and deaths. If a death was reported they asked for a death certificate, which the family would then have to find and show to the team, and almost every single family had these papers according to the two Lancet reports. The team would then have to say goodbye, and go to the next house. All of that would be very difficult to accomplish in just 6 minutes, let alone 3 minutes. The survey teams were also smaller in 2006 than 2004 meaning there were fewer people to do more work. The responses by the Lancet authors to these criticisms only added to the controversy. Gilbert Burnham of the Johns Hopkins Bloomberg School of Public Health commented that the 2006 teams were able to accomplish 40 interviews in one day, because they split up into two groups when they got to a neighborhood. That was not mentioned in the Lancet article. It was also contradicted be an Iraqi survey member who said they never broke up their teams, and by Roberts who stated that the teams only split up in 2004, not 2006. At another time, Roberts stated that the interviews were sped up by having children spread the word through the neighborhood about the survey. Again, this was not included in the Lancet piece. One possible explanation for this issue was that the teams spent much more time in the field, but didn’t tell the authors. Still, even if they worked 10-12 hours straight in 2006 that would only give them 15-18 minutes each, a still difficult task. A more troubling possibility was that the teams did not go to their allotted number of houses each day, and faked their results. This could have been resolved if the authors released the start and end times for the survey work. This proved impossible, because according to them that information was destroyed to protect the interviewees’ safety. On the one hand, Burnham was either confused about the teams breaking up in 2006 or lied about it, and then he and the others conveniently destroyed all the records that would show what really happened. What the teams actually did, and what the authors knew about it was thus an open question, and pointed to a lack of oversight of the survey work.
To add to this, the authors admitted that more things happened in the field than what they wrote about. The survey teams were supposed to use random numbers each time they picked a district, neighborhood, street, and house. Instead, Roberts later said that the teams would either number the houses, write them on pieces of paper, and then pick them out of a hat or assign numbers for the houses and use serial numbers off of money to pick which one to start with. Both are random processes. The issue was that this was just another example of the published report being different from what actually happened. If the teams didn’t use random numbers, spent more time in the field than they claimed or didn’t even go to all the houses they said, what else might be wrong with the Lancet report?
Of even greater importance, the survey forms included a spot to record people’s names, a violation of protocol, and the authors lied about it. In February 2009, Johns Hopkins University released its findings on the methodology used in the second Lancet report. It went through the survey papers, and found that those used in the field were different from the ones created by the authors, because there was a spot to record the names of those interviewed. Burnham and company knew about this, and repeatedly lied about it. Burnham and co-author Shannon Doocy said that they met with Lafta in Jordan as soon as the fieldwork was done to verify all the data collected. They claimed that the forms did not include names. At a speech and in an interview with Johns Hopkins Magazine, Burnham stated the same thing. Burnham also did an interview with the National Journal claiming that the survey was conduced just as planned. After the University’s investigation came out, Burnham changed his story to say that the first names of people were recorded to make sure there was no repetition, but not their last. That was contradicted by Doocy who was asked about the matter, and said that some of the survey papers had people’s last names on them, meaning that she was aware of the violations as well, and lied about it before. Johns Hopkins found that the surveys included both first and last names. As a result of this revelation, Burnham was censured by the school, and barred from participating in any further fieldwork. He repeatedly said he was deeply worried about the safety of all the Iraqis involved in the survey, but then he violated one of his own security measures to protect them. Not only that, but he and Doocy lied about the names being recorded. Again, this brought up serious concerns about author’s credibility, and the conduct of the surveys as more information came out about what actually happened in the field, and the lack of quality control and the following of protocol.
A final issue was that the authors refused to share their data. The Lancet authors have only released a limited amount of their material, and given it directly to others, rather than circulate it to everyone. Not only that, but Lafta refused to talk with reporters or answer any questions about how he conducted the fieldwork in Iraq. Likewise, Roberts said that he couldn’t release any information, because of confidentiality requirements at Johns Hopkins, and that it could endanger the lives of those interviewed back in Iraq. He would later give a speech in 2007 where he stated that he didn’t want to share the Lancet statistics with anyone, and if it was up to him, it never would be made public. Instead of looking at the Lancet data, Burnham suggested that researchers could verify their work by going to graveyards in Iraq and counting the number of dead or going to morgues. He would later say that morgues were unreliable. They would not support him either as the Health Ministry and Baghdad morgue recorded only 50,000 deaths by June 2006. It is common for researchers to make their data public, so that others can repeat their work, check it, and critique it. That was impossible with the Lancet studies, because the authors did not release their fieldwork except in limited form to specific people. Burnham ended up being censured by the American Association for Public Opinion Research in 2009 for not sharing the survey results. This was just another cloud hanging over the Lancet work. What did the authors have to hide that they did not want to share their results, which is a common practice? Could there be anomalies in the statistics that could point to fraud? No one can know for sure, and the critics cannot be answered, because of the authors' stubbornness on this issue.
The two Lancet studies taken at face value seemed like legitimate estimates of the number of Iraqis that might have been killed after the 2003 invasion. Many have used cluster sample surveys, and the published methodology seemed sound. When people began digging into their findings however, there were a myriad of serious problems. One of those was the possibility that the first Lancet was never peer reviewed before it was published, because one author wanted to influence the 2004 presidential election in America. Others were the questions about what the survey teams actually did in the field, the fact that the authors continuously changed their story of what happened and the methods used, and even lied about it, and the general lack of oversight of the fieldwork. Finally, no one could thoroughly verify the two reports, because the authors only released parts of their data to specific people, a violation of basic research and writing protocol in academia. There were many more anomalies as well, which brought into question whether the Lancet reports were believable or not. The authors’ response to these issues never resolved anything, but rather dug them deeper into a hole. That left them little ground to stand on, and provided more reasons to doubt their work. Ultimately, there were far more questions about the two Lancet reports than answers.
*With an MA in International Relations, Joel Wing has been researching and writing about Iraq since 2002. His acclaimed blog, Musings on Iraq, is currently listed by the New York Times and the World Politics Review. In addition, Mr. Wing's work has been cited by the Center for Strategic and International Studies, the Guardian and the Washington Independent.
16 August 2012
14 August 2012
20 May 2013 8:46
20 May 2013 8:46
20 May 2013 8:24
489,793 views since 15 May 2008 5:26
307,263 views since 20 Sep 2006 8:46
252,789 views since 09 Apr 2007 10:54
233,328 views since 12 Oct 2007 6:51
225,200 views since 09 Jan 2008 10:20
499 views since 13 May 2013 6:53
318 views since 14 May 2013 10:22
230 views since 15 May 2013 10:16
225 views since 16 May 2013 8:23
217 views since 14 May 2013 10:21
798 views since 25 Apr 2013 19:58
718 views since 08 May 2013 8:51
613 views since 29 Apr 2013 20:17
527 views since 01 May 2013 19:53
519 views since 24 Apr 2013 20:08