Medical Kaizen, Cont'd

Name: Virginia Postrel
Address: AL, US

Soxblogger James Dwight responds to the email from Professor Postrel, which I posted below. The discussion is long, but well worth reading.

Thank you for a thoughtful post and letter. Obviously I disagree with some of your sentiments, but I appreciate the cordial tone of the conversation. As you know, we bloggers receive worse at times.

By way of response, I have several points:

1) As I mentioned in one of my final posts on the subject, I regret even addressing Dr. Gawande's theory about physicians being distributed on a bell curve. While I disagree with the theory, I do find it provocative and indeed obvious that there is a bunch of below average physicians out there (roughly 50% by my math). But my sole beef (and the cause of my vehemence) was with Dr. Gawande's methodology concerning the CF community.

As Steven's letter acknowledges, Dr. Gawande reaches some dramatic conclusions based on a crude univariate study. Looking at two different CF treatment centers, Dr. Gawande found the patients' life expectancy at one to be 50% higher than at the other. He attributed this disparity completely and entirely to the differing skills of the physicians at the centers, but never addressed or considered other possible contributors to this outcome. Is it too much to ask that such a study be a careful multi-variate study?

I will, however, acknowledge that there is nothing in the doctor's article that disproves his thesis. Once again, my approach was flawed in so much as I addressed his theory where I should have limited my critiques to his sloppy analysis of the CF centers' data. My sole interest here was to reassure members of the CF community that Dr. Gawande's conclusions regarding the Cincinnati doctors and the implications those conclusions might have for other physicians were not necessarily justified and would most certainly not be justified by a simple glance at a CF center's mortality rates and its patients' lung functions.

My misguided tack in regards to not limiting myself to the doctor's CF center analysis has caused several observers to overlook my principle complaint and focus on defending the Doctor's theory. Moreover, I would agree that discarding the doctor's theory because of his suspect CF center analysis would indeed be throwing the baby away with the bath water. His thesis most certainly merits further study.

2) All of the numbers I cite are easily available at CFF.org except for the average number of patients per accredited CFF center and that was based on a simple calculation (going off the top of my head, there are roughly 115 centers and roughly 23,000 patients at those centers - that's where the number of 200 per center came up). If you truly doubt my contention that a CF patient's genetic variant is by far the largest determinant of his fate, ten minutes of research will dispel the doubt. There is nothing in any of my pieces that is not completely accepted by the entire CF medical community and the medical community at large.

3) The Minneapolis center is a great center, but it does nothing differently from the other 115 centers in terms of wildly different treatments or anything else that could be labeled innovative. The doctors at Minneapolis would tell you as much. Ironically given the terrain of this conversation, the greatest benefit of the CFF's Quality Improvement Initiative has been the rapid diffusion of information across the network. Any successful innovation at a CF center literally anywhere in the country becomes common practice almost overnight.

What the Minneapolis center might well do is get its patients to comply with a grueling CF treatment regimen marginally better than the physicians at other programs. That is a factor that I addressed in one of my posts. I believe The Foundation has done studies on the difference made by patient compliance. Sadly, this is another area of inquiry that Dr. Gawande completely ignores.

4) To address a few specific points from your letter (I offer your quotes first and then my response in italics):

"Unless patient turnover is negligible, that militates strongly against the suggestion that it is all the luck of the genetic draw."

To the best of my knowledge, patient turnover is indeed negligible. Again, this is the kind of thing Dr. Gawande clearly should have considered and quantified before reaching his conclusions. I'd be quite surprised if there isn't solid data available on this issue.

"If patient turnover is negligible, then at best one could argue that there are genetic types that are not only easier to treat but that have initially hidden pathways to improvement not present in other types."

Before addressing that point, I should mention my credentials. I'm 37 years old and have CF. I'm also a Harvard grad and a lawyer. Mastering the body of knowledge that exists regarding CF is an achievable task for any well motivated and reasonably intelligent person. It is the good side of the sad fact that the body of knowledge regarding CF is so slim.

So, respectfully, your last quote indicates a lack of awareness regarding CF. As a rule, CF patients don't improve, they only get worse. CF has no cure and more relevantly, no control either. Those with the milder genetic defects decline at a slower rate. Their superior fate comes about because of this factor, not because they "get better." Sadly, we seldom get better. Arresting the disease's progress and on occasion regaining some incremental lost ground is the best we can hope.

I would also guess that it is not the "bad" cases that skew a center's numbers, but the "good" cases. A cluster of mild cases in Minneapolis several decades ago would have skewed their numbers from literally the 1950's into the present day. Again, a more thorough analysis than Dr. Gawande's would be instructive in this regard. Which brings me back to my original beef - Dr. Gawande made dramatic conclusions without performing the necessary research.

5) Finally, regarding the CF Foundation's lauding of the New Yorker article, the Foundation's statement in fact lauds itself, not the New Yorker article. The Foundation has most certainly not issued a statement that even in the slightest way praises the article's methodology, although the Foundation does appear to be quite satisfied that Dr. Gawande compliments their QII. The QII is indeed a visionary initiative, which makes it all the sadder that it has become imperiled by Dr. Gawande's article. Although I don't have any inside information regarding the Foundation's thinking regarding this piece, I'm quite confident they earnestly hope that this article disappears before at least 50% of the CF community begins screaming holy murder regarding their sub-par or mediocre physicians based on the flimsiest of data.

In conclusion, Dr. Gawande's analysis regarding the CF centers was gravely flawed and remarkably irresponsible. As a physician and a man of science, he should have known better.

I appreciate the opportunity you've given me to forward this conversation. Best wishes to both of you for a happy and healthy New Year,

James Dwight

Here's Steve's reply:

Thank you for your gracious response to my email. I understand why you believe it so important to avoid spreading erroneous information about CF treatment. I share your concern. What I do not share is your conviction that inter-clinic differences in treatment are likely to be unimportant factors in explaining differences in patient outcomes. Such a conviction contradicts the evidence in many other fields, from software to manufacturing, where large performance differences across superficially similar units are pervasive. The sources of these differences and the factors explaining their persistence are an important topic in modern business strategy research.

It is also clear that we agree on the need for serious econometric analysis of the CF QII data. There are plenty of experts in economics or health policy qualified to perform such work--I wonder if or when they will have an opportunity to do it. In Bayesian terms, our priors about the results of such analyses differ; you expect that most inter-clinic variations in lung function are due to the luck of the draw, while I suspect that systematic organizational factors are at work. But I am glad that you are not ready to dismiss the thesis of inter-clinic performance differences ('the baby" in your terms) simply because of a lack of rigor in Gawande's discussion ("the bath water").

We also disagree about the merits of Gawande's article. You point out that he should have mentioned the influence of genetic factors, and I can hardly disagree, but without quantitative evidence I am not prepared to say that these differences play any significant role in explaining the differences across clinics. Unless you believe Gawande deliberately suppressed information about genetics, I find it hard to believe that either the doctors in Cinncinati or those in Minneapolis would have failed to mention this if it were an important factor in their success rates. Especially in Cinncinati, where they were trying very hard to improve and wracking their brains to close the gap, it would surprise me greatly if no one said "Hey, maybe we're doing everything right but just got stuck with tougher patients." In my experience, that type of self-justification tends to appear quite frequently even when there is no evidence to support it; if your theory of random differences were the accepted explanation, the Cinncinati staff surely would have mentioned it to Gawande.

Furthermore, you write as if Gawande gave short shrift to the influence of patient compliance. To the contrary. His long illustrative anecdote about why Minneapolis is different and better highlighted the unusual and determined effort the lead physician made to ferret out hidden acts (and causes) of noncompliance. The anecdote also showed how cabining off the effects into "patient compliance" is misleading, because the way in which non-compliance was detected was through a much more meticulous tracking and accounting for small changes in lung function and weight. It is highly reminiscent of the quality revolution in manufacturing, where statistical analysis tells you when your process is drifting and you then deploy an exhaustive diagnostic process (fishbone charts, "ask why five times,", etc.) to hunt down and eradicate the source of the problem. Minneapolis was doing this and Cinncinati was not, at least in Gawande's account.

At the risk of taxing your patience, I'll briefly comment on your responses to my seven points:

1) On the truncation of genetic variation in the snapshot clinic population due to rapid attrition of the hard cases, we appear to agree. You argue that the real issue is a long tail of mild cases. I would like to see some data on the true variation in means across genetic types with and without the truncation effect. I suspect that the actual range of severity found with truncation is quite a bit smaller than the theoretical range possible.

2) From your response, it appears you do not disagree that the number of genetic variants is not particularly crucial; what matters is the range of severity among those variants. We do not know those numbers. You say that "a CF patient's genetic variant is by far the largest determinant of his fate," but the precise empirical question brought up by the QII data is how much "by far" really means in quantitative terms compared to clinic performance. And of course, even if genetic variant explains most of the variance across the entire population, there could still be huge differences in clinic performance conditional on a specific genetic variant.

3) It seems we now agree that it is possible for physicians (or clinics) to follow a bell curve in effectiveness even if they come from a rigorously selected population. A possibly picayune point here is that the fraction of physicians falling below the mean could be much greater or less than 50% if the distribution were NOT symmetrical like a bell curve. In a world with a few superstars and the rest clumped fairly closely together, well over half would be below average, while in a world with a few superduds and the rest clumped together well over half would be above average.

4) You assert that the Minneapolis clinic "does nothing differently from the other 115 centers in terms of wildly different treatments or anything else that could be labeled innovative." This claim captures, in a nutshell, why total quality programs are so hard to implement. People simply have trouble accepting that many small, subtle differences in approach can cumulatively yield drastically different performance levels. The patient compliance methods used at Minneapolis are, you concede, different and possibly superior, as Gawande reports in detail (not "completely ignores," as you state in your response). As I mentioned above, these techniques appear to be part and parcel of a different approach to monitoring and intervention. And I continue to find it unlikely that the clinic that pushes these techniques with such unusual vigor would also just happen to get the best genetic draw from its patient pool.

5) You present an intersting theory that the persistent and IMPROVING performance gap between the "best" clinics and the others is due to fluke clusters of mild cases taken in forty years ago. Because these mild cases deteriorate more slowly than the others, the clinics that have them will show relative improvements over time; and since these mild cases live longer, they will keep skewing the statistics over a longer period of time. This is a theory could be tested if the genetic data are broken out in the QII program. Note that your theory is just about the only way to rationalize the long-term and increasing superior performance of the leading clinics.

I am skeptical of this explanation for a few reasons. First, there has to be plenty of patient turnover, simply due to deaths, relocations, new cases, etc. This would tend to wash out lucky genetic draws unless certain geographical areas have distinct genetic profiles (not out of the question, I suppose--lots of Scandanavians in Minnesota, lots of Germans in Cinncinati, etc.--but I'd really need strong evidence to buy into that). Second, Gawande reports that some CF patients do improve when their treatment regime changes--the Minnesota clinic, for example, believes that it can get lung function to go up from 90% to 100% or better by improving patient compliance.

6) I still think it is relevant that huge performance differences across hospitals have been found on items as simple as giving pneumonia vaccines to elderly patients as indicated by best-practice guidelines. This suggests that it is not far-fetched that CF clinics would also show significant performance differences.

7) We will have to agree to disagree about the CF Foundation's website press release on the Gawande article. You are correct that they are not "laudatory" to the article itself, as I initially wrote, but they seem very happy about it and express no qualms of any kind about how it is interpreted or whether it will scare off clinics from participating. If they had such concerns, I can't imagine them not putting up some kind of caveat. If they "earnestly hope that this article disappears" then putting out a positive press release seems quixotic.

My conclusion is that Gawande's article, while falling short of the standards of social science, provides a useful journalistic window on the issue of performance differences among medical practiitoners. If it turns out that this story is indeed a statistical chimera produced entirely by natural variations (like the many bogus "cancer cluster" scares), then either Gawande's sources were unaware of it or Gawande is grossly dishonest or incompetent. But if, as is more likely, these performance differences are real and not artifacts, then Gawande has performed a service to the public understanding.

I am going to cc this missive to Virginia--I don't know what she'll do with it. I understand if you are tired of this subject. I do not intend to continue this colloquy any further, so you may have the last word or ignore it as you choose.

Best of wishes for the New Year.

--Steven Postrel

By mutual agreement, that ends their public discussion.

Posted by Virginia Postrel on January 05, 2005