Observed coincidences occur far more often than chance would suggest because we look for them and define our list of what would constitute an interesting coincidence based on what actually occurred (we have no intuition for just how huge the denominator is). I know that. Still, I find it pretty remarkable that for the last couple of days I have been trying to nail down exactly how to explain what is missing from the behavioral modeling by FDA CTP, and then discovered in my morning econoblogosphere reading, the answer I needed seems to be the topic of the hour.
Yesterday I posted some advice to FDA CTP about the need to understand social science (mainly economics) in their modeling of behavior. There is relatively little economics that needs to be considered in traditional FDA missions, and that which is needed is relatively simple. But regulations that are intended to affect preferences about a freely-chosen consumer product where preferences vary across the population (i.e., like tobacco products, in contrast with medicines or food safety) are all about economics. The failure to include explicit economic analysis in the recent report on the possibility of banning menthol cigarettes illustrated the problems, both scientific and ethical.
But if you were to suggest to the people working in the FDA orbit that they do not really have a model of people's choices about tobacco products (as I have argued), they would probably reply that they do have models. Several of them. (For those who are familiar with this field I am, of course, talking about Levy, Mendez, Environ. For those not familiar, that should present no obstacle to understanding this post -- just know that I am talking about a handful of well-known specifics.)
Ok, there are models. But there is something fundamentally wrong with them. They do not offer us any reason to believe in what they say will occur at the micro level (that is, why each individual person whose actions, collectively, result in the outcome, will do what they suggest they will do). What I mean is that they tell us things like "if X% fewer people start smoking each year", say, due to a menthol ban, "then this graph shows the number of smokers in the future, which is lower than current trends by Y" (if they fill in other information and make a bunch of other assumptions about what is happening, of course). But as for why X% fewer people would start smoking, there is nothing at all. There is just the number.
I have criticized these as being more like calculators than models. If a population starts at 1 and doubles, every period then the number after n periods is 2^n. But it is hard to call the equation "2^n" a model. Similarly, I argued, the preferred "models" used by the tobacco policy inner circle are just more complicated equations.
I learned, however, that this point was not widely convincing based on sociological empiricism -- i.e., I tried to make the point to people and did not have much success. I realized perhaps why this was based on my blog reading from this morning: I was using the wrong words to make my point. It is not that these are not models; any calculation, no matter how simple, can be called a model if it is representing a worldly phenomenon in some useful way (even that lowly 2^n). The problem with these models, and the reason they failed as legitimate models, is the lack of mechanism.
A model uses numbers and equations to show how one variable/construct/point-in-time/etc. affects others. But the model may not capture why a particular effect occurs (the mechanism), as with that X% reduction in smoking initiation. In such cases, it is really just answering a hypothetical question ("if X were true, then Y") rather than making real predictions ("X appears to be true, therefore Y"). But the models that would be useful for FDA purposes are ones that tell us "therefore" not just "if...then". Of course, every model is going to have some simplified or hypothetical elements (if there are no simplifications it is not a model, it is reality) and, once again, there are no bright lines since the mechanisms generally are abstractions (i.e., models) in themselves. But for a model to offer predictions that do not just result from hypothetical inputs, there has to be some "why" built into it.
It seems that the problem is that these models have been developed in a world where the only familiar social science is epidemiology. Epidemiology usually fails as a social science, and as a science more generally, because there is very little attention paid to mechanisms. That it fails as a social science is fairly easy to explain: most people doing it do not realize they are engaging in social science, and most people teaching it have no background in social science. They think they are just doing medical trials. Sometimes this is literally true, of course, and sometimes the observational epidemiology is legitimately an attempt to substitute for medical trials. But as soon as what is being studied is not purely biological, and involves people as people, not just as organisms, it is social science. Medical trials are easy because either the mechanism is obvious or it does not matter -- e.g., this drug makes cancer go away, and we probably have a guess about why, but that guess does not matter because the mechanism does not matter to the epidemiology (though obviously it does for the drug development process).
The failure to be good science at all is less easy to explain or defend. For almost 15 years, mechanism-oriented methods have been developed and taught (in the few good epidemiology departments). These tend to be pretty simple, just boxes and arrows that show what is causing what, but that is most of what you need. Unfortunately these are (a) seldom used at all and (b) almost exclusively used just for identifying confounders. The latter is useful, of course, and doing it is far better than not doing it. But what is missing is use of these mechanistic models to address questions like "if X is really causing Y by affecting Z, then I should be able to observe not just an association between X and Y, but also..." Such scientific hypothesis testing is close to completely absent from epidemiology. Instead, mechanisms in epidemiology exist entirely in the untested conclusion statements. You have seen it: An association is observed and there is a discussion of how X must be to be causing Y as a result of Z, or whatever, but whether that really seems to be true is never addressed scientifically. It is worth reiterating: In epidemiology, mechanisms live almost entirely in the conclusions and not in the science.
So circling back to the question of tobacco behavior modeling, when the models are developed in the tradition of epidemiology, it is little wonder that there is no mechanism. The "why" of what happens when a variable changes is not part of epidemiology, and so not part of the models. It is just assumed that if the effects of a particular variable changing were observed in the past -- or more likely, merely if there was just some association observed in the past, with no effects of changes observed -- then that same association will still occur if an intervention is imposed (e.g., menthol is banned). But there is usually no reason to believe that, and indeed, often a lot of reason to not believe it. To take an extreme case, one of the popular models assumes that without menthol, the rate of smoking initiation would drop by the rate at which smoking is initiated with menthol cigarettes. Put a little more simply, this basically is the assumption is that everyone who would have initiated with menthol will therefore never smoke (it is even a bit worse than that because it is based on past associations which might themselves change). I suspect I do not need to explain why the implicit mechanism about people choosing to initiate smoking menthol cigarettes is rather absurd. The absurdity of that seems unfathomable unless you recognize the mechanism-free mindset: "all we know [the mindset goes] is the association we observed before, so we just have to assume that association will always exist".
Yesterday I posted some advice to FDA CTP about the need to understand social science (mainly economics) in their modeling of behavior. There is relatively little economics that needs to be considered in traditional FDA missions, and that which is needed is relatively simple. But regulations that are intended to affect preferences about a freely-chosen consumer product where preferences vary across the population (i.e., like tobacco products, in contrast with medicines or food safety) are all about economics. The failure to include explicit economic analysis in the recent report on the possibility of banning menthol cigarettes illustrated the problems, both scientific and ethical.
But if you were to suggest to the people working in the FDA orbit that they do not really have a model of people's choices about tobacco products (as I have argued), they would probably reply that they do have models. Several of them. (For those who are familiar with this field I am, of course, talking about Levy, Mendez, Environ. For those not familiar, that should present no obstacle to understanding this post -- just know that I am talking about a handful of well-known specifics.)
Ok, there are models. But there is something fundamentally wrong with them. They do not offer us any reason to believe in what they say will occur at the micro level (that is, why each individual person whose actions, collectively, result in the outcome, will do what they suggest they will do). What I mean is that they tell us things like "if X% fewer people start smoking each year", say, due to a menthol ban, "then this graph shows the number of smokers in the future, which is lower than current trends by Y" (if they fill in other information and make a bunch of other assumptions about what is happening, of course). But as for why X% fewer people would start smoking, there is nothing at all. There is just the number.
I have criticized these as being more like calculators than models. If a population starts at 1 and doubles, every period then the number after n periods is 2^n. But it is hard to call the equation "2^n" a model. Similarly, I argued, the preferred "models" used by the tobacco policy inner circle are just more complicated equations.
I learned, however, that this point was not widely convincing based on sociological empiricism -- i.e., I tried to make the point to people and did not have much success. I realized perhaps why this was based on my blog reading from this morning: I was using the wrong words to make my point. It is not that these are not models; any calculation, no matter how simple, can be called a model if it is representing a worldly phenomenon in some useful way (even that lowly 2^n). The problem with these models, and the reason they failed as legitimate models, is the lack of mechanism.
A model uses numbers and equations to show how one variable/construct/point-in-time/etc. affects others. But the model may not capture why a particular effect occurs (the mechanism), as with that X% reduction in smoking initiation. In such cases, it is really just answering a hypothetical question ("if X were true, then Y") rather than making real predictions ("X appears to be true, therefore Y"). But the models that would be useful for FDA purposes are ones that tell us "therefore" not just "if...then". Of course, every model is going to have some simplified or hypothetical elements (if there are no simplifications it is not a model, it is reality) and, once again, there are no bright lines since the mechanisms generally are abstractions (i.e., models) in themselves. But for a model to offer predictions that do not just result from hypothetical inputs, there has to be some "why" built into it.
It seems that the problem is that these models have been developed in a world where the only familiar social science is epidemiology. Epidemiology usually fails as a social science, and as a science more generally, because there is very little attention paid to mechanisms. That it fails as a social science is fairly easy to explain: most people doing it do not realize they are engaging in social science, and most people teaching it have no background in social science. They think they are just doing medical trials. Sometimes this is literally true, of course, and sometimes the observational epidemiology is legitimately an attempt to substitute for medical trials. But as soon as what is being studied is not purely biological, and involves people as people, not just as organisms, it is social science. Medical trials are easy because either the mechanism is obvious or it does not matter -- e.g., this drug makes cancer go away, and we probably have a guess about why, but that guess does not matter because the mechanism does not matter to the epidemiology (though obviously it does for the drug development process).
The failure to be good science at all is less easy to explain or defend. For almost 15 years, mechanism-oriented methods have been developed and taught (in the few good epidemiology departments). These tend to be pretty simple, just boxes and arrows that show what is causing what, but that is most of what you need. Unfortunately these are (a) seldom used at all and (b) almost exclusively used just for identifying confounders. The latter is useful, of course, and doing it is far better than not doing it. But what is missing is use of these mechanistic models to address questions like "if X is really causing Y by affecting Z, then I should be able to observe not just an association between X and Y, but also..." Such scientific hypothesis testing is close to completely absent from epidemiology. Instead, mechanisms in epidemiology exist entirely in the untested conclusion statements. You have seen it: An association is observed and there is a discussion of how X must be to be causing Y as a result of Z, or whatever, but whether that really seems to be true is never addressed scientifically. It is worth reiterating: In epidemiology, mechanisms live almost entirely in the conclusions and not in the science.
So circling back to the question of tobacco behavior modeling, when the models are developed in the tradition of epidemiology, it is little wonder that there is no mechanism. The "why" of what happens when a variable changes is not part of epidemiology, and so not part of the models. It is just assumed that if the effects of a particular variable changing were observed in the past -- or more likely, merely if there was just some association observed in the past, with no effects of changes observed -- then that same association will still occur if an intervention is imposed (e.g., menthol is banned). But there is usually no reason to believe that, and indeed, often a lot of reason to not believe it. To take an extreme case, one of the popular models assumes that without menthol, the rate of smoking initiation would drop by the rate at which smoking is initiated with menthol cigarettes. Put a little more simply, this basically is the assumption is that everyone who would have initiated with menthol will therefore never smoke (it is even a bit worse than that because it is based on past associations which might themselves change). I suspect I do not need to explain why the implicit mechanism about people choosing to initiate smoking menthol cigarettes is rather absurd. The absurdity of that seems unfathomable unless you recognize the mechanism-free mindset: "all we know [the mindset goes] is the association we observed before, so we just have to assume that association will always exist".