Significance testing and the real tasks of social science

30 mars, 2019 kl. 09:56 | Publicerat i Statistics & Econometrics | Lämna en kommentar

acAfter having mastered all the technicalities of regression analysis and econometrics, students often feel as though they are masters of the universe. I usually cool them down with required reading of Christopher Achen’s modern classic Interpreting and Using Regression. It usually gets​ them back on track again, and they understand that

no increase in methodological sophistication … alter the fundamental nature of the subject. It remains a wondrous mixture of rigorous theory, experienced judgment, and inspired guesswork. And that, finally, is its charm.

And in case they get too excited about having learned to master the intricacies of proper significance tests and p-values, I ask them to also ponder on Achen’s warning:

Significance testing as a search for specification errors substitutes calculations for substantive thinking. Worse, it channels energy toward the hopeless search for functionally correct specifications and diverts​ attention from the real tasks, which are to formulate a manageable description of the data and to exclude competing ones.

Econometric beasts of bias

8 mars, 2019 kl. 15:16 | Publicerat i Statistics & Econometrics | 2 kommentarer

In an article posted earlier on this blog — What are the key assumptions of linear regression models? — yours truly tried to argue that since econometrics doesn’t content itself with only making ‘optimal’ predictions,” but also aspires to explain things in terms of causes and effects, econometricians need loads of assumptions — and that most important of these are additivity and linearity.

overconfidenceLet me take the opportunity to elaborate a little more on why I find these assumptions of such paramount importance and ought to be much more argued for — on both epistemological and ontological grounds — if at all being used.

Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we ‘export’ them to our ‘target systems,’ we have to be able to show that they do not only hold under ceteris paribus conditions and a fortiori only are of limited value to our understanding, explanations or predictions of real economic systems.

Econometrics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fulfil its tasks. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics will continue to consider its ultimate argument as a mixture of rather unhelpful metaphors and metaphysics. Maintaining that economics is a science in the ‘true knowledge’ business, yours truly remains a sceptic of the pretences and aspirations of econometrics. So far, I cannot really see that it has yielded very much in terms of relevant, interesting economic knowledge.

The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. The rather one-sided emphasis of usefulness and its concomitant instrumentalist justification cannot hide that neither Haavelmo, nor the legions of probabilistic econometricians following in his footsteps, give supportive evidence for their considering it “fruitful to believe” in the possibility of treating unique economic data as the observable results of random drawings from an imaginary sampling of an imaginary population. After having analyzed some of its ontological and epistemological foundations, I cannot but conclude that econometrics, on the whole, has not delivered ‘truth.’ And I doubt if it has ever been the intention of its main protagonists.

Our admiration for technical virtuosity should not blind us to the fact that we have to have a cautious attitude towards probabilistic inferences in economic contexts. Science, as Keynes said, should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” We should look out for causal relations, but econometrics can never be more than a starting point in that endeavour since econometric (statistical) explanations are not explanations in terms of mechanisms, powers, capacities or causes. Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality, But there is always the possibility that there are other variables – of vital importance and although perhaps unobservable and non-additive, not necessarily epistemologically inaccessible – that were not considered for the model. Those who were can hence never be guaranteed to be more than potential causes, and not real causes. A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. A perusal of the leading econom(etr)ic journals shows that most econometricians still concentrate on fixed parameter models and that parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.

Real world social systems are not governed by stable causal mechanisms or capacities. As Keynes wrote in his critique of econometrics and inferential statistics already in the 1920s (emphasis added):

The atomic hypothesis which has worked so splendidly in Physics breaks down in Psychics. We are faced at every turn with the problems of Organic Unity, of Discreteness, of Discontinuity – the whole is not equal to the sum of the parts, comparisons of quantity fails us, small changes produce large effects, the assumptions of a uniform and homogeneous continuum are not satisfied. Thus the results of Mathematical Psychics turn out to be derivative, not fundamental, indexes, not measurements, first approximations at the best; and fallible indexes, dubious approximations at that, with much doubt added as to what, if anything, they are indexes or approximations of.

The kinds of ‘laws’ and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made ‘nomological machines’ they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of mainstream economic theoretical modeling – rather useless.

Econometrics — the path from cause to effect

7 mars, 2019 kl. 18:58 | Publicerat i Statistics & Econometrics | 2 kommentarer


In their book — Mastering ‘Metrics: The Path from Cause to Effect — Joshua D. Angrist and Jörn-Steffen Pischke write:

masteringOur first line of attack on the causality problem is a randomized experiment, often called a randomized trial. In a randomized trial, researchers change the causal variables of interest … for a group selected using something like a coin toss. By changing circumstances randomly, we make it highly likely that the variable of interest is unrelated to the many other factors determining the outcomes we want to study. Random assignment isn’t the same as holding everything else fixed, but it has the same effect. Random manipulation makes other things equal hold on average across the groups that did and did not experience manipulation. As we explain … ‘on average’ is usually good enough.

Angrist and Pischke may ”dream of the trials we’d like to do” and consider ”the notion of an ideal experiment” something that ”disciplines our approach to econometric research,” but to maintain that ‘on average’ is ”usually good enough” is an allegation that in my view is rather unwarranted, and for many reasons.

First of all it amounts to nothing but hand waving to simpliciter assume, without argumentation, that it is tenable to treat social agents and relations as homogeneous and interchangeable entities.

notes7-2Randomization is used to basically allow the econometrician to treat the population as consisting of interchangeable and homogeneous groups (‘treatment’ and ‘control’). The regression models one arrives at  by using randomized trials tell us the average effect that variations in variable X has on the outcome variable Y, without having to explicitly control for effects of other explanatory variables R, S, T, etc., etc. Everything is assumed to be essentially equal except the values taken by variable X.

In a usual regression context one would apply an ordinary least squares estimator (OLS) in trying to get an unbiased and consistent estimate:

Y = α + βX + ε,

where α is a constant intercept, β a constant ”structural” causal effect and ε an error term.

The problem here is that although we may get an estimate of the ”true” average causal effect, this may “mask” important heterogeneous effects of a causal nature. Although we get the right answer of the average causal effect being 0, those who are “treated”( X=1) may have causal effects equal to – 100 and those “not treated” (X=0) may have causal effects equal to 100. Contemplating being treated or not, most people would probably be interested in knowing about this underlying heterogeneity and would not consider the OLS average effect particularly enlightening.

Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we ”export” them to our “target systems”, we have to be able to show that they do not only hold under ceteris paribus conditions and a fortiori only are of limited value to our understanding, explanations or predictions of real economic systems.

Real world social systems are not governed by stable causal mechanisms or capacities. The kinds of ”laws” and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of mainstream economic theoretical modeling – rather useless.

Remember that a model is not the truth. It is a lie to help you get your point across. And in the case of modeling economic risk, your model is a lie about others, who are probably lying themselves. And what’s worse than a simple lie? A complicated lie.

Sam L. Savage The Flaw of Averages

When Joshua Angrist and Jörn-Steffen Pischke in an earlier article of theirs [”The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics,” Journal of Economic Perspectives, 2010] say that

anyone who makes a living out of data analysis probably believes that heterogeneity is limited enough that the well-understood past can be informative about the future

I really think they underestimate the heterogeneity problem. It does not just turn up as an external validity problem when trying to “export” regression results to different times or different target populations. It is also often an internal problem to the millions of regression estimates that economists produce every year.

But when the randomization is purposeful, a whole new set of issues arises — experimental contamination — which is much more serious with human subjects in a social system than with chemicals mixed in beakers … Anyone who designs an experiment in economics would do well to anticipate the inevitable barrage of questions regarding the valid transference of things learned in the lab (one value of z) into the real world (a different value of z) …

randomizeAbsent observation of the interactive compounding effects z, what is estimated is some kind of average treatment effect which is called by Imbens and Angrist (1994) a “Local Average Treatment Effect,” which is a little like the lawyer who explained that when he was a young man he lost many cases he should have won but as he grew older he won many that he should have lost, so that on the average justice was done. In other words, if you act as if the treatment effect is a random variable by substituting βt for β0 + β′zt, the notation inappropriately relieves you of the heavy burden of considering what are the interactive confounders and finding some way to measure them …

If little thought has gone into identifying these possible confounders, it seems probable that little thought will be given to the limited applicability of the results in other settings.

Ed Leamer

Evidence-based theories and policies are highly valued nowadays. Randomization is supposed to control for bias from unknown confounders. The received opinion is that evidence based on randomized experiments therefore is the best.

More and more economists have also lately come to advocate randomization as the principal method for ensuring being able to make valid causal inferences.

I would however rather argue that randomization, just as econometrics, promises more than it can deliver, basically because it requires assumptions that in practice are not possible to maintain.

Especially when it comes to questions of causality, randomization is nowadays considered some kind of ”gold standard”. Everything has to be evidence-based, and the evidence has to come from randomized experiments.

But just as econometrics, randomization is basically a deductive method. Given the assumptions (such as manipulability, transitivity, separability, additivity, linearity, etc.) these methods deliver deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. And although randomization may contribute to controlling for confounding, it does not guarantee it, since genuine ramdomness presupposes infinite experimentation and we know all real experimentation is finite. And even if randomization may help to establish average causal effects, it says nothing of individual effects unless homogeneity is added to the list of assumptions. Real target systems are seldom epistemically isomorphic to our axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by randomization procedures may be valid in ”closed” models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.

When does a conclusion established in population X hold for target population Y? Only under very restrictive conditions!

Angrist’s and Pischke’s ”ideally controlled experiments” tell us with certainty what causes what effects — but only given the right ”closures”. Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. ”It works there” is no evidence for ”it will work here”. Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population/system. The causal background assumptions made have to be justified, and without licenses to export, the value of ”rigorous” and ”precise” methods — and ‘on-average-knowledge’ — is despairingly small.

Random walks (student stuff)

7 mars, 2019 kl. 00:07 | Publicerat i Statistics & Econometrics | 2 kommentarer

 

Pólya urn models mathematics

27 februari, 2019 kl. 09:11 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Pólya urn models mathematics

 

The limits of probabilistic reasoning

23 februari, 2019 kl. 08:45 | Publicerat i Statistics & Econometrics | 2 kommentarer

Almost a hundred years after John Maynard Keynes wrote his seminal A Treatise on Probability (1921), it is still very difficult to find statistics books that seriously try to incorporate his far-reaching and incisive analysis of induction and evidential weight.

keynesreadingbookThe standard view in statistics — and the axiomatic probability theory underlying it — is to a large extent based on the rather simplistic idea that more is better. But as Keynes argues — more of the same is not what is important when making inductive inferences. It’s rather a question of “more but different.”

Variation, not replication, is at the core of induction. Finding that p(x|y) = p(x|y & w) doesn’t make w irrelevant. Knowing that the probability is unchanged when w is present gives p(x|y & w) another evidential weight. Running 10 replicative experiments do not make you as sure of your inductions as when running 10 000 varied experiments — even if the probability values happen to be the same.

According to Keynes we live in a world permeated by unmeasurable uncertainty — not quantifiable stochastic risk — which often forces us to make decisions based on anything but ‘rational expectations.’ Keynes rather thinks that we base our expectations on the confidence or ‘weight’ we put on different events and alternatives. To Keynes, expectations are a question of weighing probabilities by ‘degrees of belief,’ beliefs that often have preciously little to do with the kind of stochastic probabilistic calculations made by the rational agents as modelled by modern social sciences. And often we “simply do not know.”

Science according to Keynes should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” Models can never be more than a starting point in that endeavour. He further argued that it was inadmissible to project history on to the future. Consequently, we cannot presuppose that what has worked before, will continue to do so in the future. That statistical models can get hold of correlations between different variables is not enough. If they cannot get at the causal structure that generated the data, they are not really ‘identified.’

How strange that economists and other social scientists, as a rule, do not even touch upon these aspects of scientific methodology that seems to be so fundamental and important for anyone trying to understand how we learn and orient ourselves in an uncertain world. An educated guess on why this is a fact would be that Keynes’ concepts are not possible to squeeze into a single calculable numerical probability. In the quest for quantities one puts a blind eye to qualities and looks the other way — but Keynes ideas keep creeping out from under the statistics carpet.

The validity of the inferential models we as scientists use ultimately depends on the assumptions we make about the entities to which we apply them. Applying the traditional calculus of probability presupposes far-reaching ontological presuppositions. If we are prepared to assume that societies and economies are like urns filled with coloured balls in fixed proportions, then fine. But — really — who could earnestly believe in such an utterly ridiculous analogy?

In a real world full of ‘unknown unknowns’ and genuine non-ergodic uncertainty, urns are of little avail.

Human decisions affecting the future, whether personal or political or economic, cannot depend on strict mathematical expectation, since the basis for making such calculations does not exist; and that it is our innate urge to activity which makes the wheels go round, our rational selves choosing between the alternatives as best we are able, calculating where we can, but often falling back for our motive on whim or sentiment or chance.

J M Keynes

Added: Tom Hickey — as always — has some interesting comments on this post here.

Bayesian statistics — an introduction

19 februari, 2019 kl. 16:26 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Bayesian statistics — an introduction

ben

If you look for a Bayesian statistics textbook accessible also to students with limited knowledge of statistics and math — this is the book!

Machine learning — getting results that are completely wrong

18 februari, 2019 kl. 14:31 | Publicerat i Statistics & Econometrics | 7 kommentarer

machMachine-learning techniques used by thousands of scientists to analyse data are producing results that are misleading and often completely wrong.

Dr Genevera Allen from Rice University in Houston said that the increased use of such systems was contributing to a “crisis in science” …

The data sets are very large and expensive. But, according to Dr Allen, the answers they come up with are likely to be inaccurate or wrong because the software is identifying patterns that exist only in that data set and not the real world …

Machine learning systems and the use of big data sets has accelerated the crisis, according to Dr Allen. That is because machine learning algorithms have been developed specifically to find interesting things in datasets and so when they search through huge amounts of data they will inevitably find a pattern.

“The challenge is can we really trust those findings?” she told BBC News.

“Are those really true discoveries that really represent science? Are they reproducible? If we had an additional dataset would we see the same scientific discovery or principle on the same dataset? And unfortunately the answer is often probably not.”

BBC News

The central problem with the present ‘machine learning’ and ‘big data’ hype is that so many think that they can get away with analysing real-world phenomena without any (commitment to) theory. But — data never speaks for itself. Without a prior statistical set-up, there actually are no data at all to process. And — using a machine learning algorithm will only produce what you are looking for.

Machine learning algorithms always express a view of what constitutes a pattern or regularity. They are never theory-neutral.

Clever data-mining tricks are not enough to answer important scientific questions. Theory matters.

Fat arms science …

15 februari, 2019 kl. 18:47 | Publicerat i Statistics & Econometrics | 4 kommentarer

fat-fucking-armOver human evolutionary history, upper-body strength has been a major component of fighting ability. Evolutionary models of animal conflict predict that actors with greater fighting ability will more actively attempt to acquire or defend resources than less formidable contestants will. Here, we applied these models to political decision making about redistribution of income and wealth among modern humans. In studies conducted in Argentina, Denmark, and the United States, men with greater upper-body strength more strongly endorsed the self-beneficial position: Among men of lower socioeconomic status (SES), strength predicted increased support for redistribution; among men of higher SES, strength predicted increased opposition to redistribution. Because personal upper-body strength is irrelevant to payoffs from economic policies in modern mass democracies, the continuing role of strength suggests that modern political decision making is shaped by an evolved psychology designed for small-scale groups.

Michael Bang Petersen, Daniel Sznycer, Aaron Sell

Aren’t we just überjoyed research​ funding goes into performing this kind of immensely interesting and​ important studies …

The statistical crisis in science

15 februari, 2019 kl. 18:11 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för The statistical crisis in science

 

Such a great guy. If only more academics could be like you, Andrew!

Gretl — econometrics made easy

14 februari, 2019 kl. 08:17 | Publicerat i Statistics & Econometrics | 1 kommentar

 

Thanks to Allin Cottrell and Riccardo Lucchetti we today have access to a high-quality​ tool for doing and teaching econometrics — Gretl. And, best of all, it is totally free!

Gretl is up to the tasks you may have, so why spend money on expensive commercial programs?

The latest snapshot version of Gretl can be downloaded here.

[And yes, I do know there’s another fabulously good and free program — R. But R hasn’t got as nifty a GUI as Gretl — and at least for students, it’s more difficult to learn to handle and program. I do think it’s preferable when students are going to learn some basic econometrics to use Gretl so that they can concentrate more on ‘content’ rather than ‘technique.’]

Bayesian moons made of green cheese

12 februari, 2019 kl. 17:58 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Bayesian moons made of green cheese

lindleyIn other words, if a decision-maker thinks something cannot be true and interprets this to mean it has zero probability, he will never be influenced by any data, which is surely absurd. So leave a little probability for the moon being made of green cheese; it can be as small as 1 in a million, but have it there since otherwise an army of astronauts returning with samples of the said cheese will leave you unmoved.

To get the Bayesian probability calculus going you sometimes have to assume strange things — so strange that you actually should rather start wondering if maybe there is something wrong with your theory …

Machine learning — puzzling Big Data nonsense

10 februari, 2019 kl. 14:09 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Machine learning — puzzling Big Data nonsense

maIf we wanted highly probable claims, scientists would stick to​​ low-level observables and not seek generalizations, much less theories with high explanatory content. In this day​ of fascination with Big data’s ability to predict​ what book I’ll buy next, a healthy Popperian reminder is due: humans also want to understand and to explain. We want bold ‘improbable’ theories. I’m a little puzzled when I hear leading machine learners praise Popper, a realist, while proclaiming themselves fervid instrumentalists. That is, they hold the view that theories, rather than aiming at truth, are just instruments for organizing and predicting observable facts. It follows from the success of machine learning, Vladimir Cherkassy avers, that​ ”realism is not possible.” This is very quick philosophy!

Quick indeed!

The central problem with the present ‘machine learning’ and ‘big data’ hype is that so many — falsely — think that they can get away with analysing real-world phenomena without any (commitment to) theory. But — data never speaks for itself. Without a prior statistical set-up, there actually are no data at all to process. And — using a machine learning algorithm will only produce what you are looking for.

Machine learning algorithms always express a view of what constitutes a pattern or regularity. They are never theory-neutral.

Clever data-mining tricks are not enough to answer important scientific questions. Theory matters.

Gibbs sampling (student stuff)

9 februari, 2019 kl. 10:40 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Gibbs sampling (student stuff)

 

Informational entropy (student stuff)

4 februari, 2019 kl. 10:51 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Informational entropy (student stuff)

 

Statistik​ — vår tids religion

31 januari, 2019 kl. 17:19 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Statistik​ — vår tids religion

sc3b6renhalldenIngen tvekan råder om att ett helt annat ämne tagit över kontrollen när det gäller utbildningen i vetenskaplig metod inom nästan hela fältet, nämligen statistiken … Värdet hos det statistiska regelsystemet skall naturligtvis inte ifrågasättas, men det skall inte förglömmas att även andra former av reflektion odlas i vetenskapslandet. Inget enskilt ämne kan göra anspråk på hegemoni …

John Maynard Keynes … pekar på något som kan kallas ‘kausal spridning.’ För att få människokunskap behöver den unga människan träffa människor av skilda slag … Detta förefaller helt uppenbart, men synpunkten har knappast släppts in i en lärobok i statistik. Där gäller inte den kvalitativa rikedomen, utan endast den kvantitativa. Keynes däremot säger helt ogenerat att antalet granskade fall inte är av någon större betydelse … Den som bedömer trovärdighet vill kanske se på sannolikhetssiffran som det avgörande. Men siffran har ibland ett bristfälligt underlag och då är den inte mycket värd. När bedömaren får tag i extra material kanske slutsatserna vänds upp och ned. Risken för detta måste också tas med i beräkningen.

Sören Halldén

När yours truly läste filosofi och matematisk logik i Lund på 1980-talet var Sören Halldén en stor inspirationskälla. Det är han fortfarande.

Hidden Markov Models and Bayes Theorem for dummies

22 januari, 2019 kl. 15:57 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Hidden Markov Models and Bayes Theorem for dummies

 

Beyond probabilism

20 januari, 2019 kl. 17:22 | Publicerat i Statistics & Econometrics | Kommentarer inaktiverade för Beyond probabilism

“Getting philosophical” is not about articulating rarified concepts divorced from statistical practice. It is to provide tools to avoid obfuscating the terms and issues being bandied about …

mayoDo I hear a protest? “There is nothing philosophical about our criticism of statistical significance tests (someone might say). The problem is that a small P-value is invariably, and erroneously, interpreted as giving a small probability to the null hypothesis.” Really? P-values are not intended to be used this way; presupposing they ought to be so interpreted grows out of a specific conception of the role of probability in statistical inference. That conception is philosophical. Methods characterized through the lens of over-simple epistemological orthodoxies are methods misapplied and mischaracterized. This may lead one to lie, however unwittingly, about the nature and goals of statistical inference, when what we want is to tell what’s true about them …

One does not have evidence for a claim if nothing has been done to rule out ways the claim may be false. If data x agree with a claim C but the method used is practically guaranteed to find such agreement, and had little or no capability of finding flaws with C even if they exist, then we have bad evidence, no test …

Statistical inference uses data to reach claims about aspects of processes and mechanisms producing them, accompanied by an assessment of the properties of the inference methods: their capabilities to control and alert us to erroneous interpretations. We need to report if the method has satisfied the most minimal requirement for solving such a problem. Has anything been tested with a modicum of severity, or not? The severe tester also requires reporting of what has been poorly probed … Informal​ statistical testing, the crude dichotomy of “pass/fail” or “significant or not” will scarcely do. We must determine the magnitudes (and directions) of any statistical discrepancies warranted, and the limits to any substantive claims you may be entitled to infer from the statistical ones.

Deborah Mayo’s book underlines more than anything else the importance of not equating science with statistical calculation or applied probability theory.

The ‘frequentist’ long-run perspective in itself says nothing about how ‘severely’ tested are hypotheses and claims. It doesn’t give​ us the evidence we seek.

And ‘Bayesian’ consistency and coherence are as silent. All science entail human judgement, and using statistical models doesn’t relieve us of that necessity. Choosing between theories and hypotheses can never be a question of inner coherence and consistency.

Probabilism — in whatever form it takes — says absolutely​ nothing about reality.​

On the​ emptiness of Bayesian probabilism

15 januari, 2019 kl. 17:49 | Publicerat i Statistics & Econometrics | 1 kommentar

unknownA major attraction of the personalistic [Bayesian] view is that it aims to address uncertainty that is not directly based on statistical data, in the narrow sense of that term​. Clearly much uncertainty is of this broader kind. Yet when we come to specific issues I believe that a snag in the theory emerges. To take an example that concerns me at the moment: what is the evidence that the signals from mobile telephones or transmission base stations are a major health hazard? Because such telephones are relatively new and the latency period for the development of, say, brain tumours is long the direct epidemiological evidence is slender; we rely largely on the interpretation of animal and cellular studies and to some extent on theoretical calculations about the energy levels that are needed to induce certain changes. What ​is the probability that conclusions drawn from such indirect studies have relevance for human health? Now I can elicit what my personal probability actually is at the moment, at least approximately. But that is not the issue. I want to know what my personal probability ought to be, partly because I want to behave sensibly and much more importantly because I am involved in the writing of a report which wants to be generally convincing. I come to the conclusion that my personal probability is of little interest to me and of no interest whatever to anyone else unless it is based on serious and so far as feasible explicit information. For example, how often have very broadly comparable laboratory studies been misleading as regards human health? How distant are the laboratory studies from a direct process affecting health? The issue is not to elicit how much weight I actually put on such considerations but how much I ought to put. Now of course in the personalistic approach having (good) information is better than having none but the point is that in my view the personalistic probability is virtually worthless for reasoned discussion​ unless it is based on information, often directly or indirectly of a broadly frequentist kind. The personalistic approach as usually presented is in danger of putting the cart before the horse.

David Cox

The nodal point here is that although Bayes’ theorem is mathematically unquestionable, that doesn’t qualify it as indisputably applicable to scientific questions. Science is not reducible to betting, and scientific inference is not a branch of probability theory. It always transcends mathematics. The unfulfilled dream of constructing an inductive logic of probabilism — the Bayesian Holy Grail — will always remain unfulfilled.

Bayesian probability calculus is far from the automatic inference engine that its protagonists maintain it is. That probabilities may work for expressing uncertainty when we pick balls from an urn, does not automatically make it relevant for making inferences in science. Where do the priors come from? Wouldn’t it be better in science if we did some scientific experimentation and observation if we are uncertain, rather than starting to make calculations based on often vague and subjective personal beliefs? People have a lot of beliefs, and when they are plainly wrong, we shall not do any calculations whatsoever on them. We simply reject them. Is it, from an epistemological point of view, really credible to think that the Bayesian probability calculus makes it possible to somehow fully assess people’s subjective beliefs? And are — as many Bayesians maintain — all scientific controversies and disagreements really possible to explain in terms of differences in prior probabilities? I’ll be dipped!

Keynes on the limits​ of econometric methods

14 januari, 2019 kl. 20:20 | Publicerat i Statistics & Econometrics | 5 kommentarer

4388529Am I right in thinking that the method of multiple correlation analysis essentially depends on the economist having furnished, not merely a list of the significant causes, which is correct so far as it goes, but a complete list? For example, suppose three factors are taken into account, it is not enough that these should be in fact vera causa; there must be no other significant factor. If there is a further factor, not taken account of, then the method is not able to discover the relative quantitative importance of the first three. If so, this means that the method is only applicable where the economist is able to provide beforehand a correct and indubitably complete analysis of the significant factors. The method is one neither of discovery nor of criticism. It is a means of giving quantitative precision to what, in qualitative terms, we know already as the result of a complete theoretical analysis.

John Maynard Keynes

Nästa sida »

Blogga med WordPress.com.
Entries och kommentarer feeds.