## Fat arms science …

15 February, 2019 at 18:47 | Posted in Statistics & Econometrics | 4 CommentsOver human evolutionary history, upper-body strength has been a major component of fighting ability. Evolutionary models of animal conflict predict that actors with greater fighting ability will more actively attempt to acquire or defend resources than less formidable contestants will. Here, we applied these models to political decision making about redistribution of income and wealth among modern humans. In studies conducted in Argentina, Denmark, and the United States, men with greater upper-body strength more strongly endorsed the self-beneficial position: Among men of lower socioeconomic status (SES), strength predicted increased support for redistribution; among men of higher SES, strength predicted increased opposition to redistribution. Because personal upper-body strength is irrelevant to payoffs from economic policies in modern mass democracies, the continuing role of strength suggests that modern political decision making is shaped by an evolved psychology designed for small-scale groups.

Aren’t we just überjoyed research funding goes into performing this kind of immensely interesting and important studies …

## The statistical crisis in science

15 February, 2019 at 18:11 | Posted in Statistics & Econometrics | Leave a comment

Such a great guy. If only more academics could be like you, Andrew!

## Gretl — econometrics made easy

14 February, 2019 at 08:17 | Posted in Statistics & Econometrics | 1 Comment

Thanks to Allin Cottrell and Riccardo Lucchetti we today have access to a high-quality tool for doing and teaching econometrics — **Gretl**. And, best of all, it is totally *free*!

Gretl is up to the tasks you may have, so why spend money on expensive commercial programs?

The latest snapshot version of Gretl can be downloaded here.

[And yes, I do know there’s another fabulously good and free program — **R**. But R hasn’t got as nifty a GUI as Gretl — and at least for students, it’s more difficult to learn to handle and program. I do think it’s preferable when students are going to learn some basic econometrics to use Gretl so that they can concentrate more on ‘content’ rather than ‘technique.’]

## Bayesian moons made of green cheese

12 February, 2019 at 17:58 | Posted in Statistics & Econometrics | Leave a commentIn other words, if a decision-maker thinks something cannot be true and interprets this to mean it has zero probability, he will never be influenced by any data, which is surely absurd. So leave a little probability for the moon being made of green cheese; it can be as small as 1 in a million, but have it there since otherwise an army of astronauts returning with samples of the said cheese will leave you unmoved.

To get the Bayesian probability calculus going you sometimes have to assume strange things — so strange that you actually should rather start wondering if maybe there is something wrong with your theory …

## Machine learning — puzzling Big Data nonsense

10 February, 2019 at 14:09 | Posted in Statistics & Econometrics | Leave a commentIf we wanted highly probable claims, scientists would stick to low-level observables and not seek generalizations, much less theories with high explanatory content. In this day of fascination with Big data’s ability to predict what book I’ll buy next, a healthy Popperian reminder is due: humans also want to understand and to explain. We want bold ‘improbable’ theories. I’m a little puzzled when I hear leading machine learners praise Popper, a realist, while proclaiming themselves fervid instrumentalists. That is, they hold the view that theories, rather than aiming at truth, are just instruments for organizing and predicting observable facts. It follows from the success of machine learning, Vladimir Cherkassy avers, that “realism is not possible.” This is very quick philosophy!

Quick indeed!

The central problem with the present ‘machine learning’ and ‘big data’ hype is that so many — falsely — think that they can get away with analysing real-world phenomena without any (commitment to) theory. But — data never speaks for itself. Without a prior statistical set-up, there actually are no data at all to process. And — using a machine learning algorithm will only produce what you are looking for.

Machine learning algorithms *always* express a view of what constitutes a pattern or regularity. They are *never* theory-neutral.

Clever data-mining tricks are not enough to answer important scientific questions. Theory matters.

## Gibbs sampling (student stuff)

9 February, 2019 at 10:40 | Posted in Statistics & Econometrics | Leave a comment

## Informational entropy (student stuff)

4 February, 2019 at 10:51 | Posted in Statistics & Econometrics | Leave a comment

## Statistik — vår tids religion

31 January, 2019 at 17:19 | Posted in Statistics & Econometrics | Leave a commentIngen tvekan råder om att ett helt annat ämne tagit över kontrollen när det gäller utbildningen i vetenskaplig metod inom nästan hela fältet, nämligen statistiken … Värdet hos det statistiska regelsystemet skall naturligtvis inte ifrågasättas, men det skall inte förglömmas att även andra former av reflektion odlas i vetenskapslandet. Inget enskilt ämne kan göra anspråk på hegemoni …

John Maynard Keynes … pekar på något som kan kallas ‘kausal spridning.’ För att få människokunskap behöver den unga människan träffa människor av skilda slag … Detta förefaller helt uppenbart, men synpunkten har knappast släppts in i en lärobok i statistik. Där gäller inte den kvalitativa rikedomen, utan endast den kvantitativa. Keynes däremot säger helt ogenerat att

antaletgranskade fall inte är av någon större betydelse … Den som bedömer trovärdighet vill kanske se på sannolikhetssiffran som det avgörande. Men siffran har ibland ett bristfälligt underlag och då är den inte mycket värd. När bedömaren får tag i extra material kanske slutsatserna vänds upp och ned. Risken för detta måste också tas med i beräkningen.

När yours truly läste filosofi och matematisk logik i Lund på 1980-talet var Sören Halldén en stor inspirationskälla. Det är han fortfarande.

## Hidden Markov Models and Bayes Theorem for dummies

22 January, 2019 at 15:57 | Posted in Statistics & Econometrics | Comments Off on Hidden Markov Models and Bayes Theorem for dummies

## Beyond probabilism

20 January, 2019 at 17:22 | Posted in Statistics & Econometrics | Comments Off on Beyond probabilism“Getting philosophical” is not about articulating rariﬁed concepts divorced from statistical practice. It is to provide tools to avoid obfuscating the terms and issues being bandied about …

Do I hear a protest? “There is nothing philosophical about our criticism of statistical signiﬁcance tests (someone might say). The problem is that a small P-value is invariably, and erroneously, interpreted as giving a small probability to the null hypothesis.” Really? P-values are not intended to be used this way; presupposing they ought to be so interpreted grows out of a speciﬁc conception of the role of probability in statistical inference. That conception is philosophical. Methods characterized through the lens of over-simple epistemological orthodoxies are methods misapplied and mischaracterized. This may lead one to lie, however unwittingly, about the nature and goals of statistical inference, when what we want is to tell what’s true about them …

One does not have evidence for a claim if nothing has been done to rule out ways the claim may be false. If data x agree with a claim C but the method used is practically guaranteed to ﬁnd such agreement, and had little or no capability of ﬁnding ﬂaws with C even if they exist, then we have bad evidence, no test …

Statistical inference uses data to reach claims about aspects of processes and mechanisms producing them, accompanied by an assessment of the properties of the inference methods: their capabilities to control and alert us to erroneous interpretations. We need to report if the method has satisﬁed the most minimal requirement for solving such a problem. Has anything been tested with a modicum of severity, or not? The severe tester also requires reporting of what has been poorly probed … Informal statistical testing, the crude dichotomy of “pass/fail” or “signiﬁcant or not” will scarcely do. We must determine the magnitudes (and directions) of any statistical discrepancies warranted, and the limits to any substantive claims you may be entitled to infer from the statistical ones.

Deborah Mayo’s book underlines more than anything else the importance of not equating science with statistical calculation or applied probability theory.

The ‘frequentist’ long-run perspective in itself says nothing about how ‘severely’ tested are hypotheses and claims. It doesn’t give us the evidence we seek.

And ‘Bayesian’ consistency and coherence are as silent. All science entail human judgement, and using statistical models doesn’t relieve us of that necessity. Choosing between theories and hypotheses can never be a question of inner coherence and consistency.

Probabilism — in whatever form it takes — says absolutely nothing about reality.

## On the emptiness of Bayesian probabilism

15 January, 2019 at 17:49 | Posted in Statistics & Econometrics | 1 CommentA major attraction of the personalistic [Bayesian] view is that it aims to address uncertainty that is not directly based on statistical data, in the narrow sense of that term. Clearly much uncertainty is of this broader kind. Yet when we come to specific issues I believe that a snag in the theory emerges. To take an example that concerns me at the moment: what is the evidence that the signals from mobile telephones or transmission base stations are a major health hazard? Because such telephones are relatively new and the latency period for the development of, say, brain tumours is long the direct epidemiological evidence is slender; we rely largely on the interpretation of animal and cellular studies and to some extent on theoretical calculations about the energy levels that are needed to induce certain changes. What is the probability that conclusions drawn from such indirect studies have relevance for human health? Now I can elicit what my personal probability actually is at the moment, at least approximately. But that is not the issue. I want to know what my personal probability ought to be, partly because I want to behave sensibly and much more importantly because I am involved in the writing of a report which wants to be generally convincing. I come to the conclusion that my personal probability is of little interest to me and of no interest whatever to anyone else unless it is based on serious and so far as feasible explicit information. For example, how often have very broadly comparable laboratory studies been misleading as regards human health? How distant are the laboratory studies from a direct process affecting health? The issue is not to elicit how much weight I actually put on such considerations but how much I ought to put. Now of course in the personalistic approach having (good) information is better than having none but the point is that in my view the personalistic probability is virtually worthless for reasoned discussion unless it is based on information, often directly or indirectly of a broadly frequentist kind. The personalistic approach as usually presented is in danger of putting the cart before the horse.

The nodal point here is that although Bayes’ theorem is *mathematically* unquestionable, that doesn’t qualify it as indisputably applicable to *scientific* questions. Science is not reducible to betting, and scientific inference is not a branch of probability theory. It always transcends mathematics. The unfulfilled dream of constructing an inductive logic of probabilism — the Bayesian Holy Grail — will always remain unfulfilled.

Bayesian probability calculus is far from the automatic inference engine that its protagonists maintain it is. That probabilities may work for expressing uncertainty when we pick balls from an urn, does not automatically make it relevant for making inferences in science. Where do the priors come from? Wouldn’t it be better in science if we did some scientific experimentation and observation if we are uncertain, rather than starting to make calculations based on often vague and subjective personal beliefs? People have a lot of beliefs, and when they are plainly wrong, we shall not do any calculations whatsoever on them. We simply reject them. Is it, from an epistemological point of view, really credible to think that the Bayesian probability calculus makes it possible to somehow fully assess people’s subjective beliefs? And are — as many Bayesians maintain — all scientific controversies and disagreements really possible to explain in terms of differences in prior probabilities? I’ll be dipped!

## Keynes on the limits of econometric methods

14 January, 2019 at 20:20 | Posted in Statistics & Econometrics | 5 CommentsAm I right in thinking that the method of multiple correlation analysis essentially depends on the economist having furnished, not merely a list of the significant causes, which is correct so far as it goes, but a

completelist? For example, suppose three factors are taken into account, it is not enough that these should be in fact vera causa; there must be no other significant factor. If there is a further factor, not taken account of, then the method is not able to discover the relative quantitative importance of the first three. If so, this means that the method is only applicable where the economist is able to provide beforehand a correct and indubitably complete analysis of the significant factors. The method is one neither of discovery nor of criticism. It is a means of giving quantitative precision to what, in qualitative terms, we know already as the result of a complete theoretical analysis.

## Insignificant ‘statistical significance’

11 January, 2019 at 10:03 | Posted in Statistics & Econometrics | Comments Off on Insignificant ‘statistical significance’We recommend dropping the NHST [null hypothesis significance testing] paradigm — and the p-value thresholds associated with it — as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, rather than allowing statistical signicance as determined by p < 0.05 (or some other statistical threshold) to serve as a lexicographic decision rule in scientic publication and statistical decision making more broadly as per the status quo, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with the neglected factors [such factors as prior and related evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain] as just one among many pieces of evidence.

We make this recommendation for three broad reasons. First, in the biomedical and social sciences, the sharp point null hypothesis of zero effect and zero systematic error used in the overwhelming majority of applications is generally not of interest because it is generally implausible. Second, the standard use of NHST — to take the rejection of this straw man sharp point null hypothesis as positive or even definitive evidence in favor of some preferredalternative hypothesis — is a logical fallacy that routinely results in erroneous scientic reasoning even by experienced scientists and statisticians. Third, p-value and other statistical thresholds encourage researchers to study and report single comparisons rather than focusing on the totality of their data and results.

As shown over and over again when significance tests are applied, people have a tendency to read ‘not disconfirmed’ as ‘probably confirmed.’ Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more ‘reasonable’ to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give about the same 10 % result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.

We should never forget that the underlying parameters we use when performing significance tests are *model constructions*. Our p-values mean nothing if the model is wrong. And most importantly — statistical significance tests DO NOT validate models!

In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p – 1 degrees of freedom in the numerator and n – p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this:

ifthe model is rightandthe coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:

i) An unlikely event occurred.

ii) Or the model is right and some of the coefficients differ from 0.

iii) Or the model is wrong.

So?

## Handy missing data methodologies

10 January, 2019 at 19:16 | Posted in Statistics & Econometrics | 2 CommentsOn October 13, 2012, Manny Fernandez reported in The New York Times that former El Paso schools superintendent Lorenzo Garcia was sentenced to prison for his role in orchestrating a testing scandal. The Texas Assessment of Knowledge and Skills (TAKS) is a state-mandated test for high-school sophomores. The TAKS missing data algorithm was to treat missing data as missing-at-random, and hence the score for the entire school was based solely on those who showed up. Such a methodology is so easy to game that it was clearly a disaster waiting to happen. And it did. The missing data algorithm used by Texas was obviously understood by school administrators; all aspects of their scheme were to keep potentially low-scoring students out of the classroom so they would not take the test and possibly drag scores down. Students identified as likely low performing “were transferred to charter schools, discouraged from enrolling in school, or were visited at home by truant officers and told not to go to school on test day.”

But it didn’t stop there. Some students had credits deleted from transcripts or grades changed from passing to failing so they could be reclassified as freshmen and avoid testing. Sometimes, students who were intentionally held back were allowed to catch up before graduation with “turbo-mesters,” in which a student could acquire the necessary credits for graduation in a few hours in front of a computer.

## Groundbreaking study shows parachutes do not reduce death when jumping from aircraft

10 January, 2019 at 16:00 | Posted in Statistics & Econometrics | 1 CommentParachute use compared with a backpack control did not reduce death or major traumatic injury when used by participants jumping from aircraft in this first randomized evaluation of the intervention. This largely resulted from our ability to only recruit participants jumping from stationary aircraft on the ground. When beliefs regarding the effectiveness of an intervention exist in the community, randomized trials evaluating their effectiveness could selectively enroll individuals with a lower likelihood of benefit, thereby diminishing the applicability of trial results to routine practice. Therefore, although we can confidently recommend that individuals jumping from small stationary aircraft on the ground do not require parachutes, individual judgment should be exercised when applying these findings at higher altitudes.

Yeap — background knowledge sure is important when experimenting …

‘Ideally controlled experiments’ tell us with certainty what causes what effects — but only given the right ‘closures.’ Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. ‘It works there’ is no evidence for ‘it will work here.’ Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population/system. The causal background assumptions made have to be justified, and without licenses to export, the value of ‘rigorous’ and ‘precise’ methods — and ‘on-average-knowledge’ — is despairingly small.

RCTs have very little reach beyond giving descriptions of what has happened in the past. From the perspective of the future and for policy purposes they are as a rule of limited value since they cannot tell us what background factors were held constant when the trial intervention was being made.

RCTs usually do not provide evidence that the results are exportable to other target systems. RCTs cannot be taken for granted to give generalizable results. That something works somewhere for someone is no warranty for us to believe it to work for us here or even that it works generally.

## The replicability crisis

3 January, 2019 at 16:07 | Posted in Statistics & Econometrics | Comments Off on The replicability crisis

## Statistics is no substitute for thinking

2 January, 2019 at 14:44 | Posted in Statistics & Econometrics | Comments Off on Statistics is no substitute for thinkingThe cost of computing has dropped exponentially, but the cost of thinking is what it always was. That is why we see

so manyarticles withso many regressionsandso little thought.

Zvi Griliches

## Confessions of scientific fraud

29 December, 2018 at 18:29 | Posted in Statistics & Econometrics | 1 CommentEven with my various “grey” methods for “improving” the data, I wasn’t able to get the results the way I wanted them. I couldn’t resist the temptation to go a step further. I wanted it so badly …

I opened the file with the data that I had entered and changed an unexpected 2 into a 4; then, a little further along, I changed a 3 into a 5. It didn’t feel right.

I looked around me nervously. The data danced in front of my eyes. When the results are just not quite what you’d so badly hoped for … when you know that there are other people doing similar research elsewhere who are getting good results; then, surely, you’re entitled to adjust the results just a little?

No. I clicked on “Undo Typing.” And again. I felt very alone. I didn’t want this. I’d worked so hard. I’d done everything I could and it just hadn’t quite worked out the way I’d expected. It just wasn’t quite how everyone could see that it logically had to be … I looked at the array of data and made a few mouse clicks to tell the computer to run the statistical analyses. When I saw the results, the world had become logical again. I saw what I’d imagined.

## Why your friends are more popular than you are

27 December, 2018 at 19:54 | Posted in Statistics & Econometrics | Comments Off on Why your friends are more popular than you are

## The skill-luck equation

21 December, 2018 at 16:58 | Posted in Statistics & Econometrics | Comments Off on The skill-luck equation

Blog at WordPress.com.

Entries and comments feeds.