Machine learning — puzzling ‘big data’ nonsense

14 Mar, 2020 at 09:00 | Posted in Statistics & Econometrics | 4 Comments

maIf we wanted highly probable claims, scientists would stick to​​ low-level observables and not seek generalizations, much less theories with high explanatory content. In this day​ of fascination with Big data’s ability to predict​ what book I’ll buy next, a healthy Popperian reminder is due: humans also want to understand and to explain. We want bold ‘improbable’ theories. I’m a little puzzled when I hear leading machine learners praise Popper, a realist, while proclaiming themselves fervid instrumentalists. That is, they hold the view that theories, rather than aiming at truth, are just instruments for organizing and predicting observable facts. It follows from the success of machine learning, Vladimir Cherkassy avers, that​ “realism is not possible.” This is very quick philosophy!

Quick indeed!

The central problem with the present ‘machine learning’ and ‘big data’ hype is that so many — falsely — think that they can get away with analysing real-world phenomena without any (commitment to) theory. But — data never speaks for itself. Without a prior statistical set-up, there actually are no data at all to process. And — using a machine learning algorithm will only produce what you are looking for.

Machine learning algorithms always express a view of what constitutes a pattern or regularity. They are never theory-neutral.

Clever data-mining tricks are not enough to answer important scientific questions. Theory matters.


  1. In a scientific frame of mind, people presume even without knowing any specifics that observable phenomena are related to each other by being produced by a common systemic mechanism. They want to discover that system and understand the logic of its mechanisms.
    Social science faces the quandary that the logic of social mechanisms is our logic; we create the logic of institutions out of the rules we make and enforce, and some of those rules concern the definition and integrity of concepts and social facts. The logic of accounting rules creates the facts of financial accounting. The logic of relational databases has come to govern the double-entry bookkeeping that underlies accounting. The fictions of fiat money, governed by myth-shrouded banking procedure, supply the unit of account, used by bookkeepers and accountants and business managers and investors and journalists.
    There is in the welter of financial facts of sales, asset values, profit, income, prices nothing real. It is all instrumental. Unemployment rate? Instrumental. GDP growth in the fourth quarter? Instrumental.
    Econometrics is drowning in instrumental data — that is all there is. Arguing too strict a realism risks arguing against the genuine reality of society and social relations: that social mechanisms use fictions to create a socially constructed (and strategically contested) virtual reality in which people and social organizations and entities are defined and operate.
    The philosophical realist must confront the reality that the social mechanisms that she studies as a would-be social scientist are constructed from fictions. Moreover, the strongest organizing principle for a theory of such contested fictions may well be prescriptive before it can be descriptive: the best theory will be a social architecture, a design philosophy, if you will, that becomes deeply embedded in the deep structures of the very institutions it seeks to understand.
    In a context of systems constructed of fictions, it is not the case that truth does not matter. Truth matters more to social systems than physical ones. Gravity and momentum will keep the earth revolving around the sun no matter what false stories are told by humans. Price formation overwhelmed by fraud, on the other hand, can induce dysfunctional cooperation or financial collapse.
    The most important function of a social theory of economics or politics may well be to help us recognize truth and distinguish truth from fraud.

    • The problem is that something other than gravity is keeping stars orbiting in galaxies, so our story about forces in this solar system will probably become as “fraudulent” as epicycle theory in another thousand years or so. In the same way, financial truth is only fraud if you tell a particular story about it. How do you know your story is the best? How can you prove it, without simply relying on fickle, arbitrary social consensus?

  2. Excellent questions! There are some profound issues involved in developing a conceptual apparatus to distinguish fraud among fictions.
    Uncertainty implies all knowledge is inaccurate and contingent and subject to contextual qualification. In measuring the physical world, elaborate arguments can be advanced for accuracy of some measurements, properly qualified as to context (even if the context specified is imaginary!).
    For the social and economic world, contingency looms much larger than measurement error in the reliable truth of common fictions. The price of widgets is fixed to some precise number, but the precision is an artifact of administrative procedure and property law, the equivalent of “error” contained in implied warranties and the like. The price of McDonald’s hamburgers on Tuesday is known in advance and is reliable rather than accurate. I do not even know what it would mean to argue that such a price was “accurate”. The price is not measuring anything. But, when you go there on Tuesday, that is what you pay. And, what you get is a meal of consistent quality.
    A person can navigate the economy by means of such fictions as the price of McDonald’s hamburgers and get a meal without personally engaging in any aspect of production or distribution beyond obtaining a money income with which to pay.
    A fraud in this context is a deception regarding the contingencies. If you showed up at McDonald’s and what you had to pay was much more than the advertised price, it might be fair to attribute that to fraud. Similarly, if the meal was poorly cooked or contaminated. We expect the restaurant to spare no expense in remedying certain kinds of deficiency or accident. Fraud is something other than honest error.
    But, everything cannot be a fraud just because every social fact is a fiction. We have to be able to distinguish fraud from truth in a world of fictions.

  3. Some things got to be true in a temporary sence e.g. a social system is observable, it has borders and it has organisation and it have interconnections to other systems. A modell may be a more or less good desciptive of reality, but what is the criteria making the difference in human realisation on how the world works? Biologically given is that persons need to act, every animal need to act to get feedback on the world and verify the thoughts about it through feedback. Indirect knowledge or direct, but most knowlege is create and given socially, in interaction with others. The dictum can not be that “I Think so I exists” but The world must exist in order for me to exist.

Sorry, the comment form is closed at this time.

Blog at
Entries and comments feeds.