The dangers of data-based certainty
April 17 2021 12:15 AM

People who work in the arts and humanities have started doing something unusual, at least for them: poring over data. This is due to the pandemic, of course. Every day, they check Covid-19 case numbers, how slowly or quickly the R factor is declining, and how many people in our area got vaccinated the day before.
Meanwhile, social media are full of claims and counterclaims about all manner of other data. Is global poverty declining or increasing? What is the real level of US unemployment? The scrutiny, sometimes leading to tetchy arguments, results from people’s desire to cite – or challenge – the authority of data to support their position or worldview.
But in other areas where data are used, there is remarkably little focus on its reliability or interpretation. One striking example recently concerns the “CAPTCHA” tests designed to protect websites against bots, which ask you to prove your humanity by identifying images containing common features such as boats, bicycles, or traffic lights. If your choice – even if correct – differs from that of the machine system using your selection to train an image-recognition algorithm, you will be deemed inhuman.
In this example, the machine’s error is obvious, although there is no appeal against it if you want to access the website it is guarding. But in other cases, it may not be possible to identify what conclusions either machine-learning systems or human analysts are drawing when they put more weight on data than the data can bear.
Economists are rushing to embrace the use of big data in their research, while many policymakers think artificial intelligence offers scope for greater cost-effectiveness and better policy outcomes. But before we entrust more decisions to data-based machine-learning and AI systems, we must be clear about the limitations of the data.
Already, too little attention is paid to the uncertainties inherent in economic data. Although policymakers generally appreciate that even something as basic as GDP growth is subject to large uncertainties and revisions, it seems impossible to stop people from building narratives on weak foundations.
For example, cross-country comparisons of the pandemic’s impact on national GDP are fraught with difficulty, owing to differences in economic structure and statistical methodology. But that does not stop claims about which economies are weathering the crisis better or worse.
Or consider the “true” rate of inflation. Seemingly technical disputes about how best to construct a price index mask profound distributional conflicts, such as those between borrowers and bond holders, or workers and employers.
The data we use shape our view of a complex, changing world. But data represent reality from a particular perspective. Data of the kind deployed in policy debates are rarely completely unanchored from the world they describe, but the lens they provide can be sharp or blurry – and there is no escaping the perspective they offer.
One possible reason for the current distrust of economic “expertise” is the growing gap between top-down, technical economic assessments based on familiar data series, and an alternative world of more granular data presenting the bottom-up picture. Standard economic statistics capture average experience, which ceases to be typical when people’s fortunes diverge.
In general, advocates of evidence-based policy are aware of the inherent uncertainty of available data. Researchers take great care regarding sampling, the scope for error, and the limitations of the data-collection method used. But the degree of false certainty tends to increase with proximity to policy and political decision-making. Former US President Harry S Truman is far from the only politician to have expressed impatience with economists who say, “‘On the one hand...,’ then, ‘but on the other.’” — Project Syndicate

There are no comments.

LEAVE A COMMENT Your email address will not be published. Required fields are marked*