Does this mean that the quality of journalism is about to deteriorate? Are newspapers less likely to produce investigative stories that readers depend on to expose corruption, uncover wrongdoing, and hold influential people and institutions accountable?
The worry is who else will investigate a scandal in a small regional city? says Gregory J. Martin, assistant professor of Political Economy at Stanford Graduate School of Business. The existence of such coverage is crucial for the accountability of elected officials, and it has positive effects on representative democracy.
Martin’s concerns about the future of investigative journalism were shared by two Stanford GSB researchers, Shoshana Vasserman, an assistant professor in economics, and Eray Turkel, a Ph.D. student studying applied microeconomics. It was hard to track the output of investigative journalism using existing tools. So, they trained a network to identify investigative newspaper articles. They then applied their model to millions of articles to determine if newsrooms’ decade-long shrinkage had led to a decrease in such content.
The findings published by the Proceedings of the National Academy of Sciences were not wrong. Martin says that, on the one hand, things aren’t quite as bad as they seemed. “On the contrary, the trend seems to be quite downward.” This seems to coincide with the significant layoffs and downsizing. “It’s mixed news.”
Investigating “Investigativeness”
Researchers examined over 5,9 million articles from 50 U.S. papers with a history of publishing investigative reports. The team used the full text of the articles and their metadata to create descriptive features that indicate “investigativeness” to train the algorithm.
It wasn’t easy to sift through millions of articles and find the most important investigations, even with a sensitive tool. Turkel says that only a “very, very tiny fraction” of our data was investigative. Martin says that since only one or two percent of newspaper reporting can be classified as investigative, “any kind of machine-learning method” is “inherently difficult.” Because investigative articles are so diverse, a computer cannot quickly identify the keywords. Martin says that the context is more important.
The approach included examining the impact of an article on future topics. Turkel explains that if an article was published exposing a scandal involving an essential institution or agency, other pieces will likely mention this story in the future or use the exact words. The classifier used labels that were based on investigative journalism award-winning articles. “We took about 1,000 award-winning investigative pieces to compare how an article compares to them in style or subject.”
Keywords such as “corruption” and references to documents such as court records or Freedom of Information Act releases are also signs of investigation. Searching articles that appear to be part of a series was another predictive feature. Martin says that if a newspaper invests money in a study, it will produce several pieces because it is expensive.
The algorithm developed by the team was effective at identifying quality investigative journalism. Despite being trained on a small set of award-winning articles deemed investigative, the classifier could accurately identify investigative reports and highly productive investigative journalists and outlets that specialize in investigative reporting.
The Good News
The number of investigative stories published remained relatively constant over the last decade. Martin says, “I expected this to be a dramatic decline.” “What we find is actually pretty stable.”
These findings suggest that the acquisition of newsrooms by investment groups has not led to a sustained drop in investigative articles. The authors point out that their results indicate that downsizing and reorganization are slow processes. We may not yet see the full impact of these changes.
Turkel points out that the study only includes 50 newspapers that have survived over the last decade. A second caveat is the significant decline in investigative content that began in 2019 with recent rounds of layoffs. After the Austin American Statesman, purchased by a major publishing group in 2018, went from publishing 15 investigative articles a month to just two. Martin says that the metric showed that it dropped dramatically.
The dataset of millions of articles with their predicted investigative scores is available to the public. Turkel said other researchers are interested in using the dataset to measure the impact of newspaper closings, acquisitions, and takeovers.
Stanford GSB plans to use data and algorithms to understand better how readers react to investigative reporting. In partnership with Mozilla, the researchers are launching a platform for a new window that allows web users to donate their browsing data. Martin says, “We are interested in the extent to which quality influences reading and consumption.” We’d love to say something about what investigative journalism means to readers.