New format leads to enhanced interpretability and consistency of financial data.​

Financial firms, investors, and researchers all rely on accurate financial reports from companies to understand the economy and to make investment and trading decisions.

A team of researchers from the Penn State Smeal College of Business found that a new, richer method of structuring data in these reports offers important advantages. The researchers suggest that this new method, which the Securities and Exchange Commission mandates businesses use, leads to more accurate research and improved financial decisions.

“What the study found was that there are big discrepancies between data sold by third-party, commercial data aggregators, such as Compustat, and what companies actually file with the U.S. Securities and Exchange Commission,” says Steve Huddart, professor of accounting and senior associate dean for Penn State Smeal. “A lot of the things that accounting researchers look at, such as the profitability of particular stock trading strategies, are of great interest to practitioners. Many stock trading strategies are based on information from the corporate filings of all the companies that you’re thinking about trading. Getting timely, accurate, and consistent information on hundreds or thousands of companies is crucial to portfolio formation.”

According to the researchers, the SEC recently mandated that businesses disclose financial data in a reporting language called the eXtensible Business Reporting Language (XBRL). This language links each value in a financial report to additional information, called metadata. The metadata provide additional context and specificity for business activity and transactions, thereby enhancing the interpretability and consistency of the financial data.

Currently, most analysts use information gathered by aggregators, who then sell it to their customers. The aggregators’ data omit metadata, so they are not as rich as the XBRL format and, according to the researchers, contain significant discrepancies from the XBRL data.

The researchers say they found more than 90% of the data items provided by third-party aggregators contain discrepancies in financial factors compared to the XBRL data. They add that these discrepancies are frequent and large enough to cloud the interpretations drawn from some prior academic research studies that were based upon Compustat data.

“For certain kinds of academic research, the conclusion drawn depends on whether the data source is XBRL or Compustat. To us, this strongly suggests that the way we do research going forward should be based more on XBRL data and less on these data aggregators’ products,” Huddart says.

According to the researchers, discrepancies arise for a few reasons, but occur most frequently in businesses with complex financial reports and in certain industries. They add that because data aggregators do not describe the procedures and normalizations used in creating their data products, identifying the cause of the discrepancies in their products can be a big challenge. An advantage of XBRL data is that all values, including aggregated or computed values, can be traced to the associated corporate report.

The researchers also say that as-filed data are more granular than aggregators’ data and, where aggregators may take days or even weeks to make financial statement data available to investors, as-filed data are available as soon as the XBRL filings are submitted to the SEC.

“…There are big discrepancies between data sold by third-party, commercial aggregators, such as Compustat, and what companies actually file with the Securities and Exchange Commission.”

Democratization of Data

The researchers suggest that the move toward XBRL is a step toward making financial data not just more accurate, but also more accessible. Third-party aggregators typically charge high fees to access their data.

However, they add that even though the XBRL data are essentially free, there are some technological hurdles to overcome in accessing XBRL records.

“There is still an inertia in people’s adoption of the data aggregation product,” says Kai Du, associate professor of accounting, who was an academic fellow at the SEC’s Office of the Chief Accountant. “Adoption won’t be automatic because it is free, more timely, and more granular. It’s hard to imagine that an individual researcher or company that has been using third-party data for the past two decades would automatically give them up. It takes a lot of effort and determination to make that switch.”

For the studies, the researchers gathered as-filed financial data from 2012 to 2019. Altogether, they compare as-filed data covering 20,410 company-years to information provided to third-party data aggregators for the same companies in the same time period. This comparison allowed the researchers to identify substantial discrepancies between the XBRL data and the data provided by third-party aggregators.

To assess how discrepancies affect financial judgments and decisions, the researchers focused on certain kinds of financial statement analysis, such as computing the accrual component of earnings and estimating real earnings management.

Future Work

The researchers said that future research directions should include investigation into how analysts can use XBRL information.

“I think there are so many features of XBRL that we can explore,” said Xin Daniel Jiang, assistant professor of managerial accounting at the University of Waterloo and a graduate of Penn State Smeal’s Ph.D. program. “To me, I think, the logical, hierarchical structure of XBRL is very important and underused.”

The researchers published their findings in the Journal of Accounting and Economics.