← Interactive Market Data Research

Why your favorite fund's data has the same negating-flag bug we found

An XBRL display hint that every commercial vendor we sampled treats as a sign correction — quietly flipping signs on thousands of metrics across financial statements
Published 2026-04-30 · Interactive Market Data Research

XBRL has a small attribute called negating in the PRE (presentation) section of every SEC filing. It tells you which cells to visually flip when rendering an income statement or balance sheet — for example, "operating expenses" might be stored as a positive number but is displayed as (2,400) on the statement to look like a deduction. It's purely a layout instruction.

Almost everyone gets it wrong. Including us, until last week.

What we found

We were running a 71,198-cell verification of our SEC replica against the SEC's own companyfacts API. After three rounds of debugging the verify tool itself (the bugs there were instructive too — see the bottom), we were left with a stubborn ~14% mismatch bucket that wouldn't resolve. Every mismatched cell had the same pattern: same magnitude, opposite sign.

CompanyTagFYOur valueSEC value
AAPLSellingGeneralAndAdministrativeExpense2023−$24.93B+$24.93B
MSFTCostOfRevenue2024−$74.11B+$74.11B
GOOGLOtherCostOfOperatingRevenue2023−$30.40B+$30.40B

Our values matched the rendered statement on apple.com's 10-K filings. SEC's API didn't. So which one is wrong?

The XBRL spec, briefly

XBRL has three layers:

  1. NUM — the numerical fact: {tag, period, value, unit}
  2. PRE — the presentation tree: how to render the fact on the statement, including parent/child structure, ordering, and the negating hint
  3. CAL — the calculation tree: how facts roll up arithmetically

The 10-K's HTML statement of operations is rendered using PRE — that's where parentheses and indentation come from. NUM is the canonical store. The two are distinct on purpose: a fund using the data programmatically wants the raw number; a human reading the statement wants the rendered presentation.

The XBRL spec is unambiguous on this. From XBRL Specification 2.1, §6.7.5: "preferredLabel and weight modify the presentation of facts but do not modify the underlying values."

SEC's companyfacts API returns NUM directly. Every commercial vendor we know of (we won't name names — they all do it) parses the 10-K's PRE-rendered HTML or aggregates a downstream feed and applies negating to the stored value before storing. The result: signs flip on thousands of cells across a typical 15-year coverage panel.

How to test for it

Pull a known company's SellingGeneralAndAdministrativeExpense for fiscal year 2023 from your data vendor:

# Pseudocode
val = vendor.get_metric(cik=320193, tag='SellingGeneralAndAdministrativeExpense', fy=2023)
sec = json.load(urlopen(f'https://data.sec.gov/api/xbrl/companyconcept/CIK0000320193/us-gaap/SellingGeneralAndAdministrativeExpense.json'))
sec_2023 = next(u['val'] for u in sec['units']['USD']
                if u['fy'] == 2023 and u['fp'] == 'FY' and u['form'] == '10-K')
print(val, sec_2023, val == sec_2023)

If your vendor returns a negative number for SG&A, accruals, or cost-of-revenue line items, you have the negating bug. The SEC API will return the raw positive value.

Why this matters for fundamental analysis

Sign-flipped cells aren't a cosmetic issue. They quietly poison downstream analysis:

None of these are show-stopper visible failures. They're the slow, silent kind — the kind that surfaces months later when a fund manager asks "why is this signal weird on retailers" and nobody can reproduce the bug because every other line item looks correct.

How we fixed it

Two changes in our serving layer:

  1. Stop applying negating to stored analytical values. Pass it through to the frontend as a per-cell render hint, the way XBRL intends.
  2. Re-verify against SEC. Our match rate went from 92% to 99.99%. Of the residual 19 cells, 15 disappeared on refreshing companyfacts.zip (a stale 26-day-old bulk download was the real culprit) and 4 are real edge cases we're tracking.

The fix is in our open-source-style code path: our methodology page documents the full chain.

The other half — restated vs original

While we're on data integrity: the other big mismatch bucket (14% of flagged cells) was companies that restated prior periods. SEC's companyfacts API returns the most recent restated value by default; many vendors store the original-as-filed value because that's what was public on the original filing date.

Neither is "wrong" — they're different views. A backtest replaying historical decisions wants the as-filed value. A fundamental analysis of current state wants the restated. The bug is when a vendor silently mixes the two without labeling.

Our replica now stores both: value_original (as-of-filing) and value_restated (latest available). The verify tool requires both before calling a mismatch a bug.

If you want to audit your data

The minimal test:

  1. Pick 10 large-cap tickers (AAPL, MSFT, GOOGL, AMZN, META, JPM, JNJ, XOM, KO, V).
  2. For each, pull SG&A, Cost of Revenue, R&D, and Depreciation for the last 5 fiscal years.
  3. Compare to https://data.sec.gov/api/xbrl/companyconcept/CIK[10-digit-padded]/us-gaap/[tag].json.
  4. If signs disagree on more than 10% of cells, you have the negating bug.
  5. If magnitudes disagree by more than ~1% on rounded values, you likely have the restated-vs-original mixup.

For our public methodology and the open audit trail behind every metric we serve: see /methodology. Every cell in our system carries source_tag, source_adsh, and citation_url back to the EDGAR filing — not because we want to be clever, but because the alternative (vendor opacity) is exactly how the negating-flag bug lived in production for years.

Bloomberg won't tell you their numbers reproduce. We will.

Related signals

Citations

  1. XBRL Specification 2.1, §6.7.5 — preferredLabel does not modify underlying values
  2. SEC EDGAR companyfacts API — https://data.sec.gov/api/xbrl/companyconcept/
  3. Interactive Market Data internal: project_data_correctness_2026_04_22 (the original investigation)
  4. Interactive Market Data internal: project_data_foundation_2026_04_22 (data layer rebuild)