Methodology & Data Sources
PlainEmissions reconciles four independent measurements of global greenhouse-gas emissions into one comparable dataset. This page documents every step of how raw upstream data becomes the figures you see on country and sector pages — sources, taxonomy harmonization, unit conversions, vintage tracking, and limitations.
Data sources
1. EU EDGAR (Joint Research Centre)
The Emissions Database for Global Atmospheric Research, maintained by the European Commission's Joint Research Centre, is a bottom-up model that estimates greenhouse-gas emissions for every country and major sector from 1970 to the most recent reporting year (typically two years behind real time). License: CC BY 4.0. Source: edgar.jrc.ec.europa.eu. We treat EDGAR as the canonical cross-country comparability layer because the same methodology is applied to every country — politics and reporting capacity do not enter.
2. World Bank Climate Knowledge Portal
The World Bank's Climate Change Knowledge Portal aggregates country-level climate and emissions indicators with broad historical coverage and high update frequency. License: CC BY 4.0. Source: climateknowledgeportal.worldbank.org. We use the World Bank layer for macro indicators (national totals, per-capita normalization, GDP-intensity) where its data lineage runs through the same upstream feeds international ESG analysts already trust.
3. Climate TRACE
Climate TRACE is an independent coalition that produces emissions estimates using satellite observations and machine learning at facility-level resolution. License: CC BY 4.0. Source: climatetrace.org. Climate TRACE is the only source that does not rely on self-reported inputs, making it the strongest independent check on national inventories.
4. UNFCCC National Inventory Submissions
Annex I and (since 2024) all parties to the UN Framework Convention on Climate Change submit national greenhouse-gas inventories using IPCC reporting guidelines. License: public. Source: unfccc.int. UNFCCC inventories are the official legal record under international climate-treaty obligations, but they are self-reported and reporting capacity varies dramatically across countries.
Supporting references
For cross-checks and complementary indicators we also consult the U.S. EPA Climate Change Indicators (US-specific sectoral validation), World Bank total greenhouse-gas emissions indicator, OECD environmental indicators, IEA energy and emissions statistics, and the open-data archive at Wikipedia's list of countries by greenhouse-gas emissions (as a cross-source sanity check, not a primary source).
Harmonization steps
Each upstream source uses different country codes, sector taxonomies, gas categorizations, and units. The ETL pipeline performs the following normalization steps in order:
- Country codes — every record is mapped to ISO 3166-1 alpha-3. Historical entities
(USSR, Yugoslavia, etc.) are mapped to a controlled successor-state set documented in the
database
countriestable. - Sector taxonomy — upstream categories are mapped to a common IPCC-aligned hierarchy: Energy, Transport, Buildings, Industry, Agriculture, Land use & forestry (LULUCF), Waste, Fugitive emissions. Where upstream sub-sectors do not cleanly map, the record is retained at the lowest unambiguous parent.
- Gas normalization — gases are tracked individually (CO2, CH4, N2O, HFC, PFC, SF6, NF3) and also expressed as CO2-equivalent using IPCC AR6 100-year global-warming potential multipliers. The native unit value is always retained alongside the CO2e value for transparency.
- Provenance — every fact-table row records its
source_code(EDGAR, WB_CLIMATE, CLIMATE_TRACE, UNFCCC) so multi-source comparisons are queryable without joins. - No interpolation — when a country-year-sector-gas combination is missing from an upstream source, it is left missing. We do not impute and we do not back-fill from other sources.
Vintage tracking
Every record carries the upstream release vintage. The data_sources table records the
current vintage in use for each source. When a new release lands, only that source's rows are
updated — historic vintages are not retroactively rewritten in published research pages.
Update cadence
- EDGAR: annual, typically September-November
- World Bank: monthly indicator refresh
- Climate TRACE: quarterly
- UNFCCC: rolling, country-by-country
PlainEmissions refreshes within four weeks of a major upstream release. Minor revisions (single-country corrections) propagate via the corrections-overlay framework documented in our about page.
Limitations
- Greenhouse-gas measurement is inherently uncertain. Even the best satellite estimates have error bands of single-digit to double-digit percent at the country level for some sectors. We surface this uncertainty as multi-source spread rather than hiding it inside a single number.
- LULUCF (land use, land-use change, and forestry) is the most-disputed sector across all four sources. Country pages render LULUCF separately and note when source disagreement exceeds 50%.
- UNFCCC inventory coverage is incomplete for some developing countries; for those countries the comparable EDGAR or Climate TRACE figure is the most useful reference.
- Climate TRACE's facility-level estimates are most accurate for large point sources (power plants, cement, steel, refineries) and less accurate for diffuse sources (agriculture, transport).
Editorial pipeline
Country and sector data pages are loaded directly from the database and rendered server-side — numbers are never modified between source row and page. Editorial research pages and methodology notes are researched and drafted with computational assistance and reviewed by Kiznis Studio before publication. Every research page cites the specific upstream figures it references so claims remain verifiable.
Last reviewed: