Methodology & Data Sources

PlainEmissions presents the EU EDGAR v8.0 greenhouse-gas dataset as plain, comparable per-country and per-sector pages. This page documents every step of how the raw EDGAR workbooks become the figures you see, the source, sector-taxonomy harmonization, unit conversions, vintage tracking, and limitations, and how three other authoritative datasets (World Bank, Climate TRACE, UNFCCC) differ from EDGAR.

Data sources

PlainEmissions currently ingests one primary dataset, EU EDGAR, into its database. Every number rendered on a country, sector, or ranking page is an EDGAR figure. Three other authoritative datasets, World Bank Climate Data, Climate TRACE, and UNFCCC national inventories, are documented below and referenced throughout our research and guide pages for methodological context and contrast, but their record-level data is not currently loaded into our database, so no page on this site displays a World Bank, Climate TRACE, or UNFCCC number directly.

1. EU EDGAR (Joint Research Centre) — ingested, the source of every figure on this site

The Emissions Database for Global Atmospheric Research, maintained by the European Commission's Joint Research Centre, is a bottom-up model that estimates greenhouse-gas emissions for every country and major sector from 1970 to the most recent reporting year (typically two years behind real time). License: CC BY 4.0. Source: edgar.jrc.ec.europa.eu. We treat EDGAR as the canonical cross-country comparability layer because the same methodology is applied to every country, politics and reporting capacity do not enter.

2. World Bank Climate Knowledge Portal — referenced, not currently ingested

The World Bank's Climate Change Knowledge Portal aggregates country-level climate and emissions indicators with broad historical coverage and high update frequency. License: CC BY 4.0. Source: climateknowledgeportal.worldbank.org. We describe the World Bank layer here for context (it is the dataset researchers would use for macro indicators, national totals, per-capita normalization, GDP-intensity), but its records are not loaded into our database and no figure on this site is drawn from it.

3. Climate TRACE — referenced, not currently ingested

Climate TRACE is an independent coalition that produces emissions estimates using satellite observations and machine learning at facility-level resolution. License: CC BY 4.0. Source: climatetrace.org. Climate TRACE is the only one of these datasets that does not rely on self-reported inputs, which is why it is the strongest independent check on national inventories in principle, but its records are not loaded into our database and no figure on this site is drawn from it.

4. UNFCCC National Inventory Submissions — referenced, not currently ingested

Annex I and (since 2024) all parties to the UN Framework Convention on Climate Change submit national greenhouse-gas inventories using IPCC reporting guidelines. License: public. Source: unfccc.int. UNFCCC inventories are the official legal record under international climate-treaty obligations, but they are self-reported and reporting capacity varies dramatically across countries; its records are not loaded into our database and no figure on this site is drawn from it.

Supporting references

For cross-checks and complementary indicators we also consult the U.S. EPA Climate Change Indicators (US-specific sectoral validation), World Bank total greenhouse-gas emissions indicator, OECD environmental indicators, IEA energy and emissions statistics, and the open-data archive at Wikipedia's list of countries by greenhouse-gas emissions (as a cross-source sanity check, not a primary source).

Harmonization steps

EDGAR's raw workbooks use their own country codes, sector taxonomies, gas categorizations, and units. The ETL pipeline performs the following normalization steps in order:

Country codes - every record is mapped to ISO 3166-1 alpha-3. Historical entities (USSR, Yugoslavia, etc.) are mapped to a controlled successor-state set documented in the database countries table.
Sector taxonomy - upstream categories are mapped to a common IPCC-aligned hierarchy: Energy, Transport, Buildings, Industry, Agriculture, Land use & forestry (LULUCF), Waste, Fugitive emissions. Where upstream sub-sectors do not cleanly map, the record is retained at the lowest unambiguous parent.
Gas normalization - gases are tracked individually (CO2, CH4, N2O, HFC, PFC, SF6, NF3) and also expressed as CO2-equivalent using IPCC AR6 100-year global-warming potential multipliers. The native unit value is always retained alongside the CO2e value for transparency.
Provenance - every fact-table row records a source_code, and the schema is designed to hold WB_CLIMATE, CLIMATE_TRACE, and UNFCCC rows alongside EDGAR's for future multi-source comparison. Today only EDGAR rows are populated in the live database.
No interpolation - when a country-year-sector-gas combination is missing from EDGAR, it is left missing. We do not impute and we do not back-fill from other sources.

Vintage tracking

Every EDGAR record carries its upstream release vintage. The data_sources table registers the other three datasets for future reference, but only EDGAR's row currently reflects an in-database vintage. When EDGAR issues a new release, only its rows are updated; historic vintages are not retroactively rewritten in published research pages.

Update cadence

Only EDGAR is currently loaded into our database and refreshed. The other three datasets' own publication cadences (for context, should we load them in the future):

EDGAR (ingested): annual, typically September-November
World Bank (referenced only): monthly indicator refresh
Climate TRACE (referenced only): quarterly
UNFCCC (referenced only): rolling, country-by-country

PlainEmissions refreshes within four weeks of a major EDGAR release. Minor revisions (single-country corrections) propagate via the corrections-overlay framework documented in our about page.

Limitations

Greenhouse-gas measurement is inherently uncertain. Even the best satellite estimates have error bands of single-digit to double-digit percent at the country level for some sectors. We surface this uncertainty as multi-source spread rather than hiding it inside a single number.
LULUCF (land use, land-use change, and forestry) is the most-disputed sector across all four sources. Country pages render LULUCF separately and note when source disagreement exceeds 50%.
UNFCCC inventory coverage is incomplete for some developing countries; for those countries the comparable EDGAR or Climate TRACE figure is the most useful reference.
Climate TRACE's facility-level estimates are most accurate for large point sources (power plants, cement, steel, refineries) and less accurate for diffuse sources (agriculture, transport).

How figures are sourced

Country and sector data pages are loaded directly from the database and rendered server-side - numbers are never modified between source row and page. Editorial research pages and methodology notes are grounded in the same upstream datasets, EU EDGAR, the World Bank, Climate TRACE, and UNFCCC national inventory submissions. Every research page cites the specific upstream figures it references so claims remain verifiable.

Last updated: 2026-05-20