Methodology

How this dashboard is built — pipeline, sources, freshness, and known gaps.

The principle

Every chart on this dashboard cites a primary government or authoritative source. No Wikipedia, no blog aggregators, no Statista or StatisticsTimes. When something cannot be traced to a primary source, it is either flagged as an explicit data gap or omitted entirely. The goal is research-grade traceability — a journalist or academic should be able to follow any chart back to the table number it came from.

Source-tier hierarchy

Sources are evaluated in this order of preference:

  1. Primary government data— Census, NFHS, SRS, MoSPI, RBI, ministry portals (MoEFCC, MoHFW, MoA&FW, MoT, MoRTH), state-government statistical handbooks. Used wherever available.
  2. Quasi-government / academic-commissioned— NITI Aayog reports, PRS India budget analyses, NHSRC dossiers, British Council’s 2019 Durga Puja study, UNESCO ICH listings. Cited when primary sources are unavailable; clearly attributed.
  3. Industry / aggregator — IBEF, FAI, Tea Board. Cross-checked against primary sources where possible.
  4. Reconstructed / derived — NFHS-4 values for post-2017 split districts (Alipurduar, Kalimpong, Jhargram, Paschim/Purba Bardhaman, Paschim/Purba Medinipur) reconstructed from parent-district values. Always flagged in the chart and tooltip.

Freshness window

Wherever possible, every chart uses data published within two years of the latest available release. Older data is shown only when it is the most recent published value (e.g., Census 2011 mode-share to work, since Census 2021 has not been released; NSS 76 housing conditions, since NSS does not refresh annually). Vintage caveats are surfaced on every chart that uses pre-2020 data.

The pipeline

Data refresh is fully automated via GitHub Actions. Three cron cadences match the source cadence:

  • Daily (02:00 UTC) — Open-Meteo weather and AQI for the climate and environment pages.
  • Monthly (1st of month, 03:00 UTC) — light fetchers that re-pull data.gov.in agriculture, monthly GST settlements, and similar fast-API sources.
  • Quarterly (1st of Jan / Apr / Jul / Oct, 04:00 UTC) — heavy PDF fetchers for sources that publish slow-cycle reports: NFHS, UDISE+, MoSPI Statistical Appendices, RBI Handbook, NCRB, ISFR, CPCB.

Each fetcher writes a manifest to data/raw/<source>/_manifest.json capturing HTTP status, SHA-256, and retrieval date for every URL. Transform scripts then derive curated JSON in data/processed/, which the dashboard loads as static assets. Provenance for any datapoint can be traced back through the manifest.

Known data gaps

  • Census 2021 has not been conducted. Mode-share, commute distance, religion, mother tongue, and SC/ST distributions are anchored to Census 2011 and labelled accordingly.
  • District-level datais unavailable for many economic indicators (e.g., MSME counts at district level, district-wise GVA). State-level totals are shown with a note where district splits don’t exist.
  • Air-quality monitoring outside Kolkata is sparse; the non-Kolkata gap panel on /ecology illustrates festival-survey data only.
  • Crime statisticsare hidden in production until SLL undercounting and reporting heterogeneity are reconciled with NCRB’s methodology notes.

Source list (44)

Organised by domain. Every source URL where available; otherwise the source is referenced inline on the relevant chart.

Demographics, Health & Population

  • NFHS-5 (2019–21)Fertility, mortality, anthropometry, anaemia, immunisation, institutional delivery — district level.
  • NFHS-4 (2015–16)Pre-NFHS-5 baseline; reconstructed for post-2017 split districts.
  • SRS — Sample Registration System (RGI)IMR, MMR, NNMR, life expectancy.
  • NHSRC India Health Dossier (2021)HDI, hospital infrastructure, DALYs, OOPE.
  • NHA 2021–22National Health Accounts; out-of-pocket health expenditure.
  • NCVBDC, IDSP, Swasthya Sathi dashboardsVector-borne diseases, surveillance, scheme coverage.

Economy, Work & Industry

Agriculture, Land & Water

Education

  • UDISE+ 2023–24 (Min of Education)Schools, enrolment, dropout rates.
  • NIRF 2024Higher-education rankings.
  • WBBSE / WBCHSEMadhyamik and HS results.
  • Kanyashree, SVMCM, Aikyashree dashboardsScholarship disbursement.
  • PM POSHAN portalMid-day-meal coverage and supplementation.

Climate, Environment & Mobility

Tourism, Culture & Investment

  • MoT India Tourism Statistics (ITS)Domestic and foreign tourist arrivals.
  • HVS India Hotel Performance reportsBranded-hotel occupancy and ADR.
  • British Council — Durga Puja Economic Impact (2019)Sector-wise economic impact of Durga Puja.
  • UNESCO Intangible Cultural HeritageICH inscriptions.
  • BGBS official summaries (WB Govt)Investment commitments, sector-wise.

Housing, Diversity & Crime

  • HCES 2022–23 (NSO)MPCE, food share, Engel curves, Gini.
  • NSS 76 — Housing ConditionsPucca houses, piped water, sanitation, LPG.
  • e-Shram dashboardUnorganised-worker registrations.
  • TUS 2019 (Time Use Survey)Unpaid domestic and care-work hours.
  • NCRB Crime in IndiaIPC and SLL offences (data shown only outside production until verified).
  • Census 2011 C-01 / C-17 / A-10Religion, mother tongue, SC/ST distribution.
Reproducibility: the data pipeline is open source at github.com/canindya/State-WestBengal. Every fetcher, transform, and chart is auditable. Spot a source that needs cross-checking? File an issue or email.