1. Overview

This dataset provides records of public procurement payments made to civil society organisations — including registered charities, community interest companies, co-operatives, and other third sector bodies — across the United Kingdom. It was produced by the UK Third and Civil Society Sector Database project, which collects, processes, and links public administrative data on civil society organisations throughout the UK.

The data is drawn from three open data sources: central government transparency spending data, NHS payment records, and Contracts Finder awarded contract notices. Each payment record links a public sector funder to a civil society supplier, providing a detailed picture of how public money flows to the third sector through procurement and commissioning.

The dataset contains 944,449 payment records covering 11,862 organisations and 1,558 public funders, spanning the period from 2010 to 2025. Together, these records offer a comprehensive view of the scale, distribution, and evolution of public procurement relationships with civil society across the UK.

944,449Payment Records
11,862Organisations
1,558Public Funders
2010–2025Years Covered

This guide covers two related procurement datasets published by the project:

  • Procurement payments (cso-procurement-dataset.csv) — the payment-level dataset described above and in Sections 3–4: 944,449 payments made to civil society organisations, drawn from central government, NHS and Contracts Finder sources.
  • Contracts Finder contracts (contractsfinder-enriched.csv) — a complementary, contract-level dataset of 406,113 awarded contracts published on Contracts Finder, enriched with the project’s supplier classification. Unlike the payments dataset, this file covers contracts awarded to all suppliers (not only civil society), with civil society suppliers identified by a register match, and adds rich contract metadata (descriptions, sector codes, values, dates and the buyer’s voluntary-sector suitability flags). It is documented in Section 3B.

2. What are Procurement Records?

UK public bodies are required to publish details of their spending over certain thresholds as part of government transparency commitments. This dataset draws from three open data initiatives that collect and standardise these spending records, filtered to include only payments made to civil society organisations identified in the TCSS Organisation Register.

Central Government Spending

UK government departments publish monthly CSV files of all transactions exceeding £25,000 as part of HM Government’s spending transparency commitments. These files are collected and harmonised by the centgovspend project, which aggregates and cleans thousands of files from ministerial and non-ministerial departments for consistency and quality controls. Central government spending accounts for 90.3% of the records in this dataset.

NHS Spending

Payment records from NHS Trusts and Clinical Commissioning Groups (CCGs) are collected by the NHSSpend project. This covers payments exceeding £25,000 made by NHS institutions across England, spanning approximately 2010 to April 2020 when data collection concluded. NHS spending accounts for 7.8% of records.

Contracts Finder

Contracts Finder is the UK government’s online portal for public sector procurement opportunities and awarded contracts. Awarded contract notices are scraped by the TCSS project’s own Contracts Finder collector. Unlike the payment-level data from the other two sources, Contracts Finder provides contract-level information (awarded amounts and dates). These records account for 1.9% of the dataset.

3. Dataset Contents

This section documents the procurement payments dataset (cso-procurement-dataset.csv); the Contracts Finder contracts dataset is documented in Section 3B. The payments dataset contains 944,449 records across 18 fields. Each row represents a single payment from a public sector funder to a civil society organisation. The fields are organised into four groups: identifiers, payment details, funder information, and summary statistics.

Identifiers & Organisation Info

Field Description Type Coverage
uid Unique organisation identifier from the TCSS Organisation Register (e.g., GB-COH-12345678, GB-CHC-1234567, GB-SC-SC012345) Text 93.4%
organisation_name Organisation name as recorded in the source data Text 85.6%
supplier_id Internal supplier identifier assigned during pipeline processing (e.g., S192664) Text 100%
supplier_type Organisation type: Charity, CIC, Co-operative/Mutual, or Other CSO Text 100%
supplier_type_detail A more granular supplier classification, refining supplier_type Text 100%
is_multi_supplier Indicates whether the source record named more than one supplier (True/False) Boolean 100%

Payment Details

Field Description Type Coverage
payment_year Calendar year of the payment Numeric 100%
payment_date Date of payment (YYYY-MM-DD format) Date 100%
amount Payment amount in pounds sterling (£); negative values indicate reversals or corrections Numeric 100%
data_source Source dataset: Central Government, NHSSpend, Contracts Finder, or contractsfinder Text 100%

Funder Information

Field Description Type Coverage
funder_name Canonical funder name (uppercase) Text 100%
funder_name_alt Alternative funder name, where applicable Text 1.0%
funder_id Unique funder identifier (e.g., F02683) Text 100%
funder_type Funder classification: UK Government, NHS, Local Government, Education Institution, CSO, Police, Fire and Rescue, Private Sector, Other, or Junk/Invalid Text 100%
funder_type_detail A more granular funder classification, refining funder_type (for example, distinguishing types of body within UK Government) Text 100%
funder_type_alt Classification of the alternative funder name Text 1.4%

Note: The funder_name_alt and funder_type_alt fields are populated only where an alternative funder name was assigned during the classification pipeline. This primarily applies to Contracts Finder funders that were manually refined to their parent Central Government department (e.g., DVSA mapped to DFTRANSPORT).

Summary Statistics

Field Description Type Coverage
total_value_payments_to_org Total value of all payments to this organisation across the full dataset (£) Numeric 100%
total_number_payments_to_org Total number of payments to this organisation across the full dataset Numeric 100%

3B. The Contracts Finder Enriched Dataset

Alongside the payment-level dataset, the project publishes a second, complementary file: contractsfinder-enriched.csv. This is a contract-level dataset of awarded contract notices published on Contracts Finder, enriched with the project’s supplier classification and register linkage.

Two features distinguish it from the payments dataset. First, it is contract-level rather than payment-level: each row is a single awarded contract notice, carrying the contract title, description, sector codes, estimated and awarded values, key dates, and the buyer’s own suitability and award flags. Second, it covers contracts awarded to all suppliers, not only civil society organisations — civil society suppliers are identified by a register match (the uid field), which is present for 4.7% of records. Retaining the non-civil-society awards is what makes it possible to measure civil society’s share of the contract market and to test the reliability of the buyer’s voluntary-sector flags.

406,113Awarded Contracts
96,361Suppliers
4,687Buyers
19,204Awards to CSOs

The dataset contains 406,113 records across 52 fields, organised into the groups below.

Notice & Contract Details

FieldDescriptionTypeCoverage
idInternal contract/notice identifierText100%
noticeIdentifierContracts Finder notice reference numberText100%
parentIdIdentifier of the parent notice (for call-offs under a framework or related notices)Text10.9%
isSubNoticeWhether the record is a sub-notice of a parent notice (True/False)Boolean100%
noticeTypeType of notice (e.g., award notice)Text100%
noticeStatusStatus of the notice (e.g., Awarded)Text100%
deptPublishing department/buyer label from the source portalText100%
organisationNameBuyer organisation name as published on the noticeText100%
titleContract titleText100%
descriptionContract descriptionText100%
sectorSector tag, where provided by the buyerText8.2%
cpvDescriptionPrimary Common Procurement Vocabulary (CPV) category descriptionText100%
cpvDescriptionExpandedExpanded CPV category description(s)Text100%
cpvCodesCPV code(s) for the contractText100%
cpvCodesExtendedExtended CPV code listText100%

Buyer (Funder)

FieldDescriptionTypeCoverage
funder_nameCanonical funder nameText99.9%
funder_name_altAlternative funder name, where applicableText9.7%
funder_idUnique funder identifier (e.g., F02683)Text99.9%
funder_typeFunder classification: UK Government, NHS, Local Government, Education Institution, Police, Fire and Rescue, Private Sector, CSO, Other, or Junk/InvalidText99.9%
funder_type_altClassification of the alternative funder nameText15.2%
funder_type_detailA more granular funder classification, refining funder_typeText99.9%

Supplier & Civil-Society Classification

FieldDescriptionTypeCoverage
awardedSupplierSupplier name as recorded on the award noticeText99.7%
normalized_supplierNormalised supplier name used for register matchingText99.8%
supplier_idInternal supplier identifierText99.9%
supplier_typeSupplier classification: Charity, CIC, Co-operative/Mutual, Other CSO, or Non-CSOText99.9%
supplier_type_detailA more granular supplier classification, refining supplier_typeText99.8%
uidTCSS Organisation Register identifier; present only where the supplier matched a register entry (i.e., is a civil society organisation)Text4.7%
organisation_nameMatched organisation name from the registerText4.1%
is_multi_supplierWhether the award named more than one supplier (True/False)Boolean100%

Values & Dates

FieldDescriptionTypeCoverage
awardedValueAwarded contract value (£)Numeric100%
valueLowLower bound of the estimated contract value (£)Numeric100%
valueHighUpper bound of the estimated contract value (£)Numeric100%
publishedDateDate the notice was publishedDate100%
approachMarketDateDate the buyer approached the market, where recordedDate0.6%
deadlineDateSubmission deadline dateDate100%
awardedDateDate the contract was awardedDate100%
startContract start dateDate100%
endContract end dateDate100%
lastNotifableUpdateDate of the last notifiable update to the noticeDate100%

Voluntary-Sector & SME Flags

FieldDescriptionTypeCoverage
isSuitableForVcoBuyer flag — the contract was advertised as suitable for voluntary and community organisations (True/False)Boolean100%
awardedToVcseBuyer flag — the contract was recorded as awarded to a voluntary, community or social enterprise (True/False)Boolean100%
isSuitableForSmeBuyer flag — the contract was advertised as suitable for small and medium-sized enterprises (True/False)Boolean100%
awardedToSmeBuyer flag — the contract was recorded as awarded to an SME (True/False)Boolean100%

Note: The four flags above are self-reported by the buyer and are not validated against any register. Analysis of these records shows they are an unreliable guide to which contracts actually went to civil society organisations: among ministerial-department contracts, only about one in six contracts flagged “suitable for” the voluntary sector, and about two in five flagged “awarded to” it, were placed with a register-confirmed civil society organisation. The register-matched supplier_type and uid fields provide the validated classification.

Geography

FieldDescriptionTypeCoverage
postcodeSupplier postcode, where providedText32.4%
coordinatesGeographic coordinates associated with the recordText100%
regionRegion codeText87.5%
regionTextRegion nameText87.5%

Linkage Fields

FieldDescriptionTypeCoverage
is_gapIndicates the record was added to fill a coverage gap during processing (True/False)Boolean100%
payment_yearYear field carried over for cross-dataset linkage (see note)Numeric100%
payment_dateDate field carried over for cross-dataset linkage (see note)Text100%
total_value_payments_to_orgTotal value of payments to the matched organisation across the payments dataset (£)Numeric100%
total_number_payments_to_orgTotal number of payments to the matched organisation across the payments datasetNumeric100%

Note: For contract-level analysis, use the contract dates — awardedDate, publishedDate, start and end. The payment_year and payment_date fields are carried from the linkage process to align with the payments dataset and should not be treated as contract dates. A small number of records carry implausible payment_year values; filter on awardedDate instead. Records can be linked to the payments dataset and the TCSS Organisation Register using supplier_id and (for matched civil society organisations) uid.

4. Coverage & Completeness

Core Field Coverage

The core payment fields — payment_date, amount, funder_id and data_source — are present in every record. The uid register identifier is present for 93.4% of records and organisation_name for 85.6%; the remaining records could not be matched to a register entry but are still valid payments linked via supplier_id.

uid
93.4%
payment_date
100%
amount
100%
organisation_name
85.6%

Data Source Breakdown

The dataset draws from four data source tags. Central government spending dominates, accounting for over 90% of all records.

Source Records Unique Organisations Share
Central Government 859,100 8,583 91.0%
NHSSpend 66,105 2,202 7.0%
Contracts Finder 15,320 4,523 1.6%
contractsfinder 3,924 1,879 0.4%

Note: The data_source field distinguishes two Contracts Finder collection batches — Contracts Finder and contractsfinder — both drawn from the same portal. The “Unique Organisations” column counts distinct suppliers within each source; because a supplier may appear in more than one source, these figures do not sum to the dataset total.

Funder Type Distribution

Funders are classified into nine types using a multi-layer cascade (see Appendix A2). UK Government departments account for the vast majority of funding.

Funder Type Records Share
UK Government 864,917 91.6%
NHS 67,925 7.2%
Local Government 9,005 1.0%
Education Institution 984 0.1%
CSO 471 <0.1%
Police, Fire and Rescue 403 <0.1%
Private Sector 332 <0.1%
Other 237 <0.1%
Junk/Invalid 175 <0.1%

Supplier Type Distribution

Civil society organisations in the dataset are classified by legal form. “Other CSO” includes companies limited by guarantee and other third sector bodies that do not fall into the three specific categories.

Supplier Type Records Share
Other CSO 654,143 69.3%
Charity 237,269 25.1%
Co-operative/Mutual 34,074 3.6%
CIC 18,963 2.0%

Year-by-Year Record Counts

Record counts vary substantially by year, reflecting the availability of source data over time. Coverage is strongest between 2012 and 2023.

Year Records
20105,480
20118,365
201255,142
2013123,636
2014175,834
201556,626
201631,580
201769,060
201867,705
201970,979
202064,894
202162,042
202267,501
202371,664
202412,314
20251,591

Note: Thirty-six records fall outside the 2010–2025 range shown above. These include a small number with implausible payment years (e.g., 1900, 2027) due to source data errors, as well as seven records from 2005–2009 that predate the main collection period and 26 records dated 2026 from ongoing data collection. All are retained in the dataset for transparency; users conducting temporal analysis may wish to filter to the core 2010–2025 range. The 2024 and 2025 figures are also incomplete as source data collection is ongoing.

Organisation Register Composition

The uid prefix indicates which source register each organisation originates from in the TCSS Organisation Register. Companies House registrations dominate, reflecting the large number of companies limited by guarantee and other nonprofit company forms. About 6.6% of payment records could not be matched to a register and have no uid; they appear as the final row.

Prefix Source Register Records Share
GB-COH Companies House 623,258 66.0%
GB-CHC Charity Commission for England & Wales 200,907 21.3%
GB-SC Office of the Scottish Charity Regulator 25,477 2.7%
GB-COOP Co-operatives UK 23,253 2.5%
GB-MPR Mutuals Public Register (FCA) 8,732 0.9%
GB-SHR Scottish Housing Regulator 352 <0.1%
GB-NIC Charity Commission for Northern Ireland 294 <0.1%
(No register match) 62,175 6.6%

A single record uses another register prefix (GB-SHPE) and is omitted from the table above.

5. What Can You Learn?

The procurement dataset enables a wide range of research questions about the relationship between the public sector and civil society in the United Kingdom.

Research Questions

  • Scale and trends — How much does the UK government spend with civil society organisations, and how has this changed over time?
  • Funder analysis — Which government departments and NHS bodies are the largest funders of civil society? How does spending vary across funder types?
  • Sectoral composition — What types of civil society organisations receive the most public procurement funding — charities, community interest companies, or co-operatives?
  • Concentration — How concentrated is procurement spending? Do a small number of organisations receive the majority of payments?
  • Cross-sector linkage — By linking to the TCSS Organisation Register, researchers can explore how procurement recipients differ from the broader civil society population in terms of size, age, location, and industrial classification.

Example: Mapping UK Civil Society Procurement

The procurement dataset was used in the research report Mapping and Understanding the UK Civil Society Sector (McDonnell et al., 2026), which analysed the full set of payment records to characterise the flow of public money to civil society organisations. Key findings include:

  • UK Government dominates — UK Government departments are the largest source of public procurement spend to civil society across all organisation types, accounting for the majority of both payment volume and total value.
  • Charities receive the largest share — Charities receive 42% of total procurement spend to civil society, reflecting their central role in public service delivery.
  • Scale of spending — UK Government departments account for over £131 billion in cumulative payments, dwarfing the NHS (£15 billion), local government (£8 billion), and all other funder types combined.
  • Department-level variation — Analysis at the individual department level reveals substantial variation in how much each department spends with different types of civil society organisation, from health-focused charities receiving NHS payments to social enterprises delivering local government contracts.
Table 4: Procurement by CSO type and funder type
Number of organisations, payments, and total value
CSO Education Institution Local Government NHS Other Police, Fire and Rescue UK Government
CIC 18 orgs
35 payments
£7.1M
14 orgs
21 payments
£1.1M
312 orgs
615 payments
£795.4M
258 orgs
12,611 payments
£4721.9M
38 orgs
54 payments
£191.0M
16 orgs
25 payments
£9.7M
267 orgs
3,599 payments
£386.6M
Charity 154 orgs
247 payments
£230.7M
167 orgs
443 payments
£89.1M
1,784 orgs
5,353 payments
£5386.0M
1,765 orgs
42,808 payments
£8745.4M
234 orgs
382 payments
£875.9M
109 orgs
206 payments
£407.0M
3,798 orgs
171,112 payments
£50731.5M
Co-operative / Mutual 15 orgs
32 payments
£9.4M
9 orgs
17 payments
£3.3M
105 orgs
570 payments
£1537.6M
96 orgs
3,607 payments
£557.8M
17 orgs
28 payments
£83.8M
3 orgs
3 payments
£1.7M
141 orgs
24,364 payments
£942.2M
Other 67 orgs
109 payments
£64.2M
74 orgs
321 payments
£39.6M
415 orgs
1,058 payments
£386.7M
260 orgs
8,232 payments
£1185.2M
72 orgs
109 payments
£65.1M
24 orgs
85 payments
£20.5M
3,524 orgs
600,720 payments
£79275.8M
Source: UK Civil Society Spine, Contracts Finder and procurement data
Table 9: Procurement by CSO type and funder type. Each cell shows the number of organisations, payments, and total value. Source: UK Civil Society Spine, Contracts Finder and procurement data.

UK Government departments by CSO type

Table 10 shows the proportion of each UK Government department's procurement spend that goes to each civil society organisation type. Departments vary considerably in where their procurement is directed. The Department for Culture, Media and Sport and the Department for International Development direct over 90% of their civil society spend to charities, while the Department for Education and the Department for Business and Trade allocate over 75% to other nonprofit companies. The Department for Work and Pensions stands out for relatively high CIC (6.1%) and co-operative/mutual (7.1%) shares compared with other departments.

Table 10: UK Government department procurement spend by CSO type
Proportion of each department's total spend to civil society by organisation type (%)
Charity CIC Co-operative / Mutual Other CSO
Bank of England 55.4% 0.0% 0.0% 44.6%
Cabinet Office 72.4% 0.5% 0.0% 27.1%
Care Quality Commission 98.5% 0.0% 0.0% 1.5%
Crown Commercial Service 35.9% 0.3% 2.2% 61.6%
Defence Science & Technology Laboratory 76.3% 3.4% 0.0% 20.3%
Department for Business & Trade 17.0% 0.3% 0.5% 82.2%
Department for Culture, Media & Sport 98.0% 0.1% 0.1% 1.9%
Department for Education 23.3% 0.1% 0.3% 76.0%
Department for Energy Security & Net Zero 1.9% 2.0% 18.8% 77.3%
Department for Environment, Food & Rural Affairs 71.9% 0.7% 0.6% 26.8%
Department for International Development 92.9% 0.0% 0.0% 7.1%
Department for International Trade 7.8% 0.0% 0.0% 92.2%
Department for International Trade 6.5% 0.3% 0.0% 93.2%
Department for Science, Innovation & Technology 38.7% 4.4% 0.0% 56.9%
Department for Transport 20.9% 0.1% 1.7% 77.3%
Department for Work & Pensions 57.6% 6.1% 7.1% 29.1%
Department of Health & Social Care 82.1% 0.4% 0.0% 17.5%
Driver & Vehicle Licensing Agency 21.4% 0.0% 0.0% 78.6%
Food Standards Agency 12.6% 0.5% 0.0% 83.4%
Foreign Office 93.9% 0.0% 0.0% 6.0%
Foreign, Commonwealth & Development Office 84.0% 0.0% 0.0% 16.0%
HM Land Registry 0.3% 5.8% 0.0% 93.8%
HM Revenue & Customs 8.7% 0.1% 0.2% 91.0%
HM Treasury 5.3% 0.1% 0.3% 94.3%
Health Education England 35.8% 25.5% 0.3% 38.5%
Highways England 94.6% 0.0% 0.0% 5.4%
Home Office 38.2% 1.4% 0.2% 60.2%
Homes England 95.0% 0.0% 3.5% 1.5%
Institute for Apprenticeships & Technical Education 100.0% 0.0% 0.0% 0.0%
Medicines & Healthcare Products Regulatory Agency 99.6% 0.0% 0.0% 0.4%
Ministry of Defence 71.2% 0.8% 5.0% 22.9%
Ministry of Housing, Communities & Local Government 79.1% 0.3% 2.1% 18.5%
Ministry of Justice 63.4% 3.1% 3.3% 30.2%
Money & Pensions Service 100.0% 0.0% 0.0% 0.0%
NHS Blood & Transplant 2.7% 0.0% 93.9% 3.4%
Public Health England 59.9% 10.8% 0.6% 27.1%
Scotland Office 53.8% 0.2% 4.3% 41.7%
Sellafield Ltd 1.1% 0.0% 0.0% 98.9%
Skills Funding Agency 50.0% 0.0% 0.0% 50.0%
Transport for London 66.7% 25.0% 6.5% 1.9%
UK Health Security Agency 1.8% 0.0% 0.0% 98.2%
UK Research & Innovation 42.0% 20.6% 3.3% 34.1%
UK Shared Business Services 34.3% 7.1% 3.5% 55.0%
Source: UK Civil Society Spine, Contracts Finder and procurement data
Note: Includes 43 departments with total civil society spend of at least £10 million.
Table 10: UK Government department procurement spend by civil society organisation type. Proportion of each department's total spend to civil society (%). Departments with at least £10 million total spend. Source: UK Civil Society Spine, Contracts Finder and procurement data.

Tip: This dataset can be linked to the TCSS Organisation Register using the uid field, enabling enrichment with organisation characteristics such as location, registration dates, and industrial classification codes. See Appendix A5 for worked examples.

6. Limitations & Caveats

Threshold Bias

The source data includes only transactions above £25,000, as mandated by UK government transparency requirements. Smaller payments and grants below this threshold are not captured. This means the dataset over-represents larger contracts and under-represents routine smaller purchases, and total spending figures will understate the true volume of public procurement from civil society.

Missing Organisation Names

Approximately 19.7% of records lack an organisation_name value. These records are still linked to valid organisations through the uid and supplier_id fields, but the name was not always present in the source spending data. Users can recover organisation names by joining with the TCSS Organisation Register on uid.

Year Outliers

A small number of records (22 in total) carry implausible payment years such as 1900 or 2027, resulting from errors in the source data. These records are retained for transparency. Users conducting temporal analysis should filter to the core range of 2010–2025, and may also wish to note that the 2024 and 2025 counts are incomplete as source data collection is ongoing.

Duplicate and Reversal Entries

Some records represent payment corrections or reversals, indicated by negative values in the amount field. These are retained to preserve the source data faithfully. Users conducting aggregate analysis should be aware that naïve summation of amounts may overstate or understate totals; consider filtering or handling negative amounts depending on the research question.

NHS Coverage Gap

The NHSSpend data collection concluded around April 2020. NHS procurement records after this date are not included in the dataset, creating a gap in NHS-specific coverage for 2020 onwards. Central government and Contracts Finder data continue beyond this date.

Supplier Matching

Organisations in the raw payment data were matched to the TCSS Organisation Register using a combination of exact and fuzzy name matching (Jaro-Winkler similarity, threshold ≥ 0.90). Some false positives (incorrect matches) and false negatives (missed matches) are possible, particularly for organisations with common or ambiguous names. The alias resolution process (see Appendix A3) mitigates but does not eliminate this issue.

Funder Classification

Funders are classified using a multi-layer cascade of metadata signals, external lookups, and keyword rules (see Appendix A2). While the cascade achieves high accuracy for well-known funder types (NHS, UK Government), edge cases — particularly funders with ambiguous names or those not present in external reference databases — may be misclassified. The Junk/Invalid category captures clearly erroneous entries (20 records).

What’s NOT in the Data

The dataset does not include:

  • Contract descriptions or service categories
  • Geographic detail of the contract delivery location
  • Payments below £25,000
  • Payments to organisations not identified in the TCSS Organisation Register
  • Procurement from non-civil-society suppliers (private companies, individuals, etc.)

7. Citation & Licence

Licence: This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to share, adapt, and build upon this data for any purpose, provided you give appropriate credit.

Suggested Citation

McDonnell, D. et al. (2026). TCSS Procurement Records. UK Third and Civil Society Sector Database. Available at: https://uk-third-sector-database.github.io/data/. Licensed under CC BY 4.0.

If you would like to learn more about this dataset and how it can be applied to your project or research programme, please contact research@brawdata.com.

8. Changelog

Version Date Changes
1.1 June 2026 Refreshed the payments dataset (944,449 records) and refreshed all coverage and distribution figures. Added documentation for the companion contract-level Contracts Finder dataset (Section 3B).
1.0 March 2026 Initial release of the Procurement Records guidance document and dataset.

A1. Pipeline Overview

The procurement dataset is produced by a six-step R preprocessing pipeline that transforms raw spending data into the final linked dataset. The pipeline is orchestrated by a single script (run-all-preprocessing-pipeline.R) that runs each step in sequence, skipping steps whose inputs have not changed since the last successful run.

Pipeline Steps

Step Script Purpose
1 02-build-funder-lookup.R Classifies approximately 1,400 unique funders into eight types using a multi-layer cascade of metadata signals, external lookups, and keyword rules. See Appendix A2.
2 03-build-supplier-lookup.R Matches supplier names from the raw payment data to the TCSS Organisation Register using exact and fuzzy name matching (Jaro-Winkler, threshold ≥ 0.90). Assigns a uid and supplier_type to each matched supplier.
3 03a-generate-alias-batches.R Identifies potential supplier name aliases — cases where the same organisation appears under different names — using deterministic rules. Generates review batches for manual or LLM-assisted validation.
4 03b-assemble-alias-decisions.R Assembles alias decisions from both rule-based determinations and LLM-reviewed batch results into a single validated alias file.
5 03c-apply-alias-merges.R Applies validated alias merges to the supplier lookup, consolidating duplicate supplier entries under a single canonical uid.
6 04-assemble-final-datasets.R Joins funder classifications, supplier lookups, and raw payment data into the final output file. Computes per-organisation summary statistics (total_value_payments_to_org, total_number_payments_to_org).

Note: Steps 3–5 handle the alias review pipeline. In a fully automated run, Step 3 generates candidate batches, but the LLM review must occur externally before Step 4 can assemble decisions. When reviewed batches already exist, all steps run in sequence without manual intervention.

A2. Funder Classification

Funders are classified into eight types using a multi-layer cascade. Classification is applied independently to funder_name (producing funder_type) and funder_name_alt (producing funder_type_alt). The cascade proceeds in order; the first match wins.

Classification Cascade

Pre-filter: Junk/Invalid Detection

Entries with purely numeric names, hash-like strings, or very short names (fewer than 3 meaningful characters) are flagged as Junk/Invalid and excluded from subsequent cascade layers. This affects 20 records.

Layer 1: Metadata Signals

Information already present in the source data is used as the first classification signal:

  • Funders from the NHSSpend data source are classified as NHS
  • Funders from the Central Government data source are classified as UK Government
  • The note field in the funder masterlist may contain signals such as “ministerial” or “non-ministerial”, indicating UK Government

Layer 2: External Lookups

Unclassified funders are matched against two external reference databases:

  • findthatcharity — a comprehensive lookup of UK organisations. The organisationType field is mapped to funder types (e.g., nhs-trust → NHS, local-authority → Local Government). Both exact and fuzzy matching (Jaro-Winkler, threshold ≥ 0.90) are used.
  • TCSS Organisation Register — funders that match the Organisation Register are classified as CSO (a civil society organisation acting as a funder).

Layer 3: Keyword Rules

Remaining unclassified funders are matched using pattern-matching rules applied to the funder name. Rules are applied in priority order; the first match wins:

  1. NHS — names containing “NHS” combined with “TRUST”, “CCG”, “ICB”, etc.
  2. UK Government — known department abbreviations (FCDO, DFID, etc.) and patterns like “BRITISH EMBASSY”, “SCOTTISH GOVERNMENT”
  3. Local Government — names containing “CITY COUNCIL”, “COUNTY COUNCIL”, “BOROUGH COUNCIL”, etc.
  4. Police, Fire and Rescue — names containing “CONSTABULARY”, “POLICE”, “FIRE” + “RESCUE”
  5. Education Institution — names containing “UNIVERSITY”, “COLLEGE”, “ACADEMY TRUST”, school patterns
  6. CSO — names ending in “CIC” or containing “TRUST” or “CHARITY” (after NHS and Academy Trusts have been captured)
  7. Other — all remaining unclassified funders

Funder Type Taxonomy

Funder Type Description Examples
UK Government Central government departments, agencies, and arm’s-length bodies DEPARTMENT FOR EDUCATION, MINISTRY OF DEFENCE, DVLA
NHS NHS trusts, clinical commissioning groups, integrated care boards NHS ENGLAND, BARTS HEALTH NHS TRUST, NHS HRAW CCG
Local Government County, district, borough, and unitary councils; combined authorities MANCHESTER CITY COUNCIL, KENT COUNTY COUNCIL
Education Institution Universities, colleges, academy trusts, schools UNIVERSITY OF OXFORD, HARRIS FEDERATION
CSO Civil society organisations acting as funders (grant-makers, intermediaries) THE NATIONAL LOTTERY COMMUNITY FUND
Police, Fire and Rescue Police forces, fire and rescue services METROPOLITAN POLICE, LONDON FIRE BRIGADE
Other Funders not classifiable into the above categories Various unclassified public bodies
Junk/Invalid Clearly erroneous entries (numeric strings, hash tokens) 3149053, 37300

A3. Supplier Matching & Alias Resolution

The raw spending data contains supplier names as entered by government departments — often inconsistent in spelling, abbreviation, and formatting. The pipeline matches these names to organisations in the TCSS Organisation Register to assign a standardised uid to each supplier.

Matching Process

Matching proceeds in two stages:

  1. Exact matching — supplier names are normalised (uppercased, punctuation removed, whitespace collapsed) and matched exactly against the Organisation Register.
  2. Fuzzy matching — unmatched suppliers are compared to the Register using Jaro-Winkler string similarity. Matches with a similarity score ≥ 0.90 are accepted. This captures variations in spelling, abbreviation (e.g., “LTD” vs “LIMITED”), and minor data entry errors.

Alias Resolution

After initial matching, the pipeline identifies potential aliases — cases where the same organisation appears under different supplier names. This is common when departments record the same supplier differently (e.g., “ST LUKE’S HOSPICE” vs “SAINT LUKES HOSPICE”).

Alias resolution proceeds in three steps:

  1. Candidate generation (Script 03a) — deterministic rules identify supplier name pairs that may refer to the same organisation, based on shared UIDs, similar names, or overlapping funder relationships.
  2. Batch review — candidate pairs are grouped into batches and reviewed using a combination of LLM-assisted classification and manual checks. Each pair is labelled as a confirmed alias or a false positive.
  3. Merge (Scripts 03b–03c) — confirmed aliases are assembled into a validated alias file, and the supplier lookup is updated to consolidate duplicate entries under a single canonical record.

Note: The LLM review step occurs externally between Scripts 03a and 03b. In a fully automated run where reviewed batches already exist, all scripts execute in sequence without manual intervention.

A4. Contracts Finder Enrichment

Contracts Finder is the UK government’s portal for public procurement opportunities. The TCSS project operates its own Contracts Finder collector that scrapes awarded contract notices from the portal’s API.

Integration Process

Contracts Finder records are integrated into the procurement dataset through a crosswalk file (contracts-finder-crosswalk.csv) that links awarded contract notices to the main payment data. The crosswalk maps Contracts Finder notice identifiers and awarded amounts to the standardised format used by the rest of the dataset.

Unlike the payment-level data from Central Government and NHS sources, Contracts Finder provides contract-level information — each record represents an awarded contract rather than an individual payment. These records are assigned a data_source value of Contracts Finder or contractsfinder (reflecting different collection batches).

Note: Contracts Finder records account for a relatively small share of the dataset (1.9%) but contribute a disproportionate number of unique organisations (approximately 4,400), as they capture a broader range of contract awards that may not appear in the £25k+ payment transparency data.

A5. Linking to Other TCSS Datasets

The procurement dataset can be linked to other datasets in the UK Third and Civil Society Sector Database using the uid field. This unique organisation identifier is consistent across all TCSS datasets, enabling researchers to enrich procurement records with organisation characteristics, financial data, and other information.

Available Linkages

Dataset Join Key What It Adds
Organisation Register uid Organisation name, postcode, registration and removal dates, source registers, SIC codes
Charity Financial Records uid Annual income, expenditure, and detailed financial breakdowns for registered charities (GB-CHC, GB-SC, GB-NIC prefixed UIDs)
Nonprofit Financial Records (guidance forthcoming) uid Companies House accounts data — balance sheets, profit and loss, employee numbers — for nonprofit companies (GB-COH prefixed UIDs)
CIC 36 Forms uid Community interest statements, beneficiary descriptions, and activity summaries for Community Interest Companies

Example: Linking to the Organisation Register

import pandas as pd

# Load datasets
procurement = pd.read_csv("tcss-procurement-records.csv")
spine = pd.read_csv("TSCS_spine.spine.csv")

# Link procurement records to organisation characteristics
merged = procurement.merge(
    spine[["uid", "organisationname", "postcode", "dateregistered"]],
    on="uid",
    how="left"
)

# Example: count procurement recipients by region (requires postcode lookup)
print(merged.groupby("postcode").size().sort_values(ascending=False).head(10))

Example: Linking to Charity Financial Records

import pandas as pd

# Load datasets
procurement = pd.read_csv("tcss-procurement-records.csv")
charity_finance = pd.read_csv("cso-spine-charity-financial-history.csv")

# Filter procurement to charities only
charity_procurement = procurement[procurement["uid"].str.startswith(("GB-CHC", "GB-SC", "GB-NIC"))]

# Link to latest financial year
latest_finance = charity_finance.sort_values("fy").groupby("uid").last().reset_index()
merged = charity_procurement.merge(
    latest_finance[["uid", "inc", "exp"]],
    on="uid",
    how="left"
)

# Example: compare procurement spend to charity income
print(merged[["uid", "amount", "inc"]].head(10))

Tip: When linking datasets, use a left join from the procurement data to preserve all payment records, even if some organisations are not found in the target dataset. Check for missing values after the join to assess linkage coverage.