Skip to content

Migrate from USAspending bulk download

You've been pulling USAspending's nightly bulk CSVs, unpacking them, and bundling transactions into your own award records. This guide moves the FPDS contracts side of that pipeline to the Tango API.

Scope

In scope. Prime contract awards (FPDS) and their subawards. The FPDS data Tango ingests is the same FPDS feed USAspending republishes — Tango pulls it directly twice daily and bundles transactions into one row per award.

Out of scope (today). Financial assistance (FABS) — grants, loans, direct payments, cooperative agreements, insurance. If your pipeline consumes USAspending's assistance_transactions.csv, subaward_* assistance rows, or any award_type in the assistance set (02, 03, 04, 05, 06, 07, 08, 09, 10, 11), keep that on USAspending bulk for now. Tango has no first-class FABS endpoint yet — /api/contracts/ is contract-only.

Also out of scope. Outlays on prime contracts (USAspending's prime_award_total_outlayed_amount on the prime side). Obligations are the canonical money metric on /api/contracts/. Outlays are exposed on subawards only today.

What you actually gain

  • Twice-daily FPDS ingest vs USAspending's nightly FPDS pull. Tango is ~12 hours fresher on contract data specifically. For FABS, USAspending is your only option and they're equivalent (both refresh daily).
  • One row per award, not one row per modification. USAspending ships contract_award_transactions.csv with a row per FPDS mod; you have to group by award and reconcile dollar fields yourself. Tango's /api/contracts/ returns one row per award with all mods rolled up; if you need the per-mod detail, request ?shape=...,transactions(*) or hit GET /api/contracts/{key}/transactions/.
  • Stable canonical key per award. Tango exposes a canonical key derived from the FPDS award components (CONT_AWD_{piid}_{agency}_{parent_piid}_{parent_agency}). It's stable across re-pulls and is the URL identifier (/api/contracts/{key}/). PIID is still available for human display and cross-checks.
  • Filter + shape at request time. Replace the ZIP-and-grep step with query params (?awarding_agency=GSA&obligated_gte=1000000) and a response-shape selector (?shape=key,piid,obligated). Cuts payload size 80–95% vs the full bulk schema.
  • Push delivery (optional). Subscribe to filter alerts via webhooks instead of diffing CSVs nightly. See Polling vs webhooks below.

Step-by-step

1. Inventory the USAspending columns you actually consume

Most pipelines touch 20–40 columns out of USAspending's ~280-column contract schema. List the columns your downstream code reads (parquet writer, BI tool, vendor enrichment, etc.) — that's the only set you need to map.

2. Map each column to a Tango field

Use the Field mapping table below. Two things to know up front:

  • Agency codes. USAspending uses two codes per office: a 3-digit CGAC for the top-tier (e.g. 047 = GSA) and a 4-digit FPDS code for the sub-tier (e.g. 4740 = GSA Public Buildings Service). Tango exposes both on the awarding_office / funding_office expansions as department_code (CGAC) and agency_code (FPDS sub-tier). Don't swap them — the names look transposable but the codes won't match.
  • Code/description objects. Several USAspending columns are split across <thing>_code and <thing>_description. Tango returns them as {code, description} objects via expansion. To get the bare scalar, request ?shape=naics_code (scalar) instead of ?shape=naics(code,description) (object).

3. Replace the CSV download with a paginated GET

/api/contracts/ uses cursor pagination — there is no page= parameter. Each response is {count, next, previous, cursor, results}. Follow next until it's null, or pass the value from the cursor field on the response into the next request.

/api/subawards/ uses page-number pagination (page=N&limit=M). Different surface, same idea.

Both endpoints accept the same filter syntax as the API reference: range filters like obligated_gte, multi-value filters using | (e.g. naics=541511|541512), and full-text via search=.

4. Shape the response

Pass ?shape= listing only the fields you actually use. Without shape, the API returns a sensible default but it's not the full FPDS record — and it's still much bigger than what most ETLs need.

5. Decide your dedup key

Use key for upserts (canonical, stable, URL-safe). PIID alone is not unique — same PIID can recur across agencies and across fiscal years. The key field encodes the full FPDS award tuple.

Field mapping

This covers the most-used contract and subaward columns. For everything else, request ?show_shapes=true on /api/contracts/ to see the full available field set, or check the Contracts data dictionary.

Contracts (/api/contracts/)

Identifiers

USAspending column Tango field Notes
contract_award_unique_key / award_id key Use key for all API operations. Format: CONT_AWD_{piid}_{agency}_{parent_piid}_{parent_agency}.
award_id_piid piid Same string. PIID alone is not unique across agencies — prefer key for joins.
parent_award_id_piid parent_award.piid parent_award is a {key, piid} reference.
parent_award_id parent_award.key

Dollar amounts

USAspending column Tango field Notes
total_obligated_amount obligated Total dollars obligated across all mods.
current_total_value_of_award base_and_exercised_options_value
potential_total_value_of_award total_contract_value Includes unexercised options.
federal_action_obligation (transaction-level) transactions[].obligated One per FPDS mod. Request via ?shape=transactions(*).

Dates

USAspending column Tango field Notes
action_date (initial transaction) award_date Date the award was first signed.
action_date (per modification) transactions[].transaction_date One per mod, when expanded.
period_of_performance_start_date period_of_performance.start_date
period_of_performance_current_end_date period_of_performance.current_end_date
period_of_performance_potential_end_date period_of_performance.ultimate_completion_date
fiscal_year fiscal_year Federal FY (Oct–Sep). Same semantics as USAspending.

Awarding / funding agency

USAspending splits agency identity across four code/name pairs per side (top-tier + sub-tier + office). Tango folds them into a single awarding_office / funding_office object with three levels. Watch the code semantics — department_code is CGAC, agency_code is FPDS sub-tier.

USAspending column Tango field Notes
awarding_agency_code awarding_office.department_code CGAC top-tier code (3-digit, e.g. 047 for GSA).
awarding_agency_name awarding_office.department_name
awarding_sub_agency_code awarding_office.agency_code FPDS sub-tier code (4-digit, e.g. 4740 for GSA PBS).
awarding_sub_agency_name awarding_office.agency_name
awarding_office_code awarding_office.office_code AAC (Activity Address Code), e.g. 47PF52.
awarding_office_name awarding_office.office_name
funding_agency_code funding_office.department_code Same CGAC/FPDS distinction as awarding.
funding_agency_name funding_office.department_name
funding_sub_agency_code funding_office.agency_code
funding_sub_agency_name funding_office.agency_name
funding_office_code funding_office.office_code
funding_office_name funding_office.office_name

To filter by agency, use ?awarding_agency= or ?funding_agency= — these accept the CGAC code, the FPDS code, the name, or the abbreviation (best-effort matching). For reproducible jobs, use the CGAC code.

Recipient

USAspending column Tango field Notes
recipient_uei recipient.uei UEI is canonical post-SAM transition.
recipient_name recipient.display_name DBA / common name. recipient.legal_business_name is the SAM legal name.
recipient_duns (not exposed on /api/contracts/) DUNS is no longer the primary entity key. Use UEI. The legacy DUNS is still on subawards as recipient_duns.
cage_code recipient.cage_code
recipient_parent_uei recipient.parent_uei (via ?shape=recipient(parent_uei))

Place of performance

USAspending's primary_place_of_performance_* columns map to the place_of_performance expansion:

USAspending column Tango field
primary_place_of_performance_country_code place_of_performance.country_code
primary_place_of_performance_country_name place_of_performance.country_name
primary_place_of_performance_state_code place_of_performance.state_code
primary_place_of_performance_state_name place_of_performance.state_name
primary_place_of_performance_city_name place_of_performance.city_name
primary_place_of_performance_zip_4 place_of_performance.zip_code

NAICS / PSC / set-aside

USAspending column Tango field Notes
naics_code naics_code (scalar) or naics.code (via ?shape=naics(code,description))
naics_description naics.description
product_or_service_code psc_code (scalar) or psc.code
product_or_service_code_description psc.description
type_of_set_aside set_aside.code (via ?shape=set_aside(code,description))
type_of_set_aside_description set_aside.description

Competition

USAspending flattens competition fields onto the row; Tango groups them under competition. Request via ?shape=competition(*).

USAspending column Tango field Notes
type_of_contract_pricing competition.contract_type.code Returned as {code, description} — use .code for direct equivalence.
type_of_contract_pricing_description competition.contract_type.description
extent_competed competition.extent_competed.code
extent_competed_description competition.extent_competed.description
solicitation_identifier competition.solicitation_identifier (also exposed as top-level solicitation_identifier)
solicitation_procedures competition.solicitation_procedures.code
number_of_offers_received competition.number_of_offers_received
other_than_full_and_open_competition competition.other_than_full_and_open_competition.code

Description and award type

USAspending column Tango field Notes
award_description / prime_award_base_transaction_description description
award_type_code award_type.code (via ?shape=award_type(code,description))
award_type award_type.description

Subawards (/api/subawards/)

Subawards are looser — most columns map 1:1 by name. The default shape already returns most of what FSRS-based pipelines need.

USAspending column Tango field Notes
prime_award_unique_key award_key Joins to /api/contracts/{award_key}/.
prime_award_piid piid
prime_awardee_uei prime_awardee_uei Also prime_recipient.uei if you request prime_recipient(*).
prime_awardee_name prime_awardee_name
subawardee_uei / recipient_uei recipient_uei Also subaward_recipient.uei.
subawardee_name / recipient_name recipient_name
subawardee_duns / recipient_duns recipient_duns DUNS is retained on subawards as a legacy column.
subawardee_parent_uei recipient_parent_uei
subawardee_parent_duns recipient_parent_duns
subaward_number subaward_details.number
subaward_action_date subaward_details.action_date
subaward_amount subaward_details.amount
subaward_description subaward_details.description
usaspending_permalink usaspending_permalink Passed through verbatim.
highly_compensated_officer_*_name / _amount highly_compensated_officers[].name / .amount List of {name, amount} objects.
subaward_fsrs_report_last_modified_date fsrs_details.last_modified_date Closest thing to a change-detection timestamp on subawards.

Reference data

USAspending file Tango endpoint Notes
Recipient profiles ZIP GET /api/entities/, GET /api/entities/{uei}/ UEI-keyed.
Agency reference CSV GET /api/organizations/ Unified Federal Hierarchy with CGAC, FPDS, and FH keys. See Federal Agency Hierarchy.
NAICS reference CSV GET /api/naics/ code + description, full-text search=.
PSC reference CSV GET /api/psc/ Same shape as NAICS.

Worked example: replace one CSV job

The classic: "GSA contracts awarded in FY2026 over $1M, with recipient and place of performance." On USAspending this is a bulk-download request + ZIP unpack + filter pass. On Tango it's a paginated loop with a shape selector.

# GSA: pass the name, "047" (CGAC), or the abbreviation — best-effort matching.
# Use CGAC for reproducible jobs.
curl -H "X-API-KEY: $TANGO_API_KEY" \
  "https://tango.makegov.com/api/contracts/?\
awarding_agency=047&\
fiscal_year=2026&\
obligated_gte=1000000&\
ordering=-award_date&\
limit=100&\
shape=key,piid,award_date,obligated,total_contract_value,\
recipient(uei,display_name),naics(code,description),\
place_of_performance(state_code,city_name)"

# Then follow the `next` URL on each response until it's null.
import os
from urllib.parse import parse_qs, urlparse

from tango import TangoClient

client = TangoClient(api_key=os.environ["TANGO_API_KEY"])

SHAPE = (
    "key,piid,award_date,obligated,total_contract_value,"
    "recipient(uei,display_name),naics(code,description),"
    "place_of_performance(state_code,city_name)"
)

cursor: str | None = None
rows: list = []

while True:
    resp = client.list_contracts(
        awarding_agency="047",   # GSA CGAC
        fiscal_year=2026,
        obligated_gte="1000000",
        sort="award_date",
        order="desc",
        limit=100,
        cursor=cursor,
        shape=SHAPE,
    )
    rows.extend(resp.results)
    if not resp.next:
        break
    cursor = parse_qs(urlparse(resp.next).query).get("cursor", [None])[0]

print(f"Pulled {len(rows)} contracts")
import { TangoClient } from "@makegov/tango-node";

const client = new TangoClient({ apiKey: process.env.TANGO_API_KEY! });

const SHAPE = [
  "key",
  "piid",
  "award_date",
  "obligated",
  "total_contract_value",
  "recipient(uei,display_name)",
  "naics(code,description)",
  "place_of_performance(state_code,city_name)",
].join(",");

let cursor: string | null | undefined = undefined;
const rows: unknown[] = [];

while (true) {
  const resp = await client.listContracts({
    awarding_agency: "047", // GSA CGAC
    fiscal_year: 2026,
    obligated_gte: "1000000",
    sort: "award_date",
    order: "desc",
    limit: 100,
    cursor,
    shape: SHAPE,
  });
  rows.push(...resp.results);
  if (!resp.next) break;
  cursor = new URL(resp.next).searchParams.get("cursor");
}

console.log(`Pulled ${rows.length} contracts`);

The subaward equivalent uses client.list_subawards(...) / client.listSubawards(...) with page=N instead of cursor.

Change capture: polling vs webhooks

Two patterns. Pick the one that matches your tolerance for latency and ops complexity.

Polling (simpler default)

Run a scheduled job that pulls awards newer than your last cursor. The change-capture filter is award_date_gte:

curl -H "X-API-KEY: $TANGO_API_KEY" \
  "https://tango.makegov.com/api/contracts/?award_date_gte=2026-05-01&ordering=-award_date&limit=100&shape=..."

What this catches:

  • New awards since the last poll, ordered by award_date. ✅
  • Mods to existing awards — only if the mod is significant enough that Tango re-materialized the award. The /api/contracts/ API does not expose a updated_at / modified_after filter today; award_date is the only date-window filter. If you need per-mod change detection, pull the transactions endpoint for the awards you care about (next section) or use webhooks.

The simple pattern most teams need:

  1. Persist the latest award_date you've seen.
  2. On each run, request award_date_gte=<that date> and follow next until exhausted.
  3. Upsert by key.

This is enough for most BI / dashboards / vendor-watch use cases.

Webhooks (push, lower latency)

For "tell me when a new contract matches this filter, within minutes of FPDS landing," subscribe to a filter alert and consume alerts.contract.match events. The webhook payload ships matched IDs only — your receiver pulls the full record from /api/contracts/{key}/.

Full walkthrough including signature verification, deduping, and the receiver loop: Stream contract awards. The full subscription / payload / retry / circuit-breaker reference: Webhooks user guide.

One thing to know before you POST /api/webhooks/alerts/: if your account has more than one webhook endpoint, you must include "endpoint": "<endpoint_uuid>" in the request body. Single-endpoint accounts auto-resolve; multi-endpoint accounts get a 400 without it. The Python SDK's client.create_webhook_alert(..., endpoint="...") exposes the same kwarg.

Historical backfill

For seeding history (years of contracts at cutover), use a date-windowed cursor crawl. The pattern:

  1. Pick a window — fiscal-year-sized windows (fiscal_year=2018, then 2019, ...) parallelize well because they're independent and roughly balanced (FPDS volumes per FY are within an order of magnitude).
  2. For each window, cursor-paginate /api/contracts/?fiscal_year=<YYYY>&ordering=-award_date&limit=100 and write batches as you go.
  3. Checkpoint the cursor value after each successful batch so you can resume without re-pulling.
  4. Subawards: ?fiscal_year_gte=<YYYY>&fiscal_year_lte=<YYYY> with page-number pagination.

For reference, FY2026 contracts on production are ~1.8M rows; a window-per-FY crawl at limit=100 is on the order of ~20K requests per fiscal year. Use shape= aggressively to keep payloads small.

If you're cutting over from USAspending, validate by spot-checking 50–100 random key lookups against your last USAspending CSV pull for the same agency/FY and confirm obligated, recipient.uei, and awarding_office.agency_code agree.

Transactions completeness

/api/contracts/{key}/transactions/ returns the FPDS modifications Tango has for that award. Each item: modification_number, transaction_date, obligated, description, action_type. Transactions are loaded from FPDS at award materialization — for an award that exists in /api/contracts/, the transaction list reflects what FPDS has published for that award up to the most recent ingest cycle.

If your downstream wants USAspending's per-transaction row shape (one row per mod, denormalized), don't try to recreate it from the bundled award. Hit the transactions detail endpoint per key, or request ?shape=...,transactions(*) on the list call to get the mods inline.

NULL semantics

USAspending bulk CSVs and Tango JSON disagree on how absent values look. Plan for both:

  • USAspending CSVs emit empty strings (""), and in some columns sentinel strings like "None" or "NULL". Your ingest probably already has cleanup for this.
  • Tango JSON uses JSON null for absent values and omits keys from shaped responses when the field isn't requested or isn't present. A shaped response with ?shape=key,piid,recipient(uei) will not include recipient at all if the recipient is unresolved — vs an empty object — so check key presence, not just truthiness, when transforming.
  • Numeric fields (obligated, total_contract_value) come back as JSON numbers, not strings. USAspending's CSVs are strings — your existing parser needs to handle both or you'll write strings to a numeric column.
  • DUNS on contracts is not exposed — see the field-mapping note above. Don't try to map recipient_duns through /api/contracts/; it'll be missing every time. DUNS lives on subawards (recipient_duns) for legacy joins.
  • awarding_office codes can be null independently. An award can have a known department (department_code = "047") but a missing sub-tier or office. Don't assume the three levels are co-populated.

Pitfalls

  • Don't filter agency by name unless you have to. awarding_agency=047 (CGAC code) is exact for GSA. awarding_agency=GSA does best-effort name resolution — great for ad-hoc, can match more or fewer orgs than you expect on a production job. Use CGAC for reproducibility. The FPDS sub-tier code (e.g. 4700) is stored on org records but isn't a reliable filter on /api/contracts/ — use CGAC.
  • fiscal_year is federal FY (Oct–Sep). Same as USAspending; just don't accidentally pass a calendar year.
  • obligated_gte / obligated_lte are exact USD. Pass 1000000, not "1M".
  • Multi-value filters use |. naics=541511|541512 matches either. Same for psc and other scalar filters.
  • Cursor pagination doesn't accept page=. /api/contracts/ uses keyset cursors — follow next or pass cursor=<token>. /api/subawards/ is the opposite — page-number pagination with page=N. They are not interchangeable.
  • Default response shape is opinionated. It's a sensible middle ground, not the full FPDS record. Pass ?shape= explicitly for production jobs; ?show_shapes=true lists the available fields.
  • PIID is not unique. Use key for upserts and joins. PIID alone repeats across agencies.

What's out of scope today

  • FABS / assistance data. No grants, loans, direct payments, cooperative agreements. Keep that on USAspending bulk. See the Grants API reference for what Tango does ship on the grants side — it's Grants.gov data, not FABS-derived.
  • Prime-contract outlays. Obligations only on /api/contracts/. prime_award_total_outlayed_amount is exposed on subawards.
  • Per-mod change feed across all contracts. No global "give me every FPDS mod since timestamp X" endpoint. Polling on award_date_gte catches new awards; webhook filter alerts cover filtered slices in near-real-time; per-award transaction history is on /api/contracts/{key}/transactions/.