# Protests Data Dictionary

This document describes the fields available in the Protests API endpoints (`/api/protests/`). For filtering, ordering, and pagination, see the [Protests API Reference](../api-reference/protests.md).

## Overview

Protests are bid protest records from GAO and COFC (Court of Federal Claims). The API exposes a narrow set of client-facing fields. Source ingest payload details and internal fields (e.g. `external_id`, `data_quality`) are not exposed. Filter by `source_system` (`gao` or `cofc`) to scope results to a single venue.

**Authentication**: Protests endpoints require authentication (API key or OAuth2).

**Date range**: The API only serves protests from **2015 onward**. Pre-2015 records exist in the database but are not available through the API.

**Note**: Both **list** and **detail** endpoints return case-level objects. Each case is identified by `case_id` (a deterministic UUID from `source_system` + `base_case_number`). Detail lookup uses `case_id` in the URL path: `GET /api/protests/{case_id}/`. Use `?shape=...,dockets(...)` to expand nested docket records. The default (unshaped) response does not include dockets.

## Update Frequency

GAO and COFC protest data is refreshed by loader schedule and available source files. COFC data is sourced from CourtListener (dockets and opinions).

## Fields (API response)

Only the following fields are returned in list and detail responses (and are available via `shape`):

| Field | Type | Description | Source |
| ----- | ---- | ----------- | ------ |
| `case_id` | UUID | Deterministic case UUID derived from `source_system` + `base_case_number`. Use for detail lookup: `GET /api/protests/{case_id}/`. | Tango |
| `source_system` | String | Source venue identifier (for example `gao`). | Tango |
| `case_number` | String | Base case for grouping sub-dockets (for example `b-424214` from `B-424214.1`). Model field: `base_case_number`. Use with `decision_date` for decision-level counting. | Derived |
| `docket_number` | String | Source docket identifier (for example `b-424214.1`, `b-424214.2`). Model field: `case_number`. Only available inside the `dockets(...)` expansion. Use to distinguish sub-dockets under the same `case_number`. | Source |
| `title` | String | Protest title. | Source |
| `protester` | String | Protester name. | Source |
| `agency` | String | Protested agency. | Source |
| `solicitation_number` | String | Solicitation number when provided. | Source |
| `case_type` | String | Protest case type. | Source |
| `outcome` | String | Protest outcome, when available. | Source |
| `filed_date` | Date | Date the protest was filed. | Source |
| `posted_date` | Date | Date the protest was posted. | Source |
| `decision_date` | Date | Decision date, when available. | Source |
| `due_date` | Date | Protest due date, when available. | Source |
| `docket_url` | String | Source docket URL. | Source |
| `decision_url` | String | Source decision URL, when available. | Source |

**Opt-in via shape only:** `digest` — when requested with `?shape=...,digest`, the value from `raw_data.digest` (e.g. decision summary text) is returned. Not included in the default list/detail response.

## Resolution Fields (shape expansions)

Protests support entity and organization resolution via the Bayesian resolver. These fields are available as shape expansions and return `null` when no confident match exists.

### `resolved_protester(...)` — Entity Resolution

| Field | Type | Description |
| ----- | ---- | ----------- |
| `uei` | String | Unique Entity Identifier of the matched entity. |
| `name` | String | Display name of the matched entity. |
| `match_confidence` | String | `"confident"` (high confidence, auto-linkable) or `"review"` (medium confidence, needs human review). |
| `rationale` | String | Human-readable explanation (e.g. `"Exact name match"`, `"Similar name match; multiple entities share this name"`). |

### `resolved_agency(...)` — Organization Resolution

| Field | Type | Description |
| ----- | ---- | ----------- |
| `key` | String (UUID) | UUID primary key of the matched organization in Tango. |
| `name` | String | Display name of the matched organization. |
| `match_confidence` | String | `"confident"` or `"review"`. |
| `rationale` | String | Human-readable explanation (e.g. `"Exact name match; parent agency confirmed"`). |

**Confidence mapping**: Internal resolution tiers map to public labels as follows: `high` → `"confident"`, `medium` → `"review"`. Low-confidence and no-match results are excluded (return `null`).

The following are **not** exposed in the API: `external_id`, `data_quality`, `source_last_updated`, `created`, `modified`, `raw_data`, `field_provenance`.

## Data Quality (internal)

Every protest record has an internal `data_quality` tier (not exposed in the API response). Records with `quarantined` or `unknown` quality are **always excluded from API responses**. Quality is recomputed on each ingestion cycle.

## Shaping

Both list and detail return case-level objects. `shape` lets you request a subset of fields. Use the `dockets(...)` expansion to include nested docket records.

Allowed shape fields (case-level):

- `case_id`, `source_system`, `case_number`, `title`, `protester`, `agency`
- `solicitation_number`, `case_type`, `outcome`
- `filed_date`, `posted_date`, `decision_date`, `due_date`
- `docket_url`, `decision_url`
- `digest` (from `raw_data.digest`; opt-in only)
- `dockets` (expand with docket fields, e.g. `dockets(case_number,docket_number,filed_date)`)
- `resolved_protester` (expand: `uei`, `name`, `match_confidence`, `rationale`)
- `resolved_agency` (expand: `key`, `name`, `match_confidence`, `rationale`)

Inside `dockets(...)`, these additional fields are available:

- `docket_number` (specific sub-docket identifier, e.g. `b-424214.1`)

Examples:

```bash
# List: case-level and nested dockets
GET /api/protests/?shape=case_id,case_number,title,dockets(docket_number,filed_date,outcome)

# List: case-level only
GET /api/protests/?shape=case_id,title,outcome,decision_date

# Detail with nested dockets
GET /api/protests/{case_id}/?shape=case_id,title,dockets(docket_number,filed_date)
```

`raw_data` and field-level provenance metadata are internal-only and not exposed in API responses.

## Data Sources

- **GAO** – primary source for protest records.
- **Tango** – `case_id` (deterministic case UUID) and normalized API behavior.
