Skip to content

download_manifest.db Table Reference

This page documents table purpose, key structure, and invariants for download_manifest.db.

Overview

download_manifest.db stores download-time catalog and inventory state:

  • canonical act catalog
  • identifier aliases
  • deduplicated artifact blobs and logical file observations
  • run/event audit trail
  • ambiguity backlog
  • derived coverage view

Table Map (Mermaid Flowchart)

flowchart LR
    dm_act_catalog["Act Catalog (dm_act_catalog)<br/>One canonical row per act"]
    dm_act_alias["Act Alias (dm_act_alias)<br/>External identifiers mapped to acts"]
    dm_act_progress["Enumeration Progress (dm_act_progress)<br/>Resume checkpoints per window"]
    dm_act_meta["Enumeration Metadata (dm_act_meta)<br/>Small workflow key-value state"]
    dm_artifact_blob["Artifact Blob (dm_artifact_blob)<br/>Deduplicated physical content by hash"]
    dm_act_file["Act File Observation (dm_act_file)<br/>Logical file rows linked to act/source/blob"]
    dm_source_ref["Source Reference (dm_source_ref)<br/>Source publication identity"]
    dm_download_run["Download Run (dm_download_run)<br/>One row per download invocation"]
    dm_download_event["Download Event (dm_download_event)<br/>Per-attempt result/audit events"]
    dm_download_ambiguity["Download Ambiguity (dm_download_ambiguity)<br/>Unresolved ownership mapping backlog"]
    act_file_coverage["Act File Coverage View (act_file_coverage)<br/>Derived per-act coverage projection"]

    dm_act_catalog -->|FK: act_pk| dm_act_alias
    dm_act_catalog -->|FK: act_pk| dm_act_file
    dm_artifact_blob -->|FK: blob_id| dm_act_file
    dm_source_ref -->|FK: source_ref_id| dm_act_file
    dm_download_run -->|FK: run_id| dm_download_event
    dm_act_catalog -->|FK: act_pk| dm_download_event
    dm_source_ref -->|FK: source_ref_id| dm_download_event
    dm_download_run -->|FK: run_id| dm_download_ambiguity
    dm_act_catalog -->|FK: act_pk| dm_download_ambiguity
    dm_act_catalog -.->|derived from catalog + files| act_file_coverage
    dm_act_file -.->|derived from catalog + files| act_file_coverage

Core Catalog Tables

dm_act_catalog

Contains: - One canonical row per known act identity in the download hub.

Purpose: - Canonical act row used by downloader tables.

Keys: - Primary key: act_pk

Foreign keys: - Referenced by dm_act_alias.act_pk - Referenced by dm_act_file.act_pk - Referenced by dm_download_event.act_pk (nullable) - Referenced by dm_download_ambiguity.act_pk (nullable)

Invariants: - One row per canonical act in this DB instance. - act_pk is internal DB identity, not cross-service identity. - canonical_urn_nir and primary_eli_id are promoted primary identifiers for fast deterministic lookup.

dm_act_alias

Contains: - Alias rows mapping external identifier pairs (kind, value) to act_pk.

Purpose: - Maps external/domain identifiers to act_pk.

Keys: - Primary key: alias_id - Unique index: (kind, value) via ux_dm_act_alias_kind_value

Foreign keys: - act_pk -> dm_act_catalog.act_pk

Invariants: - Alias namespace+value pair points to at most one act. - Used for canonical URN and source-specific owner ids.

Manifest Enumeration Checkpoint Tables

dm_act_progress

Contains: - Resume/checkpoint rows keyed by enumeration kind + window key.

Purpose: - Resume-safe checkpoint ledger for manifest windows.

Keys: - Composite primary key: (kind, window_key)

Foreign keys: - None

Invariants: - At most one progress row per (kind, window_key) pair.

dm_act_meta

Contains: - Small workflow metadata key/value rows for enumeration state.

Purpose: - Small metadata key/value state for enumeration workflows.

Keys: - Primary key: key

Foreign keys: - None

Invariants: - At most one value per metadata key.

Artifact Inventory Tables

dm_artifact_blob

Contains: - Deduplicated physical blob records keyed by SHA-256 digest.

Purpose: - Physical dedupe layer keyed by content digest.

Keys: - Primary key: blob_id - Unique: sha256

Foreign keys: - Referenced by dm_act_file.blob_id

Invariants: - One row per unique SHA-256 digest. - Multiple dm_act_file rows may reference one blob.

dm_act_file

Contains: - Logical file observation rows linking acts, source refs, and blob ids.

Purpose: - Logical artifact observations per act.

Keys: - Primary key: file_id

Foreign keys: - act_pk -> dm_act_catalog.act_pk - source_ref_id -> dm_source_ref.source_ref_id (nullable) - blob_id -> dm_artifact_blob.blob_id

Invariants: - No strict uniqueness on (act_pk, mode, format, path). - Repeated logical observations are allowed for audit history. - Path integrity is validated operationally against export root.

dm_source_ref

Contains: - Source publication identity rows keyed by (source, owner alias pair).

Purpose: - Source publication identity hub (source, owner_alias_kind, owner_alias_value).

Keys: - Primary key: source_ref_id - Unique: (source, owner_alias_kind, owner_alias_value) via ux_dm_source_ref_source_owner_alias

Foreign keys: - Referenced by dm_act_file.source_ref_id - Referenced by dm_download_event.source_ref_id

Invariants: - One row per unique source publication key.

Run Audit Tables

dm_download_run

Contains: - One run ledger row per download invocation.

Purpose: - One row per download command invocation.

Keys: - Primary key: run_id

Foreign keys: - Referenced by dm_download_event.run_id - Referenced by dm_download_ambiguity.run_id (nullable)

Invariants: - Run ids are stable UUID-like text ids.

dm_download_event

Contains: - Per-attempt event rows recording request dimensions and outcomes.

Purpose: - Attempt/observation event stream for each run.

Keys: - Primary key: event_id

Foreign keys: - run_id -> dm_download_run.run_id - act_pk -> dm_act_catalog.act_pk (nullable) - source_ref_id -> dm_source_ref.source_ref_id (nullable)

Invariants: - Event may exist without resolved act_pk. - Event record preserves request dimensions and result status.

Ambiguity Backlog Table

dm_download_ambiguity

Contains: - Backlog rows for unresolved ownership/alias mapping ambiguities.

Purpose: - Tracks unresolved ownership/alias mapping ambiguity.

Keys: - Primary key: ambiguity_id - Unique constraint: (act_pk, reason)

Foreign keys: - act_pk -> dm_act_catalog.act_pk (nullable) - run_id -> dm_download_run.run_id (nullable)

Invariants: - act_pk must remain nullable (runtime contract check). - resolved is integer-encoded boolean (0/1). - observed_keys_json and candidate_act_pks_json are JSON payloads.

Derived View

act_file_coverage (view)

Contains: - Read-only per-act aggregate coverage projections derived from act/file rows.

Purpose: - Read-only per-act coverage rollup from catalog + file rows.

Key shape: - One row per act_pk projection.

Base relations: - dm_act_catalog left-joined with dm_act_file

Invariants: - Rebuilt idempotently by storage initialization. - Derived state only; not a source-of-truth table.