Normattiva Pipeline
This page documents the current operational behavior of the Normattiva download and ingest flow.
Entry Points
- Downloader entrypoint:
API: Normattiva Download
(
backend/src/law_graph/pipelines/normattiva/download/italy.py) - Ingest entrypoint:
API: Normattiva Ingest
(
backend/src/law_graph/pipelines/normattiva/ingest/__init__.py) - Async investigation notes: Normattiva Async Notes
Operational Truth (Current)
M(richiestaExport="M", multivigente) is reliable for automation and regular synchronization.- Bulk historical windows for
V/Oare best-effort and frequently return empty archives in async export flows. - Per-act
codiceRedazionaleexports forV/Ocan produce meaningful results and are currently the practical strategy for targeted historical backfill.
What We Are Doing Now
- Run continuous/regular syncs with multivigente (
M) as the default strategy. - Persist inventory and ingest state in manifest DBs to support incremental operation and reproducible backfill intent.
- Treat broad
V/Oasync windows as opportunistic (non-blocking) inputs, not core guarantees. - Use targeted backfill techniques where explicit act identities are known.
What Is Intentionally Not Guaranteed
- Full bulk historical coverage from async
V/Owindows. - Exhaustive automatic discovery of all
codiceRedazionalevalues. - API behavior stability for historical bulk exports beyond observed
Mreliability.
Practical Contributor Guidance
- If implementing automation, optimize for
Mfirst. - If historical
V/Ocoverage is required, design per-act workflows with explicit IDs and retry/reporting controls. - Keep docs and CLI help explicit about the distinction between reliable and best-effort modes.
Refactor TODO (Proposed)
- Introduce downloader strategy selection in
backend/src/law_graph/pipelines/normattiva/download/runner.pysoItalyDownloaderis not hard-wired and future sources/profiles can be configured without structural duplication. - Narrowed ingest persistence surface by moving parse/finalize row-shape
assembly behind
IngestManifestDbStorerunner-facing intent methods and reducing direct row-contract construction in ingest orchestration. - Externalize Normattiva client defaults (base URL, origin, referer, retry tuning) into environment/config wiring so deployments can override network behavior without code edits; keep sane defaults in code.