Skip to content

Formats Comparison

🧠 Why they differ so much

Each export format reflects a different layer of the Normattiva publishing pipeline:

Layer Format Origin
Legal text layer Akoma Ntoso (AKN) The official, structured XML standard used for legislation (FRBR/ELI, anchors, legal semantics).
Editorial database layer JSON Normattiva’s internal content-management export β€” focuses on metadata, versioning, and links between acts.
Legacy NIR XML NIR/Legge… XML The older Italian Norme in Rete XML schema (pre-AKN), still useful as a fallback for some identifiers/labels.
Presentation layer HTML The render for public viewing β€” lots of <div>, <span>, anchors, little semantics.

They don’t come from the same serialization of one object β€” they come from different internal models. That’s why field sets and structures differ so dramatically.


πŸ“¦ What extra information each format gives you

🟩 JSON

βœ… Extra information

  • Cross-act change tracking (aggiornamentiAtto, attiAggiornati)
  • Temporal versioning (dataVigoreVersione, inizioVigore, fineVigore)
  • Internal identifiers (idNir, idDettNir, idInterno, idProgrNir)
  • Editorial states and flags (abrogato, modificheAttive, errore)
  • β€œDatabase” fields (Gazzetta info, rubrica, codiceRedazionale)

🚫 Missing

  • Full structural hierarchy (chapters, paragraphs)
  • FRBR/ELI metadata lattice
  • Textual modification instructions (<textualMod>)

πŸ”§ Best for

  • Incremental syncs
  • Building the temporal graph (what changed / when)
  • Analytics & indexing (fast Parquet conversion)

🟦 Akoma Ntoso (AKN)

βœ… Extra information

  • Complete legal structure: <body>, <chapter>, <article>, <paragraph> etc. with @eId anchors
  • FRBR / ELI metadata: Work / Expression / Manifestation, authors, publication, workflow
  • Lifecycle and textualMod elements = explicit β€œamend X β†’ insert Y”
  • Meta / references / analysis = cross-links with rich provenance

🚫 Missing

  • Normattiva-specific update registry (no aggiornamentiAtto)
  • In-force intervals (AKN has lifecycle, but not per-element)
  • Editorial internal IDs

πŸ”§ Best for

  • Semantic graph (citations, references)
  • EU interoperability (ELI/FRBR alignment)
  • NLP / RAG / search on legal text

🟨 NIR XML

βœ… Extra information

  • Italian-specific β€œNIR” descriptors (intestazione, descrittori, urn, redazione)
  • Often contains labels/identifiers that are absent or inconsistent across JSON/AKN

🚫 Missing

  • FRBR/ELI semantics (AKN replaced them)
  • Update relationships (JSON-only)

πŸ”§ Best for

  • Fallback extraction when AKN/JSON are missing a value
  • Bridging legacy datasets and validating identifiers

πŸŸ₯ HTML

βœ… Extra information

  • Readable text, formatting, anchors (id, href)
  • Browser-rendered layout

🚫 Missing

  • All semantic data (everything is a <div> or <span>)

πŸ”§ Best for

  • UI / OCR comparison / anchor extraction (e.g., to detect where articles start)
  • Not for canonical data

🧭 So, what should you actually do now?

Goal Recommended format(s) Why
Synchronize acts, detect changes, build timeline graph βœ… JSON (Multivigente) Contains all change and validity metadata.
Extract structured text and legal anchors βœ… AKN (Multivigente) Full text + FRBR/ELI + explicit anchors.
Fill missing identifiers/labels (fallbacks) 🟑 NIR XML (fallback) Legacy descriptors sometimes missing elsewhere.
Presentation or anchor alignment 🟑 HTML (light use) Only for comparing visible anchors.

In other words:

  • Use JSON as your temporal backbone.
  • Use AKN as your semantic structure.
  • Use NIR XML only as a fallback/auxiliary layer.
  • Treat HTML (if used at all) as an auxiliary layer.

🧩 Why you should test others (lightly)

Yes β€” test them, but only to extract what JSON/AKN lack.

  • From NIR XML, capture urn, pubblicazione[@num|@norm], and redazione descriptors.

  • From HTML, optionally collect @id/@href anchors for front-end search linking.

No need to integrate them fully β€” treat them as auxiliary layers.

βš–οΈ TL;DR recommendation

Function Preferred format
Incremental sync / change tracking JSON
Full legal structure / EU graph AKN
Missing identifier/label fallback NIR XML
Visual/anchor validation HTML

Format comparisons

Comparison of Classic (C) vs Responsive (R) attribute exports on AKN and JSON. They are identical.