Exact Matching

Exact matching is the second predicate run in the pipeline, producing the highest-confidence non-distance matches by aligning entries through their shared UIC reference numbers.

flowchart TB CTX["MatchingContext"] --> UA["atlas.get_unmatched_records()"] CTX --> OSM["osm.get_by_uic(uic_ref)<br/><i>excl. used + siblings + is_station nodes</i>"] UA -->|group by uic_ref| G["For each UIC group"] G --> C{"How many<br/>on each side?"} C -->|"0 OSM"| X["No match"] C -->|"1 OSM, N ATLAS"| M1["All ATLAS → single OSM<br/>(many-to-one)"] C -->|"N OSM, 1 ATLAS"| M2["ATLAS → all OSM<br/>(one-to-many)"] C -->|"N OSM, M ATLAS"| D{"designation<br/>== local_ref?"} D -->|paired| M3["1:1 matches"] D -->|unpaired| X M1 & M2 & M3 -->|"ctx.commit()"| OUT["MatchRecord entities"]

Overview

The UIC (Union Internationale des Chemins de fer) reference is a standardized identifier used across European rail systems.

Dataset UIC Field Domain Model Field Example
ATLAS number AtlasNode.uic_ref 8503000
OSM uic_ref tag OsmNode.uic_ref 8503000

Result: 21,769 exact matches

How It Works

ExactUicPredicate groups all unmatched AtlasNode entries by uic_ref, then queries ctx.osm.get_by_uic() for each UIC. Matches are recorded immediately via ctx.commit().

Implementation note: exact matches store the actual haversine distance between the ATLAS and OSM coordinates.

Matching Scenarios

Scenario Example Resolution
1 OSM → N ATLAS Single node, multiple platforms All ATLAS entries match to one node
N OSM → 1 ATLAS Multiple nodes, single platform One platform matches all nodes
N OSM ↔ M ATLAS Multiple both sides Disambiguate by designationlocal_ref

Disambiguation

When both sides have multiple entries with the same UIC, the designation field (AtlasNode.designation — the ATLAS platform number) is matched against local_ref (OsmNode.local_ref — the OSM platform identifier), case-insensitively.

If no unambiguous pairing can be built, the remaining ATLAS entries are not matched in this stage. The predicate also checks ctx.osm.is_used() to avoid double-booking an OSM node within the many-to-many case.

Field Domain Model Meaning Example
designation AtlasNode.designation Platform identifier (letter/number) "A", "1", "Nord"
designationOfficial AtlasNode.designation_official Full stop name "Zurich HB"
local_ref OsmNode.local_ref Platform identifier "A", "1"
uic_ref OsmNode.uic_ref UIC reference number "8503000"

Key distinction: designation is not designation_official

  • designation: Platform-level identifier (e.g., "Track 1")
  • designation_official: Station/stop name (e.g., "Zurich HB")

This ensures we match individual platforms rather than entire station buildings.

Code Reference

Class Description
ExactUicPredicate Predicate class; groups by UIC, handles all three scenarios including disambiguation

All logic is in predicates/exact_matching.py.

Data update running in background
Preparing update... | Phase: initializing
Data update in progress
Core data is being refreshed. Use this time to read the documentation.
Elapsed: -- ETA: -- Phase: idle