Route Matching
Route matching is the ninth predicate run in the pipeline, after the full distance-matching block, and correlates ATLAS platforms with OSM nodes based on shared GTFS transit routes and directions.
Overview
While the previous predicates rely on exact UICs, names, or purely closest distance, Route Matching provides an alternative way to confidently match stops.
Importantly, this is still a spatial, stop-to-stop matching process, not just linking abstract routes. For every unmatched ATLAS stop, the predicate looks for unmatched OSM stops within a 50m radius. If an ATLAS platform and a nearby OSM node share strong GTFS route-token evidence or compatible direction-name evidence, they are matched together. Route data acts as the "proof" that two physically close points are indeed the same stop. The spatial candidate filter uses OsmNode.is_station, so aerialway stations remain eligible.
Result: 0 route-based matches
Unlike the exact, name, and distance predicates, route matching now re-validates each batched candidate list against the current used_ids set before selecting a match. This prevents an OSM representative consumed earlier in the same predicate run from being re-used by a later ATLAS row.
Required Data
Route matching relies entirely on data owned by the state layer — the predicate performs no file I/O:
OsmNodecandidates found via batchedOsmState.batch_query_radius()withinmax_distanceOsmState.name_dirs— per-node direction strings (loaded fromosm_directions.csvsidecar or parsed from XML relations)OsmState._node_routesviactx.osm.get_node_routes(node_id)— per-node GTFS route memberships derived from OSM XML relations duringOsmState.from_xml_file()AtlasState._routes_by_sloidviactx.atlas.get_routes(sloid)— GTFS route entries loaded fromatlas_routes_gtfs.csvduringAtlasState.from_dataframe()
Token-Based Matching
Route data is converted into comparable tokens. The predicate tries two priority levels:
P1: GTFS Route-ID Tokens
The predicate primarily compares per-stop GTFS route tokens that are already loaded into AtlasState and OsmState:
- ATLAS Tokens:
{(route_id_normalized, direction_id)}fromatlas_routes_gtfs.csv. - OSM Candidates: For each nearby node,
ctx.osm.get_node_routes(node_id)contributes(gtfs_route_id, direction_id)and(normalize_route_id(gtfs_route_id), direction_id)tokens derived from the XML relation pass.
If RouteState already contains an in-process mapping for an OSM relation ID, the predicate also adds that mapped ATLAS route ID and its normalized form to the OSM candidate token set before intersecting it with the ATLAS tokens.
Normalized route IDs are therefore carried on the ATLAS side in atlas_routes_gtfs.csv and computed on the OSM side at match time. RouteState uses the same normalization helper when it is populated.
If ref_trips does not yield a direction, OSM route extraction currently emits both direction buckets (0 and 1) for that relation membership so route-id evidence can still participate.
P2: Name-Based Direction Fallback
ATLAS direction names are compared against OSM route relation direction strings (first/last member names like "Zurich HB → Bern"), stored in OsmState.name_dirs. The current implementation checks exact direction-string membership.
Data Sources
| Source | File / Origin | Loaded by | Description |
|---|---|---|---|
| GTFS routes | data/processed/atlas_routes_gtfs.csv |
AtlasState |
Timetable-derived route entries per SLOID for stop-level matching |
| OSM routes | OSM XML relations | OsmState.from_xml_file() |
Route memberships per OSM node (via relation ID) |
| Equivalency cache | data/processed/atlas_routes.csv + data/processed/osm_routes.csv |
RouteState |
Optional atlas-route crosswalk, primarily populated by the route import path |
Related Documentation
- 1.2 GTFS – GTFS route extraction
- 1.3 OSM data – OSM route extraction
- 1.4 Route-Route Matching – Route-level ATLAS↔OSM linking in importer
(Route provenance is tracked in the output via match_type — currently always route_gtfs_gtfs; the specific evidence is recorded in notes as either gtfs_tokens or direction_name.)
When Route Matching Succeeds
Route matching is particularly effective for:
- Platforms without UIC: Some
OsmNodeentities lackuic_refbut have route memberships - Ambiguous proximity: When multiple
OsmNodeentities are nearby, shared routes disambiguate
Code Reference
| Class / Method | Description |
|---|---|
RouteMatchPredicate |
Predicate class; leverages RouteState and batch_query_radius() |
ctx.atlas.get_routes(sloid) |
Returns ATLAS route assignments for a SLOID |
ctx.osm.get_node_routes(node_id) |
Returns relation memberships for an OSM node |
RouteState.get_atlas_route() |
Returns the mapped ATLAS route for a given OSM relation ID |
All predicate logic is in predicates/route_matching_gtfs.py. Route state logic lives in route_state.py.