OSM Data

OpenStreetMap provides community-maintained public transport data. We fetch Swiss PT nodes and their route relations via Overpass API.

Way Ingestion Policy (Issue #37)

In addition to OSM nodes, we now ingest a narrow subset of OSM ways as virtual stop elements.

Included way categories:

  1. aerialway=station + public_transport=station
  2. ways with uic_ref where no existing node has the same uic_ref

Kept ways are converted to virtual IDs (way_<osm_way_id>) and represented as point elements using out center from Overpass (or a centroid fallback in parser logic). These virtual IDs are propagated through matching and route membership the same way as node IDs.

flowchart LR subgraph Query["Overpass Query"] N["PT Nodes<br/>~67K"] R["Route Relations<br/>~3.5K"] end subgraph Process["Processing"] X[osm_data.xml] --> P1["osm_nodes_with_routes.csv<br/>osm_directions.csv"] X --> P2["osm_routes.csv<br/>osm_route_tags.csv<br/>osm_route_members.csv"] end Query --> X

Statistics

  • OSM Nodes with Route Data: 0

Data Source

Property Value
Endpoint https://overpass-api.de/api/interpreter
Coverage Switzerland (ISO3166-1=CH)
Updates Live (reflects current OSM state)

Node Selection Criteria

Tag Values
public_transport platform, stop_position, station, halt, stop
railway tram_stop, halt, station
highway bus_stop
amenity ferry_terminal, bus_station
aerialway station

Generated Route Artifacts

The raw XML currently feeds two output families: entity-first route tables for route import, plus flattened/sidecar files used by stop-level matching and stats helpers.

Entity-first route tables

File Description Used By
osm_routes.csv One row per OSM route relation with core metadata (route, name, ref, operator, network, gtfs_route_id, ...). Route import, RouteState
osm_route_tags.csv Exploded key/value tags for each route relation. Route import
osm_route_members.csv Ordered relation members with resolved node/way IDs and a derived direction_id_derived. Route import

Flattened / sidecar stop-level files

Alongside the entity-first tables, the wider pipeline also maintains two per-node exports:

osm_nodes_with_routes.csv

One row per node–route combination. This compact flattened view uses these columns:

Column Description Example
node_id OSM node ID 123456789
node_type public_transport value when present platform
route_name OSM route relation name Zürich HB - Oerlikon
gtfs_route_id OSM relation gtfs:route_id 11-T-j25-1
direction_id Direction parsed from ref_trips (0/1, or empty if unknown) 0
uic_ref Node uic_ref tag 8503000

This CSV is a flattened export used by route stats/UI helpers. The matcher itself reads route memberships directly from the OSM XML relation pass in OsmState.from_xml_file().

osm_directions.csv

One row per node with an extracted textual direction (by node Name or UIC reference):

Column Description Example
node_id OSM node ID 12345
dir_type Direction type (name or uic) name
direction_string Full start->end direction Auzelg → Rehalp

Direction Extraction

OSM routes often have a ref_trips tag encoding direction:

  • .H suffix → outbound (direction_id = 0)
  • .R suffix → inbound (direction_id = 1)

Many routes lack this tag, leaving the exported direction_id / direction_id_derived empty. In the in-memory matcher, relations without a detectable suffix are expanded to both directions so GTFS token matching can still fall back to route ID alone.

Key OSM Tags for Matching

Tag Purpose Used In
uic_ref UIC reference number Exact matching
name / uic_name Stop name Name matching
local_ref Platform identifier Disambiguation
ref Route number Route matching
from / to Route direction Route matching

Operator Normalization

OSM operator names vary widely (for example, CFF, FFS, and SBB CFF FFS for SBB). These are normalized before matching—see 1.3.1 OSM operator normalization.

Overpass Query

The following Overpass QL query is used to fetch the data from the Overpass API:

[out:xml][timeout:360];
area["ISO3166-1"="CH"]->.searchArea;

(
    node(area.searchArea)["public_transport"~"platform|stop_position|station|halt|stop"];
    node(area.searchArea)["railway"="tram_stop"];
    node(area.searchArea)["amenity"="ferry_terminal"];
    node(area.searchArea)["amenity"="bus_station"];
    node(area.searchArea)["highway"="bus_stop"];
    node(area.searchArea)["railway"="halt"];
    node(area.searchArea)["railway"="station"];
    node(area.searchArea)["aerialway"="station"];
)->.pt_nodes;

(
    way(area.searchArea)["aerialway"="station"]["public_transport"="station"];
    way(area.searchArea)["uic_ref"];
)->.candidate_ways;

.pt_nodes out body qt;
.candidate_ways out body center qt;

(
    relation(bn.pt_nodes)[type=route];
    relation(bw.candidate_ways)[type=route];
);
out meta;
Data update running in background
Preparing update... | Phase: initializing
Data update in progress
Core data is being refreshed. Use this time to read the documentation.
Elapsed: -- ETA: -- Phase: idle