Scaling Pedestrian Crossing Analysis to 100 U.S. Cities via AI-based Segmentation of Satellite Imagery

Scaling Pedestrian Crossing Analysis to 100 U.S. Cities via AI-based Segmentation of Satellite Imagery
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Accurately measuring street dimensions is essential to evaluating how their design influences both travel behavior and safety. However, gathering street-level information at city scale with precision is difficult given the quantity and complexity of urban intersections. To address this challenge in the context of pedestrian crossings - a crucial component of walkability - we introduce a scalable and accurate method for automatically measuring crossing distance at both marked and unmarked crosswalks, applied to America’s 100 largest cities. First, OpenStreetMap coordinates were used to retrieve satellite imagery of intersections throughout each city, totaling roughly three million images. Next, Meta’s Segment Anything Model was trained on a manually-labelled subset of these images to differentiate drivable from non-drivable surfaces (i.e., roads vs. sidewalks). Third, all available crossing edges from OpenStreetMap were extracted. Finally, crossing edges were overlaid on the segmented intersection images, and a grow-cut algorithm was applied to connect each edge to its adjacent non-drivable surface (e.g., sidewalk, private property, etc.), thus enabling the calculation of crossing distance. This achieved 93 percent accuracy in measuring crossing distance, with a median absolute error of 2 feet 3 inches (0.69 meters), when compared to manually-verified data for an entire city. Across the 100 largest US cities, median crossing distance ranges from 32 feet to 78 feet (9.8 to 23.8m), with detectable regional patterns. Median crossing distance also displays a positive relationship with cities’ year of incorporation, illustrating in a novel way how American cities increasingly emphasize wider (and more car-centric) streets.


💡 Research Summary

This paper presents a scalable, fully automated pipeline for measuring pedestrian crossing distances—including both marked and unmarked crosswalks—across the 100 largest U.S. cities. The authors combine open‑source geographic data (OpenStreetMap) with Meta’s Segment Anything Model (SAM) to segment satellite imagery of intersections and then compute precise crossing lengths using a grow‑cut algorithm.

First, OSMnx is used to extract latitude‑longitude coordinates for every street intersection and mid‑block crossing in each target city. These coordinates drive bulk retrieval of high‑resolution satellite tiles from the Google Maps API. Each tile covers a 25‑meter radius around the intersection, is 1629 × 1629 pixels, and represents a 30 cm ground sample distance. In total, roughly three million tiles (≈8.3 TB) are collected, with parallel serverless workers achieving a peak download rate of 2,000 tiles per minute.

Second, the authors fine‑tune SAM, originally a zero‑shot segmentation model, on a manually annotated subset of high‑entropy intersection images. From 14 representative cities, 193 high‑entropy images are hand‑labeled for non‑drivable surfaces (sidewalks, parks, refuge islands, etc.). Data augmentation (flipping, rotation, Gaussian blur, random noise) expands this set to 5,790 samples. SAM is then run on all three million images using a 16 × 16 point grid and a 1024 × 1024 input size, producing pixel‑level masks of non‑drivable areas. Post‑processing includes (a) removing masks that overlap OpenStreetMap road centerline buffers (2 m for primary roads, 1.2 m for others) to eliminate false positives, and (b) inserting merged building footprints from OSM where the model missed non‑drivable surfaces, thereby reducing false negatives. The masks are converted to geographic polygons and exported as ESRI shapefiles.

Third, pedestrian crossing edges are extracted from OSM. When a crossing is represented only as a node, the node is matched to the nearest highway edge and a synthetic crossing edge is generated. Duplicate edges with endpoints within 5 m and bearing differences under 5° are merged.

Finally, the grow‑cut algorithm is applied. Each OSM crossing edge is expanded by 5 % at both ends while preserving its original orientation. The expanded line is intersected with the non‑drivable polygons; the intersection points define the true start and end of the crossing. The edge is then cut at these points, yielding a precise crossing segment. Post‑processing removes any remaining duplicates by buffering edges by 2 m (6.5 ft) and merging intersecting edges with similar bearings, and filters out implausibly short (<6 ft) or long (>160 ft) segments.

The complete workflow processes all 100 cities in roughly one hour per city, dominated by image fetching and SAM segmentation. It yields 808,377 measured pedestrian crossings with 93 % accuracy and a median absolute error of 0.69 m (2 ft 3 in) when validated against manually verified data for an entire city. Median crossing distances across cities range from 9.8 m (32 ft) to 23.8 m (78 ft), displaying clear regional patterns. Moreover, a positive correlation is observed between a city’s year of incorporation and its median crossing distance, suggesting that newer American cities tend to design wider, more car‑centric streets.

The study’s strengths lie in its exclusive use of freely available data and open‑source tools (ensuring reproducibility), the strategic selection of high‑entropy training images and extensive augmentation to fine‑tune SAM efficiently, and the novel integration of OSM network data with image segmentation via a grow‑cut approach to capture both marked and unmarked crossings. Limitations include potential segmentation errors caused by shadows, tree canopy, or low‑resolution imagery, and the reliance on OSM’s crowd‑sourced completeness, which can vary across regions. Future work could incorporate higher‑resolution aerial or LiDAR data, apply temporal satellite series to monitor changes in crossing design, and extend the methodology to other street‑level attributes (e.g., curb ramps, bike lanes) to further inform pedestrian safety and urban design policy.


Comments & Academic Discussion

Loading comments...

Leave a Comment