The San Francisco Shortest Path Dataset .2018, Martin Werner

San Francisco Shortest Paths

Description

The Shortest Path In San Francisco dataset is a synthetic dataset of shortest paths around San Francisco. These shortest paths have been calculated using random start and end points, weight maps derived from distance and from travel time, and random small polygonal obstructions which led individual paths avoid certain small regions. It contains 20,242 trajectories in total containing about 5 million points.

You can download it here

It has been created for the ACM SIGSPATIAL GIS Cup 2017 on Range Queries under Fréchet distance and is given as a set of trajectories in Global Web Mercator (EPSG:3857). It is based on the ACM SIGSPATIAL GIS Cup submission winning the ACM SIGSPATIAL 2015 competition (Werner, 2015) in which shortest paths under polygonal constraints can be extracted.

The data is derived from OpenStreetMap and we republish the derived data under identical terms, that is following the Open Data Commons Database License (ODbl). For details, see https://www.openstreetmap.org/copyright

When you use it in your scientific works, we encourage you to

preferably link to the DOI or alternatively
link to this web page to enable readers to easily access the data, and to
cite this dataset by citing the article of the SIGSPATIAL cup for which it has been created (Werner & Oliver, 2018)

Preview

Example Usage

To get you started, the following python snippet creates a list of all trajectories formatted as individual numpy arrays. Therefore, it parses the TGZ file. It is rather slow and can only be used when you are importing into your envisaged temporary work format.

import urllib
from os.path import isfile
import tarfile
import numpy as np;
from tqdm import tqdm;
from matplotlib import pyplot as plt;

if __name__=="__main__":
    print("Checking if data exists")
    if not isfile('shortest-sf.tgz'):
        print("Downloading... ")
        urllib.urlretrieve ("https://www.martinwerner.de/files/shortest-sf.tgz", "shortest-sf.tgz")
    else:
        print("Found local file")

    #unzip all files
    dataset = tarfile.open('shortest-sf.tgz')
    loa = list()
    for f in tqdm(dataset.getmembers()):
        if f.isfile():
            f_trajectory = dataset.extractfile(f)
            m = np.loadtxt(f_trajectory, skiprows=1)
            loa = loa + list(m) # add to list of arrays

References

Werner, M. (2015). GISCUP 2015: Notes on Routing with Polygonal Constraints. SIGSPATIAL GIS CUP 15, in Conjunction with 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2015).
Werner, M., & Oliver, D. (2018). ACM SIGSPATIAL GIS Cup 2017 - Range Queries Under Fréchet Distance. ACM SIGSPATIAL Newsletter, To Appear.

Contact

Professur für
Big Geospatial Data Management

Lise-Meitner-Str. 9
85521 Ottobrunn
martin.werner@tum.de
Getting to us...

News

September 22, 2025 - News

The AI4Soil Workshop on Opportunities of AI in Soil Science...

September 3, 2025 - News

Carla Rieger presents at IEEE Quantum Week in Albuquerque, US

August 7, 2025 - News

Meet us at IGARRS 2025 in Brisbane and learn about Multimodal Image Geolocalization

August 5, 2025 - News

Our PhD student Carla Rieger gives a talk on Quantum Machine Learning at Deutsches Museum Munich

August 1, 2025 - News

Meet us at ACM SIGSPATIAL 2025

Show all...