Suppose that you are a singer and you want to present some of your new songs on a concert tour by visiting multiple cities. The concerts need to happen in four out of five possible cities, but you don't know the exact schedule in advance, e.g. the sequence in which you must visit them. Four different variants seem to be most likely and you would like to know how much you would need to travel, so that you can estimate approximately your time and budget needs.
You are looking for the total distance of your trip in each of the four cases, given that the cities are Berlin, Hamburg, Frankfurt, Cologne and Saarbrücken and that you will be traveling between them by car, so that individual paths won't be straight lines.
One simple way to be able to answer such queries is to construct a small distance matrix for all participating cities. We could use this distance calculator to get separate measurements for each path segment.
from collections import OrderedDict import pandas as pd distances = ( ('Berlin', [0, 287, 545, 579, 723]), ('Hamburg', [287, 0, 492, 431, 670]), ('Frankfurt', [545, 492, 0, 190, 185]), ('Cologne', [579, 431, 190, 0, 259]), ('Saarbrücken', [723, 670, 185, 259, 0]) ) possible_paths = ( ('Hamburg', 'Berlin', 'Frankfurt', 'Saarbrücken'), ('Cologne', 'Saarbrücken', 'Frankfurt', 'Berlin'), ('Berlin', 'Cologne', 'Hamburg', 'Saarbrücken'), ('Hamburg', 'Frankfurt', 'Cologne', 'Berlin') ) distance_matrix = OrderedDict() for city, dist in distances: distance_matrix[city] = dist df = pd.DataFrame(distance_matrix, index=distance_matrix.keys()) print(df)
This gives us the following data frame:
Berlin Hamburg Frankfurt Cologne Saarbrücken Berlin 0 287 545 579 723 Hamburg 287 0 492 431 670 Frankfurt 545 492 0 190 185 Cologne 579 431 190 0 259 Saarbrücken 723 670 185 259 0
What is beautiful is that by using df.loc['Berlin', 'Hamburg'], for example, we obtain a very concise and descriptive way to extract the individual distances from this data frame. What remains is to compute the distances of the possible paths as sums of path segments.
path_distances =  for path in possible_paths: path_dist = [path, 0] path_len = len(path) for i in range(path_len): if i < path_len - 1: from_city, to_city = path[i:i+2] path_dist += df.loc[from_city, to_city] path_distances.append(path_dist) print(path_distances) """ [ [('Hamburg', 'Berlin', 'Frankfurt', 'Saarbrücken'), 1017], [('Cologne', 'Saarbrücken', 'Frankfurt', 'Berlin'), 989], [('Berlin', 'Cologne', 'Hamburg', 'Saarbrücken'), 1680], [('Hamburg', 'Frankfurt', 'Cologne', 'Berlin'), 1261] ] """
This shows that if the concerts are scheduled to happen first in Cologne, then in Saarbrücken, then in Frankfurt and at last in Berlin, the travel distance would be the shortest one among the alternatives and with the least stress for the singer.
For bigger tasks, Google provides a distance matrix API (seems to be free while within request limit), which may be more accurate. It also gives an estimate of the time left until arrival and returns a convenient JSON response that can be used in all kinds of applications.