Visualizing the weight of transported goods

Suppose that you have a transportation company owning many trucks that constantly travel between some cities to deliver goods. Your drivers already know how to optimize the value they deliver and the routes through which they deliver it. Now you would like to know how much weight is actually transported through each route to seek which ones matter more/less. To be able to make the decision, you decide to install sensors in each truck to give you this information in real time. You collect the data, but there is the problem that you have too much of it. Your drivers serve many routes and through each route travel several trucks in both directions every single day. But the number of cities your company serves is relatively limited: München, Leipzig, Aachen, Koblenz, Dresden, Dortmund, Frankfurt/Main, Stuttgart, Osnabrück, Ulm, Berlin and Köln. You would like to have a clear way of seeing which pairs of cities are responsible for most of the weight transported. The following simple code could help with that:

from itertools import combinations import matplotlib.pyplot as plt def label_if_unseen(city, lat, lon): if city not in seen_cities: plt.annotate(city, xy=(lon, lat), xytext=(lon, lat - txt_voffset), ha='center', fontsize=9, fontweight='bold', alpha=0.6) seen_cities.add(city) route_kgs = { ('Berlin', 'Stuttgart'): 320, ('Frankfurt/Main', 'Ulm'): 120, ('Koblenz', 'Stuttgart'): 265, ('München', 'Ulm'): 36, ('Aachen', 'Osnabrück'): 92, ('Dresden', 'Dortmund'): 114, ('Dortmund', 'Köln'): 74, ('Dortmund', 'Frankfurt/Main'): 195, ('Frankfurt/Main', 'Leipzig'): 167, ('Frankfurt/Main', 'Dresden'): 95, ('Berlin', 'Frankfurt/Main'): 180, ('Aachen', 'Ulm'): 128, ('Leipzig', 'München'): 64, ('München', 'Aachen'): 87, ('Köln', 'Leipzig'): 92, ('Köln', 'Koblenz'): 54, ('Koblenz', 'Dresden'): 63, ('Stuttgart', 'Osnabrück'): 110, ('Dresden', 'Ulm'): 135, ('Berlin', 'Aachen'): 71, ('Dresden', 'Köln'): 55, ('Leipzig', 'Stuttgart'): 104, ('Leipzig', 'Aachen'): 85, ('Ulm', 'Leipzig'): 82, ('Frankfurt/Main', 'Koblenz'): 63, ('Berlin', 'Köln'): 215, ('Dresden', 'Aachen'): 111, ('Dortmund', 'Dresden'): 81, ('Aachen', 'Stuttgart'): 115, ('Köln', 'Osnabrück'): 92, ('Osnabrück', 'Ulm'): 73, ('Dortmund', 'Dresden'): 125, ('Köln', 'Frankfurt/Main'): 150, ('Frankfurt/Main', 'Berlin'): 180, ('Berlin', 'Dortmund'): 167, ('Koblenz', 'Dresden'): 89, ('Leipzig', 'Berlin'): 140, ('München', 'Osnabrück'): 90, ('Stuttgart', 'Aachen'): 96, ('Ulm', 'Berlin'): 88, ('Köln', 'Osnabrück'): 34, ('Stuttgart', 'Berlin'): 102, ('Aachen', 'München'): 85, ('Dortmund', 'Stuttgart'): 145, ('Osnabrück', 'Leipzig'): 44, ('Koblenz', 'Frankfurt/Main'): 67, ('Osnabrück', 'Ulm'): 82, ('Berlin', 'Koblenz'): 124, ('Ulm', 'Dresden'): 94, ('Koblenz', 'Frankfurt/Main'): 64, ('Berlin', 'Dortmund'): 167, ('Osnabrück', 'Dortmund'): 68, ('Dresden', 'Stuttgart'): 56, ('Stuttgart', 'Dresden'): 86, ('Aachen', 'Ulm'): 64, ('Dresden', 'Leipzig'): 75, ('Koblenz', 'Dresden'): 86, ('Osnabrück', 'Aachen'): 56, ('Berlin', 'Frankfurt/Main'): 110, ('Koblenz', 'Aachen'): 84, ('Berlin', 'Ulm'): 115, ('Köln', 'Ulm'): 87, ('Köln', 'Osnabrück'): 93, ('Frankfurt/Main', 'Aachen'): 66, ('Frankfurt/Main', 'München'): 177, ('München', 'Berlin'): 148, ('Dresden', 'Aachen'): 92, ('Köln', 'München'): 115, ('München', 'Berlin'): 214, ('Berlin', 'Köln'): 122, ('Aachen', 'Frankfurt/Main'): 112, ('Dortmund', 'Berlin'): 127, ('Dresden', 'Koblenz'): 69, ('Berlin', 'Ulm'): 112, ('Osnabrück', 'Leipzig'): 103, ('Stuttgart', 'Frankfurt/Main'): 157, ('Dresden', 'Stuttgart'): 92, ('Berlin', 'Aachen'): 85, ('Koblenz', 'Dresden'): 75, ('Köln', 'Frankfurt/Main'): 93, ('Leipzig', 'Ulm'): 82, ('Berlin', 'Leipzig'): 115, ('Stuttgart', 'Berlin'): 183, ('Stuttgart', 'Frankfurt/Main'): 115, ('Köln', 'Aachen'): 90, ('Frankfurt/Main', 'Aachen'): 55, ('Koblenz', 'Aachen'): 67, ('Dortmund', 'Aachen'): 100, ('Dortmund', 'Berlin'): 112, ('Dresden', 'Dortmund'): 78, ('Köln', 'Dortmund'): 129, ('Köln', 'Ulm'): 88, ('Koblenz', 'Dortmund'): 94, ('Koblenz', 'Aachen'): 67, ('Dresden', 'Frankfurt/Main'): 117, ('Köln', 'Frankfurt/Main'): 82, ('Leipzig', 'Köln'): 84, ('Ulm', 'Stuttgart'): 77, ('Dortmund', 'München'): 134, ('Koblenz', 'Leipzig'): 114, ('Köln', 'München'): 176, ('Dresden', 'Stuttgart'): 112, ('Ulm', 'Osnabrück'): 65, ('Ulm', 'Dortmund'): 114, ('Frankfurt/Main', 'Dortmund'): 85, ('Stuttgart', 'Dortmund'): 121, ('Leipzig', 'Stuttgart'): 105, ('Dresden', 'Frankfurt/Main'): 95, ('Osnabrück', 'Dresden'): 66, ('Köln', 'Dresden'): 92, ('Stuttgart', 'Dortmund'): 132, ('Ulm', 'Leipzig'): 88, ('Koblenz', 'Berlin'): 81, ('Frankfurt/Main', 'Dresden'): 70, ('Aachen', 'Koblenz'): 58, ('Köln', 'Dresden'): 65, ('Dortmund', 'Berlin'): 112, ('Stuttgart', 'Frankfurt/Main'): 138, ('Ulm', 'Dortmund'): 77, ('Ulm', 'Dresden'): 69, ('Berlin', 'München'): 308, ('Dortmund', 'München'): 197, ('München', 'Osnabrück'): 76, ('München', 'Koblenz'): 97, ('Osnabrück', 'Koblenz'): 69, ('Osnabrück', 'Aachen'): 59, ('Stuttgart', 'Dortmund'): 95, ('Koblenz', 'Köln'): 65, ('Stuttgart', 'Frankfurt/Main'): 160, ('Frankfurt/Main', 'Ulm'): 120, ('München', 'Ulm'): 96, ('Dresden', 'München'): 85, ('Stuttgart', 'Koblenz'): 68 } city_lat_lons = { 'München': (48.135125, 11.581980), 'Leipzig': (51.339695, 12.373075), 'Aachen': (50.775346, 6.083887), 'Koblenz': (50.356943, 7.588996), 'Dresden': (51.050409, 13.737262), 'Dortmund': (51.513587, 7.465298), 'Frankfurt/Main': (50.110922, 8.682127), 'Stuttgart': (48.775846, 9.182932), 'Osnabrück': (52.279911, 8.047179), 'Stuttgart': (48.775846, 9.182932), 'Ulm': (48.401082, 9.987608), 'Berlin': (52.520007, 13.404954), 'Köln': (50.937531, 6.960279) } route_keys = route_kgs.keys() city_lat_lons_items = city_lat_lons.items() s = set() for c1, c2 in route_keys: s |= set([c1, c2]) total_weights = {(c1,c2): 0 for c1, c2 in combinations(s, 2)} for (c1, c2), w in route_kgs.items(): if (c1, c2) in total_weights: total_weights[(c1, c2)] += w if (c2, c1) in total_weights: total_weights[(c2, c1)] += w sorted_weights = sorted(total_weights.items(), key = lambda x: x[1], reverse=True) max_weight = sorted_weights[0][1] txt_hoffset, txt_voffset = 0.7, 0.12 legendx, legendy = 5.7, 47.8 show_labels_in_legend = 5 color = '#D48F5C' seen_cities = set() plt.figure(figsize=(6,8)) plt.title('Routes by total weight transported') for (c1, c2), w in total_weights.items(): c1lat, c1lon = city_lat_lons[c1] c2lat, c2lon = city_lat_lons[c2] normw = w / max_weight plt.plot((c1lon, c2lon), (c1lat, c2lat), '-', color=color, lw=0.2 + 2.4*normw, alpha=normw) label_if_unseen(c1, c1lat, c1lon) label_if_unseen(c2, c2lat, c2lon) lats = [v[0] for _, v in city_lat_lons_items] lons = [v[1] for _, v in city_lat_lons_items] plt.scatter(lons, lats, s=40, facecolor=color, lw=0.5) for i, ((c1, c2), w) in enumerate(sorted_weights[:show_labels_in_legend]): ly = (show_labels_in_legend - i) * txt_voffset plt.text(legendx, legendy + ly, str(w) + 'kg', fontsize=9, fontstyle='italic') plt.text(legendx + txt_hoffset, legendy + ly, '-> %s - %s' % (c1, c2), fontsize=9) plt.xlabel('longitude') plt.ylabel('latitude') plt.tight_layout() plt.show()

We start with the definition of a helper function that will be responsible for drawing city labels if they haven't already been drawn. Then we define a dictionary having tuples as keys, describing the cities and values describing the weight transported between them by one of the existing trucks. The dictionary contains every known course. Since we know the cities, we define their latitude and longitude coordinates to be able to use them once needed. We create all possible route combinations between two cities. Remember that if we have A and B, this won't include B and A. We use these combinations to create a compact dictionary, initialized with zeros that will hold the total weights computated for each route. Once we sum the weights by route, we sort them to find the maximum (so that we can normalize line widths and alpha values on our graphic) and to show a small legend having the five routes with the most weight transported. We plot the lines between the cities, adjusting the line width and alpha values. We add the labels if needed. We add the city points by a single call to the scatter function (which is usually more expensive than the plot function). And we plot the legend, each line consisting of two texts showing the total weight (in kg) and the route. We may have used a single text, but this gives more opportunities for styling The picture looks so:

Visualization of the total weight of transported goods through multiple routes

We see immediately the two thick lines corresponding to the routes Berlin-Stuttgart and Berlin-Мünchen, while routes with less weight appear more muted. That they are presented with straight lines here, where in reality the roads aren't, is an oversimplification. Using this method we can draw large networks (eventually without the labels), to see immediately which connections appear to have more weight: in the sense of weight, cost or something else. If we have the exact coordinates, we can draw without having to rely on a specialized graph drawing library like networkx, since it can produce more randomized layouts.