The port of Hamburg has a list of all ships and some of their technical parameters, which allows us to compare them by our own custom criteria. For instance, if we have the parameters shipe name, deadweight tonnage, container capacity, engine power and speed of the ships (as seen in this data), we can compute which ship can carry the most tonnage per unit engine power or create an index that divides the products of all attributes related to weight and all attributes related to mobility (which would also make maximal use of this incomplete data). We can also look for correlations among the parameters. Here is a sample code:
import pandas as pd df = pd.read_csv('hamburg_port_ships_data.csv', na_values=['None']).dropna() print(df.corr()) df['t/KW'] = df['max tonnage'] / df['engine power'] df['TEU/kn'] = df['container capacity'] / df['speed'] df['kn/KW'] = df['speed'] / df['engine power'] df['weight/mobility'] = (df['max tonnage'] * df['container capacity']) / (df['engine power'] * df['speed']) for crit in ['t/KW', 'TEU/kn', 'kn/KW', 'weight/mobility']: print(df[['name', crit]].nlargest(10, crit))
Here are the correlations between the individual attributes:
Based on the data about our small selection of ships, we see that max tonnage and container capacity seem very much related. Similarly, engine power seems related to max tonnage and to container capacity, but it seems slightly less related to speed. Speed seems most related to engine power.
If needed, with the help of Seaborn we could also see the pairwise relationships:
Then we rank the ships according to the different criteria as seen in the code:
These are the ships with the largest tonnage to engine power ratio (t/KW). It looks as if the first is quite different from the rest, as if the data we have is wrong (always assume this). A quick look at the vessel search reveals that we have the same values as given on the website.
In the criteria container capacity to speed we see more normal results. We assume that this value could be limited, since we can't load an infinite number of containers and still have a ship that is moving.
In the criteria speed to engine power we can examine how much this power is efficient in moving the vessel. We see a difference of more than two times only between the first and the tenth ship on the list, where we have 435 ranked in total.
This is the criteria weight to mobility, artificially created through the observation that two of the attributes we have are related to weight (max tonnage and container capacity) and two other are related to mobility (engine power and speed). We ask ourselves: "What is actually important for a ship"? Intuitively we might think that it is carrying as much weight as fast as possible. So we try to integrate all our existing knowledge into an index that we could use for comparisons. This is how we see the ship name "Adelheid-S" once again at the top of the list.
Note: These are not all of the ships that operate on the port of Hamburg. For more accurate results, feel free to include the ones that were omitted here. The ships were selected to have a concrete value for container capacity (given in TEU), where not all ships had one (especially when they were cruise or other types of ships). By including more data, you will get more accurate results.