Finding combinations of ingredients based on good pairs

It is not always easy to select the right ingredients when preparing food. Not everyone is a cook with many years of experience, who consistently makes the right choices. Online recipes might help, but it is often the case that following a recipe step-by-step still leads to disaster. And we only learn by trial and error. Combining the wrong ingredients may negatively influence the taste of the dish or make people less likely to enjoy it. Not only that, but it decreases the confidence of the cook, so that the next time they may no longer be interested in this activity.

The idea that we can mix too many ingredients and still get a good resulting taste seems strange. The more ingredients we mix, the more careful we have to be about the mixing ratios. This in itself means that it becomes progressively harder to influence the taste as we would like to have it. (Not to mention the digestive problems that could arise due to inappropriate food combinations.)

Colors could possibly serve as an example here. Everyone knows that as long as we mix 5-8 colors, we can create beautiful palettes. But once we break this rule, the colors start to interfere with the perception of the viewer, creating too much noise. This quickly leads to the conclusion that too many colors are rarely good.

The number of people in teams is also often restricted to five to ensure that the number of communication channels is acceptable, so that everyone can come to word. If the teams were much bigger that that, the possibility for misunderstandings would increase dramatically.

This leads us to the idea: can we find a combination of up to five ingredients that could lead to a food we might like? This is not an easy question and the answer may be less than perfect. But we might have heard about the food pairings database, which seeks to rank good pairs of foods. We don't know anything about how it was produced or how accurate it is. We see some indices, but it is hard to imagine the context. What if we can still learn something valuable from it? The interesting thing is that although the data is presented in a tabular form, it is actually suitable for a graph representation since there are many relationships between the ingredients. This means that if we extracted the data (which I already did), we could apply graph algorithms on it, using the pairing index as a weight.

If we did this, we could obtain a graph with 595 nodes and many relationships between them. But we could also do relatively well even without a graph library like networkx. Here is a sample code, where we use an ingredient as a starting point to find a chain of ingredients with any two neighbors said to pair well together.

# Find a path of 5 ingredients which pair well, starting from a given ingredient path_len = 5 start_ingredient = 'arugula' path_elems = [start_ingredient] while len(path_elems) < path_len: adjacent_ingredient = {} for prod in products: ingr1, ingr2, recipe_rating, pairing_index = prod if ingr1 == start_ingredient: adjacent_ingredient[ingr2] = (pairing_index, recipe_rating) for ing in sorted(adjacent_ingredient.items(), key=lambda x: x[1], reverse=True): if ing[0] in path_elems: continue start_ingredient = ing[0] path_elems.append(start_ingredient) break print(path_elems) # ['arugula', 'pomegranate juice', 'cherry', 'italian bread', 'marjoram'] # ['turmeric', 'dill seed', 'chive', 'pineapple juice', 'guava'] # ['zucchini', 'red leaf lettuce', 'paprika', 'fat free cream cheese', 'bell pepper'] # ['almond', 'olive', 'radicchio', 'sunflower seed', 'brazil nut'] # ['amaranth', 'cornstarch', 'millet flour', 'rice flour', 'coriander seed'] # ['basil', 'fresh dill weed', 'lemon peel', 'biscuit', 'macadamia'] # ['oregano', 'gruyere cheese', 'arugula', 'pomegranate juice', 'cherry'] # ['apple juice', 'cashew', 'safflower oil', 'chili powder', 'pasilla'] # ['soy sauce', 'shredded parmesan cheese', 'yogurt', 'broad bean', 'fresh parsley'] # ['grapefruit', 'pecan', 'star anise', 'radish', 'pimento'] # ['olive oil', 'wheat gluten', 'bread', 'pancake', 'scallion'] # ['kale', 'walnut oil', 'nutmeg', 'nectarine', 'cantaloupe'] # ['yogurt', 'broad bean', 'fresh parsley', 'corn syrup', 'white pepper'] # ['banana', 'liquid pectin', 'onion', 'watercress', 'olive oil'] # ['spaghetti', 'mexican blend cheese', 'hot sauce', 'flounder', 'onion powder'] # ['mango', 'snapper', 'cheddar cheese', 'yellow hominy', 'cream'] # ['turkey', 'strawberry', 'brie cheese', 'pesto', 'pistachio'] # ['thyme', 'english muffin', 'cinnamon', 'bagel', 'milk'] # ['chicken', 'chives', 'cream', 'caraway', 'lemon juice'] # ['parmigiano-reggiano', 'margarine', 'nectarine', 'cantaloupe', 'walnut'] # ['walnut', 'horseradish', 'wheat bread', 'pineapple', 'mustard seed'] # ['bread', 'pancake', 'scallion', 'tomato juice', 'salad dressing'] # ['butter', 'cracker meal', 'lemon juice', 'fried onion', 'unsalted butter'] # ['carrot', 'pecans', 'lettuce', 'fried onion', 'unsalted butter'] # ['cauliflower', 'bamboo shoot', 'almond', 'olive', 'radicchio'] # ['celery', 'corn starch', 'milk', 'poi', 'onion'] # ['broccoli', 'apricot', 'caper', 'hearts of palm', 'spinach'] # ['cabbage', 'chayote', 'celery', 'corn starch', 'milk'] # ['chicken', 'chives', 'cream', 'caraway', 'lemon juice'] # ['chocolate', 'crispix', 'vegetable oil', 'unsweetened orange juice', 'grape'] # ['buckwheat', 'barley', 'cucumber', 'summer squash', 'dried parsley'] # ['coconut', 'mango nectar', 'banana', 'liquid pectin', 'onion'] # ['cucumber', 'summer squash', 'dried parsley', 'maple syrup', 'ground sage'] # ['eggs', 'cauliflower', 'bamboo shoot', 'almond', 'olive']

As you can see, by varying the start ingredient, we obtain interesting combinations that could give us ideas for new recipes. The full data contains a lot more to explore, but this is a good start.