Structure

Everything we create is subjected to the test of stability. If we carelessly staple many bricks on top of each other, not only won't we build a house, but a falling brick may be the last thing we see when the wind blows. Earthquakes and floods will test the base of the house, rain will test the top, wind will test the walls on the side. Noise from the neighbors will test how stable the rules of good citizenship are. If they aren't, building a stable house is still unwarranted, especially when everyone could break into it. Proudly declaring yourself a Ferrari owner midst an average of Volkswagen Golfs will likely leave you without a car. This shows that the structure within the context matters a lot.

Imagine that a home, fully furnitured, has 72 nicely decorated rooms. How do we know where to find anything there? Without having a strong mental model about how the home is organized, we would be completely lost. We build this model almost unconsciously as we seek to approximate our reality. But it is also an unstable one, since each change, if left unnoticed, can partially invalidate it. If we put something at position A, but another person comes later and moves it to position B, we won't find it the next time we look at position A. We either need to speak with the right person regarding its new position or try to put ourselves into their mode of thinking. In the worst case we would need to ransack the room(s) most likely to contain it. The small task of finding a particular pair of socks can turn into an endless search in an unstructured, messy home. At the end, we may still be tempted to take any pair, if all we could find is a single sock.

We can find things in the house, because it is structured into rooms of specific purpose. Kitchens are likely to have stoves, refrigerators, dishwashers, mixers, plates, utensils, napkins etc. Bedrooms are likely to have beds, sheets, coverlets, night lights etc. Living rooms are likely to have TV, radio, couch, bookshelves, flowers and everything else that would also make guests feel comfortable. Although they all appear to be rooms, each one is still uniquely identifiable to us as it has different characteristics like quadrature, exposure (N, E, S, W) etc. But when we start mixing things intended for different purposes, we lose our ability to easily keep track of them. Deep in our mind we may know that the napkins are in the bedroom, but any new guest probably won't find this structure logical.

Thinking in hierarchies is just one way to organize the complexity as we perceive it. Each room has furniture, which contains various objects, which keep other things as well. We know which is where or in other words, where in this idealized hierarchy to look for once we need something. Drawers are subdivided to keep different kinds of utensils, so that the most frequently used items like spoons and forks are more conveniently accessible than infrequently used ones like can openers or corkscrews. Once we learn the structural organisation of a home and our model of it is fairly accurate, finding the path from one room to another becomes easier. And so complexity becomes slightly more manageable if not completely overwhelming. The more we train ourselves to be able to retain more complex models, the better we become at it.

A refrigerator also needs to be well-organized, so that we can find quickly the products we need. Relying on our visual system to find the right item is not enough, since some items may remain hidden behind other ones. But our mental model tells us that the last time we have put product A behind product B, so it must still be there, if not already consumed by someone else. So we know where to look first.

Finding the right file is possible only through the structure of a directory with an accurate label. We may have multiple files with the same name, so there is no way to know in advance which one we need. A search would return all similar looking names, which doesn't tell us much. But we may have an idea of the (approximate) directory label in which that file is located. This label is likely to be based on a model we have created for ourselves based on how important we perceive our data to be. By introducing structure through the labels, we can find more easily both individual and related items.

The common theme behind all these cases is the derivation of the model from reality as a means to enable structure. But it is more interesting to observe how a mental model (head-first) could be used to create real products. If we can hold complex structure in our head, this can enable the creation of products that rely on reading from and writing to this structure. Then, despite all complexity, we would know where the blinking on this structure occurs, based on how we chose to arrange the concepts and things in our head. If we look at disk defragmentation software, we see that at the start a lot of different regions are blinking, which suggests that the software is inspecting the structure of its given world. But once the model of it is built, the defragmentation process becomes much more linear, with only occational reads from distant locations. This suggests that given incomplete information, we can only proceed in a suboptimal way. The structure itself enhances our decision-making.

As we have already seen, interfering with other people's structure and their own understanding of the world is never a good idea, unless we want to slow them down. This is often not well understood in organizations where people move fast from one project to the next or change from one team to another. This way, there is no sufficient time to build any model or architecture that will stand the test of time.

Web design is another activity that relies heavily on structure—a website containing thousands of pages needs a very well-documented way to organize and link them. The better structure makes it harder to link to non-existing pages or to make mistakes that are costly to fix. Subdomains can be used to organize the content into themes that address the interests of different groups of people. The content is written in a structured way, guided by editorial standards and also stored in a structured way (information architecture); the images need to be of a certain quality; the layout is subject to the box model and perfect box alignment (you say that a grid also works well); related items must be placed together; different sections must be easily distinguishable; the typeface needs to match the site theme; the text needs to be readable; the color scheme needs to convey the appropriate mood; links and buttons must be easily clickable even on small-screen devices while giving appropriate feedback on hover and focus at minimum; the forms have to be accessible in order to sell more and now; everything has to support business goals and nothing more; the extra tools should support fast client feedback without being obtrusive; the frameworks shouldn't have too many dependencies. Getting all these details to work well together can be challenging and is just one reason why web design is so hard. It would be a mistake to think that web design is just about the looks. Even web design is about stable structures without which a project is unlikely to flourish. The structure behind the design is pointing to the structure behind the business and may be even reflecting it.

A monolithic codebase makes it hard to make changes or to contribute. Trying to separate our concerns while working in code is another way to create a structure where everything can be found quickly. It is helpful to think in terms of “maximum margin separating hyperplane” to understand how the various responsibilities within the system can be clustered before we start to manually assign individual files to them. Then, the proper labeling scheme will naturally fall into place. For instance, all operations touching the database would belong to the Model cluster, all layouts to the View cluster and all interactions to the Controller cluster. MVC is just an idea, which everyone understands differently, but it illustrates well the attempt to keep the concerns separated by a wide margin. As the number of clusters grows and they increase in size, they may move more closely together, which will increase the potential for mistakes.

An interesting question at the start of a new project is “Which objects does our system need to have?“ Somehow we assume that having the objects guarantees us having a good structure. This is often not the case: many projects are collections of objects communicating through channels that are hard to follow through and to understand. As already mentioned, everyone has a different understanding of the world, so even when the designer of the system finds its organization very logical, people that work on it may not think so. They may be missing important details in order to understand how their work fits in and why it even matters. If we haven't figured out the right structure yet—one that can be easily understood by a large number of diverse people,—we may lack a deeper understanding of the problem our system is trying to solve. Not developing a system at an appropriate granularity is a bad symptom. Perhaps a slightly better starting question would be “What kind of structure do we need to fully model/describe this problem?” The answer to this may still hint for the object-oriented approach, but we aren't immediately thinking in terms of objects yet. Instead we are evaluating how suitable different paradigms (functional/dynamic/linear/event-driven/parallel/scientific programming) are for our task. This is important, since they all have their advantages and disadvantages. Sometimes the heart of a system is an algorithm that accepts a dozen of parameters to do its job; sometimes we need a small system that does one thing well and we need to create it fast; sometimes we are working on a streaming solution, where data manipulation needs to happen online and in-place; sometimes we create a website that stands still most of the time, but infrequently accepts large images as user input and then does something with them; sometimes we need to support a large number of concurrent users on a site, eliminating potential bottlenecks; sometimes our software has to work on the moon. In all these cases we will need to choose a different structure and likely a different programming language that is more suitable for the task, as there isn't a universal hammer for every nail.

It is said that programming is mostly about data structures and algorithms. Therefore knowing the components of data structures is a first step towards understanding them better. Arrays/matrices, objects, stacks, queues/deques, trees, graphs, sets, bit vectors are just some examples. We can combine them in different ways where it makes sense or we can leave them to exist separately, but closely next to each other. For instance, it is fairly common to see an array of objects or an object having many arrays. Sometimes tuples can be used for indexing in an array. We can read from one structure and write to another, we can write a value in the same structure based on a previous one or based on a formula involving many previous ones. We can update a matrix in a staircase pattern, we can access individual items from it based on a given stride (for instance every third element on the fifth line, starting from column eight). We can multiply two matrices to get a third one, which is different than taking their dot or Kronecker products. In many cases we form a new structure from previously available ones or we decompose an existing, large structure into many smaller ones, each of which can be sent as an input to a different parallel processor if needed (don't ask me how). We can swap items in-place without the need for extra memory. With a single loop we can read five different arrays, element by element, where the contents of some of them can affect those of the others. We could also apply weights if we don't like the result. Although we seem to have an almost infinite numbers of possibilities, we should consider that keeping large structures can take a lot of memory. It is better to keep the smallest possible that could work in the context of our task. This is why careful modeling is important—if our structure is smaller than the one we need, we may get an incorrect result, if it is bigger than we need, we will waste memory and scalability issues will quickly become visible with a sufficiently large problem size.

Knowing when structure works against us is valuable as well. Too much of it can slow down processes or lead to projects with missed deadlines. Sometimes a system can fall under its own weight, created by excessive structure. This makes it hard to understand and leads to companies of many people, none of which truly knows how things work in their entirety. Too much organizational structure inhibits creativity and can lead to increased bureaucracy.

While trying to ship faster, we often penalize projects that would benefit from more structure. The release itself becomes the celebrated event behind which the quality of work is actually deteriorating and especially so after the next batch of changes is introduced. “We need it fast and now” becomes a tool for stressing and concurrent brain hacking on the structures that the programmer is trying to manipulate in mind, which is dangerous for any software project. When overworked/tired programmers are unable to hear their own voices, their choices and the quality of their products begin to suffer.

bit.ly/1zhgYt1