String repetitions

The building blocks of a website are often subject to some form of code repetition. For instance, a web page can host a user discussion on a certain topic. In this case, each comment will be wrapped by the same code, so that its length can become increasingly important with the number of comments. The way we handle repetitive code can play an important role in ensuring that our sites remain fast. There can be many different ways to generate these repetitions and it is not always obvious which ones we should prefer. Accumulating a large string through individual concatenations within a tight loop can be slow, especially when the string grows large. Concatenating fewer, but larger strings may be slower than concatenating many smaller ones. Using loops to call a print function repeatedly is flexible, but it can also be slow. In Python there are many other ways to form a large string that has a repeatable pattern. The itertools package allows us to iterate over the repetitions, printing them individually, but this too requires a loop. We could use a list comprehension to loop more efficiently and join the items in the filled list of repetitions to form a string at the end. We could use the repeat function in the operator package that accepts the string and a number of repetitions as arguments. Or we could also use the * operator, which works in the same way with list elements. There may be even more ways, but right now I am not aware of them.

Measuring how these alternatives perform would give us a better picture of what we could expect. However, sometimes not the generative speed is important, but how fast the browser can work with that large piece of code we have generated. Many browsers don’t like excess number of DOM nodes, but in order to create anything worthwhile, we need them. Here are the averages of the five measurements I made for each of the four cases described above.

tests on string repetition in Python

The * operator seems to be fastest here, but this is not by a large margin, considering that the number of repetitions is almost impractical for applications that require fast response times. If such an operation takes over 40 seconds on average, then we can’t expect to do much with it in front of a user. During the tests I observed high variability of the results and since the strings were sent to the browser (by being only partially visible), I expect that differences in how it handled the runs might also play a role.

What is surprising here is that a list comprehension seems to be almost equally fast as the repeat function in the operator package, whereas my initial expectation was that the latter would be more efficient. It doesn’t seem logical to me that something that is more general purpose would perform better than something which is focused on a single purpose, but this is what the numbers have shown. But if I had to choose between the two, I would probably still pick the repeat function.

This shows that nothing is as clear as it seems. Repetition in the form of creating similar UI elements is a good when it makes the design feel more consistent. But within the code it should be abstracted to avoid character soups. A decision must be made whether it is better to repeat one thing almost an infinite number of times or to rather repeat many things much less frequently. Both will lead to very different look-and-feels.

bit.ly/1oYQKal