Linear equations, matrices and solvers

Solving equations was one of the first things that we learned from an early age. Many diverse, numerical problems can be solved by identifying the best combination of variables that satisfy a given equation. To solve an equation with n variables, we need to have n independent equations in which those variables occur. But this information may be hard to extract from the environment—it may be incomplete or even wrong, in which case we would arrive at a wrong result. We can be easily misled by the results of various measurements and experiments needed to obtain the relations between these variables. Our initial margin of error may multiply in the process of solving the equation, in which case we will arrive at something that is far from being correct.

Linear equations can be solved with the help of matrices in which we write the variable coefficients in ordered form and then apply Gaussian elimination. The goal of the last is to bring the matrix in upper triangular form through elementary operations like addition, multiplication or swapping, performed on the matrix rows. This method is also useful for getting the inverse of a matrix. If we write a matrix A on the left and the identity matrix on the right, we could apply Gaussian elimination to the combined rows. Once the initial matrix A has been brought to the form of the identity matrix, and the initial identity matrix has become a new matrix B, we know that the inverse of A is B. Gauss elimination allows us to determine the rank of a matrix, which is equal to the number of “stairs” in upper triangular form. A system of linear equations can be solved if we know that the rank of the matrix with variable coefficients is equal to the rank of the combined A|b matrix, where b is a matrix with a single column, whose entries come from the values on the right side of each equation. If both ranks are equal, we can solve A.x = b by equating everything in A and b that appears below the dividing line between zero and non-zero values.

When trying to solve linear equations with the help of matrices, it is useful to choose a programming language that would allow us to easily get/set the aij values. Python can be convenient to work with big matrices, especially when more precision is needed. With the help of the numpy package, we can easily select a matrix block, a certain row or column, an individual value, every n-th value in a row or column, a diagonal of values or even non-neighbour values (fancy indexing). Everything is returned as a convenient numpy array, that can be used in further operations. It is important to understand that we may not always be able to operate on the original array, but only on a temporary view that we create from it. We may also need to prune our matrices, especially when they grow large and start to consume extra amounts of memory, as we don’t want to slow down our calculations. At the same time, we need to make sure that our data remains correct or at least representative. I’m not aware how numpy arrays work, but it is said that in a regular Python array, the same values may be stored only once in memory, while keeping multiple pointers to the value. This is space-efficient, but at the cost of harder pointer management (which remains hidden for us). Linear algebra has lots of applications (CSS transforms are only a small fraction of the visible surface), so it may be valuable to look around and see if we can find more uses for it. It’s a kind of reverse-thinking, but it can give us some interesting new ideas to work on.

Although Gaussian elimination can be used to solve systems of linear equations, research has shown that it can be computationally expensive. This is why specialized solvers have been developed that could tackle this task more efficiently. Although commercial math software has them built-in, it can be expensive and therefore inaccessible. GNU Octave is a free alternative, but it may be hard to install it on a Windows platform. Other software packages may require virtualization, which isn’t always possible (or usable) on slower machines.

The title of this post is a hint that we are trying to close a cycle with both mathematical and computational knowledge by following the arrows. This is similar to the way a web designer is trying to close the cycle of everything from ideation to realization. We may pretend that an equation could exist for itself, but without a solution it won’t be interesting. We may pretend that the technology we use to build our websites can exist for itself, but without the users it won’t be interesting. What matters is to know what we are doing, why we are doing it and how everything fits together in a meaningful result.

Is maths necessary to be a good programmer? Maybe it depends on the problems we are trying to solve. If we are dealing mainly with simple database inserts and updates, it is clear that we probably keep track only of the last inserted ID. As the complexity of a problem grows and it becomes more valuable to solve (see some Kaggle examples), so does the reliance on mathematical methods for its computation. If we want to be exact, we can’t avoid what is exact science. One way to increase our motivation is to pick harder problems (but they require advanced math skills). Even if we don’t reach our goals, we would still be better off than by doing the easy things again. I don’t say that all math is automatically valuable. Many complex formulas are hard to read and digest; going through all the materials may be time-consuming and unjustifiable. At the same time, going only for what we immediately need, may leave us without immediate clues how to improve it later. Nevertheless, maths is much more than producing an income statement or placing the break-even point on a 2D-plot. How many items do we need to sell until we are profitable? Maths and previous experience will say how realistic is that. If we can solve.

bit.ly/1c8Qhtp