If you have read earlier installments of this series – and I know it has gotten long – you may have noticed that very few of the matrices that appeared there have the same number of rows and columns. A matrix that does have the same number of rows as columns is known as a square matrix. Yet if you have any experience with matrices from high school or college at all, you may have noticed that almost all of those are square. It isn’t untypical for a textbook on matrix algebra to show one or two non-square matrices on the first page, and then have all remaining pages deal exclusively with square matrices. What’s so special about square matrices? Why do they end up drowning out almost all other types?
If you think back to the order matrix in the part 16, you can see that for it to be square, the number of orders has to somehow be exactly the same as the number of items on the menu. That situation would be a total coincidence, and would last only until the next car shows up at the order window. It appears then that the matrices in textbooks must arise from entirely different scenarios than the ones we’ve played with till now.
What are the scenarios that underlie the matrices found in textbooks? It turns out that this is not a trivial question, or at least it doesn’t have a trivial answer. For the textbooks often don’t tell you where a matrix comes from. They may not care, or know, particularly.
My impression is that most matrices in textbooks come from two broad application areas. One, systems of equations, and two, transformations. For those broad application areas, we can justify why many – if not most – of the matrices involved would be square.
Let’s take a quick look at equations (and leave transformations for another day). An equation like 2x+y=60 has many solutions. There are lots of combinations of numbers so that double the first, added to the second, gives us 60. For example, 1 for x and 58 for y will do the trick. But so does 10 for x and 40 for y. If I think of the equation as a clue for what x and y must be, the clue isn’t powerful enough to nail down x and y. But if I have another clue, another hint, like x+2y=75, together these hints may be enough to nail down x and y precisely. Together, these clues give us what is called a system of equations, and they would normally be written in math class like this:
By the time students get to matrix algebra in school, this same system of equations would now be written like this:
where the group of 2 by 2 numbers on the left is called the matrix of coefficients, the vector with the x and y in it is called the vector of unknowns, and the vector on the right is called the known vector, or the right hand side vector. This use of a matrix and vectors is consistent with the notion of matrix multiplication, but at this point the matrix and vectors are often introduced simply as a shift in the notation for the system of equations. it is shorter, more compact, especially if you go through the steps of what is known as Gaussian elimination. If the system of equations can be solved, it turns out that the solution depends on the known vector in an interesting way. This way can itself be expressed in matrix notation, using what is called an inverse matrix, and we can write
I’m skipping a bunch of steps on purpose here, including non-trivial ones like how we would find this inverse matrix in the first place. Here, my main interest is in recovering (imputing) the logic of the progression of topics and techniques in traditional textbooks of matrix algebra.
If square matrices come from systems of equations, where do the systems of equations come from? In many textbooks, systems of equations are simply the starting point – they appear as if dropped from the sky. Let’s take another look at the following comic strip (we encountered it earlier in this post)
The brother in the strip can at least come up with a half-way reasonable scenario that might have given rise to the system of equations shown above. In doing so, he is at a disadvantage: he has to make something up. He’s working backwards. He is making up a “story problem” working backwards from the system of equations. You can’t blame him for coming up with something that – even though Paige can relate to it – is still kind of lame. In what real-life situation would you really know (and remember) the cost of two shirts and a sweater, as well as the cost of one shirt and two sweaters – but not remember the price of each item? If the price of a sweater and the price of a shirt are not known to us, and can be recovered only by solving a system of equations, it is only because they have been deliberately hidden from us – and what store would have any reason to do that? But many people like puzzles, and we could view this as a puzzle, the type of puzzle we would call a math puzzle.
From the systems of equations and the matrices in math class, you might never ever guess that vast amounts of money and vast amounts of computer resources are used every day all across the world to perform matrix operations – operations on square matrices even, operations above and beyond the matrix-vector inner product stuff like pricing out orders in the way we saw in the prior parts of this series.
Let me end by sketching a somewhat more realistic example of a problem where you end up with a system of equations. You test a sample of concrete that is supposed to have a certain amount of steel in it. Concrete is cheap, and steel is expensive but crucial to the strength of the concrete. Your supplier would have had incentives to skimp on the amount of steel used. You know how much pure steel weighs per cubic inch, you know how much pure concrete weights per cubic inch, and you have measured the weight and the volume (cubic inches) of your sample. What is the composition of your sample?
The structure of the concrete/steel problem isn’t all that different from the shirt/sweater problem. It is different in that for the shirt/sweater problem you might simply call the store, or look them up on the web. In the concrete/steel problem you might go ahead and destroy the sample to look at the steel inside, and this might well be a good thing to do. Yet solving the system of equations would be a quick way to establish that there is insufficient steel in the concrete.
Testing the purity of drugs (whether the legal or illegal kind) can be done by similar techniques.
Do you have better examples of solving systems of equations? The criteria I’m looking for are (1) that it’s real-life, and (2) easy to state for a non-specialist audience.