In the prior parts of this series, I have look at a number of problem situations from which a certain notion of grouping data and organizing data and operating on data arises straightforwardly, a way of grouping data that is well known in traditional mathematics as vectors and matrices. I have also hinted all along the way that the traditional textbook treatment of these subjects is quite different. In this post I want to make some comparisons with what’s usually taught in the later grades of high school and undergraduate curricula (specifically for science and engineering), and I have no particular illusion that this will be of interest for people who have not encountered that curriculum before – so feel free to skip this post; the next posts will return to applications and models.
In school, vectors are often identified with an arrow, in three-dimensional space. It is pictured as having a size and a direction. The vector has three components, an x-component, a y-component and a z-component. An example would be the force acting on a point mass, another example would be the velocity of that point mass, and electric field at a particular location can also be pictured as such a vector. The joint effect of two forces, often called the resultant force, can be drawn as the diagonal of a parallelogram for which the original forces form the sides. This is then called the sum of the two vectors.
A matrix is often introduced as a three-by-three arrangement of numbers that can have a variety of uses. In algebra class, a matrix may be first introduced when looking at a system of linear equations with multiple unknowns, e.g.
The system of three equations with three unknowns, shown at the top, is then rewritten by separating out the pattern of coefficients, as shown on the bottom. This pattern of coefficients is then called a matrix, the arrangement with x, y and z is then called the unknown vector, and the group of numbers 23, 11, 17 is then called the known vector, or sometimes the right hand side vector. The big things that look like parentheses are traditionally used to “hold” the numbers in the matrix, and similarly to hold numbers in a vector. Sometimes big square brackets are used instead. The matrix of coefficients is said to be multiplied with the unknown vector to give the known vector.
The matrix of coefficients is an example of a square matrix, here 3 by 3, but you can also use the same idea for a system of 2 equations with 2 unknowns, or 7 equations with 7 unknowns.
Multiplication of one matrix and another is often introduced using square matrices only, and this has some appeal because it is a closed system. When you multiply one three-by-three matrix with another three-by-three matrix, what you get is yet another three-by-three matrix. For the curriculum writers, matrix multiplication using square matrices is often their earliest opportunity to show non-commutative groups: a system where multiplication is associative but not commutative: the order in which you multiply two square matrices matters. How you multiply two matrices, or why, that is often dealt with strictly as a recipe, as a number of steps to follow. Here is an example of such an attempt, typical in its recipe-like approach, yet unusually visual and animated in its execution. Watch the moving hands! Compared to the moving hands, a college textbook’s formula seems positively ante-diluvial. But even the nice moving hands don’t make clear why you are multiplying this way. Most incredible, to me, is how anybody is supposed to keep track of what all these numbers in the matrix stand for.
In our approach in prior parts of this series, I indicated how matrix multiplication is symmetric, that is, commutative. This would seem to fly in the face of everything you learn in normal textbooks. There is no real conflict between the two approaches, but rather an easy way to confuse what is going on when the rows and columns of a matrix are not labeled the way I’ve suggested doing. In traditional treatments, the issue of matrix multiplication being non-commutative usually only shows up when multiplying square matrices. When you multiply a 3×5 matrix with a 5×7 matrix, you get a 3×7 matrix, but you can’t multiply a 5×7 matrix with a 3×5. In the traditional treatment, that multiplication just cannot be done. And because of that, there is rarely any confusion between matrix A times matrix B versus matrix B times matrix A, since only one of them makes sense to be done at all. When both matrices are square, then confusion can arise. In our treatment earlier, for two matrices to be multiplied, there as to be a shared edge. To have a shared edge requires more than simply having the same number of numbers: the labels have to match as well. After all, I would want to multiply the number of fries ordered with the price of fries, and not multiply the number of fries with the price of chicken. The labels help you keep it all straight.
In the traditional treatment, the order of the numbers in the matrix, and the separation between rows and columns become sacrosanct. Matrix multiplication is then described in terms of combining a row from the matrix on the left with a column from the matrix on the right. In the treatment we have given in the prior parts of this series, what is shown as a row and what is shown as a column, and in what order the products of the menu are listed are largely a matter of convenience. What matters is that we combine pieces of data that all relate to french fries, or pieces of data that all relate to chickenburgers.
The compactness of the traditional notation for matrices, achieved by just listing the numbers, comes at the risk of losing track what number stands for what, and what the numbers, vectors and matrices mean.