After the work we’ve done in prior parts of this series, e.g. part 11, to show vector inner products and matrix multiplication using an example of pricing out an order at a fast food place, I’d like to work through an example that looks very different at first: an example of a perspective drawing.
If we first focus on the top drawing, we could come up with a likely story about what the drawing represents. I think of it as a picture of four cubes, stacked in such a way that one cube isn’t really visible. The side of the cubes are 1 unit in each direction. I’ve drawn axes and scales to correspond to this. The x-axis points towards the right, the z-axis points straight up, and the y-axis points away from us, out of the paper, so to speak. The point labeled P is 2 units to the right, 1 unit back, and 1 unit up, and so has x,y,z coordinates of 2,1,1, which in standard notation is given as P(2,1,1). The point Q(0,2,2) isn’t marked, but you should be able to locate it, it’s on the top cube, on top, all the way in the back, and on the left.
Using our vector notation, we could indicate the information we have about P and Q as follows:
We can show the information about P and Q separately, as is done on the left, or combine them into a single matrix as is done on the right.
If we now focus on the bottom drawing, we note that the picture, as such, is identical to the one on top. But this one we’re going to interpret entirely as a two-dimensional picture. And in a way, it really is! It’s a picture of squares and parallelograms, all drawn on a single plane, the plane of the paper, or the plane of the screen. We can also use a pair of coordinate axes for this drawing, and I’ve indicated such a pair, though you may note I’ve used u and v rather than the more standard x and y. In this picture, the point P could be given as P(2.5,1.5), or P and Q could be given in matrix notation as follows:
There is a relationship between the x,y,z values and the u,v values, and this relationship is given by the way the perspective works. The perspective I’ve used in the drawing is not the one you learn in art class, with horizons and points where parallel lines meet in the distance. The perspective here is the one you see (with some variations) in engineering drawings. These variations are sometimes called oblique perspective, or cavalier perspective, or cabinet perspective. I’m not sure where this one fits in, I think it is closest to cavalier perspective. (Where I grew up, it was called “engineer’s perspective.”) Regardless, an essential feature is that lines that are parallel in space will remain parallel in the perspective drawing, and that equal distances along parallel lines in space will remain equal distances in the perspective drawing (though equal distances in different directions in space may not show as equal distances in the perspective drawing).
The relationship between x,y,z and u,v can itself be represented as a matrix:
and we can call this matrix the projection matrix (textbooks will usually call this the transformation matrix). One way to view this relationship is to look at the rows. The row for x indicates that one unit of x contributes a unit to u and nothing to v. The row for y indicates that one unit of y contributes half a unit to u and half a unit to v. The row for z indicates that one unit of z contributes nothing to u and one unit to v. We can determine these values by moving one in the direction of x and see what that does to u and v and then repeat this for y and z. Or we could see this by looking at the marker “1” on the x axis and writing down its coordinates in the u,v system. Since the x axis and the u axis are kind of the same, it makes sense to have x be one u and no v. Similarly, the z axis and the v axis are the same. The trickier one is the y axis, and we can see that the “1” marker on the y axis corresponds to (.5, .5) in the (u,v) grid.
The relationship between x,y,z and u,v for any particular point can now be expressed as a matrix multiplication:
To the points P and Q you can see that we’ve added points O and R. O is the origin of the axes, and R is the point that all four cubes have in common. On the top right we’ve shown the projection matrix which connects x,y,z and u,v; on the bottom left we have the coordinates of the four points in the x,y,z grid; on the bottom right we have the coordinates of the four points in the u,v grid. And, guess what, the matrix on the bottom right is the matrix product of the other two matrices. For example, the u coordinate of Q is found by multiplying the x coordinate of Q by 1, the y coordinate of Q by .5 and the z coordinate of Q by 0, and then adding up all these products. This gives us (0 × 1) + (2 × .5) + (2 × 0) = 1. Similarly, the v coordinate of Q is found as (0 v 0) + (2 × .5) + (2 × 1) = 3. Each number in the bottom right matrix is found as the inner product of the row on the left and the column on top.
If you wonder what the practical use of this might be – you don’t have to look very far. Almost any recent computer game you care to examine will spend a lot of its resources figuring out where to put a pixel on your two-dimensional screen, a pixel that represents a point in three-dimensional space. Even though the game’s perspective may include horizons and vanishing points, and even though it also needs worry about color gradients and shading, the process of taking a simulated three-dimensional scene and rendering it on a two-dimensional screen is done through vast amounts of matrix multiplications that are essentially of the type shown above. The numbers in the projection matrix will be different, depending on the location and position of the simulated camera, but ultimately it still boils down to matrix multiplication. Before computers (specifically, the computer graphics cards) could handle massive amounts of matrix multiplication in real time and in high resolution, games were restricted to what were called side scrolling games.
What we’ve seen here is that both pricing orders and projecting three-dimensional scenes on a two-dimensional screen have something surprising in common: a relationship between numbers in the matrices that goes beyond the particular labels that indicate what these numbers mean. This core pattern must be important in some way.