Dr. Mark V. Sapir

Vector Spaces

One of the main ideas of algebra is the following. Consider a set of objects studied in some area of mathematics or physics or any other science (say, the set of all numbers, the set of all vectors on a plane, the set of all functions, the set of all theorems in a calculus book, etc.). Usually there are certain operations that one can perform on these objects (say, one can add, multiply, subtract numbers, vectors or functions). A collection of objects and operations is usually called an algebraic system.

Then we choose some minimal set of basic operations such that all other useful operations are composed from the basic ones.

Then we look what kind of important statements about our system of objects we can prove. We look at the proofs and find a minimal system of basic properties of our operations, that we use in these proofs.

After that we say that every algebraic system which satisfies these basic properties is similar to the one we started with. And indeed, all important theorems which hold in the initial algebraic system will hold in any similar algebraic system. The important fact is that even if two algebraic systems are similar from the algebraic point of view, they may come from completely different parts of science, and their nature may be completely different. This allows us to transfer knowledge from one part of science to another.

For example, consider the set of all vectors on a plane. Choose a coordinate system with two unit orthogonal vectors e and f. Then every vector v on the plane is a linear combination of e and f that is

v = xe + yf for some numbers x and y (the coordinates of v). We can do the following three operations on the vectors:

Addition: (xe+yf)+(pe+qf)=(x+p)e+(y+q)f,
Multiplication by a scalar: p*(xe+yf)=(px)e+(py)f,
Dot product: <(xe+yf),(pe+qf)>=xp+yq.

Notice the difference between the first two of these operations and the third one. The results of the first two are vectors (so we do not escape the set of vectors), the result of the third one is a scalar (not a vector on the plane).

Using these basic operations we can express many other operations used in geometry. For example, from geometry we know another formula for the dot product:

<a,b> = ||a|| ||b|| cos(A) where A is the angle between vectors a and b, ||u|| is the length of u. From this formula, it follows that the length ||u|| of a vector u is equal to sqrt(<u,u>). Therefore cos(A)=<a,b>/(||a|| ||b||). This formula can be viewed as a definition of the angle between the vectors a and b. Thus using the operation of dot product we can express other operations (taking the length and taking the angle between two vectors).

We can also find the area of a triangle. Of course, first we need to define a triangle in terms of vectors. An appropriate definition could be the following: a triangle is a triple of vectors (a, b, a-b). From school, you remember the formula for the area of a triangle:

S=1/2 ||a|| ||b|| sin(A)

Since we know how to compute the angle between two vectors using only the dot product, we can express the area in terms of the dot product.

Thus, we can see that using only the basic operations (addition, multiplication by scalar, dot product) we can define many important geometric concepts and operations.

Now we shall consider four other examples of algebraic systems.

For every n let Rⁿ be the set of all row vectors with with n components (a₁,...,a_n). This set is called n-space. A row vector with n component will be called an n-vector. The components of an n-vector are called coordinates.

Let C[0,1] be the set of all continuous functions on the unit interval [0,1].

For every k and n let M_kn be the set of all k by n matrices with real entries.

Let C be the set of all complex numbers.

One can view a 2-vector (x,y) as just another notation for a vector on a plane (xe+yf) or as a notation for the complex number x+iy.

We can add two vectors and multiply a vector by a scalar (as matrices). We can also add two continuous functions (matrices, complex numbers) and multiply a function (matrix, complex number) by a scalar (=real number).

Since n-vectors are matrices, they satisfy the following properties:

The addition is commutative and associative: A+B=B+A, A+(B+C)=(A+B)+C
The multiplication by scalar is distributive with respect to the addition: a(B+C)=aB+aC
The product by a scalar is distributive with respect to the addition of scalars: (a+b)C=aC+bC
a(bC)=(ab)C
1*A=A (here 1 is the scalar 1)
There exists a zero n-vector 0 such that 0+A=A+0=A for every A
0*A=0 (here the first 0 is the scalar 0, the second 0 is the zero-vector)

Continuous functions, matrices and complex numbers satisfy the same properties. In particular, the zero function is the function which takes every number to zero.

Using these properties we can deduce other properties of vectors, functions, matrices or complex numbers. For example, in order to solve an equation

X+A=B We can add (-1)*A (usually denoted by -A) to both sides of the equation:
(X+A)+(-A)=B+(-A)

Then we can use the associative law:

X+(A+(-A))=B+(-A)

Then we remember that A=1*A and rewrite the equality in the following way:

X+(1*A+(-1)*A)=B+(-A)

Then we use one of the distributivity laws:

X+(1+(-1))*A=B+(-A)

Since 1+(-1)=0, 0*A=0=0*X, we can rewrite the equality again:

X+0*X=B+(-A)

Now X=1*X and we use one of the distributive laws again:

1*X+0*X=(1+0)X=B+(-A)

Since 1+0=1 and 1*X=X, we finally get:

X=B+(-A)

Of course, we denote B+(-A) as B-A, so the subtraction is a derived operation, we derive it from the addition and the multiplication by scalar. Notice that in the solution of the equation X+A=B we did not use the fact that A,B,X are functions or vectors (or matrices or numbers for that matter), we used only the properties of our basic operations.

Any set of objects V where addition and scalar multiplication are defined and satisfy properties 1--7 is called a vector space. Here by addition we mean any operation which associates with each pair of objects A and B from V another object (the sum) C also from V; by a scalar multiplication we mean any operation which associates with every scalar k and every object A from V another object from V called the scalar multiple of A and denoted by kA.

Elements of general vector spaces are usually called vectors. For any system of vectors A₁,...,A_n and for any system of numbers a₁,...,a_n one can define a linear combination of A₁,...,A_n with coefficients a₁,...,a_n as

a₁A₁+...+a_nA_n.

Notice that a vector space does not necessarily consist of n-vectors or vectors on the plane. The set of continuous functions, the set of k by n matrices, the set of complex numbers are examples of vector spaces.

Not every set of objects with addition and scalar multiplication is a vector space. For example, we can define the following operations on the set of 2-vectors:

Addition: (a,b)+(c,d)=(a+c,d).

Scalar multiplication: k(a,b)=(k²a,b).

Then the resulting algebraic system will not be a vector space because if we take k=3, m=2, a=1, b=1 we have:

(k+m)(a,b)=(3+2)(1,1)=5(1,1)=(25,1); k(a,b)+m(c,d)=3(1,1)+2(1,1)=(9,1)+(4,1)=(13,1),

Thus the third property of vector spaces does not hold.

Euclidean Vector Spaces

Now let us define dot products. The dot product (Euclidean inner product) of n-vectors is defined as follows:
(a₁,...,a_n)(b₁,...,b_n)=a₁b₁+...+a_nb_n

We can also define dot products in other vector spaces considered in the previous section. The most important one is the dot product of functions. If f and g are two continuous functions on the interval [0,1] then the dot (inner) product f*g is the integral of the product f(x)g(x) from 0 to 1. The dot product of functions f(x) and g(x) will be denoted by <f(x),g(x)>.

The dot product of n-vectors satisfies the following properies:

<A,B>=<B,A> ;
<(A+B),C>=<A,C>+<B,C> ;
<(kA),B>=<A,(kB)>=k<A,B> ;
<A,A> is greater than or equal to 0. <A,A> is 0 if and only if A=0.

The dot product of functions from C[0,1] satisfies the same properties.

Any vector space V with a dot product which satisfies properties 1-4 is called a Euclidean vector space.

Using the dot product one can define most of the geometric concepts, so one can transfer the elementary geometry to arbitrary Euclidean vector spaces.

In particular, one can define the length (norm) of a vector in a vector space as

||A|| = sqrt(<A,A>) The following theorem shows that this norm satisfies the usual properties of length:

Theorem. Let V be a Euclidean vector space then the norm has the following properties:

||A|| is greater than or equals 0, ||A||=0 if and only if A=0.
||kA||=|k| ||A||.
|<A,B>| is less than or equal to ||A|| ||B|| (the Cauchy-Schwartz inequality).
||A+B|| is less than or equal to ||A||+||B|| (the triangle inequality).

All these properties have clear geometric meanings in the planar geometry:

The first property means that length is always non-negative and the zero vector is the only vector of length 0.

The second property means that if we multiply a vector by a number k, the vector gets longer by a factor of |k| (if k is negative, the vector changes its direction).

The third property (the Cauchy-Schwartz inequality) means that <A,B>/||A|| ||B|| is always between -1 and 1, which is true for vectors on the plane since this quotient is precisely the cosine of the angle between these two vectors.

The fourth property means that the length of every side of a triangle does not exceed the sum of the lengths of the other two sides.

Using the norm, one can define the distance between two vectors:

d(A,B)=||A-B||

This distance satisfies the ordinary property of distances:

Theorem. Let V be a Euclidean vector space then the distance function has the following properties:

d(A,B) is greater than or equals 0, d(A,B)=0 if and only if A=B.
d(A,B)=d(B,A).
d(A,B)is less than or equals d(A,C)+d(C,B) (the triangle inequality).

We can define many other geometric concepts using the dot product. For example, we can call two vectors A and B orthogonal if <A,B>=0 (their dot product is 0). Orthogonal vectors in arbitrary Euclidean vector spaces have properties similar to orthogonal vectors on a plane. For example, if A and B are orthogonal then pA and qB are also orthogonal for every scalars p and q ( prove it!).

The following theorem is an analogue of the Pythagoras theorem.

Theorem. Let A₁,...,A_n be pairwise orthogonal vectors in a Euclidean vector space. Then

||A₁+A₂+...+A_n||²=||A₁||²+||A₂||²+...+||A_n||².

The proof is left as an exercise.

Click here for a discussion of a generalization of Pythagoras theorem to infinite sets of vectors.

Using dot products in Rⁿ we can rewrite any system of linear equations

a₁₁ x₁+...+a_1n x_n = b₁
a₂₁ x₁+...+a_2n x_n = b₂
.....................
a_m1 x₁+...+a_mn x_n = b_m

in the following form:

<A₁,v> = b₁
<A₂,v> = b₂
..........
<A_m,v> = b_m

where

A_i=(a_i1,...,a_in),

v is the vector of unknowns (x₁,...,x_n) and * denotes the dot product. This gives another interpretation of systems of linear equations.

In particular if b₁=b₂=...=b_m=0 then to solve this system of linear equations means to find a vector v which is orthogonal to given vectors A₁,...,A_m. Thus homogeneous systems of linear equations arise naturally in the geometry of Euclidean vector spaces.