The Linear Logo

Dr. Mark V. Sapir

How to prove that two sets are equal

The only direct way to prove that a set A is equal to a set B is to take an element from A and show that it belongs to B, then take an element from B and prove that it belongs to A. Thus the statement that A is equal to B is in fact a conjunction of two statements "every element from A belongs to B" and "every element from B belongs to A".

For example, if you want to prove that the set of all numbers which have real square roots coincides with the set of all non-negative real numbers, you need to show that:

  1. Every real number which has a real square root is non-negative.
  2. Every non-negative real number has a real square root.

Another example: if you need to prove that the kernel of a linear operator in R2 which projects every vector to a line L is the set of vectors which are perpendicular to L, you need to prove two statements:

  1. Every vector in the kernel of the projection is perpendicular to L and
  2. Every vector which is perpendicular to L is in the kernel of the projection.

How to disprove a statement of the form "For every...".

The only direct way to disprove such a statement is to present a concrete example for which this statement is wrong.

For example, if you want to disprove that statement "Every number is even", you need to present a number which is not even, say, 1.

Another example: If you need to prove that a set R2 with addition:

(a,b)+(c,d)=(a,d)

and scalar multiplication

k(a,b)=(ka,kd)

is not a vector space, and you want to prove that the commutativity property of addition does not hold, you need to present two concrete vectors u and v such that u+v is not equal to v+u, say, u=(1,2) and v=(3,4).

Types of Proofs

There are many types of proofs or, more precisely, many methods of proving mathematical statements. Here I am going to talk about some of these methods.

There are several books devoted entirely to methods of proofs. The most famous books on this subject are two books by G. Polya, "How to solve it" and "Mathematics and plausible reasoning". I strongly recommend these books to everybody.

Proofs by contradiction


Suppose that we want to prove a statement A which (as any statement) has some assumptions and a conclusion. The proof by contradiction consists in assuming, in addition to all the assumptions of the statement A, that the conclusion of A is false; then using this additional assumption we deduce an obviously false statement (2=3 or something like this). This would mean that the additional assumption was wrong and the conclusion of A is in fact true.

Example 1.

Prove that if a square matrix A is a zero divisor (that is AB=0 for some non-zero matrix B) then det(A)=0.

Proof. By contradiction, assume that the assumption that A is a zero divizor is true and in addition assume that the conclusion is false, that is det(A) is not 0. Since A is a zero divisor, there exists a non-zero matrix B such that AB=0. Since det(A) is not zero, by the third theorem about determinants A is an invertible matrix. Multiply the equality AB=0 by A-1 on the left: A-1*(A*B)=0. Since A-1A=I and I*B=B, we get B=0, a contradiction (B is not zero by our assumption). Thus we assumed that our statement is false and deduced an obvious contradiction. This means that our statement is in fact true.

The main feature of a proof by contradiction is that we are allowed to operate with all the assumptions of the original statement A plus the new assumption that the conclusion of A is false.

Example 2.

Prove that if A is a square matrix and det(A)=0 then A is a zero divisor.

Proof. Indeed, assume that det(A)=0 (the assumption of the original statement) and A is not a zero divisor (that is the conclusion of our statement is wrong). By the third theorem about determinants A is not invertible matrix (since det(A)=0). Therefore by the second theorem about inverses the reduced row echelon form C of our matrix A is not the identity matrix. Therefore C contains a zero row. Since C is in the reduced row echelon form, the n-th row of C is zero where n is the size of C.

We want to show that there exists a non-zero matrix B such that CB=0. Let B be a "generic" n by n matrix with unknown entries B(i,j), the number of entries in B is, of course, n2, so we have n2 unknowns. Represent C as a column of rows r1,...,rn and represent B as a row of columns c1,...,cn. Then the equation CB=0 is equivalent to a system of n2 linear equations ricj=0. If i=n then such an equation has the form 0=0 since rn=0, so we can delete these equations from our system. Thus we have a system of (n2-n) homogeneous equations with n2 unknowns. Since n2-n < n2, we can apply the theorem about homogeneous systems of equations and conclude that this system has a non-zero solution. Thus indeed, there exists a non-zero matrix B such that CB=0.

We know that C is obtained from A by a series of row operations. Then A also can be obtained from C also by a series of row operations (see the lemma about inverses of elementary matrices and the paragraph before that lemma). By the theorem about elementary matrices A is equal to E1E2...EkC for some elementary matrices E1,...,Ek. Therefore

AB=(E1E2...EkC)B=(E1...Ek)CB=0

This contradicts our assumption that A is not a zero divizor (indeed, B is not 0 and AB=0). This contradiction completes the proof.

Proofs by case-analyzis


This method consists of dividing a statement that we want to prove into several sub-statements and proving them separately. We have used this method in the proof of the theorem about elementary matrices.

The following is an easier example.

Example. Prove that |-x|=|x| for every real number x. Here |.| denotes the absolute value of a number.

Proof. The definition of absolute value suggests that we need to consider two cases: in the first case x is a non-negative number and in the second case x is a negative number.

Case 1. x is non-negative. Then by the definition of absolute value |-x|=x and |x|=x, so |-x|=|x| as required.

Case 2. x is negative. Then again by the definition of absolute value |x|=-x and |-x|=-x, so again |-x|=|x|.

The statement is proved.

To be continued


Index Concepts Cover Page