Here is another article that I hope to develop into an entry on the Tricks Wiki. It concerns the use of linear algebra to solve extremal problems in combinatorics. The method is quite easy to illustrate with some well-known examples, but what I find interesting is the question of how to recognise the kind of problem where the method is likely to apply. I have something to say about that, but I’d like to make clear that I didn’t think of it for myself. If I remember rightly, I read it in something that Noga Alon wrote. I’ll draw attention to it when I get there.
Incidentally, the use of linear algebra in combinatorics has become a well-developed art, and the examples I present are just the tip of a large iceberg. Also, there are many combinatorialists who are much better practitioners of this art than I will ever be. So I would fully expect the article below to be expanded by other contributors. (Alternatively, if you see obvious remarks that I have missed, then you can make comments here and I will incorporate them into the version that appears on the Tricki.)
Title: Solving extremal problems using linear algebra.
Quick description: If you want to find the maximum of a combinatorial quantity, and if the maximum seems to be very far from uniquely attained, then try to relate the quantity to the dimension of a certain vector space.
Prerequisites: Basic linear algebra over finite fields. The notions of set, subset, element.
General discussion/Example 1: Let be an even integer. How many subsets of the set
can you pick if they all have to have odd size but the intersection of any two of them has to have even size?
Let us try to attack the first of these problems directly, and to begin with let us just try to find the biggest example we can in the greediest way we can. And to save space let us use notation such as 134 for the set with elements 1, 3 and 4 rather than proper notation such as . The obvious set to begin with is 1. Since the next set has to have even-sized intersection with 1 and odd size, the obvious choice is 2. Continuing, the greedy algorithm gives us the
sets 1, 2, …, n, and we then find that we cannot choose any more sets (since any other set of odd size will have an odd intersection with any singleton formed from one of its elements).
Can we do better than this? Let’s experiment by starting with a different set and then continuing greedily. If we start with 123, then the next set has to have odd size and even-sized intersection with 123, so the two most obvious choices are 124 and 4. If we go for 124, then the obvious continuation is 125, 126, …, 12n. Can we choose any more sets? Yes we can: the set 134567…n, which a moment’s thought will reveal to have odd size and an even intersection with all the other sets. And then we notice that we can also choose the set 234567…n, at which point it is not hard to prove that we cannot choose any more. (Proof: any set that contains 12 cannot contain any of 3,4,5,…,n, so must be 12, which has odd intersection with 134567…n. Any set that contains 13 must contain all of 4,5,6,…,n because if it doesn’t contain m then it has odd intersection with 12m. And any set that contains 23 must contain all of 4,5,6,…,n for a similar reason.) Once again, we have ended up with sets.
If you continue experimenting in this way, you will find the same thing every time: if you have chosen fewer than sets then you can choose more, but once you get to
sets then you get stuck. But there seems to be a great deal of freedom about how you choose the sets. This contrasts with many problems in extremal combinatorics, where the best possible examples are often unique, or unique up to some obvious symmetry. (A good example of the latter: how many subsets of
can you choose if none of your subsets is contained in any other? The unique best possible collection of sets is all sets of size
.)
Are there any other circumstances where you have a collection of objects, and a condition on pairs of those objects, such that (i) whenever you have chosen objects
with
there are many different ways of choosing
such that the condition holds for each pair
, and (ii) it is impossible to choose more than
of them if any two have to satisfy the condition? Yes there are: a common one is when the objects are vectors in
and the condition on a pair
is that
and
should be orthogonal. Since orthogonality implies independence, we cannot choose more than
non-zero vectors that are all orthogonal to each other.
Can we relate these two situations? To do so we would need to associate vectors with our subsets of and interpret the even-intersections condition as a kind of orthogonality. The obvious way to associate a vector with a subset
of
is to take its characteristic function, that is, the function
that takes the value 1 for every
that belongs to
and 0 for every
that does not belong to
. Now the size of the intersection of two sets is equal to the standard scalar product of their characteristic functions (a fact that has many uses). Therefore, if two sets have even intersection, then their scalar product is an even integer.
But we wanted orthogonality. That is, we wanted a scalar product of zero. Can we think of an even integer as being “zero” somehow? Yes we can, if we look at it mod 2. So we are led to interpret the characteristic functions of the sets as elements of the set , and to think of that set as an
-dimensional vector space over the field
Of course, we must check that orthogonality still implies independence in this context. The usual proof is this: if , then the scalar product with any
must be 0; but the orthogonality of the
implies that this scalar product is
; since
is a non-zero vector, its scalar product with itself is non-zero, so
must be zero; since
was arbitrary we are done. In the mod-2 context the only part of that proof that fails is the assertion that if
is a non-zero vector, then
is non-zero: it is equal to the number of 1s in
, reduced mod 2. But if
is the characteristic function of a set of odd size, then this will be 1, so the proof of linear independence works in this case.
A quick summary of the above proof: you can choose at most sets of odd size with even intersections because the characteristic functions of such sets must be linearly independent in the vector space
.
Further general discussion: How can we recognise extremal problems where linear algebra is likely to be useful? One answer was touched on above, and this is the insight that I think I got from Noga Alon. It is that such problems are often characterized by the presence of a wide variety of best possible examples. The reason this suggests linear algebra as a possible tool is that the problem, “How many linearly independent vectors can you find in ?” has a very similar quality: you can’t find more than
, but there are many ways of finding
. Of course, if your problem has that quality, there is no guarantee that linear algebra can be used, and even when it can, one often needs much more ingenuity to recast it in terms of vectors in a vector space than was needed above.
Example 2: An obvious follow-up problem to the first is this. How many sets can you pick if their sizes have to be even and their intersections also even?
Note that the approach above now fails, since if has even size and if
is its characteristic function, then
. Furthermore, a moment’s thought reveals that much bigger collections of sets are possible. For instance, take the collection of all possible unions of the sets 12, 34, 56, …, (n-1) n. There are
such unions, they all have even size, and the intersection of any two of them is even. That collection of sets turns out to be merely the simplest and most natural of a wide variety of examples of collections of size
with the desired property. Combining that observation with the fact that we can express the even-intersections property as a condition about scalar products mod 2, it seems likely that linear algebra will help us.
There are two approaches we could try: one would be to find some -dimensional space
and find some way of converting any system of
sets that satisfy the property we want into a collection of
linearly independent vectors in
. The other is to stick with our previous way of turning sets into vectors and see where it gets us. Since it is not at all obvious what we would choose as
in the first approach, let us try the second.
Thus, we are trying to choose as many vectors in as we can, with the property that every
that we choose has an even number of 1s, and
for any
and
that we choose. Since the parity of the number of 1s in
is
, we can say this quite neatly: we want a collection
of vectors with the property that
for any two vectors
(not necessarily distinct).
Imagine that we are choosing a collection of such vectors in a greedy way: we just keep going until we cannot go any further. If we have chosen vectors , then any vector
we add to the collection will have to have the property that
for
. Now the condition
is a linear condition of
, so we are placing
linear conditions on all subsequent vectors. It might seem as though they must therefore all live in an
-dimensional subspace of
, but of course that would only be the case if the linear conditions were independent, and there is no reason for them to be. But this does give us a huge clue about how to proceed with the proof: if we are trying to keep going for as long as possible, then we should try to make the conditions we impose as dependent as possible, which comes to the same thing as saying that we should try to cram the vectors
into as small a subspace of
as we can.
From here the proof more or less writes itself. If we have chosen more than vectors, then at least
of them must be linearly independent (since an
-dimensional subspace of
has size
. From this it follows that all vectors in the collection are confined to a subspace of dimension at most
, which is an obvious contradiction. So the simple example we started with is indeed best possible.
Example 3: To give a glimpse of how much more sophisticated the linear-algebra method can be, here is a beautiful proof, due to Peter Frankl and Janos Pach, of the famous Sauer-Shelah lemma. This is a solution to the following problem. Let be a collection of subsets of
. Given any subset
, define the projection
of
onto
to be the collection of all sets
with
. How large can
be if there is no set
of size
for which
is the set of all subsets of
. (Many authors say that
is shattered by
if
is the set of all subsets of
.)
A moment’s thought leads to an obvious maximal example (by which I mean an example that cannot be extended rather than an example of maximal size, though it will eventually turn out to be of maximal size too). It is to take to be the set of all subsets of
of size less than
.
Imagine now that we have experimented a bit and been unable to find any larger examples. Given the naturalness of that example, we begin to suspect that it may be of maximal size. If we have experimented sufficiently diligently, we will also see that if our hunch is correct, then there are many examples of maximal size. So once again we wonder about using linear algebra. The most obvious approach would be to find associate with each set in a vector in some space of dimension
in such a way that the condition we are requiring
to satisfy implies that all those vectors are linearly independent.
But how do we associate coordinates with a subset of
? The least unnatural way would seem to be to relate these coordinates in some way to the sets of size at most
. With this thought in mind, the following idea is quite a natural one: let
be the set of all subsets of
of size at most
, and associate with any subset
of
the function
that takes a set
to 1 if
and 0 otherwise.
You may have wondered why I described this as a function to rather than to
. The reason was that it turns out to be best to regard the functions
as belonging to a vector space over
rather than over
. (This is often true, so this example is a useful illustration.) Indeed, it turns out that if
has no full projections onto sets of size
, then the functions
are linearly independent.
Let us see how to prove this. To do this, we take a linear dependence and try to prove that every
is zero. The information we have is that for no set
of size
does
contain all subsets of
. An obvious way to try to use this information is to assume that not all the
are zero, to define some set
of size
in a natural way, and to prove that every subset of
is equal to
for some
.
Let us consider what the statement actually says. If
is a set of size at most
, then
can be rewritten as
. So we know that this quantity is 0 for every set
of size at most
. So we have here a function that vanishes for every set of size
, and we are looking for a set of size at least
This suggests that we should look for a set
such that
does not vanish, since such a set is guaranteed to have size at least
. We could then hope that every subset of this set
was of the form
for some
.
Must there exist a set such that the expression in question does not vanish? Clearly yes: we can just take
to be a maximal
such that
. However, it is too much to hope that every set
with this property will work (and easy to find examples of sets
that do not work). If we want to make it as likely as possible that every subset of
is of the form
then obviously we would like
to be as small as possible. So let us choose a minimal
such that
.
Let . We would like to prove that there exists
such that
. And a natural way of doing that, given what we have so far, would be to prove that
. And a natural way of doing that, if we want to exploit the minimality of
, is to try to express this sum as a combination of sums of the form
with
.
This can indeed be done. One can check (and this is basically the inclusion-exclusion formula) that . By the minimality of
, the sum on the right-hand side is equal to
, which is non-zero, and we are done.
Recommended further reading. This singleton list needs to be extended, but as a start, one could look at Part III of “Extremal Combinatorics: with Applications in Computer Science” by Stasys Jukna. It can be looked at online here (just click on “Preview this book”).
End of sample article.
The expectation is that, like Wikipedia articles, Tricki articles will be edited many times by people other than the original author. This would be an obvious candidate for further work: it would be good to have more examples and a more coherent account of how to use the method itself. (For instance, is there a more systematic way of converting problems into linear algebra, is there a way of telling which scalars to use, etc.?)
August 3, 2008 at 1:29 am |
I also recently discussed (in less detail) Example 3 here, after reading about it in van den Dries’s book on o-minimal structures. He also gives a reference for it in a probability theory paper of Vapnik and Chernovenkis (and does not mention Sauer’s name).
The proof in your post seems more natural than the one in the book of van den Dries (which is essentially the same as that of Vapnik-Chernovenkis), by induction on k.
August 13, 2008 at 9:08 pm |
Dear Tim,
Here is an exercise that is related to Example 2 which you may find interesting. Fix k and let n >>k. How many subsets of {1,..,n} can one find such that each has even size and every k-intersection also have even size ?
August 20, 2008 at 8:01 am |
[…] “dimension arguments in combinatorics”. For more examples and a general discussion see this post in Gowers’s […]
September 5, 2008 at 7:55 pm |
There is a monograph (I am not sure if it is available as a book) by Babai and Frankl titled, `Linear Algebra methods in Combinatorics’. It covers a wide variety of examples of this type. So examples involving the binomial coefficients can be handled using the vector space of $k$-alternating forms and so on.
September 28, 2008 at 3:11 am |
[…] described as an “eigentheorem” because it is important in many different places. It was mentioned(with an algebraic proof by Frankl and Pach) in Gowers blog and also, in another context, in […]
November 14, 2008 at 11:57 pm |
I have an other interesting question:
Let S be a finite set and let T be a proper nonempty subset of S. Suppose R is selected uniformly at random from among all subsets of S of odd size. what is the probability that size of (T intersection R) is odd?
any ideas?
May 21, 2009 at 7:09 am |
[…] applications. It has several proofs, all based on linear algebra methods (also referred to as dimension arguments). The original proof is based on a careful study of incidence matrices for families of sets. […]
July 25, 2012 at 4:24 am |
Hi Tim,
Can I answer your first question this way? (I would like to know if there is any mistake in the approach I take below.)
So, you represent a set
by its characteristic vector which are bitstrings of length
each containing an odd number of
‘s. I assume that you can get
such
‘s which have odd number of elements and the intersection of
and
contains even number of elements.
Like you have done above, let us view this as a vector space over
. Now, if it is possible for me to stuff more than
number of sets, let us look at
as a linear combination of minimal number of basis elements (which has fewer than
) elements. Then, if
, then
is 1 for all
(by minimality).(Here
is XOR) and since intersection is even, the value should be
be true for XOR which is not the case. Contradiction => m is at most n. And m = n is possible. So, the result follows.
January 11, 2015 at 11:47 pm |
[…] This result shows the power of using linear algebra in combinatorics. It has been used extensively to solve several interesting combinatorial and geometrical problems, many of which could not be proved in any other way. Here is a nice mathoverflow question that discusses the use of linear algebra in combinatorics: http://mathoverflow.net/questions/17006/linear-algebra-proofs-in-combinatorics, and here is a blog post by Gower discussing this kind of dimension argument in combinatorics: https://gowers.wordpress.com/2008/07/31/dimension-arguments-in-combinatorics/ […]
March 6, 2022 at 7:03 am |
[…] example would be Tim Gowers’ beautiful proof of the following […]