Dimension arguments in combinatorics

Here is another article that I hope to develop into an entry on the Tricks Wiki. It concerns the use of linear algebra to solve extremal problems in combinatorics. The method is quite easy to illustrate with some well-known examples, but what I find interesting is the question of how to recognise the kind of problem where the method is likely to apply. I have something to say about that, but I’d like to make clear that I didn’t think of it for myself. If I remember rightly, I read it in something that Noga Alon wrote. I’ll draw attention to it when I get there.

Incidentally, the use of linear algebra in combinatorics has become a well-developed art, and the examples I present are just the tip of a large iceberg. Also, there are many combinatorialists who are much better practitioners of this art than I will ever be. So I would fully expect the article below to be expanded by other contributors. (Alternatively, if you see obvious remarks that I have missed, then you can make comments here and I will incorporate them into the version that appears on the Tricki.)

Title: Solving extremal problems using linear algebra.

Quick description: If you want to find the maximum of a combinatorial quantity, and if the maximum seems to be very far from uniquely attained, then try to relate the quantity to the dimension of a certain vector space.

Prerequisites: Basic linear algebra over finite fields. The notions of set, subset, element.

General discussion/Example 1: Let n be an even integer. How many subsets of the set \{1,2,\dots,n\} can you pick if they all have to have odd size but the intersection of any two of them has to have even size?

Let us try to attack the first of these problems directly, and to begin with let us just try to find the biggest example we can in the greediest way we can. And to save space let us use notation such as 134 for the set with elements 1, 3 and 4 rather than proper notation such as \{1,3,4\}. The obvious set to begin with is 1. Since the next set has to have even-sized intersection with 1 and odd size, the obvious choice is 2. Continuing, the greedy algorithm gives us the n sets 1, 2, …, n, and we then find that we cannot choose any more sets (since any other set of odd size will have an odd intersection with any singleton formed from one of its elements).

Can we do better than this? Let’s experiment by starting with a different set and then continuing greedily. If we start with 123, then the next set has to have odd size and even-sized intersection with 123, so the two most obvious choices are 124 and 4. If we go for 124, then the obvious continuation is 125, 126, …, 12n. Can we choose any more sets? Yes we can: the set 134567…n, which a moment’s thought will reveal to have odd size and an even intersection with all the other sets. And then we notice that we can also choose the set 234567…n, at which point it is not hard to prove that we cannot choose any more. (Proof: any set that contains 12 cannot contain any of 3,4,5,…,n, so must be 12, which has odd intersection with 134567…n. Any set that contains 13 must contain all of 4,5,6,…,n because if it doesn’t contain m then it has odd intersection with 12m. And any set that contains 23 must contain all of 4,5,6,…,n for a similar reason.) Once again, we have ended up with n sets.

If you continue experimenting in this way, you will find the same thing every time: if you have chosen fewer than n sets then you can choose more, but once you get to n sets then you get stuck. But there seems to be a great deal of freedom about how you choose the sets. This contrasts with many problems in extremal combinatorics, where the best possible examples are often unique, or unique up to some obvious symmetry. (A good example of the latter: how many subsets of \{1,2,\dots,n\} can you choose if none of your subsets is contained in any other? The unique best possible collection of sets is all sets of size n/2.)

Are there any other circumstances where you have a collection of objects, and a condition on pairs (x,y) of those objects, such that (i) whenever you have chosen objects x_1,x_2,\dots,x_m with m<n there are many different ways of choosing y such that the condition holds for each pair (x_i,y), and (ii) it is impossible to choose more than n of them if any two have to satisfy the condition? Yes there are: a common one is when the objects are vectors in \mathbb{R}^n and the condition on a pair (x,y) is that x and y should be orthogonal. Since orthogonality implies independence, we cannot choose more than n non-zero vectors that are all orthogonal to each other.

Can we relate these two situations? To do so we would need to associate vectors with our subsets of \{1,2,\dots,n\} and interpret the even-intersections condition as a kind of orthogonality. The obvious way to associate a vector with a subset A of \{1,2,\dots,n\} is to take its characteristic function, that is, the function f that takes the value 1 for every m that belongs to A and 0 for every m that does not belong to A. Now the size of the intersection of two sets is equal to the standard scalar product of their characteristic functions (a fact that has many uses). Therefore, if two sets have even intersection, then their scalar product is an even integer.

But we wanted orthogonality. That is, we wanted a scalar product of zero. Can we think of an even integer as being “zero” somehow? Yes we can, if we look at it mod 2. So we are led to interpret the characteristic functions of the sets as elements of the set \{0,1\}^n, and to think of that set as an n-dimensional vector space over the field \mathbb{F}_2.

Of course, we must check that orthogonality still implies independence in this context. The usual proof is this: if \sum_i c_iv_i=0, then the scalar product with any x_j must be 0; but the orthogonality of the v_i implies that this scalar product is c_j\langle v_j,v_j\rangle; since v_j is a non-zero vector, its scalar product with itself is non-zero, so c_j must be zero; since j was arbitrary we are done. In the mod-2 context the only part of that proof that fails is the assertion that if v_j is a non-zero vector, then \langle v_j,v_j\rangle is non-zero: it is equal to the number of 1s in v_j, reduced mod 2. But if v_j is the characteristic function of a set of odd size, then this will be 1, so the proof of linear independence works in this case.

A quick summary of the above proof: you can choose at most n sets of odd size with even intersections because the characteristic functions of such sets must be linearly independent in the vector space \mathbb{F}_2^n.

Further general discussion: How can we recognise extremal problems where linear algebra is likely to be useful? One answer was touched on above, and this is the insight that I think I got from Noga Alon. It is that such problems are often characterized by the presence of a wide variety of best possible examples. The reason this suggests linear algebra as a possible tool is that the problem, “How many linearly independent vectors can you find in \mathbb{F}^n?” has a very similar quality: you can’t find more than n, but there are many ways of finding n. Of course, if your problem has that quality, there is no guarantee that linear algebra can be used, and even when it can, one often needs much more ingenuity to recast it in terms of vectors in a vector space than was needed above.  

Example 2: An obvious follow-up problem to the first is this. How many sets can you pick if their sizes have to be even and their intersections also even?

Note that the approach above now fails, since if A has even size and if v is its characteristic function, then \langle v,v\rangle=0. Furthermore, a moment’s thought reveals that much bigger collections of sets are possible. For instance, take the collection of all possible unions of the sets 12, 34, 56, …, (n-1) n. There are 2^{n/2} such unions, they all have even size, and the intersection of any two of them is even. That collection of sets turns out to be merely the simplest and most natural of a wide variety of examples of collections of size 2^{n/2} with the desired property. Combining that observation with the fact that we can express the even-intersections property as a condition about scalar products mod 2, it seems likely that linear algebra will help us. 

There are two approaches we could try: one would be to find some 2^{n/2}-dimensional space V and find some way of converting any system of m sets that satisfy the property we want into a collection of m linearly independent vectors in V. The other is to stick with our previous way of turning sets into vectors and see where it gets us. Since it is not at all obvious what we would choose as V in the first approach, let us try the second.

Thus, we are trying to choose as many vectors in \mathbb{F}_2^n as we can, with the property that every v that we choose has an even number of 1s, and \langle v,w\rangle=0 for any v and w that we choose. Since the parity of the number of 1s in v is \langle v,v\rangle, we can say this quite neatly: we want a collection Z of vectors with the property that \langle v,w\rangle=0 for any two vectors v,w\in Z (not necessarily distinct).

Imagine that we are choosing a collection of such vectors in a greedy way: we just keep going until we cannot go any further. If we have chosen vectors v_1,v_2,\dots,v_m, then any vector v we add to the collection will have to have the property that \langle v_i,v\rangle=0 for i=1,2,\dots,m. Now the condition \langle v_i,v\rangle=0 is a linear condition of v, so we are placing m linear conditions on all subsequent vectors. It might seem as though they must therefore all live in an (n-m)-dimensional subspace of \mathbb{F}_2^n, but of course that would only be the case if the linear conditions were independent, and there is no reason for them to be. But this does give us a huge clue about how to proceed with the proof: if we are trying to keep going for as long as possible, then we should try to make the conditions we impose as dependent as possible, which comes to the same thing as saying that we should try to cram the vectors v_1,v_2,\dots,v_m into as small a subspace of \mathbb{F}_2^n as we can.

From here the proof more or less writes itself. If we have chosen more than 2^{n/2} vectors, then at least n/2+1 of them must be linearly independent (since an n/2-dimensional subspace of \mathbb{F}_2^n has size 2^{n/2}. From this it follows that all vectors in the collection are confined to a subspace of dimension at most n/2-1, which is an obvious contradiction. So the simple example we started with is indeed best possible.

Example 3: To give a glimpse of how much more sophisticated the linear-algebra method can be, here is a beautiful proof, due to Peter Frankl and Janos Pach, of the famous Sauer-Shelah lemma. This is a solution to the following problem. Let \mathcal{F} be a collection of subsets of \{1,2,\dots,n\}. Given any subset Y\subset\{1,2,\dots,n\}, define the projection \mathcal{F}(Y) of \mathcal{F} onto Y to be the collection of all sets Y\cap F with F\in\mathcal{F}. How large can \mathcal{F} be if there is no set Y of size k for which \mathcal{F}(Y) is the set of all subsets of Y. (Many authors say that Y is shattered by \mathcal{F} if \mathcal{F}(Y) is the set of all subsets of Y.)

A moment’s thought leads to an obvious maximal example (by which I mean an example that cannot be extended rather than an example of maximal size, though it will eventually turn out to be of maximal size too). It is to take \mathcal{F} to be the set of all subsets of \{1,2,\dots,n\} of size less than k

Imagine now that we have experimented a bit and been unable to find any larger examples. Given the naturalness of that example, we begin to suspect that it may be of maximal size. If we have experimented sufficiently diligently, we will also see that if our hunch is correct, then there are many examples of maximal size. So once again we wonder about using linear algebra. The most obvious approach would be to find associate with each set in \mathcal{F} a vector in some space of dimension 1+\binom n 1+\dots+\binom n{k-1} in such a way that the condition we are requiring \mathcal{F} to satisfy implies that all those vectors are linearly independent. 

But how do we associate 1+\binom n 1+\dots+\binom n{k-1} coordinates with a subset of \{1,2,\dots,n\}? The least unnatural way would seem to be to relate these coordinates in some way to the sets of size at most k-1. With this thought in mind, the following idea is quite a natural one: let \mathcal{Z} be the set of all subsets of \{1,2,\dots,n\} of size at most k-1, and associate with any subset F of \{1,2,\dots,n\} the function \phi_F:\mathcal{Z}\rightarrow\mathbb{R} that takes a set Z\in\mathcal{Z} to 1 if Z\subset F and 0 otherwise.

You may have wondered why I described this as a function to \mathbb{R} rather than to \{0,1\}. The reason was that it turns out to be best to regard the functions \phi_F as belonging to a vector space over \mathbb{R} rather than over \mathbb{F}_2. (This is often true, so this example is a useful illustration.) Indeed, it turns out that if \mathcal{F} has no full projections onto sets of size k, then the functions \phi_F are linearly independent.

Let us see how to prove this. To do this, we take a linear dependence \sum\lambda_F\phi_F=0 and try to prove that every \lambda_F is zero. The information we have is that for no set Y of size k does \mathcal{F}(Y) contain all subsets of Y. An obvious way to try to use this information is to assume that not all the \lambda_F are zero, to define some set Y of size k in a natural way, and to prove that every subset of Y is equal to F\cap Y for some F\in\mathcal{F}

Let us consider what the statement \sum\lambda_F\phi_F=0 actually says. If Z is a set of size at most k-1, then \sum\lambda_F\phi_F(Z) can be rewritten as \sum_{F\in\mathcal{F}, Z\subset F}\lambda_F. So we know that this quantity is 0 for every set Z of size at most k-1. So we have here a function that vanishes for every set of size k-1, and we are looking for a set of size at least k. This suggests that we should look for a set Y such that \sum_{F\in\mathcal{F}, Y\subset F}\lambda_F does not vanish, since such a set is guaranteed to have size at least k. We could then hope that every subset of this set Y was of the form F\cap Y for some F\in\mathcal{F}.

Must there exist a set Y such that the expression in question does not vanish? Clearly yes: we can just take Y to be a maximal F\in\mathcal{F} such that \lambda_F\ne 0. However, it is too much to hope that every set Y with this property will work (and easy to find examples of sets Y that do not work). If we want to make it as likely as possible that every subset of Y is of the form F\cap Y then obviously we would like Y to be as small as possible. So let us choose a minimal Y such that \sum_{F\in\mathcal{F}, Y\subset F}\lambda_F\ne 0.

Let Z\subset Y. We would like to prove that there exists F\in\mathcal{F} such that F\cap Y=Z. And a natural way of doing that, given what we have so far, would be to prove that \sum_{F\in\mathcal{F}, F\cap Y=Z}\lambda_F\ne 0. And a natural way of doing that, if we want to exploit the minimality of Y, is to try to express this sum as a combination of sums of the form \sigma(W)=\sum_{F\in\mathcal{F}, W\subset F}\lambda_F with W\subset Y.

This can indeed be done. One can check (and this is basically the inclusion-exclusion formula) that \null\sum_{F\in\mathcal{F}, F\cap Y=Z}\lambda_F=\sum_{Z\subset W\subset Y}(-1)^{|W-Z|}\sigma(W). By the minimality of Y, the sum on the right-hand side is equal to (-1)^{|Y-Z|}\sigma(Y), which is non-zero, and we are done.

Recommended further reading. This singleton list needs to be extended, but as a start, one could look at Part III of “Extremal Combinatorics: with Applications in Computer Science” by Stasys Jukna. It can be looked at online here (just click on “Preview this book”).

End of sample article.

The expectation is that, like Wikipedia articles, Tricki articles will be edited many times by people other than the original author. This would be an obvious candidate for further work: it would be good to have more examples and a more coherent account of how to use the method itself. (For instance, is there a more systematic way of converting problems into linear algebra, is there a way of telling which scalars to use, etc.?)

10 Responses to “Dimension arguments in combinatorics”

  1. Emmanuel Kowalski Says:

    I also recently discussed (in less detail) Example 3 here, after reading about it in van den Dries’s book on o-minimal structures. He also gives a reference for it in a probability theory paper of Vapnik and Chernovenkis (and does not mention Sauer’s name).
    The proof in your post seems more natural than the one in the book of van den Dries (which is essentially the same as that of Vapnik-Chernovenkis), by induction on k.

  2. van vu Says:

    Dear Tim,

    Here is an exercise that is related to Example 2 which you may find interesting. Fix k and let n >>k. How many subsets of {1,..,n} can one find such that each has even size and every k-intersection also have even size ?

  3. Extremal Combinatorics I: Extremal Problems on Set Systems « Combinatorics and more Says:

    […] “dimension arguments in combinatorics”. For more examples and a general discussion see this post  in Gowers’s […]

  4. Niranjan Says:

    There is a monograph (I am not sure if it is available as a book) by Babai and Frankl titled, `Linear Algebra methods in Combinatorics’. It covers a wide variety of examples of this type. So examples involving the binomial coefficients can be handled using the vector space of $k$-alternating forms and so on.

  5. Extremal Combinatorics III: Some Basic Theorems « Combinatorics and more Says:

    […] described as an “eigentheorem” because it is important in many different places. It was mentioned(with an algebraic proof by Frankl and Pach) in Gowers blog and also, in another context, in […]

  6. mudassir Says:

    I have an other interesting question:

    Let S be a finite set and let T be a proper nonempty subset of S. Suppose R is selected uniformly at random from among all subsets of S of odd size. what is the probability that size of (T intersection R) is odd?

    any ideas?

  7. Extremal Combinatorics VI: The Frankl-Wilson Theorem « Combinatorics and more Says:

    […] applications. It has several proofs, all based on linear algebra methods (also referred to as dimension arguments). The original proof is based on a careful study of incidence matrices for families of sets. […]

  8. student Says:

    Hi Tim,

    Can I answer your first question this way? (I would like to know if there is any mistake in the approach I take below.)

    So, you represent a set A_i by its characteristic vector which are bitstrings of length n each containing an odd number of 1‘s. I assume that you can get m such A_i‘s which have odd number of elements and the intersection of A_i and A_j contains even number of elements.

    Like you have done above, let us view this as a vector space over \mathbb{F}_2. Now, if it is possible for me to stuff more than n number of sets, let us look at A_{n+1} as a linear combination of minimal number of basis elements (which has fewer than n) elements. Then, if A_{n+1} = A_{i1} \oplus A_{i2} \oplus \ldots \oplus A_{ik}, then A_{n+1} \oplus A_{ij} is 1 for all j \leq k (by minimality).(Here \oplus is XOR) and since intersection is even, the value should be 0 be true for XOR which is not the case. Contradiction => m is at most n. And m = n is possible. So, the result follows.

  9. The origins of the linear algebra method in combinatorics | Anurag's Math Blog Says:

    […] This result shows the power of using linear algebra in combinatorics. It has been used extensively to solve several interesting combinatorial and geometrical problems, many of which could not be proved in any other way. Here is a nice mathoverflow question that discusses the use of linear algebra in combinatorics: http://mathoverflow.net/questions/17006/linear-algebra-proofs-in-combinatorics, and here is a blog post by Gower discussing this kind of dimension argument in combinatorics: https://gowers.wordpress.com/2008/07/31/dimension-arguments-in-combinatorics/  […]

  10. Elegant applications of advanced techniques to “olympiad” problems – Math Solution Says:

    […] example would be Tim Gowers’ beautiful proof of the following […]

Leave a comment