• No results found

Square-Independence is Sufficient

2.3 True Complexity for Vector Spaces over Finite Fields

2.3.3 Square-Independence is Sufficient

2.3 True Complexity for Vector Spaces over Finite Fields

dimensiond1. ThenV =W∩Y and Yhas dimension at leastn−d1. It follows easily that

r = codimV≤codimW+ codimY ≤codimW+d1,

which implies that the codimension of W is at least r−d1. Hence the codimension of W inside W is at least r−2d1.

We are now in a position to state the version of the structure theorem that we shall be using. It can be read out of (but is not explicitly stated in) [Gre05b] and [GT05b].

Theorem 2.11. Letpbe a fixed prime, letδ >0, let r:N→Nan arbitrary function (which may depend on δ) and suppose that n > n0(r, δ) is sufficiently large. Then given any function f :Fnp → [−1,1], there exists d0 =d0(r, δ) and a quadratic factor (B1,B2)of rank at leastr(d1+d2)and complexity at most(d1, d2), d1, d2 ≤d0, together with a decomposition

f =f1+f2+f3, where

f1 :=E(f|B2), kf2k2 ≤δ and kf3kU3 ≤δ.

Note that Ef1 =Ef. In particular Ef1 = 0 whenever f is the balanced function of a subset of Fnp. It can be shown that f1 is uniform whenever f is uniform: roughly speaking, the reason for this is that E(f|B1) is approximately zero and the atoms of B2 are uniform subsets of the atoms of B1. However, we shall not need this fact.

We shall apply Theorem 2.11 when r is the function d7→2md+C for a constant C.

Unfortunately, ensuring that factors have high rank is an expensive process: even for this modest function the argument involves an iteration that increases d0 exponen-tially at every step. For this reason we have stated the theorem in a qualitative way.

A quantitative version would involve a tower-type bound.

2.3 True Complexity for Vector Spaces over Finite Fields

linear system L of complexity at most 2, our aim is to show, assuming that kfkU2 is sufficiently small, that

Ex∈(Fnp)d m

Y

i=1

f(Li(x))

is also small. (Once we have done that, it will be straightforward to show that the same average, except with A replacing f, is close to αm.) In order to carry out this estimate, we first apply the structure theorem to decomposef as f1+f2+f3, where f1 is quadratically structured, f2 is small in L2 and f3 is quadratically uniform. This then allows us to decompose the product into a sum of 3m products, one for each way of choosing f1, f2 or f3 from each of the m brackets. If we ever choose f2, then the Cauchy-Schwarz inequality implies that the corresponding term is small, and if we ever choose f3 then a similar conclusion follows from Theorem 2.4. Thus, the most important part of the proof is to use the linear uniformity and quadratic structure of f1 to prove that the product

Ex∈(Fnp)d m

Y

i=1

f1(Li(x))

is small. This involves a calculation that generalizes the one we used to prove Theorem 2.7. The main step is the following lemma, where we do the calculation in the case where the linear factor B1 is trivial.

To understand its significance, let us briefly think about what happens when we map a 4-term progression to (Fdp2)4 using the quadratic map Γ2. Because of the relation between the squares of the forms defining the 4-term progression, we find that there is roughly the expected number of progression in the pre-image (Γ−12 (b(1)),Γ−12 (b(2)), Γ−12 (b(3)),Γ−12 (b(4))) ⊆ (Fnp)4 whenever the b(i) ∈ Fdp2 satisfy b(1) − 3b(2) + 3b(3) − b(4) = 0, and precisely no progressions otherwise. For a general square-independent linear system, it turns out that the pre-images are roughly uniformly distributed independent of any relations between the b(i)s.

Lemma 2.12. Let L = (L1, . . . , Lm) be a square-independent system of linear forms and let Γ2 = (q1, . . . , qd2) be a quadratic map from Fnp to Fdp2 of rank at least r. Let φ1, . . . , φm be linear maps from(Fnp)d to Fdp2 and letb1, . . . , bm be elements ofFdp2. Let x = (x1, . . . , xd) be a randomly chosen element of (Fnp)d. Then the probability that Γ2(Li(x)) =φi(x) +bi for every i differs from p−md2 by at most p−r/2.

Proof. Let Λ be the set of all m×d2 matrices λ = (λij) over Fp and let us write φi = (φi1, . . . , φid2) andbi = (bi1, . . . , bid2) for eachi. The probability we are interested

2.3 True Complexity for Vector Spaces over Finite Fields

in is the probability that qj(Li(x)) =φij(x) +bij for every i ≤m and every j ≤d2. This equals

ExEλ∈Λ

m

Y

i=1 d2

Y

j=1

ωλij(qj(Li(x))−φij(x)−bij),

since if qj(Li(x)) =φij(x) +bij for every i and j, then the expectation over λ is 1, and otherwise if we choose i and j such that qj(Li(x)) 6= φij(x) +bij and consider the expectation over λij while all other entries of λ are fixed, then we see that the expectation over λ is zero.

We can rewrite the above expectation as

Eλ∈ΛExωPi,jλij(qj(Li(x))−φij(x)−bij).

If λ = 0, then obviously the expectation over x is 1. This happens with probability p−md2. Otherwise, for eachilet us say that the coefficients ofLi are ci1, . . . , cid. That is, let Li(x) =Pd

u=1ciuxu. Then

qj(Li(x)) = X

u,v

ciucivβj(xu, xv),

where βj is the bilinear form associated with qj. Choose some j such thatλij is non-zero for at least one i. Then the square-independence of the linear forms Li implies that there exist u and v such thatP

iλijciuciv is not zero.

Fix such a j,u and v and do it in such a way that u=v, if this is possible. We shall now consider the expectation as xu and xv vary with every other xw fixed. Notice first that

X

i,j

λijqj(Li(x)) = X

i,j

X

t,w

λijcitciwβj(xt, xw).

Let us writeβtwfor the bilinear formP

i,jλijcitciwβj, so that this becomesP

t,wβtw(xt, xw).

Let us also write φ(x) forP

ijλijφij(x) and let φ1, . . . , φd be linear maps from Fnp to Fp such that φ(x) =P

tφt(xt) for every x. Then X

i,j

λij(qj(Li(x))−φij(x)) =X

t,w

βtw(xt, xw)−X

t

φt(xt).

Notice that if we cannot get u to equal v, then P

iλijc2iu = 0 for every u and every j, which implies that βuu = 0. Notice also that the assumption that Γ2 has rank at least r and the fact that P

iλijciuciw 6= 0 for at least one j imply thatβuv has rank at least r.

2.3 True Complexity for Vector Spaces over Finite Fields

If we fix every xtexcept forxu and xv, thenP

t,wβtw(xt, xw)−P

tφt(xt) is a function of xu and xv of the form

βuv(xu, xv) +ψu(xu) +ψv(xv),

where ψu and ψv are linear functionals on Fnp (that depend on the other xt).

Now let us estimate the expectation

Exu,xvωPi,jλij(qj(Li(x))−φij(x)−bij),

where we have fixed every xt apart from xu and xv. Letting b =P

λijbij and using the calculations we have just made, we can write this in the form

Exu,xvωβuv(xu,xv)+ψu(xu)+ψv(xv)−b.

Ifu=v, then the expectation is just over xu and the exponent has the formq(xu) + wTu−b for some quadratic form q of rank at leastr. Therefore, by Lemma 2.29, the expectation is at most p−r/2. If u6=v (and therefore everybuu is zero) then for each xv the exponent is linear in u. This means that either the expectation overxu is zero or the function βuv(xu, xv) +ψu(xu) is constant. If the latter is true when xv = y and when xv =z, thenβuv(xu, y−z) is also constant, and therefore identically zero.

Since βuv has rank at least r, y−z must lie in a subspace of codimension at least r. Therefore, the set of xv such that βuv(xu, xv) +ψu(xu) is constant is an affine subspace of Fnp of codimension at least r, which implies that the probability (for a randomly chosenxv) that the expectation (overxu) is non-zero is at mostp−r. When the expectation is non-zero, it has modulus 1.

In either case, we find that, for any non-zeroλ∈Λ, the expectation overxis at most p−r/2, and this completes the proof of the lemma.

We now want to take into account Γ1 as well as Γ2. This turns out to be a short deduction from the previous result. First let us do a simple piece of linear algebra.

Lemma 2.13. Let L = (L1, . . . , Lm) be a collection of linear forms in d variables, and suppose that the linear span of L1, . . . , Lm has dimension d0. Let Γ1 : Fnp →Fdp1

be a surjective linear map and let φ: (Fnp)d →(Fdp1)m be defined by the formula φ:x7→(Γ1(L1(x)), . . . ,Γ1(Lm(x))).

Then the image of φ is the subspace Z of (Fdp1)m that consists of all sequences

2.3 True Complexity for Vector Spaces over Finite Fields

(a1, . . . , am) such that P

iµiai = 0 whenever P

iµiLi = 0. The dimension of Z is d0d1.

Proof. Since the m forms Li span a space of dimension d0, the set of sequences µ = (µ1, . . . , µm) such that P

iµiLi = 0 is a linear subspace W of Fmp of dimension m−d0. Therefore, the condition thatP

iµiai = 0 for every sequenceµ∈W restricts (a1, . . . , am) to a subspace of (Fdp1)m of codimension d1(m−d0). (An easy way to see this is to writeai = (ai1, . . . , aid1) and note that for eachj the sequence (a1j, . . . , amj) is restricted to a subspace of codimension m−d0.) Therefore, the dimension of Z is d0d1, as claimed.

Now let us show that Z is the image of φ. Since φ is linear, Z certainly contains the image of φ, so it will be enough to prove that the rank of φ is d0d1.

Abusing notation, let us write Γ1(x) for the sequence (Γ1x1, . . . ,Γ1xd), which belongs to (Fdp1)d. Then φ(x) can be rewritten as (L11(x)), . . . , Lm1(x))). Since Γ1 is a surjection, it is also a surjection when considered as a map on (Fnp)d. Therefore, the rank of φ is the rank of the map ψ : (Fdp1)d→(Fdp1)m defined by

ψ :y7→(L1(y), . . . , Lm(y)).

Since the Li span a space of dimensiond0, the nullity of this map isd1(d−d0), so its rank is d1d0. Therefore, the image of φ is indeed Z.

Lemma 2.14. Let L = (L1, . . . , Lm) be a square-independent system of linear forms in d variables, and suppose that the linear span of L1, . . . , Lm has dimension d0. Let Γ1 : Fnp → Fdp1 be a surjective linear map and let Γ2 : Fnp → Fdp2 be a quadratic map of rank at least r. Let a1, . . . , am be elements of Fdp1 and let b1, . . . , bm be elements of Fdp2, and let φ and Z be as defined in the previous lemma. Then the probability, if x is chosen randomly from (Fnp)d, that Γ1(Li(x)) = ai and Γ2(Li(x)) = bi for every i≤m is zero if (a1, . . . , am)∈Z, and otherwise it differs from p−d1d0−d2m by at most pd1−d0d1−r/2.

Proof. If a= (a1, . . . , am) ∈/ Z, then there exists µ∈ Fmp such that P

iµiai 6= 0 and P

iµiLi(x) = 0 for every x. Since Γ1 is linear, it follows that there is nox such that Γ1(Li(x)) =ai for every i.

Otherwise, by Lemma 2.13,alies in the image ofφ, which has rankd0d1, soφ−1({a}) is an affine subspace of (Fnp)d of codimension d0d1. Therefore, the probability that φ(x) = aisp−d0d1. Now let us use Lemma 2.12 to estimate the probability, conditional on this, that Γ2(Li(x)) =bi for every i≤m.

2.3 True Complexity for Vector Spaces over Finite Fields

In the proof of Lemma 2.13, we observed thatφ(x) depends on Γ1(x) only, so we shall estimate the required probability, given the value of Γ1(x). (Recall that this is nota-tion for (Γ1x1, . . . ,Γ1xd).) In order to specify the set on which we are conditioning, let V be the kernel of Γ1 (considered as a map defined on Fnp), and given a sequence (w1, . . . , wd)∈(Fnp)d, let us estimate the required probability, given thatxu ∈V +wu for every u.

Let us write xu = yu +wu. Thus, we are estimating the probability that Γ2(Li(y+ w)) =bi for every i≤m. But for each i we can write Γ2(Li(y+w)) as Γ2(Li(y)) + φi(y) +b0i for some linear function φi :Vd→Fdp2 and some vectorb0i ∈Fdp2.

Because Γ2 has rank at least r and the codimension of V in Fnp is d1, Lemma 2.10 implies that the rank of the restriction of Γ2 to V is at least r−2d1. Therefore, by Lemma 2.12, the probability that Γ2(Li(y)) =−φi(y) +bi−b0i for everyidiffers from p−md2 by at most pd1−r/2.

Since this is true for all choices of w, we have the same estimate if we condition on the event that φ(x) = a for some fixed a ∈ Z. Therefore, the probability that Γ1(Li(x)) = ai and Γ2(Li(x)) = bi for every i differs from p−d0d1−md2 by at most pd1−d0d1−r/2, as claimed.

Next, we observe that Lemma 2.14 implies that all the atoms ofB2have approximately the same size.

Corollary 2.15. Let Γ1 and Γ2 be as above and let x be a randomly chosen element of Fnp. Then for every a∈ Fdp1 and every b ∈Fdp2, the probability that Γ1(x) =a and Γ2(x) = b differs from p−d1−d2 by at most p−r/2.

Proof. Let us apply Lemma 2.14 in the case whereLconsists of the single one-variable linear form L(x) = x. This has linear rank 1 and is square-independent, so when we apply the lemma we haved0 =m = 1. If we leta1 =aandb1 =b, then the conclusion of the lemma tells us precisely what is claimed.

The next two lemmas are simple technical facts about projections on to linear factors.

The first one tells us that if g is any function that is uniform and constant on the atoms of a linear factor, then it has smallL2-norm. The second tells us that projecting on to a linear factor decreases the U2-norm.

Lemma 2.16. Let G be a function from Fdp1 to [−1,1], let Γ1 : Fnp → Fdp1 be a surjective linear map and let g =G◦Γ1. Then kgk42 ≤pd1kgk4U2.

2.3 True Complexity for Vector Spaces over Finite Fields

Proof. Since Γ1 takes each value in Fdp1 the same number of times, kgkU2 = kGkU2. But

kGk4U2 =Ea(EbG(b)G(b+a))2 ≥p−d1(EbG(b)2)2 =p−d1kGk42, which proves the result, since kgk2 =kGk2 as well.

Lemma 2.17. Let f be a function from Fnp to R, let B1 be a linear factor and let g =E(f|B1). Then kgkU2 ≤ kfkU2.

Proof. On every atom of B1, g is constant and f −g averages zero. Let Γ1 be the linear map that defines B1 and, as we did earlier, for each a ∈ Fdp1 let Va stand for Γ−11 ({a}). Then

kfk4U2 =Ea1+a2=a3+a4Ex1+x2=x3+x4

Γ1(xi)=ai

f(x1)f(x2)f(x3)f(x4) .

Let us fix a choice of a1+a2 = a3+a4 and consider the inner expectation. Setting g0 =f −g, this has the form

Ex1+x2=x3+x4

Γ1(xi)=ai

1+g0(x1))(λ2 +g0(x2))(λ3+g0(x3))(λ4+g0(x4))

This splits into sixteen parts. Each part that involves at least oneλi and at least one g0(xi) is zero, because any three of the xis can vary independently and g0 averages zero on every atom of B1. This means that the expectation is

λ1λ2λ3λ4+Ex1+x2=x3+x4

Γ1(xi)=ai

g0(x1)g0(x2)g0(x3)g0(x4) .

If we now take expectations over a1 +a2 = a3 +a4 we find that kfk4U2 = kgk4U2 + kf −gk4U2. Notice that this is a general result about how theU2-norm of a function is related to the U2-norm of a projection on to a linear factor.

Now we are ready to estimate the product we are interested in, for functions that are constant on the atoms of B2.

Lemma 2.18. Let Γ1 : Fnp → Fdp1 be a linear function and Γ2 : Fnp → Fdp2 be a quadratic function. Let (B1,B2) be the corresponding quadratic factor and suppose that this has rank at least r. Let c > 0 and let f : Fnp → [−1,1] be a function with kfkU2 ≤ c and let f1 = E(f|B2). Let L = (L1, . . . , Lm) be a square-independent system of linear forms. Then

Ex∈(Fnp)d m

Y

i=1

f1(Li(x))≤4mcpd1/4+ 2m+1pm(d1+d2)−r/2.

2.3 True Complexity for Vector Spaces over Finite Fields

Proof. Let g = E(f1|B1) and let h = f1 −g. Then kgk1 ≤ kgk2 ≤ pd1/4kgkU2, by Cauchy-Schwarz and Lemma 2.16. By Lemma 2.17,kgkU2 ≤ kfkU2, which is at most c, by hypothesis. Therefore, kgk1 ≤cpd1/4.

Since f1 =g +h, we can split the product up into a sum of 2m products, in each of which we replace f1(Li(x)) by either g(Li(x)) or h(Li(x)). Since kgk1 ≤ cpd1/4 and khk ≤2, any product that involves at least oneg has average at most 2mcpd1/4. It remains to estimate

Ex∈(Fnp)d m

Y

i=1

h(Li(x)).

LetZbe as defined in Lemma 2.13, and for eacha= (a1, . . . , am) andb= (b1, . . . , bm), letP(a,b) be the probability that Γ1(Li(x)) =ai and Γ2(Li(x)) =bi for everyi. By Lemma 2.14, we can set P(a,b) =p−d0d1−md2 +(a,b), with |(a,b)| ≤pd1−d0d1−r/2. Now let H be defined by the formula h(x) =H(Γ1x,Γ2x). Because h is constant on the atoms of B2, H is well-defined on the set of all elements of Fdp1 ×Fdp2 of the form (Γ1x,Γ2x). Sinceh takes values in [−2,2], so doesH.

Next, we show that EbH(a, b) is small for any fixed a ∈ Fdp1, using the facts that h averages 0 on every cell of B1 and that it is constant on the cells ofB2. Let us fix an a and write P(b) for the probability that Γ2(x) = b given that Γ1(x) = a—that is, for the density of Va∩Wb inside Va. Then

0 =Ex∈Vah(x) =Ex∈VaH(Γ1x,Γ2x) = X

b

P(b)H(a, b).

By Corollary 2.15, we can write P(b) =p−d2 +(b), with |(b)| ≤pd1−r/2 for every b.

Therefore, the right-hand side differs from EbH(a, b) by at most 2pd1+d2−r/2, which implies that |EbH(a, b)| ≤2pd1+d2−r/2.

Now Ex

m

Y

i=1

h(Li(x)) =Ex m

Y

i=1

H(Γ1(Li(x)),Γ2(Li(x))) = X

a∈Z

X

b

P(a,b)

m

Y

i=1

H(ai, bi).

Let us split up this sum as p−d0d1−md2X

a∈Z

X

b m

Y

i=1

H(ai, bi) +X

a∈Z

X

b

(a,b)

m

Y

i=1

H(ai, bi).

The first term equals Ea∈ZQm

i=1(EbH(ai, b)), which is at most (2pd1+d2−r/2)m. The second is at most p(d0d1+md2)2mpd1−d0d1−r/2 = 2mpd1+md2−r/2. Therefore, the whole

2.3 True Complexity for Vector Spaces over Finite Fields

sum is at most 2m+1pm(d1+d2)−r/2. Together with our estimate for the terms that involved g, this proves the lemma.

We have almost finished the proof of our main result.

Theorem 2.19. For every > 0 there exists c > 0 with the following property.

Let f : Fnp → [−1,1] be a c-uniform function. Let L = (L1, . . . , Lm) be a square-independent system of linear forms in d variables, with Cauchy-Schwarz complexity at most 2. Then

Ex∈(Fnp)d m

Y

i=1

f(Li(x))

≤.

Proof. Letδ >0 be a constant to be chosen later. LetCbe such that 2m+1p−C/2 ≤/3 and let r be the function d 7→ 2md+C. Then according to the structure theorem (Theorem 2.11) there exists d0, depending on r and δ only, and a quadratic factor (B1,B2) of rank at least 2m(d1+d2) +C and complexity (d1, d2), withd1 and d2 both at most d0, such that we can write f = f1 +f2+f3, with f1 = E(f|B2), kf2k2 ≤ δ and kf3kU3 ≤δ.

Let us show that the sum does not change much if we replacef(Lm(x)) byf1(Lm(x)).

The difference is what we get if we replacef(Lm(x)) byf2(Lm(x)) +f3(Lm(x)). Now kf2k1 ≤ kf2k2 and kfk ≤ 1, so the contribution from the f2 part is at most δ.

As for the f3 part, since kf3kU3 ≤ δ and kfk ≤ 1, Theorem 2.4 tells us that the contribution is at most δ. Therefore, the total difference is at most δ+δ≤2δ.

Now let us replace f by f1 in the penultimate bracket. The same argument works, since kf1k≤1. Indeed, we can carry on with this process, replacing every single f by f1, and the difference we make will be at most 2mδ.

We are left needing to show that the product with everyf replaced byf1is small. This is what Lemma 2.18 tells us. It gives us an upper bound of 4mcpd1/4+2m+1pm(d1+d2)−r/2, where for r we can take 2m(d1 +d2) +C. Therefore, the upper bound is 4mcpd0/4+ 2m+1p−C/2, which, by our choice of C, is at most 4mcpd0/4 +/3.

To finish, let δ =/6m. This determines the value of d0 and we can then set cto be 4−mp−d0/4/3, which will be a function of only.

Because of our use of Theorem 2.11, the bounds in the above result and in the corollary that we are about to draw from it are both very weak. However, we have been explicit about all the bounds apart from d0, partly in order to make it clear how the parameters depend on each other and partly to demonstrate that our weak bound derives just from the weakness of d0 in the structure theorem.

2.3 True Complexity for Vector Spaces over Finite Fields

Corollary 2.20. For every > 0 there exists c > 0 with the following property.

Let A be a c-uniform subset of Fnp of density α. Let L = (L1, . . . , Lm) be a square-independent system of linear forms in d variables, with Cauchy-Schwarz complexity at most 2. Let x = (x1, . . . , xd) be a random element of (Fnp)d. Then the probability that Li(x)∈A for every i differs from αm by at most .

Proof. We shall choose as our c the c that is given by the previous theorem when is replaced by /2m. Our assumption is then that we can write A = α+f for a c-uniform functionf. The probability we are interested in is

Ex∈(Fnp)d m

Y

i=1

A(Li(x)),

which we can split into 2m parts, obtained by replacing each occurrence of A either by α or by f.

For each part that involves at least one occurrence of f, we have a power of α multiplied by a product over some subsystem of L. This subsystem will also be square-independent and have CS-complexity at most 2. Moreover, the number of linear forms will have decreased. Therefore, the previous theorem and our choice of c tell us that the contribution it makes is at most /2m. Therefore, the contribution from all such parts is at most . The only remaining part is the one where every A(Li(x)) has been replaced by α, and that gives us the main termαm.