• No results found

Estimating the Size of the Niveau Set

4.3 From the Model Case to Z N

4.3.1 Estimating the Size of the Niveau Set

The following lower bound on the cardinality of the niveau setAis proved in [Ruz91].

It is the analogue of Lemma 4.3 in the case G=ZN.

Proposition 4.11. Let ≤ 1/4 and suppose k logN/log logN. Then the set A with parameters and k as defined above has cardinality at least N/3.

For the sake of clarity, self-containedness and because we want to use a very similar argument later on, we give a concise exposition of Ruzsa’s proof in this section. We shall proceed in two steps. First, we compare the character sum appearing in the definition of A to a sum of independent random variables distributed uniformly on the unit circle. Second, we approximate this sum of independent random variables by a normal distribution, which allows us to perform explicit computations.

4.3 From the Model Case to Z

N

A crucial tool in proving the first step is the following theorem in probability theory, which is known as Esseen’s Inequality. It dates back to Esseen [Ess45] and indepen-dently Berry [Ber41], but see Shiryayev [Shi84] for a general introductory reference.

Theorem 4.12. Let F1, F2 be probability distribution functions with corresponding characteristic functions φ1, φ2. Assume F10 exists and is pointwise bounded by a con-stant V. Then

sup

x

|F1(x)−F2(x)| V T +

Z T 0

1(t)−φ2(t)|

t dt.

We briefly recall that thecharacteristic function φX of a random variableX is defined to be φX(t) := Eexp(itX), and that therefore the probability density function of a random variable is the inverse Fourier transform of its characteristic function. From now on we shall be using the notationa b to indicate that there exists an absolute constant csuch that a≤cb.

A special case of Theorem 4.12, also known as the Berry-Esseen Inequality, will help us complete the second step. It measures the total variation distance between a sum of independent identically distributed random variables and the normal distribution, in other words, it gives us information about the rate of convergence in the Central Limit Theorem. More precisely, let X1, X2, . . . , Xkbe independent random variables, each distributed uniformly on the unit circle, and define their sum to be

X :=

k

X

j=1

Xj with real part Xe :=<X.

Let σ := p

k/2 denote the standard deviation of X. The following formulation ofe the Berry-Esseen Inequality is taken from page 374 of [Shi84].

Theorem 4.13. Let Xe be defined as above, and let Φ denote the standard normal distribution function. Then

sup

x

|FX/σe (x)−Φ(x)| E|X|e 3 σ4 ,

provided that the third absolute moment E|X|e 3 is finite.

In order to estimate the difference between two characteristic functions effectively using Theorem 4.12, we need to consider the moments of the corresponding random variables. Given a random variableXe as defined above, we can express itslth moment

4.3 From the Model Case to Z

N

µel:=EXel as

µel = 1 2l

l

X

i=0

l i

i,l−i by writing µei,j :=EXiXj.

We set up analogous expressions for the character sum defining A by writing f(x) :=

k

X

j=1

γj(x) with real part f(x) :=e <f(x) and lth moment eνl := 1 N

N

X

x=1

f(x)e l.

The lth moment of fecan likewise be expanded as

l = 1 2l

l

X

i=0

l i

i,l−i upon setting νei,j := 1 N

N

X

x=1

f(x)if(x)j.

Let FXe, Ffedenote the obvious distribution functions, and write φXe, φfefor the cor-responding characteristic functions.

We are interested in the distribution of fe. More precisely, in order to estimate the size of A we want to count the number of elements x ∈ ZN such that fe(x) ≥ √

k.

This means that 1−Ffe(√

k) is the quantity we are ultimately interested in.

Our first lemma shows that K-independence guarantees that the lower moments of feand Xe are equal.

Lemma 4.14. With the moments µel and νel defined as above and the characters γ1, γ2, . . . , γk assumed to be K-independent, we have νel =µel for all l = 1,2, . . . , K.

Proof. Under the assumption of K-independence, it is not too difficult to compute the mixed moments explicitly. Indeed, we can rewrite νei,j as

1 N

N

X

x=1 k

X

m=1

γm(x)

!i k X

n=1

γn(x)

!j

= 1 N

X

m1,...,mi

n1,...,nj

N

X

x=1

e((rm1+...+rmi−rn1−...−rnj)x/N).

Wheneveri+j ≤K, the latter sum equals zero byK-independence unlessm1, . . . , mi

is a permutation of n1, . . . , nj, in which case it equalsN. We compare this with

µei,j =E

k

X

m=1

Xm

!i k X

n=1

Xn

!j

=

k

X

m1,...,mi=1 k

X

n1,...,nj=1

EXm1. . . XmiXn1. . . Xnj.

4.3 From the Model Case to Z

N

Again, sinceXi is independent ofXj fori6=j, the expectation is non-zero only when m1, . . . , mi is a permutation of n1, . . . , nj, in which case it equals 1. Henceeνi,j =µei,j

for all i+j ≤K, and the result follows as stated.

In order to usefully estimate the difference between the two characteristic functions we also need to infer a decent bound on the lth moment eµl.

Lemma 4.15. For any even integer l ≤ K and µel defined as above, we have the upper bound

µel≤min

kl, l!

2l(l/2)!kl/2

.

Proof. The first part of the bound is obvious, and the second follows from the fact that the only non-zero mixed moments µei,l−i are those for whichi =l/2, when they are of magnitude kl/2(l/2)!.

We are now ready to carry out the first step of the argument, namely showing that feand Xe are close in distribution using Theorem 4.12.

Proposition 4.16. Under the same assumptions as before, feand Xe are close in distribution in the sense that

sup

x

|FXe(x)−Ffe(x)| min ( 1

√ K,

√k K

) .

Proof. In order to apply Esseen’s Inequality, we first need to verify that F0

Xe exists and is bounded above by a suitable constant. As we have already mentioned, it is a well-known fact in probability theory that the density function of a random variable is the inverse Fourier transform of its characteristic function, hence

F0

Xe(x)≤ Z

−∞

Xe(t)|dt.

We thus require the following bounds on the characteristic function φXe of X, whiche we state here without proof. The interested reader is referred to [Ruz91] for details.

Lemma 4.17. There exist constants a, b >0 and T0 >1 such that φ

Xe satisfies

Xe(t)| ≤

exp (−akt2) |t| ≤T0σ (b|t|)−k/2 |t|> T0σ

.

4.3 From the Model Case to Z

N

It is immediate to deduce that F0

Xe(x) is bounded above by a constant times the standard deviationσ. Next we observe that by Taylor’s Theorem with remainder we can write

φXe(t) =

l−1

X

j=1

j

j!(it)j+δµel

|t|l l! ,

and similarly

φfe(t) =

l−1

X

j=1

j

j!(it)j +δeνl|t|l l!

for some|δ| ≤1. With the benefit of hindsight, this allows us to justify why we were so keen to compare moments in the first place. K-independence gave us through Lemma 4.14 that all moments µej and νej up to order K were equal, and thus

Xe(t)−φ

fe(t)| ≤2µeK|t|K K!.

It now follows from Theorem 4.12 that for any T >1, sup

x

|FXe(x)−F

fe(x)| σ

T +µeK TK K!K.

Using the bound on µeK derived in Lemma 4.15 and setting T = σ(K!/eµK)1/(K+1) followed by a short computation concludes the proof of Proposition 4.16.

We have thus successfully approximated fe by X. It remains to compare a suit-e ably normalized version of Xe to a standard normal random variable. The following proposition states thatXe is close to a normal distribution with mean 0 and standard deviation σ.

Proposition 4.18. LetXe be defined as above, and let Φ denote the standard normal distribution function. Then

sup

x

|FX/σe (x)−Φ(x)| 1 σ.

Proof. This is a straightforward application of Theorem 4.13. The third absolute moment E|X|e 3 can be bounded by the Cauchy-Schwarz Inequality as

E|X|e 3 ≤(E|X|e 2)1/2(E|X|e 4)1/2.

Splitting Xj into real and imaginary parts Xj = Rj + iIj, we first observe that ER2j = 1/2 and EIj2 = 1/2 as well as ER4j = 3/8. It is not hard to see that Xi and

4.3 From the Model Case to Z

N

Xj are independent if and only if the pairs (Ri, Ii) and (Rj, Ij) are independent (but see page 273 of [Shi84] for a justification of this claim), which yields

EXe2 =E

k

X

j,l=1

RjRl =

k

X

j=1

ER2j +

k

X

j6=l=1

ERjRl= k 2 and

EXe4 =E

k

X

j=1

R4j +

k

X

j,l=1

ER2jERl2 = k

2 2

+ 3 4k.

This implies that E|X|e 3 σ3, and the result follows as claimed from Theorem 4.13.

We remark that in fact Ruzsa [Ruz91] proves the slightly stronger error term of σ−2, but we shall not need to do so here. Proposition 4.18 completes the second step of the argument, so we are now in a position to estimate the size of the niveau set A.

Proof of Proposition 4.11. Bearing in mind that by definition of the distribution func-tion FX/σe (x) =FXe(σx), we deduce from Propositions 4.16 and 4.18 the existence of two constants c and c0 such that

Ffe(√

k)≤FXe(√

k) +cmin ( 1

√ K,

√ k K

)

≤Φ(√

2) +cmin ( 1

√ K,

√ k K

)

+c0 1

√ k.

It is easy to compute that for ≤1/4, the value of the standard normal distribution function Φ at √

2 is bounded above by 2/3, so that the size of the set A is at least N/3. In fact, the density can be made arbitrarily close to 1/2 by choosing small enough. We also need to ensure that the error term √

k/K tends to 0 asN tends to infinity, and thatK satisfiesK N1/k. We therefore require thatkgrow at most like a constant times logN/log log(N). This proves Proposition 4.11 for N sufficiently large.