The large sieve and the Bombieri-Vinogradov theorem

-1.Motivation-

Large sieve: A Philosophy Reflecting a Large Group of Inequalities

The large sieve is a philosophy that reflects a large group of inequalities which are very effective in controlling some linear sums or square sums of correlations of arithmetic functions. This idea could have originated in harmonic analysis, relying almost entirely on almost orthogonality.

One fundamental example is the estimate of the quality:

$$
\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|
$$

One naive idea to control this quality is using the Cauchy-Schwarz inequality. However, by stupidly using this, we gain something even worse than the trivial estimate. In fact, by the triangle inequality and the trivial estimate, we obtain the trivial bound:

$$
\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|\leq x
$$

But by stupidly using Cauchy, we get the following:

$$
\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|\leq \left(\left(\sum_{n\leq x}|\Lambda(n)|^2\right)\left(\sum_{n\leq x}|\chi(n)|^2\right)\right)^{\frac{1}{2}}\leq x\log^{\frac{1}{2}}x
$$

But this does not mean Cauchy-Schwarz is useless in charging this quality. If we carefully look at the inequality and try to understand why the bound will be even worse, every time we successfully use Cauchy-Schwarz, there are two main phenomena. First, we lower down the complexity of the quantity we wish to bound, and second, we almost do not lose anything at all. So we just reformulate the quantity and find it lowers the complexity, and the change is compatible with the equivalent condition of Cauchy-Schwarz. For example, we have the following identity:

$$
\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|=\sqrt{\sum_{n\leq x}|\Lambda(n)\overline{\chi(n)}|\sum_{m\leq x}|\Lambda(m)\overline{\chi(m)}|}=\sqrt{\sum_{k_1,k_2\in \mathbb F_p^{\times}}\sum_{n’,m’\leq \frac{x}{p}}|\Lambda(n’)\Lambda(m’)\overline{\chi(k_1)\chi(k_2)}|}
$$

So we could understand this quality as the variation of primes in arithmetic profession constructed by ${pn+b| b\in{1,2,…,p-1}}$. But this is still difficult to estimate, merely because we need to control the variation of the convolution of $\Lambda$ with itself on $\mathbb F_p^{\times}\simeq {pn+b| b\in{1,2,…,p-1}}$.

Now we change our perspective, recalling a variant of the Cauchy-Schwarz inequality, which is called Bessel’s inequality, as follows:

Bessel inequality

Let $g_1,\dots,g_J: \mathbb{N} \rightarrow \mathbb{C}$ be finitely supported functions obeying the orthonormality relationship,

$$\displaystyle \sum_n g_j(n) \overline{g_{j’}(n)} = 1_{j=j’}$$

for all $1 \leq j,j’ \leq J$. Then for any function $f: \mathbb{N} \rightarrow \mathbb{C}$, we have,

$$\displaystyle \left(\sum_{j=1}^J \left|\sum_{n} f(n) \overline{g_j(n)}\right|^2\right)^{1/2} \leq \left(\sum_n |f(n)|^2\right)^{1/2}.$$

Pf: The proof is not very difficult, we just need to keep an orthogonal picture in our mind, consider ${g_{j}(n)}, 1\leq j\leq J$ to be an orthogonal basis on $l^2(\mathbb{N})$, then this inequality is a natural corollary.

Have this inequality in mind, by the standard argument given by transform from version of orthogonal to almost orthogonal which was merely explained in the previous note. We could imagine the following corresponding almost orthogonal variant of “Bessel inequality” is true:

Generalised Bessel inequality

Let $g_1,\dots,g_J: \mathbb{N} \rightarrow \mathbb{C}$ be finitely supported functions, and let $u: \mathbb{N} \rightarrow \mathbb{R}^+$ be a non-negative function. Let $f: \mathbb{N} \rightarrow \mathbb{C}$ be such that $f$ vanishes whenever $u$ vanishes, we have

$$\displaystyle \left(\sum_{j=1}^J \left|\sum_{n} f(n) \overline{g_j(n)}\right|^2\right)^{1/2} \leq \left(\sum_n \frac{|f(n)|^2}{u(n)}\right)^{1/2} \times \left( \sum_{j=1}^J \sum_{j’=1}^J c_j \overline{c_{j’}} \sum_n u(n) g_j(n) \overline{g_{j’}(n)} \right)^{1/2}$$

for some sequence $c_1,\dots,c_J$ of complex numbers with $\sum_{j=1}^J |c_j|^2 = 1$, with the convention that $\frac{|f(n)|^2}{u(n)}$ vanishes whenever $f(n), u(n)$ both vanish.