Speeding Up Sum-Check Proving (Bagad et al.)
TLDR: Recaps existing proving algorithms, introduces new algorithm to (1) trade off multiplications between large field elements (expensive) for multiplications between small field elements cheap, and (2) optimize for the common case where has as one of its factors, as common in Spartan and Jolt.

Algorithm 1: Linear time & space prover
Proves for an -variate polynomial which is a product of multilinear polynomials:
The degree- univariate polynomials sent to the verifier as part of the Multivariate Sum-Check Protocol are represented by their evaluations on a fixed set of .
Their definition of
They define and . For a polynomial , is defined as the highest degree coefficient of . Lemma 2.2 shows how to do Lagrange interpolation in this setting. I’m not sure what’s the advantage over using e.g. .
Notes:
- The algorithm starts out by storing the polynomials (for ) in evaluation form in arrays of size each.
- Step 4 computes (of size ) from (of size ) to maintain the stated invariant, binding one variable to the verifier challenge. The update uses the fact that for any multilinear polynomial , we have . This allows us to bind the variable to any value.
- Step 1 can be broken down as follows:
- The prover message is a univariate polynomial , whose representation is computed by evaluating for all
- is computed as
- is computed as
- is computed from and (both stored in ) using the same fact used in in step 4
Notes on performance:
- In each round, the main memory usage is size- field elements. Note that in practice, these field elements are usually small in the first round and large after (because they depend on a random challenge).
- The number of multiplications halves in each round, but from round 2, the multiplications are between large field elements.
Algorithm 2: Quasilinear time & square-root space
The core idea of this algorithm is that the terms (where , , ) can also be computed directly from the evaluation form of :
Notes:
- The evaluation form of polynomials can be accessed sequentially. This enables a streaming prover: Witness generation is re-run for each of the first sum-check rounds; the prover never has to store the entire witness.
- The terms can be computed in linear time. The time and space needed is , i.e., at round , this array will have elements.
- After rounds, the prover switches to Algorithm 1. At this point, the arrays contain elements.
Notes on performance:
- Memory usage is
- The inner product is between a large field element (the term) and a small field element (the )
- The outer product is between large field elements (after round 1)