Author: JustinThaler Version: July 18, 2023 ProofsArgsAndZK_annotated.pdf
Chapter 1: Introduction
- Definitions:
- Verifiable Computing (VC) and Interactive Proofs (IP): enable a prover to provide a guarantee to a verifier that the prover performed a requested computation correctly
 - Arguments of Knowledge: Allows for false proofs if they require exorbitant computational power to find
 - Zero Knowledge: Proof reveals nothing but the validity of the statement
 
 
Chapter 2: The Power of Randomness: Fingerprinting and Freivalds’ Algorithm
- Reed-Solomon Fingerprinting:
- Setting: Alice and Bob want to check whether they have the same file
 - Protocol: Both parties compute a fingerprint and check whether it is the identical
 - Reed-Solomon Encoding: For some prime , interpret the file as messages in and consider the family of hash functions: for
 - → Collision resistant because to unequal polynomials of degree can only agree on points
 
 - Freivald’s algorithm:
- Setting: Given two matrices and of size , compute . The fastest known algorithm is , but it is possible to verify that is computed correctly in time
 - Let , compute and , accept if they are equal
 - → Each entry in the the vectors are Reed-Solomon fingerprints of a row of and , respectively
 
 - Univariate Lagrange interpolation: See Low Degree and Multilinear Extensions
 - Evaluating : All values  can be evaluated in  operations in total, by starting with  and computing  from  (which has almost the same terms) in constant time
- → Total time to evaluate is
 
 
Chapter 3: Definitions and Technical Preliminaries
- Interactive Proof System (IP):
- Probabilistic verifier and deterministic prover which are “next-message passing algorithms”
 - Both parties are given ; at the beginning, the prover provides a value claimed to equal
 - is called the transcript
 - is the deterministic output of given its randomness , input and prover strategy
 - An IP is valid if it is sound and complete
 
 - Argument systems: An IP for which soundness is only required to hold against prover strategies that run in polynomial time
 - Schwartz-Zippel Lemma
 - Low Degree and Multilinear Extensions
 
Chapter 4: Interactive Proofs
- Multivariate Sum-Check Protocol
 - Application of the Sum-Check protocol to SAT:
- Boolean formula: A boolean circuit with fan-out of 1 for all non-output gates
 - Goal: Given a boolean formula , compute , i.e., the number of inputs for which outputs
 - → Fastest known algorithm needs exponential time!
 - Approach: Turn into an arithmetic circuit over by replacing gates by , by , etc.; then apply the sum-check protocol to this multivariate polynomial
 
 - A simple IP for Counting Triangles in Graphs:
- Given an adjacency matrix for a given graph, the number of triangles can be computed as
 - For any given , there is a function that takes the binary representations of row and column indices and outputs the entry of the matrix at that position
 - Therefore,
 - → Can apply the sum-check protocol to
 
 - Reducing Multiple Evaluations to One
 - Super-Efficient IP for Matrix Multiplication
 - GKR
 
Chapter 5: Publicly Verifiable, Non-interactive Arguments via Fiat-Shamir
- Random Oracle Model (ROM): Prover and verifier have access to some random function
- Does not exist in practice: A random function would have to be represented by a function table which would be huge for cryptographic security levels
 - In practice, no deployed system has been broken because of this
 
 - Fiat-Shamir Transformation: Verifier message in round  is the evaluation of a cryptographic hash function quere the query point is the list of messages sent by the prover in rounds 
- Secure in the ROM on constant-round public coin IPs where the prover has to run in polynomial time
 - Hash chaining: To save on the cost of hashing, one can also use
 - If the prover can choose (e.g. the input to the circuit proved by the GKR protocol) freely, it is essential that is part of the transcript
 
 - Correlation-intractability (CI): A hash family satisfies CI on some property if it is computationally infeasible to find a pair  that satisfies the property
- There are hash families for various classes of IPs & arguments, with the CI property holding under standard cryptographic assumptions, but they are very expensive and hence impractical
 
 
Chapter 6: Front Ends: Turning Computer Programs Into Circuits
- Front end: Program -to-circuit compiler
 - Back end: The argument system
 - Goal: turn arbitrary programs running on a Random Access Machine (RAM) (memory + registers + instructions + PC) in steps into arithmetic circuits
 - Preliminary techniques:
- Evaluate step-by-step: In each layer, list the entire machine configuration (e.g. representing each memory cell as 64 binary field elements) and repeatedly apply a sub-circuit that performs one computations step
- Unusable in the context of GKR, because the number of layers (and hence verifier time) is
 - Massive circuit, because each memory cell is repeated in each step
 
 - Turning small-space programs into shallow circuits: If the program uses little space the configuration graph is small and can be represented using an adjacency matrix. Running the machine for steps is equivalent to squaring the matrix times.
 
 - Evaluate step-by-step: In each layer, list the entire machine configuration (e.g. representing each memory cell as 64 binary field elements) and repeatedly apply a sub-circuit that performs one computations step
 - Circuit satisfiability problem: There exists a witness  such that 
- This is in contrast to the circuit evaluation problem where we show that
 
 - Transcript or trace: The list of changes to the machine configuration in each time step, of size 
- If a memory operation happened, contains a memory location & value
 
 - Transformation from computer programs to arithmetic circuit satisfiability: The circuit takes the transcript as witness and checks two properties:
- Memory consistency: When a value is read, it needs to return the most recently written value to that location
 - Time consistency: Assuming memory consistency holds, check that the transition from step to was correct
 
 - Checking for time consistency:
- Can be done in parallel!
 - Example: Implementing  (, )
- Add witness values representing the bit representations of and
 - Add output gates that compute → 0 iff. all are binary
 - Add output gate that computes  → 0 iff. binary decomposition is correct
- Error: I don’t think that’s sound, the sum could overflow the field
 
 - For , define:
- 
- → iff. the most significant bits are equal
 
 
 - 
 - The following expression equal iff. :
 
 
 - Checking for memory consistency:
- Idea: Sort memory accesses by (address, time), then consistency can be checked by putting constraints on neighboring items
 - Routing Networks: A graph of depth that allows the prover to permute the memory accesses in any order, using additional witness values
 - Optimizations if we allow for witnesses to false statements, but that are hard to find:
- Memory Consistency via Merkle Trees: Every read is followed by a Merkle proof, which is validated by the circuit
- → Lots of hashes, but efficient if there are few memory operations
 
 - Adapt Offline Memory Checking
 
 - Memory Consistency via Merkle Trees: Every read is followed by a Merkle proof, which is validated by the circuit
 
 - Efficiently representing non-arithmetic operations over large prime-order fields:
- Bootle et al. (2018, Arya):
- Instead of bit-decomposing elements, use base where is the word size and is a constant
 - The range-check happens via the following lookup argument:
- To show: values are all contained in
 - → There are non-negative integers such that the following polynomials are equal: and
 - The witness contains the bit representations of each , and an arithmetic circuit of gates computes for a challenge point :
 
 
 - Gabizon & Williamson (2020, Plookup):
- Simplified variant due to Cairo: All elements appear at least once and the lookup table covers a contiguous interval
 - Witness sequence is claimed to equal in sorted order
 - To show:
- is a permutation of
 - 
- Circuit checks , , and for each :
 
 
 
 
 - Bootle et al. (2018, Arya):
 
Chapter 7: A First Succinct Argument for Circuit Satisfiability, from Interactive Proofs
- Idea: Instead of sending the witness to the verifier (which might be ), use a polynomial commitment scheme (PCS) to commit to 
- If is the concatenation of and (both of the same length, a power of ), it’s multilinear extension is:
 
 - A First (Relaxed) Polynomial Commitment Scheme:
- Build a Merkle tree of all  evaluations of the polynomial
- Impractical, but included for didactic reasons
 
 - Then, we have to run a low-degree test to make sure that the evaluations are close to those of a low-degree polynomial
 - Example: point-versus-line test
- Request evaluations of a random line in and make sure that it corresponds to a univariate polynomial of degree at most
 - Repeat a few times to reduce the soundness error
 
 
 - Build a Merkle tree of all  evaluations of the polynomial
 - Knowledge Soundness: Establishes not only that there exists a satisfying witness, but that the prover knows it
- Formally shown by proving the existence of an efficient extractor: A program that can repeatedly interact with the prover to extract a valid witness
 - Why efficient? Because if it is efficient, the prover could run it by simulating the interaction with itself to compute the witness, so we can assume (s)he knows it!
 - Knowledge Soundness is a stronger property than binding
 
 
Chapter 8: MIPs and Succinct Arguments
- Multi-prover interactive proof: Instead of one prover, we have multiple, which are not allowed to communicate after the start of the protocol (otherwise, they could be simulated with just one prover)
- In practice, they only use the second prover to model a PCS!
 
 - Example: Run GKR, but with a PCS answering the final query on the multilinear extension of the witness!
 - Spartan
 
Chapter 9: PCPs and Succinct Arguments
- Probabilistically Checkable Proof (PCP): The verifier has oracle access to a static proof of length , where typically the verifier would query random parts of the proof
 - Closely related to MIPs: Can be converted in both directions, but with considerable constant overhead
 - Can be compiled to a succinct argument by having the prover commit to the proof (e.g. via a Merkle tree) and then answer the verifier queries
 - A First Polynomial Length PCP, From an MIP: Basically Clover (see Spartan), but using a base larger than (to reduce variables), with the proof consisting of every possible query by the verifier → Proof size is at least
 - A PCP of Quasilinear Length for Arithmetic Circuit Satisfiability: Skipped
 
Chapter 10: Interactive Oracle Proofs
- Interactive Oracle Proof (IOP): An IP where each round is given query access to each prover “special” message → Generalization of IP and PCP
 - Polynomial IOP: Special messages specify polynomials, the verifier can query evaluations
 - A polynomial IOP for R1CS-SAT via univariate sum-check:
- Setting: To show that , let , , and , and , , and  be their unique univariate extensions over a multiplicative subgroup  and show:
- For all , 
- Run the Univariate Zero-Check Protocol on
 
 - For all  and , 
- Let denote the bivariate low-degree extension of
 - Because is the unique extension, the following equality holds:
 - The verifier checks this equality at a random point by executing the Univariate Zero-Check Protocol asserting that for
 - As part of the protocol, the verifier has to evaluate , which involves evaluating
 - This can be done by having a trusted party commit to during pre-processing using a holographic commitment scheme, which means that the prover can reveal in a time that’s linear in the number of non-zero entries of . Page 152 gives a holographic commitment scheme that involves committing to 3 univariate polynomials of degree (row, column, and value).
 
 - *Question: Does the prover really have to commit to and send the polynomials ? Can’t it work like in Spartan where the prover sends on demand which is then verified via the sum check?
 
 - For all , 
 
 - Setting: To show that , let , , and , and , , and  be their unique univariate extensions over a multiplicative subgroup  and show:
 - FRI
 - From univariate to multilinear polynomials:
- A multilinear polynomial is fully specified by its evaluations on the boolean hypercube, so we interpolate a univariate polynomial that maps each element in to an evaluation at one of the hypercube points
 - To reveal , let be the polynomial that encodes the evaluation of all Lagrange basis polynomials at
 - Then, it suffices to show that
 - → Univariate Sum-Check Protocol
 - At the end of the protocol, the verifier has to evaluate for a random point . This is only possible in linear time.
 - However, there is a layered circuit of size and depth , so GKR can be used to outsource the work to the prover
 
 - Ligero and Brakedown Polynomial Commitments
 
Chapter 11: Zero-Knowledge Proofs and Arguments
- Zero Knowledge: There exists a polynomial-time simulator algorithm that produces transcripts indistinguishable from those of a real interaction between a verifier & an honest prover
- Perfect zero-knowledge: The distributions are identical
 - Statistical zero-knowledge: Negligible statistical distance
 - Computational zero-knowledge: No polynomial-time algorithm can distinguish the transcripts
 - Honest verifier zero-knowledge: Only real interactions with an honest count
 
 - Why does the existence of a simulator not mean that the system is broken?
- In fact, the simulator might even produce transcripts that “prove” false statements!
 - But: In a real proof, there is interaction. The simulator can already “know” later verifier messages when generating early prover messages.
 
 - I skipped the rest of the chapter!
 
Chapter 12: -Protocols and Commitments from Hardness of Discrete Logarithm
- Cryptographic background:
- Group: A set + an operation that behaves much like multiplication, i.e.:
- Closure:
 - Associativity:
 - Identity: There is an element s.t.
 - Invertibility: For every there is a s.t.
 - Commutativity? (for abelian groups):
 
 - The order of a subgroup always divides the order of the original group
 - Any prime-order subgroup is cyclic and any non-identity member is a generator
 - Other concepts not summarized here: Discrete log problem, elliptic curve groups
 
 - Group: A set + an operation that behaves much like multiplication, i.e.:
 - Schnorr’s Σ-Protocol for knowledge of discrete logarithm
 - Pedersen Commitment
 
Chapter 13: Zero-Knowledge via Commit-And-Prove and Masking Polynomials
- Commit-and-prove:
- Setting: Let be an arithmetic circuit and let the prover make the claim that (s)he knows such that
 - The prover sends a Pedersen commitment for each entry of and the output of each multiplication gate
 - Then, the prover proves knowledge of an opening to these comments
 - The verifier can on its own derive commitments to outputs of addition gates
 - For multiplication gates, the prover proves a product relationship between the committed values
 
 - zk-Arguments form IPs:
- Again, the prover sends Pedersen commitments to entires in
 - Then, GKR is executed, but with the prover messages replaced with commitments
 - → The verifier checks can be replaced with a commit-and-prove protocol using a circuit of size
 
 - Zero-knowledge via masking polynomials:
- A zero-knowledge sum-check protocol:
- Goal: Make the sum-check protocol applied to a polynomial zero-knowledge
 - Let the prover commit to a random multi-linear polynomial
 - Let the prover send claimed to equal
 - Let the verifier choose a random challenge
 - → Run the sum-check protocol on (a random polynomial)
 - At the end of the protocol, the verifier queries 
- So, at this point, it’s not perfectly zero-knowledge!
 
 
 - Mask : Replace  with a slighly higher-degree extension of 
- Insist that the verifier challenges are from
 - The prover picks and at random and choses:
 - Insist that the commitment scheme is zero-knowledge
 - Apply the zk-variant of the sum-check protocol
 
 
 - A zero-knowledge sum-check protocol:
 
Chapter 14: Polynomial Commitments from Hardness of Discrete Logarithm
- Polynomial commitment: Allows a prover to commit to a polynomial and then later reveal the evaluation at points chosen by the verifier
- Because the commitment is usually smaller than the polynomial, it can only be computationally binding
 - Evaluations proof can be shorter than evaluating the actual polynomial
 - They can be zero-knowledge, i.e., the verifier learns nothing about the polynomial except the evaluations
 - Often, polynomial commitment schemes are more general: They commit to a vector (e.g. the coefficients of the polynomial) and the prover to make a claim about for some known vector (e.g. )
 
 - Linear-size Pedersen Polynomial Commitment
 - Generalized Pedersen Commitment
 - Hyrax
 - Bulletproofs
 
Chapter 15: Polynomial Commitments from Pairings
- Cryptographic Background:
- Decisional Diffie-Hellman Assumption (DDH): Given a group, it is no feasible to distinguish  and , where  are chosen randomly
- → Stronger than discrete log or computational Diffie-Helman (CDH, infeasible to compute from and )
 
 - Bilinear map: A map  such that  (where  is called the target group)
- implies that DDH is false!
 - In practice, if is an elliptic curve group defined over then is a subgroup of where is called the embedding degree and equal to the smallest integer such that divides
 - → Embedding degree must be low for a curve to be pairing friendly
 
 
 - Decisional Diffie-Hellman Assumption (DDH): Given a group, it is no feasible to distinguish  and , where  are chosen randomly
 - KZG
 - Dory
 
Chapter 16: Wrap-Up of Polynomial Commitments
- Batch evaluation of homomorphically committed polynomials:
- Suppose a verifier wants to know both vaues and
 - Instead of opening both commitments individually, the verifier can pick a random challenge , compute the commitment to and ask the prover to open this commitment at point to value
 
 - Commitment scheme for sparse Polynomials (sketch): A commitment scheme for sparse multilinear polynomials, using any commitment scheme for dense multilinear polynomials (proposed by Setty (2020))
- Idea: Design a layered arithmetic circuit that takes a description of the -sparse polynomial and and computes
 - The description  is the list of:
- coefficients
 - For each coefficient , the bits of the Lagrange basis polynomial
 
 - → gates, apply GKR
 - → To reduce to , specify basis polynomials via a triple of integers and use a Random Access Machine (Chapter 6) to evaluate
 
 - Commitment schemes summary:

 
Chapter 17: Linear PCPs and Succinct Arguments
- Idea: Super-polynomial PCPs are fine, if the prover does not have to materialize them
 - Linear PCP: The proof is interpreted as a function  for some integer  (e.g. the circuit size ). The honest proof is the evaluation table of a linear function, i.e., . Equivalently,  is a -variate polynomial of total degree  with a constant term of .
- That means that just to specify a query point to the polynomial, the verifier has to specify field elements!
 
 - Commitment to a Linear function
 - A first linear PCP for arithmetic circuit satisfiability: Skipped, PCP has size
 - GGPR
 
Chapter 18: SNARK Composition and Recursion
- Idea: We can compose SNARKs, i.e., first compute a proof  of the inner SNARK (e.g. one with fast prover time, not necessarily zero-knowledge) and then compute an outer SNARK (e.g. one with low proof size / verification time, possibly zero-knowledge) that only claims to have validated the inner SNARK correctly
- When composing a SNARK with itself repeatedly, one will eventually hit the recursion threshold, where the circuit size of the validating circuit does not decrease anymore.
 - Note that the knowledge extractor runtime grows exponentially with the recursion depth. So, for super-constant recursion depth, we don’t know how to prove knowledge soundness!
 
 - Cycle of Curves: In order to efficiently validate a SNARK using an elliptic curve group, we need a second curve whose scalar field is equal to the original curve’s base field. With a cycle of curves (e.g. the Pasta curves), we can recurse infinitely.
 - Applications:
- Incremental computation: e.g. Verifiable delay functions
 - Incrementally Verifiable Computation (IVC)
 - Proof aggregation
 
 - Nova