From dad34f9c669532d20f7235151c07b9bf8b2a6aa1 Mon Sep 17 00:00:00 2001
From: Christopher Patton <cpatton@cloudflare.com>
Date: Fri, 11 Oct 2024 12:27:37 -0700
Subject: [PATCH] Pass of Prio3

* Update motivating example for FLP truncation (current example is out
  of date).

* Move FLP parameter table to the bottom of the section.

* Add `PROOFS` to VDAF parameter table.

* Move prose explanations of sharding, preparation to before the code
  listings.
---
 draft-irtf-cfrg-vdaf.md      | 643 +++++++++++++++++------------------
 poc/vdaf_poc/flp_bbcggi19.py |  10 +-
 poc/vdaf_poc/vdaf_prio3.py   |  14 +-
 3 files changed, 322 insertions(+), 345 deletions(-)

diff --git a/draft-irtf-cfrg-vdaf.md b/draft-irtf-cfrg-vdaf.md
index cbefc9bb..b2330f72 100644
--- a/draft-irtf-cfrg-vdaf.md
+++ b/draft-irtf-cfrg-vdaf.md
@@ -2357,45 +2357,40 @@ multiple parties need to agree on. We call this input the "binder string".
 
 # Prio3 {#prio3}
 
-This section describes Prio3, a VDAF for Prio {{CGB17}}. Prio is suitable for
-a wide variety of aggregation functions, including (but not limited to) sum,
-mean, standard deviation, estimation of quantiles (e.g., median), and linear
-regression. In fact, the scheme described in this section is compatible with any
-aggregation function that has the following structure:
+This section describes Prio3, a VDAF for general-purpose aggregation. Prio3 is
+suitable for a wide variety of aggregation functions, including (but not
+limited to) sum, mean, standard deviation, histograms, and linear regression.
+It is compatible with any aggregation function that has the following
+structure:
 
 * Each measurement is encoded as a vector over some finite field.
-* Measurement validity is determined by an arithmetic circuit evaluated over
-  the encoded measurement. (An "arithmetic circuit" is a function comprised of
-  arithmetic operations in the field.) The circuit's output is a single field
-  element: if zero, then the measurement is said to be "valid"; otherwise, if
-  the output is non-zero, then the measurement is said to be "invalid".
-* The aggregate result is obtained by summing up the encoded measurement
-  vectors and computing some function of the sum.
-
-At a high level, Prio3 distributes this computation as follows. Each Client
-first shards its measurement by first encoding it, then splitting the vector into
-secret shares and sending a share to each Aggregator. Next, in the preparation
-phase, the Aggregators carry out a multi-party computation to determine if their
-shares correspond to a valid measurement (as determined by the arithmetic
-circuit). This computation involves a "proof" of validity generated by the
-Client. Next, each Aggregator sums up its shares locally. Finally, the
-Collector sums up the aggregate shares and computes the aggregate result.
-
-This VDAF does not have an aggregation parameter. Instead, the output share is
-derived from the measurement share by applying a fixed map. See {{poplar1}} for
-an example of a VDAF that makes meaningful use of the aggregation parameter.
-
-The core component of Prio3 is a "Fully Linear Proof (FLP)" system. Introduced
-by {{BBCGGI19}}, the FLP encapsulates the functionality required for encoding
-and validating measurements. Prio3 can be thought of as a transformation of a
-particular class of FLPs into a VDAF.
+* Measurement validity is determined by an "arithmetic circuit" evaluated over
+  the encoded measurement. An arithmetic circuit is a function comprised of
+  arithmetic operations in the field. (We specify these in full detail in
+  {{flp-bbcggi19-valid}}.)
+* The aggregate result is obtained by summing up the encoded measurements and
+  computing some function of the sum.
+
+Clients protect the privacy of their measurements by secret sharing them and
+distributing the shares among the Aggregators. To ensure each measurement is
+valid, the Aggregators run a multi-party computation on their shares, the
+result of which is the output of the arithmetic circuit. This involves
+verification of a "Fully Linear Proof (FLP)" ({{flp}}) generated by the Client.
+FLPs are the core component of Prio3, as they specify the types of
+measurements, how they are encoded, and how they are aggregated. In fact Prio3
+can be thought of as a transformation of an FLP into a VDAF.
+
+Prio3 does not have an aggregation parameter. Instead, each output share is
+derived from each input share by applying a fixed map. See {{poplar1}} for an
+example of a VDAF that makes meaningful use of the aggregation parameter.
 
 The remainder of this section is structured as follows. The syntax for FLPs is
 described in {{flp}}. The generic transformation of an FLP into Prio3 is
 specified in {{prio3-construction}}. Next, a concrete FLP suitable for any
-validity circuit is specified in {{flp-bbcggi19}}. Finally, instantiations of
-Prio3 for various types of measurements are specified in
-{{prio3-instantiations}}. Test vectors can be found in {{test-vectors}}.
+validity circuit is specified in {{flp-bbcggi19}}. Finally, variants of Prio3
+for various types of aggregation tasks are specified in
+{{prio3-instantiations}}. Test vectors for each variant can be found in
+{{test-vectors}}.
 
 ## Fully Linear Proof (FLP) Systems {#flp}
 
@@ -2403,50 +2398,33 @@ Conceptually, an FLP is a two-party protocol executed by a prover and a
 verifier. In actual use, however, the prover's computation is carried out by
 the Client, and the verifier's computation is distributed among the
 Aggregators. The Client generates a "proof" of its measurement's validity and
-distributes shares of the proof to the Aggregators. Each Aggregator then
-performs some computation on its measurement share and proof share locally and
-sends the result to the other Aggregators. Combining the exchanged messages
-allows each Aggregator to decide if it holds a share of a valid measurement.
-(See {{prio3-construction}} for details.)
-
-As usual, we will describe the interface implemented by a concrete FLP in terms
-of an abstract base class `Flp` that specifies the set of methods and parameters
-a concrete FLP must provide.
-
-The parameters provided by a concrete FLP are listed in {{flp-param}}.
-
-| Parameter             | Description                            |
-|:----------------------|:---------------------------------------|
-| `PROVE_RAND_LEN: int` | Length of the prover randomness, the number of random field elements consumed by the prover when generating a proof |
-| `QUERY_RAND_LEN: int` | Length of the query randomness, the number of random field elements consumed by the verifier |
-| `JOINT_RAND_LEN: int` | Length of the joint randomness, the number of random field elements consumed by both the prover and verifier |
-| `MEAS_LEN: int`       | Length of the encoded measurement ({{flp-encode}}) |
-| `OUTPUT_LEN: int`     | Length of the aggregatable output ({{flp-encode}}) |
-| `PROOF_LEN: int`      | Length of the proof                    |
-| `VERIFIER_LEN: int`   | Length of the verifier message generated by querying the measurement and proof |
-| `Measurement`         | Type of the measurement                |
-| `AggResult`           | Type of the aggregate result           |
-| `field: type[F]`      | Class object for the field ({{field}}) |
-{: #flp-param title="FLP parameters."}
+distributes shares of the proof to the Aggregators. During preparation, each
+Aggregator performs some computation on its measurement share and proof share
+locally, then broadcasts the result in its prep share. The validity decision is
+then made by the `prep_shares_to_prep()` algorithm ({{sec-vdaf-prepare}}).
+
+As usual, we describe the interface implemented by a concrete FLP in terms of
+an abstract base class, denoted `Flp`, that specifies the set of methods and
+parameters a concrete FLP must provide.
 
-An FLP specifies the following algorithms for generating and verifying proofs of
-validity (encoding is described below in {{flp-encode}}):
+The parameters provided by a concrete FLP are listed in {{flp-param}}. A
+concrete FLP specifies the following algorithms for generating and verifying
+proofs of validity (encoding is described below in {{flp-encode}}):
 
 * `flp.prove(meas: list[F], prove_rand: list[F], joint_rand: list[F]) ->
-  list[F]` is the deterministic proof-generation algorithm run by the prover.
-  Its inputs are the encoded measurement, the "prover randomness" `prove_rand`,
-  and the "joint randomness" `joint_rand`. The prover randomness is used only
-  by the prover, but the joint randomness is shared by both the prover and
-  verifier.
+  list[F]` is the proof-generation algorithm run by the prover. Its inputs are
+  the encoded measurement, the "prover randomness" `prove_rand`, and the "joint
+  randomness" `joint_rand`. The prover randomness is used only by the prover,
+  but the joint randomness is shared by both the prover and verifier.
 
 * `flp.query(meas: list[F], proof: list[F], query_rand: list[F], joint_rand:
   list[F], num_shares: int) -> list[F]` is the query algorithm run by the
-  verifier. This is used to "query" the measurement and proof. The result of
-  the query (i.e., the output of this function) is called the "verifier
-  message". In addition to the measurement and proof, this algorithm takes as
-  input the query randomness `query_rand` and the joint randomness
-  `joint_rand`. The former is used only by the verifier. `num_shares` specifies
-  how many shares were generated.
+  verifier on the encoded measurement and proof. The result of the query (i.e.,
+  the output of this function) is called the "verifier message". In addition to
+  the measurement and proof, this algorithm takes as input the query randomness
+  `query_rand` and the joint randomness `joint_rand`. The former is used only
+  by the verifier. `num_shares` specifies the number of shares (more on this
+  below).
 
 * `flp.decide(verifier: list[F]) -> bool` is the deterministic decision
   algorithm run by the verifier. It takes as input the verifier message and
@@ -2456,19 +2434,19 @@ validity (encoding is described below in {{flp-encode}}):
 Our application requires that the FLP is "fully linear" in the sense defined in
 {{BBCGGI19}}. As a practical matter, what this property implies is that, when
 run on a share of the measurement and proof, the query algorithm outputs a
-share of the verifier message. Furthermore, the privacy property of the FLP
-system ensures that the verifier message reveals nothing about the measurement
-other than whether it is valid. Therefore, to decide if a measurement is valid,
-the Aggregators will run the query algorithm locally, exchange verifier shares,
-combine them to recover the verifier message, and run the decision algorithm.
+share of the verifier message (hereafter the "verifier share"). Furthermore,
+the privacy property of the FLP system ensures that the verifier message
+reveals nothing about the measurement other than the fact that it is valid.
+Therefore, to decide if a measurement is valid, the Aggregators will run the
+query algorithm locally, exchange verifier shares, combine them to recover the
+verifier message, and run the decision algorithm.
 
 The query algorithm includes a parameter `num_shares` that specifies the number
-of shares that were generated. If these data are not secret shared, then
-`num_shares == 1`. This parameter is useful for certain FLP constructions. For
-example, the FLP in {{flp-bbcggi19}} is defined in terms of an arithmetic
-circuit; when the circuit contains constants, it is sometimes necessary to
-normalize those constants to ensure that the circuit's output, when run on a
-valid measurement, is the same regardless of the number of shares.
+of shares of the measurement and proof that were generated. If these data are
+not secret shared, then `num_shares == 1`. This parameter is useful for
+normalizing constants in arithmetic circuits so that each Aggregator properly
+computes a secret share of the circuit's output. See {{flp-bbcggi19}} for
+details.
 
 An FLP is executed by the prover and verifier as follows:
 
@@ -2519,22 +2497,39 @@ def run_flp(
     return flp.decide(verifier)
 ~~~
 
-The proof system is constructed so that, if `meas` is valid, then `run_flp(flp,
-meas, 1)` always returns `True`. On the other hand, if `meas` is invalid, then
-as long as `joint_rand` and `query_rand` are generated uniform randomly, the
-output is `False` with high probability. False positives are possible: there is
-a small probability that a verifier accepts an invalid input as valid. An FLP
-is said to be "sound" if this probability is sufficiently small. The soundness
-of the FLP depends on a variety of parameters, like the length of the
-input and the size of the field. See {{flp-bbcggi19}} for details.
-
-Note that soundness of an FLP system is not the same as robustness for a VDAF
-In particular, soundness of the FLP is necessary, but insufficient for
-robusntess of Prio3 ({{prio3}}). See {{security-multiproof}} for details.
-
-We remark that {{BBCGGI19}} defines a much larger class of fully linear proof
+The proof system is designed so that, if `meas` is valid, then `run_flp(flp,
+meas, num_shares)` always returns `True`. On the other hand, if `meas` is
+invalid, then as long as `joint_rand` and `query_rand` are generated uniform
+randomly, the output is `False` with high probability. False positives are
+possible: there is a small probability that a verifier accepts an invalid input
+as valid. An FLP is said to be "sound" if this probability is sufficiently
+small. The soundness of the FLP depends on a variety of parameters, like the
+length of the input and the size of the field. See {{flp-bbcggi19}} for
+details.
+
+Note that soundness of an FLP system is not the same as robustness for the VDAF
+that uses it. In particular, soundness of the FLP is necessary, but
+insufficient for robustness of Prio3 ({{prio3}}). See {{security-multiproof}}
+for details.
+
+We remark that {{BBCGGI19}} defines a larger class of fully linear proof
 systems than we consider here. In particular, what is called an "FLP" here is
-called a 1.5-round, public-coin, interactive oracle proof system in their paper.
+called a 1.5-round, public-coin, interactive oracle proof system in their
+paper.
+
+| Parameter             | Description                             |
+|:----------------------|:----------------------------------------|
+| `PROVE_RAND_LEN: int` | Length of the prover randomness, the number of random field elements consumed by the prover when generating a proof. |
+| `QUERY_RAND_LEN: int` | Length of the query randomness, the number of random field elements consumed by the verifier. |
+| `JOINT_RAND_LEN: int` | Length of the joint randomness, the number of random field elements shared by the prover and verifier. |
+| `MEAS_LEN: int`       | Length of the encoded measurement ({{flp-encode}}). |
+| `OUTPUT_LEN: int`     | Length of the aggregatable output ({{flp-encode}}). |
+| `PROOF_LEN: int`      | Length of the proof.                    |
+| `VERIFIER_LEN: int`   | Length of the verifier message generated by querying the measurement and proof. |
+| `Measurement`         | Type of the measurement.                |
+| `AggResult`           | Type of the aggregate result.           |
+| `field: type[F]`      | Class object for the field ({{field}}). |
+{: #flp-param title="FLP parameters."}
 
 ### Encoding the Input {#flp-encode}
 
@@ -2543,38 +2538,48 @@ also specifies a method of encoding raw measurements as a vector of field
 elements:
 
 * `flp.encode(measurement: Measurement) -> list[F]` encodes a raw measurement
-  as a vector of field elements. The return value MUST be of length `MEAS_LEN`.
+  as a vector of field elements.
+
+  Post-conditions:
+
+    * The encoded measurement MUST have length `flp.MEAS_LEN`.
 
 For some FLPs, the encoded measurement also includes redundant field elements
 that are useful for checking the proof, but which are not needed after the
-proof has been checked. An example is the "integer sum" data type from
-{{CGB17}} in which an integer in `[0, 2^k)` is encoded as a vector of `k`
-field elements, each representing a bit of the integer (this type is also
-defined in {{prio3sum}}). After consuming this vector, all that is needed is
-the integer it represents. Thus the FLP defines an algorithm for truncating the
-encoded measurement to the length of the aggregated output:
+proof has been checked. An example is the `Sum` type defined in {{prio3sum}}
+for which each measurement is an integer in range `[0, max_measurement]`. The
+range check requires encoding the measurement with several field elements,
+though just one is needed for aggregation. Thus the FLP defines an algorithm
+for truncating the encoded measurement to the length of the aggregated output:
 
 * `flp.truncate(meas: list[F]) -> list[F]` maps an encoded measurement (e.g.,
   the bit-encoding of the measurement) to an aggregatable output (e.g., the
-  singleton vector containing the measurement). The length of the input MUST be
-  `MEAS_LEN` and the length of the output MUST be `OUTPUT_LEN`.
+  singleton vector containing the measurement).
+
+  Pre-conditions:
+
+  * The length of the input MUST be `flp.MEAS_LEN`
+
+  Post-conditions:
+
+  * The length of the output MUST be `flp.OUTPUT_LEN`.
 
-Once the aggregate shares have been computed and combined together, their sum
-can be converted into the aggregate result. This could be a projection from
-the FLP's field to the integers, or it could include additional
-post-processing.
+Once the aggregate shares have been transmitted to the Collector, their sum can
+be converted into the aggregate result. This could be a projection from the
+FLP's field to the integers, or it could include additional post-processing.
+Either way, this functionality is implemented by the following method:
 
 * `flp.decode(output: list[F], num_measurements: int) -> AggResult` maps a sum
   of aggregate shares to an aggregate result.
 
   Pre-conditions:
 
-  * The length of `output` MUST be `OUTPUT_LEN`.
-  * `num_measurements` MUST equal the number of measurements that contributed
-    to the `output`.
+  * The length of the output MUST be `OUTPUT_LEN`.
+  * `num_measurements` MUST equal the number of measurements that were
+    aggregated.
 
-We remark that, taken together, these three functionalities correspond roughly
-to the notion of "Affine-aggregatable encodings (AFEs)" from {{CGB17}}.
+We remark that, taken together, these three functionalities correspond to the
+notion of "Affine-aggregatable encodings (AFEs)" from {{CGB17}}.
 
 ### Multiple Proofs {#multiproofs}
 
@@ -2582,54 +2587,49 @@ It is sometimes desirable to generate and verify multiple independent proofs
 for the same input. First, this improves the soundness of the proof system
 without having to change any of its parameters. Second, it allows a smaller
 field to be used (e.g., replace Field128 with Field64, see {{flp-bbcggi19}})
-without sacrificing soundness. Generally, choosing a smaller field can
-significantly reduce communication cost. (This is a trade-off, of course, since
+without sacrificing soundness. This is useful because it reduces the overall
+communication of the protocol. (This is a trade-off, of course, since
 generating and verifying more proofs requires more time.) Given these benefits,
 this feature is implemented by Prio3 ({{prio3}}).
 
 To generate these proofs for a specific measurement, the prover calls
-`flp.prove` multiple times, each time using an independently generated prover
-and joint randomness string. The verifier checks each proof independently, each
-time with an independently generated query randomness string. It accepts the
-measurement only if all the decision algorithm accepts on each proof.
+`flp.prove()` multiple times, each time using fresh prover and joint
+randomness. The verifier checks each proof independently, each time with fresh
+query randomness. It accepts the measurement only if the decision algorithm
+accepts on each proof.
 
-See {{security-multiproof}} below for discussions on choosing the right number
+See {{security-multiproof}} for guidance on choosing the field size and number
 of proofs.
 
 ## Construction {#prio3-construction}
 
 This section specifies `Prio3`, an implementation of the `Vdaf` interface
-({{vdaf}}). It has three generic parameters: an `NttField`
-({field-ntt-friendly}}), an `Flp` ({{flp}}) and a `Xof` ({{xof}}). It also has
-an associated constant, `PROOFS`, with a value in the range `[1, 256)`,
-denoting the number of FLPs generated by the Client ({{multiproofs}}).
-
-The associated constants and types required by the `Vdaf` interface are
-defined in {{prio3-param}}. The methods required for sharding,
-preparation, aggregation, and unsharding are described in the remaining
-subsections. These methods refer to constants enumerated in
-{{prio3-const}}.
+defined in {{vdaf}}. The parameters and types required by the `Vdaf` interface
+are defined in {{prio3-param}}. The methods required for sharding, preparation,
+aggregation, and unsharding are described in the remaining subsections. These
+methods refer to constants enumerated in {{prio3-const}}.
 
 | Parameter         | Value                                           |
 |:------------------|:------------------------------------------------|
-| `flp`             | an instance of `Flp` ({{flp}})                  |
+| `flp`             | An instance of `Flp` ({{flp}}).                 |
 | `xof`             | `XofTurboShake128` ({{xof-turboshake128}})      |
+| `PROOFS`          | Any `int` in the range `[1, 256)`.              |
 | `VERIFY_KEY_SIZE` | `xof.SEED_SIZE`                                 |
 | `RAND_SIZE`       | `xof.SEED_SIZE * SHARES if flp.JOINT_RAND_LEN == 0 else 2 * xof.SEED_SIZE * SHARES` |
 | `NONCE_SIZE`      | `16`                                            |
 | `ROUNDS`          | `1`                                             |
-| `SHARES`          | in `[2, 256)`                                   |
-| `Measurement`     | as defined by `flp`                             |
+| `SHARES`          | Any `int` in the range `[2, 256)`.              |
+| `Measurement`     | As defined by `flp`.                            |
 | `AggParam`        | `None`                                          |
 | `PublicShare`     | `Optional[list[bytes]]`                         |
 | `InputShare`      | `tuple[list[F], list[F], Optional[bytes]] | tuple[bytes, Optional[bytes]]` |
 | `OutShare`        | `list[F]`                                       |
 | `AggShare`        | `list[F]`                                       |
-| `AggResult`       | as defined by `flp`                             |
+| `AggResult`       | As defined by `flp`.                            |
 | `PrepState`       | `tuple[list[F], Optional[bytes]]`               |
 | `PrepShare`       | `tuple[list[F], Optional[bytes]]`               |
 | `PrepMessage`     | `Optional[bytes]`                               |
-{: #prio3-param title="VDAF parameters for Prio3."}
+{: #prio3-param title="Parameters for Prio3."}
 
 | Variable                      | Value |
 |:------------------------------|:------|
@@ -2653,25 +2653,23 @@ Section 6.2.3 of {{BBCGGI19}}.)
 
 The sharding algorithm involves the following steps:
 
-1. Encode the Client's measurement for the FLP
+1. Encode the Client's measurement as specified by the FLP
 2. Shard the measurement into a sequence of measurement shares
 3. Derive the joint randomness from the measurement shares and nonce
-4. Run the FLP proof-generation algorithm using the derived joint randomness
+4. Generate the proof using the derived joint randomness
 5. Shard the proof into a sequence of proof shares
-6. Return the public share, consisting of the joint randomness parts, and the
-   input shares, each consisting of the measurement share, proof share, and
-   blind of one of the Aggregators
 
-As described in {{multiproofs}}, the soundness of the FLP can be amplified
-by generating and verifying multiple FLPs. (This in turn improves the
-robustness of Prio3.) To support this, in Prio3:
+As described in {{multiproofs}}, robustness of Prio3 can be amplified by
+generating and verifying multiple proofs. To support this:
 
 * In step 3, derive as much joint randomness as required by `PROOFS` proofs
 * Repeat step 4 `PROOFS` times, each time with a unique joint randomness
 
 Depending on the FLP, joint randomness may not be required. In particular, when
 `flp.JOINT_RAND_LEN == 0`, the Client does not derive the joint randomness
-(Step 3). The sharding algorithm is specified below.
+(Step 3).
+
+The sharding algorithm is specified below:
 
 ~~~ python
 def shard(
@@ -2705,7 +2703,17 @@ subsections below.
 #### FLPs Without Joint Randomness {#prio3-shard-without-joint-rand}
 
 The following method is used for FLPs that do not require joint randomness,
-i.e., when `flp.JOINT_RAND_LEN == 0`:
+i.e., when `flp.JOINT_RAND_LEN == 0`. It consists of the following steps:
+
+1. Shard the encoded measurement into shares
+1. Generate proofs and shard each into shares
+1. Encode each measurement share and shares of each proof into an input share
+
+Only one pair of measurement and proof(s) share (called the "Leader" shares
+above) are vectors of field elements. The other shares (called the "Helper"
+shares) are represented instead by an XOF seed, which is expanded into vectors
+of field elements. The methods on `Prio3` for deriving the prover randomness,
+measurement shares, and proof shares are defined in {{prio3-auxiliary}}.
 
 ~~~ python
 def shard_without_joint_rand(
@@ -2744,7 +2752,7 @@ def shard_without_joint_rand(
         )
 
     # Each Aggregator's input share contains its measurement share
-    # and share of proof(s).
+    # and its share of the proof(s).
     input_shares: list[Prio3InputShare[F]] = []
     input_shares.append((
         leader_meas_share,
@@ -2759,25 +2767,25 @@ def shard_without_joint_rand(
     return (None, input_shares)
 ~~~
 
-The steps in this method are as follows:
-
-1. Shard the encoded measurement into shares
-1. Generate and shard each proof into shares
-1. Encode each measurement and shares of each proof into an input share
+#### FLPs With Joint Randomness
 
-Notice that only one pair of measurement and proof(s) share (called the
-"leader" shares above) are vectors of field elements. The other shares (called
-the "helper" shares) are represented instead by an XOF seed, which is expanded
-into vectors of field elements.
+The following method is used for FLPs that require joint randomness, i.e., for
+which `flp.JOINT_RAND_LEN > 0`. Joint randomness derivation involves an
+additional XOF seed for each Aggregator called the "blind". The computation
+involves the following steps:
 
-The methods on `Prio3` for deriving the prover randomness, measurement shares,
-and proof shares and the methods for encoding the input shares are defined in
-{{prio3-auxiliary}}.
+1. Compute a "joint randomness part" from each measurement share and blind
+1. Compute a "joint randomness seed" from the joint randomness parts
+1. Compute the joint randomness for each proof evaluation from the joint
+   randomness seed
 
-#### FLPs With Joint Randomness
+This three-step process is designed to ensure that the joint randomness does
+not leak the measurement to the Aggregators while preventing a malicious Client
+from tampering with the joint randomness in a way that allows it to break
+robustness. To bootstrap the required check, the Client encodes the joint
+randomness parts in the public share. (See {{prio3-preparation}} for details.)
 
-The following method is used for FLPs that require joint randomness,
-i.e., for which `flp.JOINT_RAND_LEN > 0`:
+All functions used in the following listing are defined in {{prio3-auxiliary}}:
 
 ~~~ python
 def shard_with_joint_rand(
@@ -2814,7 +2822,7 @@ def shard_with_joint_rand(
     joint_rand_parts.insert(0, self.joint_rand_part(
         ctx, 0, leader_blind, leader_meas_share, nonce))
 
-    # Generate the proof and shard it into proof shares.
+    # Generate each proof and shard it into proof shares.
     prove_rands = self.prove_rands(ctx, prove_seed)
     joint_rands = self.joint_rands(
         ctx, self.joint_rand_seed(ctx, joint_rand_parts))
@@ -2856,59 +2864,33 @@ def shard_with_joint_rand(
     return (joint_rand_parts, input_shares)
 ~~~
 
-The difference between this procedure and previous one is that here we compute
-joint randomnesses `joint_rands`, split it into multiple `joint_rand`, and pass
-each `joint_rand` to the proof generationg algorithm. (In
-{{prio3-shard-without-joint-rand}} the joint randomness is the empty vector,
-`[]`.) This requires generating an additional value, called the "blind", that
-is incorporated into each input share.
-
-The joint randomness computation involves the following steps:
-
-1. Compute a "joint randomness part" from each measurement share and blind
-1. Compute a "joint randomness seed" from the joint randomness parts
-1. Compute the joint randomness for each proof evaluation from the joint randomness seed
-
-This three-step process is designed to ensure that the joint randomness does
-not leak the measurement to the Aggregators while preventing a malicious Client
-from tampering with the joint randomness in a way that allows it to break
-robustness. To bootstrap the required check, the Client encodes the joint
-randomness parts in the public share. (See {{prio3-preparation}} for details.)
-
-The methods used in this computation are defined in {{prio3-auxiliary}}.
-
 ### Preparation {#prio3-preparation}
 
 This section describes the process of recovering output shares from the input
 shares. The high-level idea is that each Aggregator first queries its
-measurement and share of proof(s) locally, then exchanges its share of
-verifier(s) with the other Aggregators. The shares of verifier(s) are then
-combined into the verifier message(s) used to decide whether to accept.
-
-In addition, for FLPs that require joint randomness, the Aggregators must
-ensure that they have all used the same joint randomness for the query
-algorithm. To do so, they collectively re-derive the joint randomness from
-their measurement shares just as the Client did during sharding.
-
-In order to avoid extra round of communication, the Client sends each
-Aggregator a "hint" consisting of the joint randomness parts. This leaves open
-the possibility that the Client cheated by, say, forcing the Aggregators to use
-joint randomness that biases the proof check procedure some way in its favor.
-To mitigate this, the Aggregators also check that they have all computed the
-same joint randomness seed before accepting their output shares. To do so, they
-exchange their parts of the joint randomness along with their shares of
-verifier(s).
-
-Implementation note: the preparation state for Prio3 includes the output share
-that will be released once preparation is complete. In some situations, it may be
+measurement share and proof(s) share(s) locally, then broadcasts its verifier
+share(s) in its prep share. The shares of verifier(s) are then combined into
+the verifier message(s) used to decide whether to accept.
+
+In addition, the Aggregators must recompute the same joint randomness used by
+the Client to generate the proof(s). In order to avoid an extra round of
+communication, the Client includes the joint randomness parts in the public
+share. This leaves open the possibility that the Client cheated by, say,
+forcing the Aggregators to use joint randomness that biases the proof check
+procedure some way in its favor. To mitigate this, the Aggregators also check
+that they have all computed the same joint randomness seed before accepting
+their output shares. To do so, they exchange their parts of the joint
+randomness along with their shares of verifier(s).
+
+Implementation note: the prep state for Prio3 includes the output share that
+will be released once preparation is complete. In some situations, it may be
 necessary for the Aggregator to encode this state as bytes and store it for
 retrieval later on. For all but the first Aggregator, it is possible to save
 storage by storing the measurement share rather than output share itself. It is
-relatively inexpensive to expand this seed into the input share, then truncate
-the input share to get the output share.
+relatively inexpensive to expand this seed into the measurement share, then
+truncate the measurement share to get the output share.
 
-The definitions of constants and a few auxiliary functions are defined in
-{{prio3-auxiliary}}.
+All functions used in the following listing are defined in {{prio3-auxiliary}}:
 
 ~~~ python
 def prep_init(
@@ -2985,7 +2967,7 @@ def prep_shares_to_prep(
         ctx: bytes,
         _agg_param: None,
         prep_shares: list[Prio3PrepShare[F]]) -> Optional[bytes]:
-    # Unshard the verifier shares into the verifier message.
+    # Unshard each set of verifier shares into each verifier message.
     verifiers = self.flp.field.zeros(
         self.flp.VERIFIER_LEN * self.PROOFS)
     joint_rand_parts = []
@@ -2995,7 +2977,7 @@ def prep_shares_to_prep(
             assert joint_rand_part is not None
             joint_rand_parts.append(joint_rand_part)
 
-    # Verify that each proof is well-formed and input is valid
+    # Verify that each proof is well-formed and input is valid.
     for _ in range(self.PROOFS):
         verifier, verifiers = front(self.flp.VERIFIER_LEN, verifiers)
         if not self.flp.decide(verifier):
@@ -3012,18 +2994,13 @@ def prep_shares_to_prep(
 
 ### Validity of Aggregation Parameters
 
-Every input share MUST only be used once, regardless of the aggregation
-parameters used.
+`Prio3` only permits a report to be aggregated once.
 
 ~~~ python
 def is_valid(
         self,
         _agg_param: None,
         previous_agg_params: list[None]) -> bool:
-    """
-    Checks if `previous_agg_params` is empty, as input shares in
-    Prio3 may only be used once.
-    """
     return len(previous_agg_params) == 0
 ~~~
 
@@ -3054,7 +3031,8 @@ def merge(self,
 ### Unsharding
 
 To unshard a set of aggregate shares, the Collector first adds up the vectors
-element-wise. It then converts each element of the vector into an integer.
+element-wise, then decodes the aggregate result from the sum according to the
+FLP ({{flp-encode}}).
 
 ~~~ python
 def unshard(
@@ -3071,8 +3049,6 @@ def unshard(
 This section defines a number of auxiliary functions referenced by the main
 algorithms for Prio3 in the preceding sections.
 
-The following methods are called by the sharding and preparation algorithms.
-
 ~~~ python
 def helper_meas_share(
         self,
@@ -3179,12 +3155,12 @@ def joint_rands(self,
 ### Message Serialization {#prio3-encode}
 
 This section defines serialization formats for messages exchanged over the
-network while executing Prio3. It is RECOMMENDED that implementations provide
-serialization methods for them.
+network while executing Prio3. Messages are defined in the presentation
+language of TLS as defined in {{Section 3 of !RFC8446}}.
 
-Message structures are defined following {{Section 3 of !RFC8446}}). In the
-remainder we use `S` as an alias for `prio3.xof.SEED_SIZE` and `F` as an alias
-for `prio3.field.ENCODED_SIZE`. XOF seeds are represented as follows:
+Let `prio3` denote an instance of `Prio3`. In the remainder we use `S` as an
+alias for `prio3.xof.SEED_SIZE` and `F` as an alias for
+`prio3.field.ENCODED_SIZE`. XOF seeds are represented as follows:
 
 ~~~ tls-presentation
 opaque Prio3Seed[S];
@@ -3199,11 +3175,11 @@ opaque Prio3Field[F];
 
 #### Public Share
 
-The encoding of the public share depends on whether joint randomness is
+The content of the public share depends on whether joint randomness is
 required for the underlying FLP (i.e., `prio3.flp.JOINT_RAND_LEN > 0`). If
-joint randomness is not used, then the public share is the empty string. If
-joint randomness is used, then the public share encodes the joint randomness
-parts as follows:
+joint randomness is not used, then the public share is the empty string.
+Otherwise, if joint randomness is used, then the public share encodes the joint
+randomness parts as follows:
 
 ~~~ tls-presentation
 struct {
@@ -3213,15 +3189,15 @@ struct {
 
 #### Input Share
 
-Just as for the public share, the encoding of the input shares depends on
+Just as for the public share, the content of the input shares depends on
 whether joint randomness is used. If so, then each input share includes the
 Aggregator's blind for generating its joint randomness part.
 
 In addition, the encoding of the input shares depends on which aggregator is
 receiving the message. If the aggregator ID is `0`, then the input share
-includes the full measurement and share of proof(s). Otherwise, if the aggregator ID
-is greater than `0`, then the measurement and shares of proof(s) are
-represented by an XOF seed. We shall call the former the "Leader" and the
+includes the full measurement share and proofs(s) share(s). Otherwise, if the
+aggregator ID is greater than `0`, then the measurement and shares of proof(s)
+are represented by an XOF seed. We shall call the former the "Leader" and the
 latter the "Helpers".
 
 In total there are four variants of the input share. When joint randomness is
@@ -3269,20 +3245,17 @@ When joint randomness is not used, the prep share is structured as follows:
 
 ~~~ tls-presentation
 struct {
-    Prio3Field verifiers_share[
-        F * prio3.flp.VERIFIER_LEN * prio3.PROOFS
-    ];
+    Prio3Field verifiers_share[F * V];
 } Prio3PrepShare;
 ~~~
 
-When joint randomness is used, the prep share includes the Aggregator's joint
-randomness part and is structured as follows:
+where `V = prio3.flp.VERIFIER_LEN * prio3.PROOFS`. When joint randomness is
+used, the prep share includes the Aggregator's joint randomness part and is
+structured as follows:
 
 ~~~ tls-presentation
 struct {
-    Prio3Field verifiers_share[
-        F * prio3.flp.VERIFIER_LEN * prio3.PROOFS
-    ];
+    Prio3Field verifiers_share[F * V];
     Prio3Seed joint_rand_part;
 } Prio3PrepShareWithJointRand;
 ~~~
@@ -3311,34 +3284,37 @@ struct {
 
 ## FLP Construction {#flp-bbcggi19}
 
-This section specifies an implementation of the `Flp` interface ({{flp}}) based
-on the construction from {{BBCGGI19}}, Section 4.2. We begin in
-{{flp-bbcggi19-overview}} with an overview of the proof system and some
-extensions to it. {{flp-bbcggi19-valid}} defines validity circuits, the core
-component of the proof system that determines measurement validity and how
-measurements are aggregated. The proof-generation algorithm, the query
-algorithm, and the decision algorithm are defined in
-{{flp-bbcggi19-construction-prove}}, {{flp-bbcggi19-construction-query}}, and
-{{flp-bbcggi19-construction-decide}} respectively.
-
-| Parameter        | Value                    |
-|:-----------------|:-------------------------|
-| `valid`          | instance of `Valid` ({{flp-bbcggi19-valid}}) |
-| `PROVE_RAND_LEN` | `valid.prove_rand_len()` |
-| `QUERY_RAND_LEN` | `valid.query_rand_len()` |
-| `JOINT_RAND_LEN` | `valid.JOINT_RAND_LEN`   |
-| `MEAS_LEN`       | `valid.MEAS_LEN`         |
-| `OUTPUT_LEN`     | `valid.OUTPUT_LEN`       |
-| `PROOF_LEN`      | `valid.proof_len()`      |
-| `VERIFIER_LEN`   | `valid.verifier_len()`   |
-| `Measurement`    | `valid.Measurement`      |
-| `AggResult`      | `valid.AggResult`        |
-| `field`          | `valid.field`            |
+| Parameter        | Value                                            |
+|:-----------------|:-------------------------------------------------|
+| `valid`          | An instance of `Valid` ({{flp-bbcggi19-valid}}). |
+| `field`          | `valid.field`                                    |
+| `PROVE_RAND_LEN` | `valid.prove_rand_len()`                         |
+| `QUERY_RAND_LEN` | `valid.query_rand_len()`                         |
+| `JOINT_RAND_LEN` | `valid.JOINT_RAND_LEN`                           |
+| `MEAS_LEN`       | `valid.MEAS_LEN`                                 |
+| `OUTPUT_LEN`     | `valid.OUTPUT_LEN`                               |
+| `PROOF_LEN`      | `valid.proof_len()`                              |
+| `VERIFIER_LEN`   | `valid.verifier_len()`                           |
+| `Measurement`    | As defined by `valid`.                           |
+| `AggResult`      | As defined by `valid`.                           |
 {: #flp-bbcggi19-param title="FLP parameters for a validity circuit."}
 
+This section specifies an implementation of the `Flp` interface ({{flp}}) based
+on the construction from {{BBCGGI19}}, Section 4.2. The types and parameters
+required by this interface are listed in the table above.
+
+We begin in {{flp-bbcggi19-overview}} with an overview of the proof system and
+some extensions to it. {{flp-bbcggi19-valid}} defines validity circuits, the
+core component of the proof system that determines measurement validity and how
+measurements are aggregated. The proof-generation algorithm, query algorithm,
+and decision algorithm are defined in {{flp-bbcggi19-construction-prove}},
+{{flp-bbcggi19-construction-query}}, and {{flp-bbcggi19-construction-decide}}
+respectively.
+
 ### Overview {#flp-bbcggi19-overview}
 
-Conventional zero-knowledge proof systems involve two parties:
+An FLP is a type of "zero-knowledge proof". A conventional zero-knowledge proof
+system involves two parties:
 
 * The prover, who holds a measurement and generates a proof of the
   measurement's validity
@@ -3368,8 +3344,8 @@ This circuit contains one subtraction gate (`x-1`) and one multiplication gate
 (`x * (x-1)`). Observe that `C(x) = 0` if and only if `x` is in `[0, 2)`.
 
 The goal of the proof system is to allow each Aggregator to privately and
-correctly compute a secret share of `C(x)` from its secret share of `x`. This
-way all they need to determine validity is to add up their shares of `C(x)`.
+correctly compute a share of `C(x)` from its share of `x`. Then all they need
+to determine validity is to broadcast their shares of `C(x)`.
 
 Suppose for a moment that `C` is an affine arithmetic circuit, meaning its only
 operations are addition, subtraction, and multiplication-by-constant. (The
@@ -3399,8 +3375,8 @@ Applying this idea to the example circuit `C` above:
 1. The Client, given its measurement `x`, constructs the lowest degree
    polynomial `p` for which `p(0) = s` and `p(1) = x * (x-1)`, where `s` is a
    random blinding value generated by the Client. (The blinding value is to
-   protect the privacy of the measurement.) It then sends secret shares of `x`
-   and the coefficients of `p` to each of the Aggregators.
+   protect the privacy of the measurement.) It then sends shares of `x` and
+   shares of the coefficients of `p` to each of the Aggregators.
 
 1. Each Aggregator locally computes and broadcasts its share of `p(1)`, which
    is equal to its share of `C(x)`.
@@ -3426,7 +3402,7 @@ a malicious Client to produce a gadget polynomial `p` that would result in
 measurement being accepted. To prevent this, the Aggregators perform a
 probabilistic test to check that the gadget polynomial was constructed
 properly. This "gadget test", and the procedure for constructing the
-polynomial, are described in detail in {{flp-bbcggi19}}.
+polynomial, are described in detail in {{flp-bbcggi19-construction-prove}}.
 
 #### Extensions {#flp-bbcggi19-overview-extensions}
 
@@ -3451,8 +3427,8 @@ C(x, r) = r * Range2(x[0]) + ... + r^N * Range2(x[N-1])
 (Note that this is a special case of {{BBCGGI19}}, Theorem 5.2.) Here `x` is
 the length-`N` input and `r` is a random field element. The gadget circuit
 `Range2` is the "range-check" polynomial described above, i.e., `Range2(x) =
-x^2 - x`. The idea is that, if `x` is valid (i.e., each `x[j]` is in
-`[0, 2)`), then the circuit will evaluate to zero regardless of the value of
+x^2 - x`. The idea is that, if `x` is valid, i.e., each `x[j]` is in
+`[0, 2)`, then the circuit will evaluate to zero regardless of the value of
 `r`; but if some `x[j]` is not in `[0, 2)`, then the output will be non-zero
 with high probability.
 
@@ -3473,12 +3449,11 @@ circuit can be expressed using a simpler gadget, namely multiplication, but the
 resulting proof would be longer.
 
 Third, rather than interpolate the gadget polynomial at inputs `1`, `2`, ...,
-`j`, ..., where `j` is the `j`-th invocation of the gadget, we use powers of
-`alpha`, where `alpha` is a root of unity for the field. This allows us to
-construct each gadget polynomial via the number theoretic transform {{SML24}},
-which is far more efficient than  generic formulas. Note that the roots of
-unity are powers of the generator of NTT-friendly fields (see
-{{field-ntt-friendly}}).
+`j`, ..., where `j` is the `j`-th invocation of the gadget, we use roots of
+unity for the field. This allows us to construct each gadget polynomial via the
+number theoretic transform {{SML24}}, which is far more efficient than generic
+formulas. Note that the roots of unity are powers of the generator for the
+NTT-friendly field (see {{field-ntt-friendly}}).
 
 Finally, the validity circuit in our FLP may have any number of outputs (at
 least one). The input is said to be valid if each of the outputs is zero. To
@@ -3489,17 +3464,17 @@ probability.
 
 ### Validity Circuits {#flp-bbcggi19-valid}
 
-| Parameter                 | Description                           |
-|:--------------------------|:--------------------------------------|
-| `GADGETS: list[Gadget]`   | A list of gadgets                     |
-| `GADGET_CALLS: list[int]` | Number of times each gadget is called |
-| `MEAS_LEN: int`           | Length of the measurement             |
-| `JOINT_RAND_LEN: int`     | Length of the random input            |
-| `EVAL_OUTPUT_LEN: int`    | Length of the circuit output          |
-| `OUTPUT_LEN: int`         | Length of the aggregatable output     |
-| `Measurement`             | The type of measurement               |
-| `AggResult`               | Type of the aggregate result          |
-| `field: type[F]`          | Class object for the field            |
+| Parameter                 | Description                                         |
+|:--------------------------|:----------------------------------------------------|
+| `GADGETS: list[Gadget]`   | A list of gadgets.                                  |
+| `GADGET_CALLS: list[int]` | Number of times each gadget is called.              |
+| `MEAS_LEN: int`           | Length of the measurement.                          |
+| `JOINT_RAND_LEN: int`     | Length of the joint randomness.                     |
+| `EVAL_OUTPUT_LEN: int`    | Length of the circuit output.                       |
+| `OUTPUT_LEN: int`         | Length of the aggregatable output.                  |
+| `Measurement`             | Type of the measurement.                            |
+| `AggResult`               | Type of the aggregate result.                       |
+| `field: type[F]`          | Class object for the field ({{field-ntt-friendly}}) |
 {: title="Validity circuit parameters."}
 
 An instance of the proof system is defined in terms of a validity circuit that
@@ -3515,19 +3490,19 @@ has the following interface:
   gadget `Mul(x,y) = x*y` has arity of 2.
 
 * `DEGREE: int` is the arithmetic degree of the gadget circuit. This is defined
-  to be the degree of the polynomial that computes it. This exists because
-  the circuit is arithmetic. For example, `Mul` has degree 2.
+  to be the degree of the polynomial that computes it. This exists
+  because the circuit is arithmetic. For example, `Mul` has degree 2.
 
-* `gadget.eval(field: type[F], inp: list[F]) -> F` evaluates the gadget circuit
-  over the given inputs and field.
+* `gadget.eval(field: type[F], inp: list[F]) -> F` evaluates the gadget over
+  the given inputs and field.
 
 * `gadget.eval_poly(field: type[F], inp_poly: list[list[F]]) -> list[F]` is the
   same as `eval()` except it evaluates the circuit over the polynomial ring of
-  the field. This is well defined since the circuit is arithmetic.
+  the field. This is well defined because the circuit is arithmetic.
 
 In addition to the list of gadgets, the validity circuit specifies how many
 times each gadget is called (`GADGET_CALLS`). It also specifies the length of
-the circuit's input (`MEAS_LEN`), the length of its random input
+the circuit's input (`MEAS_LEN`), the length of the joint randomness
 (`JOINT_RAND_LEN`), and the length of the circuit's output (`EVAL_OUTPUT_LEN`).
 
 A validity circuit also specifies parameters and methods needed for Prio3
@@ -3562,8 +3537,8 @@ def proof_len(self) -> int:
     """Length of the proof."""
     length = 0
     for (g, g_calls) in zip(self.GADGETS, self.GADGET_CALLS):
-        P = next_power_of_2(1 + g_calls)
-        length += g.ARITY + g.DEGREE * (P - 1) + 1
+        p = next_power_of_2(1 + g_calls)
+        length += g.ARITY + g.DEGREE * (p - 1) + 1
     return length
 
 def verifier_len(self) -> int:
@@ -3627,8 +3602,8 @@ def prove(self,
         # Compute the wire polynomials for this gadget. For each `j`,
         # find the lowest degree polynomial `wire_poly` for which
         # `wire_poly(alpha^k) = g.wires[j][k]` for all `k`. Note that
-        # each `g.wires[j][0]` is set to seed of wire `j`, which is
-        # included in the prove randomness.
+        # each `g.wires[j][0]` is set to the seed of wire `j`, which
+        # is included in the prove randomness.
         #
         # Implementation note: `alpha` is a root of unity, which
         # means `poly_interp()` can be evaluated using the NTT. Note
@@ -3686,17 +3661,18 @@ To start a gadget test, we first construct the (shares of the) wire polynomials
 just as the prover did. First, we record `g.wires[j][k]` as the input (share)
 of the `j`-th wire of the `k`-th invocation of the gadget. Again, this is
 accomplished by a wrapper gadget, `QueryGadget`, listed in {{gadget-wrappers}}.
-This gadget also evaluates the gadget polynomial for each gadget invocation in order to produce the gadget's output. Then we
-compute the wire polynomials from the recorded values.
+This gadget also evaluates the gadget polynomial for each gadget invocation in
+order to produce the gadget's output. Then we compute the wire polynomials from
+the recorded values.
 
 Next, we choose a random point `t` (parsed from the query randomness), evaluate
 each wire polynomial at `t`, and evaluate the gadget polynomial at `t`. The
 results are recorded in the verifier message passed to the decision algorithm,
 where we finish the test.
 
-The random point `t` MUST NOT be one of the fixed evaluation points used to interpolate the wire
-polynomials. Otherwise, the verifier message may partially leak the encoded
-measurement.
+The random point `t` MUST NOT be one of the fixed evaluation points used to
+interpolate the wire polynomials. Otherwise, the verifier message may partially
+leak the encoded measurement.
 
 ~~~ python
 def query(self,
@@ -3781,7 +3757,7 @@ output will not equal the gadget check with high probability.
 ~~~ python
 def decide(self, verifier: list[F]) -> bool:
     # Check the output of the validity circuit.
-    v, verifier = verifier[0], verifier[1:]
+    ([v], verifier) = front(1, verifier)
     if v != self.field(0):
         return False
 
@@ -3799,9 +3775,9 @@ def decide(self, verifier: list[F]) -> bool:
 
 This section specifies instantiations of Prio3 for various aggregation tasks.
 Each variant is determined by a field ({{field}}), a validity circuit
-({{flp-bbcggi19-valid}}), an XOF ({{xof}}), and the number of proofs to
-generate and verify. Test vectors for each can be found in {{test-vectors}}.
-All gadgets are listed in {{gadgets}}.
+({{flp-bbcggi19-valid}}),and the number of proofs to generate and verify. All
+gadgets are listed in {{gadgets}}. Test vectors for each can be found in
+{{test-vectors}}.
 
 ### Prio3Count
 
@@ -3816,7 +3792,7 @@ Our first variant of Prio3 is for a simple counter: each measurement is either
 one or zero and the aggregate result is the sum of the measurements. Its
 validity circuit uses the multiplication gadget `Mul` specified in
 {{gadget-mul}}, which takes two inputs and multiplies them. The circuit is
-specified below.
+specified below:
 
 ~~~ python
 class Count(Valid[int, int, F]):
@@ -3875,22 +3851,22 @@ two encoded integers are consistent. Let
 * `offset = 2^bits - 1 - max_measurement`
 
 The first bit-encoded integer is the measurement itself. Note that only
-measurements between `0` and `2^bits - 1` can be encoded this way with `bits`
-many bits. The second bit-encoded integer is the sum of the measurement and
+measurements between `0` and `2^bits - 1` can be encoded this way with as many
+bits. The second bit-encoded integer is the sum of the measurement and
 `offset`. Observe that this sum can only be encoded this way if it is between
 `0` and `2^bits - 1`, which implies that the measurement is between `-offset`
 and `max_measurement`.
 
 The circuit first checks that each entry of both bit vectors is a one or a
 zero. It then decodes both the measurement and the offset measurement, and
-subtracts `offset` from the latter. It then checks if these two values are
+subtracts the offset from the latter. It then checks if these two values are
 equal. Since both the measurement and the measurement plus `offset` are in the
-same range of `[0, 2^bits)`, this means that the measurement itself is
-between `0` and `max_measurement`.
+same range of `[0, 2^bits)`, this means that the measurement itself is between
+`0` and `max_measurement`.
 
 The circuit uses the polynomial-evaluation gadget `PolyEval` specified in
 {{gadget-poly-eval}}. The polynomial is `p(x) = x^2 - x`, which is equal to `0`
-if and only if `x` is in `[0, 2)`. The complete circuit is specified below.
+if and only if `x` is in `[0, 2)`. The complete circuit is specified below:
 
 ~~~
 class Sum(Valid[int, int, F]):
@@ -3968,10 +3944,10 @@ consecutively, in LSB to MSB order.
 The validity circuit uses the `ParallelSum` gadget in {{gadget-parallel-sum}}.
 This gadget applies an arithmetic subcircuit to multiple inputs in parallel,
 then returns the sum of the results. Along with the subcircuit, the
-parallel-sum gadget is parameterized by an integer `count` specifying how many
-times to call the subcircuit. It takes in a list of inputs and passes them
-through to instances of the subcircuit in the same order. It returns the sum of
-the subcircuit outputs.
+parallel-sum gadget is parameterized by an integer, denoted `count`, specifying
+how many times to call the subcircuit. It takes in a list of inputs and passes
+them through to instances of the subcircuit in the same order. It returns the
+sum of the subcircuit outputs.
 
 Note that only the `ParallelSum` gadget itself, and not its subcircuit,
 participates in the FLP's wire recording during evaluation, gadget consistency
@@ -3989,7 +3965,7 @@ vector, and the other is equal to the same measurement element minus one. These
 `Mul` subcircuits are evaluated by a `ParallelSum` gadget, and the results are
 added up both within the `ParallelSum` gadget and after it.
 
-The complete circuit is specified below.
+The complete circuit is specified below:
 
 ~~~ python
 class SumVec(Valid[list[int], list[int], F]):
@@ -4080,13 +4056,18 @@ The `chunk_length` parameter provides a trade-off between the arity of the
 gadget is called. The proof length is asymptotically minimized when the chunk
 length is near the square root of the length of the measurement. However, the
 relationship between VDAF parameters and proof length is complicated, involving
-two forms of rounding (the circuit pads the inputs to its last `ParallelSum`
-gadget call, up to the chunk length, and proof system rounds the degree of wire
-polynomials -- determined by the number of times a gadget is called -- up to
-the next power of two). Therefore, the optimal choice of `chunk_length` for a
-concrete measurement size will vary, and must be found through trial and error.
-Setting `chunk_length` equal to the square root of the appropriate measurement
-length will result in proofs up to 50% larger than the optimal proof size.
+two forms of rounding:
+
+  * The circuit pads the inputs to its last `ParallelSum` gadget call, up to
+    the chunk length.
+
+  * The proof system rounds the degree of wire polynomials, determined by the
+    number of times a gadget is called, up to the next power of two.
+
+Therefore, the optimal choice of `chunk_length` for a concrete measurement size
+will vary, and must be found through trial and error. Setting `chunk_length`
+equal to the square root of the appropriate measurement length will result in
+proofs up to 50% larger than the optimal proof size.
 
 ### Prio3Histogram
 
@@ -4216,10 +4197,10 @@ class Histogram(Valid[int, list[int], F]):
 
 For this instance of Prio3, each measurement is a vector of ones and zeros,
 where the number of ones is bounded. This provides a functionality similar to
-Prio3Histogram except that more than one entry may be non-zero. This allows
-Prio3MultihotCountVec to be composed with a randomized response mechanism, like
-{{EPK14}}, for providing differential privacy. (For example, each Client would
-set each entry to one with some small probability.)
+Prio3Histogram except that more than one entry (or none at all) may be
+non-zero. This allows Prio3MultihotCountVec to be composed with a randomized
+response mechanism, like {{EPK14}}, for providing differential privacy. (For
+example, each Client would set each entry to one with some small probability.)
 
 The validity circuit is denoted `MultihotCountVec` and has three parameters:
 `length`, the number of entries in the count vector; `max_weight`, the
diff --git a/poc/vdaf_poc/flp_bbcggi19.py b/poc/vdaf_poc/flp_bbcggi19.py
index 8dc7ead6..2a4b9d3e 100644
--- a/poc/vdaf_poc/flp_bbcggi19.py
+++ b/poc/vdaf_poc/flp_bbcggi19.py
@@ -108,8 +108,8 @@ def proof_len(self) -> int:
         """Length of the proof."""
         length = 0
         for (g, g_calls) in zip(self.GADGETS, self.GADGET_CALLS):
-            P = next_power_of_2(1 + g_calls)
-            length += g.ARITY + g.DEGREE * (P - 1) + 1
+            p = next_power_of_2(1 + g_calls)
+            length += g.ARITY + g.DEGREE * (p - 1) + 1
         return length
 
     def verifier_len(self) -> int:
@@ -353,8 +353,8 @@ def prove(self,
             # Compute the wire polynomials for this gadget. For each `j`,
             # find the lowest degree polynomial `wire_poly` for which
             # `wire_poly(alpha^k) = g.wires[j][k]` for all `k`. Note that
-            # each `g.wires[j][0]` is set to seed of wire `j`, which is
-            # included in the prove randomness.
+            # each `g.wires[j][0]` is set to the seed of wire `j`, which
+            # is included in the prove randomness.
             #
             # Implementation note: `alpha` is a root of unity, which
             # means `poly_interp()` can be evaluated using the NTT. Note
@@ -444,7 +444,7 @@ def decide(self, verifier: list[F]) -> bool:
         assert len(verifier) == self.VERIFIER_LEN  # REMOVE ME
 
         # Check the output of the validity circuit.
-        v, verifier = verifier[0], verifier[1:]
+        ([v], verifier) = front(1, verifier)
         if v != self.field(0):
             return False
 
diff --git a/poc/vdaf_poc/vdaf_prio3.py b/poc/vdaf_poc/vdaf_prio3.py
index 66605b9b..bbfac282 100644
--- a/poc/vdaf_poc/vdaf_prio3.py
+++ b/poc/vdaf_poc/vdaf_prio3.py
@@ -122,10 +122,6 @@ def is_valid(
             self,
             _agg_param: None,
             previous_agg_params: list[None]) -> bool:
-        """
-        Checks if `previous_agg_params` is empty, as input shares in
-        Prio3 may only be used once.
-        """
         return len(previous_agg_params) == 0
 
     # NOTE: The prep_init(), prep_next(), and prep_shares_to_prep()
@@ -164,7 +160,7 @@ def prep_init(
             joint_rands = self.joint_rands(
                 ctx, corrected_joint_rand_seed)
 
-        # Query the measurement and proof share.
+        # Query the measurement and proof(s) share(s).
         query_rands = self.query_rands(verify_key, ctx, nonce)
         verifiers_share = []
         for _ in range(self.PROOFS):
@@ -208,7 +204,7 @@ def prep_shares_to_prep(
             ctx: bytes,
             _agg_param: None,
             prep_shares: list[Prio3PrepShare[F]]) -> Optional[bytes]:
-        # Unshard the verifier shares into the verifier message.
+        # Unshard each set of verifier shares into each verifier message.
         verifiers = self.flp.field.zeros(
             self.flp.VERIFIER_LEN * self.PROOFS)
         joint_rand_parts = []
@@ -218,7 +214,7 @@ def prep_shares_to_prep(
                 assert joint_rand_part is not None
                 joint_rand_parts.append(joint_rand_part)
 
-        # Verify that each proof is well-formed and input is valid
+        # Verify that each proof is well-formed and input is valid.
         for _ in range(self.PROOFS):
             verifier, verifiers = front(self.flp.VERIFIER_LEN, verifiers)
             if not self.flp.decide(verifier):
@@ -311,7 +307,7 @@ def shard_without_joint_rand(
             )
 
         # Each Aggregator's input share contains its measurement share
-        # and share of proof(s).
+        # and its share of the proof(s).
         input_shares: list[Prio3InputShare[F]] = []
         input_shares.append((
             leader_meas_share,
@@ -364,7 +360,7 @@ def shard_with_joint_rand(
         joint_rand_parts.insert(0, self.joint_rand_part(
             ctx, 0, leader_blind, leader_meas_share, nonce))
 
-        # Generate the proof and shard it into proof shares.
+        # Generate each proof and shard it into proof shares.
         prove_rands = self.prove_rands(ctx, prove_seed)
         joint_rands = self.joint_rands(
             ctx, self.joint_rand_seed(ctx, joint_rand_parts))