owned this note
owned this note
Published
Linked with GitHub
# BBS+ Improvement Proposal
There is quite a bit of intereset in the BBS+ signature scheme as implemented in Hyperledger Ursa. A few forks have been made in Golang and C. As it stands today BBS+ is implemented as follows:
1- It uses the Hash to curve IETF [draft 11](https://datatracker.ietf.org/doc/draft-irtf-cfrg-hash-to-curve/) with BLS12381G1\_XMD:BLAKE2B\_SSWU\_RO\_BBS+\_SIGNATURES:1_0_0 as the domain separation tag.
1- It uses Blake2b as its hashing function for hash to curve
1- Generating secret keys happens identically to [BLS IETF draft vs](https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-bls-signature-04) but uses `BBS-SIG-KEYGEN-SALT-` as the salt and HKDF with Blake2b instead of SHA256.
1- Hashing to a field uses Blake2b with a 48 byte output.
1- Selective disclosure proofs solve the equations in section 4.5 from the [CDL](https://eprint.iacr.org/2016/663.pdf) which requires sending two commitments in G1 and 4+ field elements.
1- The signature runs on top of the BLS12-381 curve
1- The public keys come in two forms: short with just a commitment to the secret key in G2, and full which includes generators for the messages that will be signed
1- The signature generates two random 256-bit nonces in the signature - `e` and `s`
While none of these choices is inherently wrong, there are some changes that can be made to enhance the protocol and it's efficiency.
This proposal contains the improvements and enhancements with an explaination of the change.
## Change the hashing function from Blake2b to Shake256
Shake256 an extensible output function (XOF) and a member of the SHA3 family which has been accepted by NIST. The 256-bits in the name is equivalent to 256-bits of post quantum security. The output size can practically be any value less than 8KB. This moves the BBS+ scheme one step closer to NIST and FIPS compliance. Some performance will be lost when compared to Blake2b, however, it will be minimal since the hash function is not the most commonly used operation. It is also used by [Ed448](https://tools.ietf.org/html/rfc8032).
### Notation
| Parameter | Name | Value |
| -------- | -------- | -------- |
| Field Modulus | p | 0x1a0111ea397fe69a4b1ba7b6434bacd764774b84f38512bf6730d2a0f6b0f6241eabfffeb153ffffb9feffffffffaaab |
| Subgroup Size | q | 0x73eda753299d7d483339d80809a1d80553bda402fffe5bfeffffffff00000001 |
| Byte concatenation | \|\| | a \|\| b = ab |
| G1 Base Point | P | x=0x120177419e0bfb75edce6ecc21dbf440f0ae6acdf3d0e747154f95c7143ba1c17817fc679976fff55cb38790fd530c16 y=0x0bbc3efc5008a26a0e1c8c3fad0059c051ac582950405194dd595f13570725ce8c22631a7918fd8ebaac93d50ce72271 |
| G2 Base Point | <span style="text-decoration:overline">P</span> | x=0x058191924350bcd76f67b7631863366b9894999d1a3caee9a1a893b53e2ae580b3f5fb2687b4961af5f28fa202940a1011922a097360edf3c2b6ed0ef21585471b1ab6cc8541b3673bb17e18e2867806aaa0c59dbccd60c3a5a9c0759e23f606 y=0x0083fd8e7e80dae507d3a975f0ef25a2bbefb5e96e0d495fe7e6856caa0a635a597cfa1f5e369c5a4c730af860494c4a0b2bc2a163de1bf2e7175850a43ccaed79495c4ec93da33a86adac6a3be4eba018aa270a2b1461dcadc0fc92df64b05d |
| Point at infinity in G1 | 1<sub>G1</sub> | |
| Point at infinity in G2 | 1<sub>G2</sub> | |
| Point at infinity in GT | 1<sub>GT</sub> | |
| Ate Type-3 pairing | e() | |
### Hash arbitrary length sequence to cryptographic value
Strings are not directly signed by cryptography, usually the hash of a value is. However, simply hashing a string then reducing by the modulus to the same size as a field element often introduces bias--not uniformly random. The modulo bias should be corrected by using the following method.
This method should also be used when hashing to compute the fiat-shamir heuristic since the SHA3 familty does not suffer from length extension attacks.
`dst` a domain separation tag. This value is optional but should be used to distinguish between contexts like signing or creating a proof.
`value` an variable length byte sequence input
`L` the number of bytes to output which should be calculated as ceil((ceil(log2\(p\)) + k) / 8), where k is the security parameter of the suite (e.g., k = 256). For BLS12-381, L = 64
```
hash_to_field_element(value, dst, q):
tv = SHAKE256(value || dst, L)
return tv mod q
```
One method to achieve this is done by ZCash by decomposing it into two 256-bit digits with the higher bits multiplied by 2<sup>256</sup>. Thus, performing two reductions
1. the lower bits are multiplied by `R`<sup>2</sup>, as normal
2. the upper bits are multiplied by `R`<sup>2</sup> * 2<sup>256</sup> = `R`<sup>3</sup>
and computing their sum in the field. It remains to see that arbitrary 256-bit numbers can be placed into Montgomery form safely using the reduction. The reduction works so long as the product is less than R multiplied by the modulus. This holds because for any `c` smaller than the modulus, we have that (2<sup>256</sup> - 1)\*c is an acceptable product for the reduction. Therefore, the reduction always works so long as `c` is in the field; in this case it is either the constant `R2` or `R3` where `R = 2`<sup>256</sup>`mod p`, `R`<sup>2</sup>`= 2`<sup>512</sup>`mod p`, `R`<sup>3</sup>` = 2`<sup>768</sup>`mod p`.
Another approach is to use F = 2<sup>192</sup> mod `q` and compute from `L` bytes. The first half in big endian are converted to a scalar `d0`, the last half in big endian are convert to a scalar `d1`. The resulting value is computed as `d0` + `F` * `d1`.
## Change selective disclosure proofs to just be Schnorr Proofs and Fiat-Shamir
Currently the BBS+ selective disclosure proof has three subproofs: Proof1 and Proof2. Proof1 is fixed in size and links the signature to the hidden messages.
- Proof1
- Point (48 bytes) T1 = e' A' + r<sub>2</sub>'' H<sub>0</sub>
- Scalar (32 bytes) e^= e' + ce
- Scalar (32 bytes) r<sub>2</sub>^ = r<sub>2</sub>' + cr<sub>2</sub>
Proof 2 includes a proof for each hidden message that is not disclosed to a verifier and varies in size depending on each hidden message.
- Proof2
- Point (48 bytes) T2 = -r<sub>3</sub>' D + s' H<sub>0</sub> + m<sub>i</sub>'H<sub>i</sub>
- Scalar (32 bytes) r<sub>3</sub>^
- Scalar (32 bytes) s'^
- Scalar (32 bytes) m<sub>i</sub>^
Proof 3
This is the signature proof of knowledge is fixed in size to be 144 bytes.
Verification happens by recomputing the equivalents points of T1 and T2 and comparing to see if they equal the originals sent by the Prover. While not inherently wrong, it is not necessary. The points for T1 and T2 do not need to be sent as part of the proof thus saving 96 bytes of proof space and reducing comparisons from 3 to 1 and three scalar multiplies.
Instead, verification can be done be simply recomputing the fiat shamir challenge sent by the prover and doing a single check as is common is many Schnorr based proofs.
In summary, instead of computing `T1 == vT1`, `T2 == vT2`, `c == vc`, it can be simplified to just c == vc. (Note: the `v` prefix is for verifier).
This allows the proofs to just be (4 + hidden message cnt) * 32 bytes. For example, a proof with 5 hidden messages yields a proof size 432 bytes instead of 528.
## Deterministically compute `e` and `s`
Random signatures have been criticized rightly so due to the inherint weakness of random number generators and the ability to predict them leaks the secret key as in ECDSA. While this weakness does not currently exist in BBS+, it is easy to change this so signing generatings deterministic nonces similar to [RFC6979](https://tools.ietf.org/html/rfc6979).
`e` then followed by `s` can be generated determinstically in a secure manner by computing the following:
```
xof = SHAKE256(sk || H0 || H1 ... || Hn || m1 || m2 ... || mn)
e = xof.Read(64) mod q
s = xof.Read(64) mod q
```
This computation is the byte concatenation of the secret key, all message generators in uncompressed form, and all the messages to be signed using SHAKE256. The resulting XOF is then used to read a 64 byte output for both `e` and `s` which are then reduced by the field modulus as described [earlier](https://hackmd.io/Q587Q9p7T5ab30NTn4MvTA#Hash-arbitrary-length-sequence-to-cryptographic-value).
## No two different public keys
This proposals now says the short form of the public key is now the only public key going forward. Instead of the message generators being tied to the public key, they should be known by what they really are, message generators. Because BLS12-381 does not have a prime order subgroup, care should be given to how the generators are created since not every generator is equal. For example, generating a random scalar and multiplying by the base point should not be done as explained in the hash to curve [IETF spec](https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-hash-to-curve-11#appendix-A).
Message generators should be in the G1 group. The G2 group can work but will be much slower, but may be useful in situations where G2 is required.
Message generators can be created randomly one time and reused but will need to be published alongside the public key.
Message generators include an extra generator for the BBS+ blinding factor `s`. They are labeled as H<sub>0</sub> for the blinding factor `s`, H<sub>1</sub> for the first message to be signed, H<sub>2</sub> for the second and so on.
## Data Encoding
Data can be encoded multiple ways depending on how it will be used with other ZKPs. For example, it will not be useful to hash an integer to a 32 byte value using SHA256 and expect it to work properly with Range Proofs. The following is a list of encodings and the use cases they can be used:
1. **Hashed**: for aribitrary length data fields that will not fit in the base field e.g. images, strings, biometrics, blockchain transaction info. Use hash2field or some other method to hash the data into a 32 byte base field value.
2. **Bytes**: a value that is already in the base field
3. **Numbers**: for range proofs. The value is zero-centered by take computing the base field modulus `z`=`q` / 3 integer division as the zero center and adding the positive number or substracting the negative number. To support complex numbers like decimal, the number is converted to fixed point arithmetic. The [Q format](https://en.wikipedia.org/wiki/Q_(number_format)) is used for these numbers as Q64.160. This leaves two bytes to avoid numbers greater than `q`. While it does not represent the full breadth of IEEE754 numbers, it does give a considerable resolution -2<sup>63</sup> to 2<sup>63</sup>-2<sup>-160</sup> signed and 0 to 2<sup>64</sup>-2<sup>-160</sup> unsigned or about 48 decimal places of precision. If decimals are rounded to the nearest integer, then Q format is not necessary.
4. **Null**: Hard coded as 5. Means the value is not included or not used.
5. **Empty**: Hard coded as 11. Means the value is included but is an empty string or zero bytes. For example, a person does not have a middle name but a value is required to be entered.
6. **Ignored**: Hard coded as 23. For data fields that were ignored or not answered by the person. This is different than empty where the person's answer is literally nothing vs they chose not to answer it at all.
Small safe primes were used for the last three but can really be any value that is not `0` or `1`.
# Algorithms
## Keygen
### Use the BLS signature key generation scheme
The proposal here is to drop the separate salt value for BBS+. This allows any BLS key to be used for BBS+. The public key should be in G2 since BBS+ signatures in G1 are much more efficient.
For generating keys see the section 2.3 in the [BLS Signature IETF spec](https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-bls-signature-04#section-2.3).
Outputs:
sk = SecretKey
sk * P = Q = G1 Public Key
sk * <span style="text-decoration:overline">P</span> = <span style="text-decoration:overline">Q</span> = G2 PublicKey
## MessageGenerator
The proposal here is to continue to use the previous algorithm to compute deterministic message generators using hash to curve as defined in the IETF draft with the domain separation tag as `BLS12381G1_XMD:SHA256_SSWU_RO` or `BLS12381G2_XMD:SHA256_SSWU_RO`
H<sub>0</sub> <- H2C(Uncompressed(<span style="text-decoration:overline">Q</span>) || I2OSP(0, 4) || I2OSP(0, 1) || I2OSP(message_count, 4))
H<sub>1</sub> <- H2C(Uncompressed(<span style="text-decoration:overline">Q</span>) || I2OSP(1, 4) || I2OSP(0, 1) || I2OSP(message_count, 4))
...
where H2C is the hash to curve function.
## Message Encoding
Messages must be encoded to a field scalar before it can be signed. The encoding depends on the type of data to be signed. For example, numbers should be encoded in order to support range proofs vs arbitrary length strings must be encoded to a fixed size regardless of their length. Below is the currently recommended encodings where *msg* is the original value to be signed as the input and *m_i* is a field scalar that will be signed as the output.
**Strings**
- H = SHAKE256
- b[64] = SHAKE(msg)
- output b mod q
**Integers**
- o = q/4
- output o + msg
**Floating point**
Round to nearest even integer
**Unique Identifiers**
-
## Sign
The signing algorithm takes as inputs the message generators, the secret key, and all the messages that will be signed.
Algorithm: `sig = Sign(sk, gens, msgs)`
Inputs:
1. `sk` - The BLS secret key.
2. `gens` - Elliptic Curve Points in G1 (preferrably) or in G2 where gens[0] = H<sub>0</sub>, gens[1] = H<sub>1</sub>, etc.
3. `msgs` - Messages to be signed as scalar values.
Outputs:
1. `sig` - The BBS+ signature
Steps:
1. if len(`gens`) < len(`msgs`) return Err
1. if len(msgs) < 0 return Err
1. if `sk` == 0 return Err
1. xof = SHAKE256(`sk` || `gens[0]` || `gens[1]` || ... || `msgs[0]` || `msgs[1]`)
1. e = xof.Read(64) mod q until e != 0
3. s = xof.Read(64) mod q until s != 0
4. B = G<sub>1</sub> + `gens[0]` * s + `gens[1]` * `msgs[0]`+ ... + `gens[n]` * `msgs[n-1]`
5. x = (`sk` + e)<sup>-1</sup> mod q
6. A = x * B
7. sig = {A, e, s}
8. return sig
## Verify
The verify algorithm takes as inputs a public key in the opposite group as the signature, the signature, the message generators, and the signed messages and outputs whether the signature is valid over the generators and messages.
Algorithm: `ISVALID = verify(sig, `<span style="text-decoration:overline">`Q`</span>`, gens`, `msgs`)
Inputs:
1. `sig` - The BBS+ signature
2. <span style="text-decoration:overline">`Q`</span> - The BLS public key
3. `gens` - Elliptic Curve Points in G1 (preferrably) or in G2 where gens[0] = H<sub>0</sub>, gens[1] = H<sub>1</sub>, etc.
4. `msgs` - Messages to are signed as scalar values.
Outputs:
1. `ISVALID` - true if the signature, generators, and messages all match, false otherwise
Steps:
1. if len(`gens`) < len(`msgs`) return False
1. if len(msgs) < 0 return False
1. if <span style="text-decoration:overline">`Q`</span> == 1<sub>G2</sub> return False
1. A, e, s = `sig`
1. if A == 1<sub>G1</sub> return False
1. B = G<sub>1</sub> + `gens[0]` * s + `gens[1]` * `msgs[0]`+ ... + `gens[n]` * `msgs[n-1]`
1. return e(A, <span style="text-decoration:overline">P</span> * e + <span style="text-decoration:overline">Q</span>) * e(B, -<span style="text-decoration:overline">P</span>) == 1<sub>GT</sub>
## Proof Generation
There are two scenarios with BBS+ signatures where zero-knowledge proofs (ZKPs) are used: blind signatures, and signature proofs of knowledge. All ZKPs should be non-interactive and use the [Strobe](https://strobe.sourceforge.io/specs/) framework to compute the fiat-shamir heuristic at the 128-bit level. [Merlin transcripts](https://merlin.cool/) are one method to make this system even more simple to use.
### Blind signatures
Blind signatures allow a potential signature holder to hide all or a subset of messages and the final signature output from the signer.
The protocol requires 3 functions: first the holder hides the desired messages and computes a proof knowledge of the hidden messages, second the signer checks the proof of knowledge and if valid computes a blind signature, and third the holder unblinds the signature.
### Signature proof of knowledge and selective disclosure
Signature proof of knowledge allows a holder to present a proof of a valid signature instead of the signature itself. Selective disclosure proof is a proof knowledge of signed messages instead of revealing messages. None to all messages can be revealed or hidden.