Zero knowlege proof of compositeness

110 points by ColinWright a day ago

tooltower 21 hours ago

Are we sure that the base reveals nothing about the factors if n is composite? I have never seen a proof of that.

Usually, zero knowledge proofs also require a prover who knows the answer (the factors in this case). This is just a primality test that can be performed locally.

madars 21 hours ago

That's a very good question. It all depends on how you pick the witness b: there is a procedure that definitely is not zero-knowledge: say, if prover uses his knowledge of factorization to construct an explicit b that betrays that factorization.
For example, if n = p1*p2*...*pk is square-free and not a Carmichael number, then by Korselt's criterion there exists a pi such that pi-1 does not divide n-1 (this also implies that pi>2). Use the Chinese Remainder Theorem to produce b such that b=1 (mod pj) for all j!=i, and b (mod pi) is a generator of (Z/piZ)^*. Then b is a Fermat witness: gcd(b, n) = 1 (because b is non-zero modulo every prime factor) and b^(n-1) != 1 (mod n) because b^(n-1) != 1 (mod pi) (as pi-1 does not divide n-1).
However, b "betrays" the prime factorization of n, since gcd(b-1, n)>1 (by construction b-1 is divisible by all pj with j!=i, but not divisible by pi>2), and thus gcd(b-1, n) is a non-trivial factor of n. (I assumed square-free above but if pi^ei (ei>=2) divides n, then b=1+pi^(ei-1) (mod pi^ei), b=1 (mod pj^ej) (j!=i) also would have worked.)
On the other hand, it is also known that for non-Carmichael numbers at least half of the bases b with gcd(b, n) = 1 are Fermat witnesses. So if you pick b uniformly at random, the verifier does not gain any new information from seeing b: they could have sampled such a witness themselves by running the same random test. Put another way, the Fermat test itself is an OK ingredient, but a prover who chooses b in a factorization-dependent way can absolutely leak the factors - the final protocol won't be ZK.
phkahler 19 hours ago

On a related note, I've often wondered what each congruence in the quadratic seive reveals. Once you have enough of them you can factor the number, but what does a partial set of congruences reveal?
Its a matrix problem, so each row could be reducing the degrees of freedom of something. But what? And in what space?
- arcastroe 17 hours ago
  
  If:
  a^2 = b^2 (mod m)
  Then:
  a^2 - b^2 = 0 (mod m)
  (a + b)(a - b) = 0 (mod m)
  So, (a + b) must be a multiple of one of the factors of m. And (a - b) must be a multiple of the other factor of m.
  > I've often wondered what each congruence in the quadratic seive reveals.
  Each congruence reveals that the sum of the bases (a plus b) contains a factor of m. And the difference of the bases (a minus b) contains another factor of m.
  The only thing you have to watch out for is the trivial case when one of the factors you find through this method is "1" and the other factor is "m". That case isn't very helpful.
  It's not that each congruence gives you new information. You only have to find one single non-trivial congruence. But the other (trivial) congruences you find along the way only reveal that 1*m=m, which you already knew.
  - dmurray 16 hours ago
    
    > It's not that each congruence gives you new information
    So it's not that each congruence gives you N bits of information, and you want kN bits in total. It's more like each congruence has a 1/k chance of giving you the full kN bits.
    But in some information theory sense those are the same! Or concretely, if you were testing a large quantity of numbers in parallel, you would get information from each congruence.
    
    arcastroe 16 hours ago
    
    I suppose you're correct. Even if you find a trivial congruence, you do get some information. Mainly: "It's not that one!" :)
    The same information as trying two bases that don't form a quadratic congruence at all
    
    thaumasiotes 11 hours ago
    
    Well, if a+b and a-b are difficult to factor, finding more congruences might give you another pair x+y, x-y, and if that pair is nontrivial but also difficult to factor, you may be able to make progress by looking at x+y / a+b and x+y / a-b. As long as they're not integer multiples of each other, the combination of the two facts is more informative than either fact alone, in the sense that you're now trying to factor a smaller number.
  - phkahler 10 hours ago
    
    No, not that congruence. I mean a row in the big matrix. A residue r = x^2 mod m factored over the factor base. When you get enough of these you can derive a^2 - b^2 = 0 mod m. What does each factored r provide prior to having enough of them?
ColinWright 21 hours ago

My understanding is that there is a difference between the concept of a Zero-Knowledge Proof (ZKP), and then the applications that such a thing is possible.
In the example given, I can prove that N is composite without revealing anything (well, almost anything) about the factors. But in practice we want to use a ZKP to show that I have specific knowledge without revealing the knowledge itself.
For example:
You can give me a graph, and I can claim that I can three-colour it. You may doubt this, but there is a process by which I can ... to any desired level of confidence ... demonstrate that I have a colouring, without revealing what the colouring is. I colour the vertices RGB, map those colours randomly to ABC, and cover all the vertices. You choose any edge, and I reveal the "colours" (from ABC) of the endpoints. If I really can colour the graph then I will always be able to reveal two different colours. If I can't colour the graph then as we do this more and more, eventually I will fail.
So you are right, but the message of the post is, I think, still useful and relevant.
- hansvm 14 hours ago
  
  Suppose the graph admits only 4+ colorings, but when attempting a 3-coloring it's possible for only one edge to be misaligned. Then (A) you need O(n_edges) calls to the oracle to gain any confidence about the 3-colorability of a 3-colorable graph (else you might be easily duped by the one misaligned edge), and (B) in so doing, you learn almost all of the structure of the graph (since you have way more random calls than there are edges).
  Restating, not only is the ZK algorithm slow, but by the time you have confidence in the ZK proof you also have additional knowledge about the structure whose properties you're proving.
- mathgradthrow 19 hours ago
  
  can you explain this a little better?
  - ColinWright 19 hours ago
    
    I can certainly explain it more, a question of "better" is debatable!
    Here's the process:
    (A) You give me a graph to 3-colour;
    (B) I claim I can 3-colour it;
    (C) You demand that I prove it;
    (D) I colour it with colours ABC and cover the vertices;
    (E) You point at an edge;
    (F) I reveal the colours of the vertices at the ends of the edge;
    (G) If I have coloured the graph then the colours revealed will always be different;
    (H) We repeat this process with a permutation of the colours between each trial;
    (I) If I'm lying then eventually you'll pick an edge where either the vertices are not coloured, or the have the same colour.
    (J) This process reveals nothing about the colouring, but proves (to some level of confidence) that I'm telling the truth.
    So ... what's unclear?
    Instructions on how to email me are in my profile if you prefer ...
    
    kadoban 18 hours ago
    
    How do I know/prove that you're not just saying any random two colors for whichever edge I choose?
    
    ColinWright 18 hours ago
    
    The version I'm describing has it physically sitting in front of you at the time, so you can see that the colours haven't been changed "on the fly" after you pick an edge. In this version:
    (A) I colour it;
    (B) I cover the vertices so you can't see any of them, but I can no longer change them;
    (C) You choose the edge, and I reveal the endpoints.
    Converting this to a digital version requires further work ... my intent here was to explain the underlying idea that I can prove (to some degree of confidence) that I have a colouring without revealing anything about it.
    So just off the top of my head, for example, I can, for each vertex, create a completely random string that starts with "R", "G", or "B" depending on the colour of the vertex. Then I hash each of those, and send you all of them. You choose an edge and send me back the two hashes for the endpoints, and I provide the associated random strings so you can check that the hashes match.
    
    lordnacho 17 hours ago
    
    This reminds me of the "Where's Waldo (Wally in UK)" example:
    You can prove that you found Wally with a large piece of paper with a hole in it. You move the hole over Wally, and the person you're sitting with can see you found it, but he's no wiser about where.
    
    FridgeSeal 16 hours ago
    
    Another way is to get them to put marks/signatures over the back of the blank. Overlay to e blank, and cut Wally out of it where he occurs on the actual page and give them the cutout.
    
    fragmede 17 hours ago
    
    https://youtu.be/5qzNe1hk0oY for a video if you can't picture that.
  - fragmede 19 hours ago
    
    The key insight is that Colin can show you a red-green-blue coloring of the graph, and flip the whole graph secretly, so it's blue-red-green instead when you look at an individual section, but really the graph is yellow-pink-orange colored. Even after showing you all the intersections of the graph individually in the red green blue coloring to satisfy that he can 3-color it, you still have no idea what is yellow pink or orange on his copy of the graph.
schoen 21 hours ago

We also don't technically have proofs for some of the computational hardness assumptions that popular "real" ZK proof constructions rely on!
This might feel different because those assumptions were chosen in part because people had studied them and they certainly seem to be right, whereas perhaps here nobody has really studied this particular random number theory topic one way or the other.
But in some sense, there isn't a proof that regular ZK proof methods are actually completely zero-knowledge (against a computationally bounded adversary).

jeremysalwen 18 hours ago

To be honest I feel like I have seen much better expositions of zero knowledge proofs. The playing cards example is nice in some ways, but people are often exposed to trickery regarding playing cards. The recipient of the proof needs to verify that the deck of cards is a normal deck of cards, that no cards have been swapped out or altered, etc. These are all precisely the things that magicians are regularly able to fool people about. So really you have to make an additional assumption of "no funny business", which distracts from the mathematical core of what you are trying to demonstrate.

Likewise, the example of compositeness is a bit off because even though there is knowledge about the composite number that the proof does not reveal, that knowledge is in fact not known the to person constructing the proof either! The proof is not really zero knowledge either, since it gives the reader knowledge of a specific witness to its compositeness.

Even the wikipedia example of going into the cave (which used to be featured more prominently in the article) I think is terrible. Why wouldn't you just walk a loop to prove you know the way through the secret door? Also, it's clearly not zero knowledge, as it reveals some information about how quickly they can pass through the gate.

In general I think avoiding physical examples is necessary, since reality is complicated, and in the real world some information always leaks.

I think the best example for teaching about ZKPs is the graph isomorphism problem: Given two large graphs, you can prove that you know a isomorphism between two graphs by generating a new randomly labeled graph that is isomorphic to both of them and showing it to the provee, who can then ask you to demonstrate that this new graph is isomorphic to either graph A or graph B. Since you don't know ahead of time which one they will ask for, the only way you could consistently pass this test is if you actually do have a graph that was isomorphic to both A and B simultaneously. But since you only reveal one of the isomorphisms, it really is zero knowledge.

andreareina 18 hours ago

My conclusion from watching a lot of Penn and Teller is that when you're invited to examine the deck it probably is normal and often the trick will involve a force.

petermcneeley 20 hours ago

A much smaller simpler example would have been useful for us mere mortals.

Ok 6 is not a prime. 5=b is not a multiple of 6

5^(6-1) = 3125 mod 6 = 5 which is not 1. Therefore 6 cannot be prime.

jstanley 19 hours ago

> A zero knowledge proof (ZKP) answers a question without revealing anything more than answer. For example, a digital signature proves your possession of a private key without revealing that key.

I don't think a digital signature is a Zero-Knowledge Proof because someone else could copy and paste the signature and then it would look like they know the key, and because other third parties could check whether the signature was valid or not.

To be a true Zero-Knowledge Proof it needs to:

* show that you know the thing without revealing the thing

* not allow other people to copy your answer

* not allow anyone other than your intended counterparty to even verify the answer

amluto 16 hours ago

I think you can make a nice zero-knowledge interactive protocol to prove knowledge of an RSA secret key. First, the prover and the verifier jointly agree on a random number m between 1 and n-1. [0] Then the verifier signs that number, and tells the prover the signature. (The signature is d^m mod n.) The verifier verifies the signature, which, critically, is exactly the same as verifying that the signature encrypts to m.
Why is this zero-knowledge? Because the verifier could invent an entire transcript of the protocol without the prover’s help: choose a random signature and encrypt it to generate the “random message”. So the ability to work with the prover to generate random pairs of (message, signature) accomplishes nothing at all except to convince the verifier that the prover knows the secret key.
This, by the way, is one of many footguns involved in using raw RSA: you cannot assume that a private key was used properly just because someone presents the signature of some message. Better signature schemes built on top of RSA avoid this problem.
[0] This is fairly straightforward using cryptographic hashes. The verifier could instead choose freely, but then the protocol isn’t zero-knowledge.
phkahler 19 hours ago

>> I don't think a digital signature is a Zero-Knowledge Proof because someone else could copy and paste the signature and then it would look like they know the key, and because other third parties could check whether the signature was valid or not.
One of us is confused. You can't copy a digital signature in a useful way. Without the message it doesnt mean anything. With the message its proof that the message was signed by someone with the private key.
To meet your second two (arbitrary) requirements, have the signer encrypt the signed message with your public key before sending it to you.
- jstanley 18 hours ago
  
  They're not my arbitrary requirements, see https://en.wikipedia.org/wiki/Zero-knowledge_proof
  Specifically:
  > In light of the fact that one should be able to generate a proof of some statement only when in possession of certain secret information connected to the statement, the verifier, even after having become convinced of the statement's truth by means of a zero-knowledge proof, should nonetheless remain unable to prove the statement to further third parties.
  - pastel8739 11 hours ago
    
    I’m not sure that requirement is violated here; the interactive nature of a challenge-response protocol is required to prove that someone knows the private key. Without an interactive process, the prover could have just found the signatures lying around somewhere and reused them without knowing the private key at all. This means that the verifier would not be able to prove anything beyond “the private key X signed these messages”.
- pastel8739 19 hours ago
  
  I think it’s the original quote that is unclear:
  > a digital signature proves your possession of a private key without revealing that key.
  Signatures do not themselves do this; but they can be used to construct a protocol that does (e.g. the provee provides a random challenge that the prover must sign). But still this is not AFAIU a zero-knowledge proof as the signature is itself “knowledge”.
  - drdeca 18 hours ago
    
    I think a definition of the security of a signature scheme is that a computationally limited attacker should not have a non-negligibly better than chance guess of the secret key.
    I think some of the “ZKP” techniques are supposed to only be “ZK” for a computationally limited observer? Though I may be mistaken, and maybe non-interactive ZKP schemes are only assuming that the prover has limited computational resources, not that the observer/attacker hoping to get information from them does?
    
    pastel8739 12 hours ago
    
    I know very little about ZKPs, but it does indeed sound like there is a notion of “computational zero knowledge”. I don’t know whether digital signatures would meet that definition or not, or if it’s algorithm-dependent.
pastel8739 19 hours ago

I think even aside from that (which can be solved with challenge-response) digital signatures are typically not ZKPs because the signature itself constitutes information that must be transferred during the proof.

spir 11 hours ago

If you like zero knowledge proofs, check out https://ethproofs.org/blocks