ZK Journey Week 3 — Why Prime? From Group Theory to Real Cryptography

Last week was theory — groups, rings, fields. This week, the question that made everything click: "But WHY does the modulus have to be prime?" One proof, two algorithms, and suddenly the theory had teeth.

✍ 0xTheBlackPanther 📅 Mar 2026 ⏱ 14 min read 🏷 ZK, Primes, ECC, Hash Functions, Signatures
🔐 ZK Journey — Week 3 of 14 A Move security researcher learning zero-knowledge proofs from scratch. Documenting the full journey.

The Question That Started Everything

Week 3 was an office hours session — no slides, no formal lecture, just João answering questions from the cohort. And it turned out to be one of the best sessions so far, because the discussion went from abstract theory straight to real cryptography.

Rinke opened with a deceptively simple question: "How can you prove that when the modulus is prime, all numbers have a multiplicative inverse?"

In Week 2, we just stated this as a fact — prime modulus means every non-zero element has an inverse, which is why prime fields work. But we never proved it. This week, we got the proof. And once you see the proof, everything about why cryptography cares about primes becomes obvious.


Proving Inverses Exist — Two Ways

João showed two different proofs. Both arrive at the same conclusion, but from different directions.

Proof 1: Fermat's Little Theorem

Fermat's Little Theorem says that if p is prime and gcd(a, p) = 1 (which is guaranteed when a is between 1 and p−1, since p is prime), then:

# Fermat's Little Theorem ap ≡ a (mod p)

# Divide both sides by a (legal because gcd(a, p) = 1)
ap−1 ≡ 1 (mod p)

# Rewrite as:
ap−2 × a ≡ 1 (mod p)

# So the inverse of a is a^(p-2)
a−1 ≡ ap−2 (mod p)

That's it. If p is prime and a ≠ 0, then ap−2 mod p is the inverse of a. You don't need to search for it — you can compute it directly. This is actually what Python's pow(a, -1, p) does under the hood.

Proof 2: Extended Euclidean Algorithm

The Extended Euclidean Algorithm says that for any two integers x and p, there exist integers A and B such that:

# Bézout's Identity (from Extended Euclidean Algorithm) gcd(x, p) = x·A + p·B

# If gcd(x, p) = 1 (true when p is prime and 0 < x < p):
1 = x·A + p·B

# Take both sides mod p — the p·B term vanishes:
1 ≡ x·A (mod p)

# So A is the multiplicative inverse of x mod p
x−1 ≡ A (mod p)

Same conclusion, different route. The Extended Euclidean Algorithm actually finds the coefficients A and B, so it constructively gives you the inverse.

The common thread: Both proofs depend on gcd(a, p) = 1. And here's the key insight — if p is prime, then gcd(a, p) = 1 for ALL a between 1 and p−1, automatically. No exceptions. That's what makes primes special: they're coprime with everything smaller than them.


What Breaks When P Isn't Prime

João then showed exactly what goes wrong with a non-prime modulus. Let's take p = 6 and try to find multiplicative inverses:

# Trying to find multiplicative inverses mod 6

# Does 2 have an inverse? Need: 2 × ? ≡ 1 (mod 6)
2 × 1 = 2   2 × 2 = 4   2 × 3 = 0   2 × 4 = 2   2 × 5 = 4
# None give 1. No inverse. ❌ (gcd(2,6) = 2 ≠ 1)

# Does 3 have an inverse?
3 × 1 = 3   3 × 2 = 0   3 × 3 = 3   3 × 4 = 0   3 × 5 = 3
# No inverse. ❌ (gcd(3,6) = 3 ≠ 1)

# Does 4 have an inverse?
4 × 1 = 4   4 × 2 = 2   4 × 3 = 0   4 × 4 = 4   4 × 5 = 2
# No inverse. ❌ (gcd(4,6) = 2 ≠ 1)

# Does 5 have an inverse?
5 × 1 = 5   5 × 2 = 4   5 × 3 = 3   5 × 4 = 2   5 × 5 = 1
# Yes! 5 × 5 ≡ 1 (mod 6). ✅ (gcd(5,6) = 1)

Out of {1, 2, 3, 4, 5}, only 1 and 5 have inverses. The others — 2, 3, and 4 — are all non-coprime with 6, so they're stuck without inverses. You can't divide by 2 in this system. Division is broken.

The Units of a Modulus

João introduced a concept I hadn't seen before: the units of a modulus n. These are the elements whose gcd with n equals 1 — the ones that do have multiplicative inverses. They form a group under multiplication.

Units of n — the elements that "work" For n = 6: units are {1, 5} — only 2 elements form a group
For n = 7: units are {1, 2, 3, 4, 5, 6} — all 6 non-zero elements form a group
For n = 8: units are {1, 3, 5, 7} — only 4 out of 7 elements work
For n = 12: units are {1, 5, 7, 11} — only 4 out of 11 elements work

When n is prime: every non-zero element is a unit. That's the best case.
Euler's totient function φ(n) counts how many units n has.

When p is prime, the units are {1, 2, ..., p−1} — the full set. When p isn't prime, you lose elements. Some numbers become "dead" — you can't divide by them, can't find their inverses, and the group shrinks. For cryptography, you want the largest possible group. Prime gives you the maximum.


The Closure Argument — Why Primes Protect Groups

Brandon asked a sharp question: "If the multiplicative group 𝔽p* doesn't include zero, how can we be sure multiplying two elements never produces zero?"

João's answer was elegant. For a × b ≡ 0 (mod p), we'd need a × b to be a multiple of p. But if p is prime, the only way to factor p is 1 × p or p × 1. Since both a and b are between 1 and p−1, neither of them is p. So the product can never be a multiple of p. Zero can never appear.

Non-prime breaks closure. With mod 8: {1, 2, 3, 4, 5, 6, 7} under multiplication — but 2 × 4 = 8 ≡ 0 (mod 8). Zero just appeared inside a set that doesn't contain zero. The group isn't closed anymore. With mod 6: 2 × 3 = 6 ≡ 0. Same problem. A non-prime modulus means the set "leaks" — two perfectly valid elements can combine to produce something outside the set.

This immediately answered Brandon's earlier question about Diffie-Hellman: "If p is not prime, I don't have a group. And if I don't have a group, I can't define the discrete log problem." The entire security foundation of Diffie-Hellman depends on having a proper group, and you only get that with a prime modulus.


The Discrete Log Problem — How Big Is Big Enough?

Rinke asked the most practical question of the session: "In real terms, how big does the prime need to be for the discrete log problem to be secure?"

João's answer was blunt: it depends on which group you're using.

INTEGER DISCRETE LOG (𝔽p*) Gx ≡ Y (mod P)
Know Y and G, find x?

Known attacks exist (baby-step giant-step, Pohlig-Hellman, number field sieve)

Required key size: ~4000-5000 bits

Pre-ECC era: ElGamal signatures used this
ELLIPTIC CURVE DISCRETE LOG x·G = P (point multiplication)
Know point P and generator G, find x?

No known efficient attacks

Required key size: ~256 bits

Modern era: ECDSA, used in Ethereum/Bitcoin

That's a massive difference. For the same security level, ECC needs keys roughly 20x smaller than traditional discrete log. This is why elliptic curves took over cryptography in the 2000s — same security, much smaller keys, faster operations. The old approach (ElGamal, which João mentioned by name) worked on the same philosophical principle, but needed 4000+ bit primes to be secure. ECC achieves equivalent security with 256 bits.

Why the difference? Because for the integer discrete log, mathematicians have found some clever sub-exponential algorithms — not fast enough to actually break it with large keys, but fast enough to require bigger keys. For the elliptic curve discrete log, no such algorithms exist. The best known attack is essentially brute force (Pollard's rho, which is still exponential). No mathematical shortcut has been found, so the keys can stay small.

João's phrasing stuck with me: "It's hard to find something better than elliptic curves because we don't know basically any kind of attack." The security of ECC isn't proven — it's based on the absence of known attacks. If someone discovers an efficient algorithm for the elliptic curve discrete log tomorrow, the entire foundation of blockchain cryptography collapses. But after decades of research, no one has.


The Quantum Elephant in the Room

Rinke raised the next obvious question: "But elliptic curves will also be broken by Shor's algorithm, right?"

Yes. Shor's algorithm, running on a sufficiently powerful quantum computer, would break both traditional discrete log and elliptic curve discrete log. Both. The entire foundation of modern public-key cryptography — RSA, Diffie-Hellman, ECDSA, everything based on the hardness of factoring or discrete logs — would fall.

João's response was pragmatic: quantum computers capable of running Shor's at scale don't exist yet. Classical cryptography remains secure against current hardware. But the threat is taken seriously enough that the entire field of post-quantum cryptography exists to prepare for that day.

Brandon brought up lattice-based cryptography — a family of cryptographic schemes based on entirely different hard problems (shortest vector problem, learning with errors) that are believed to be quantum-resistant. João acknowledged he wasn't deeply familiar with lattices, but confirmed they're a leading candidate for post-quantum standards. They use different algebraic structures than elliptic curves — they're not based on the discrete log problem at all.

Where ZK fits in the quantum picture SNARKs (Groth16, PLONK): Built on elliptic curves and pairings → NOT quantum-secure
STARKs: Built on hash functions only → Quantum-resistant

João: "If you want quantum resistance, you need to drop most of the cryptography you know." STARKs achieve this by using only hash functions — no elliptic curves, no pairings, no discrete log assumptions.

This is a major reason STARKs exist alongside SNARKs. Groth16 (what we're building toward in this course) produces the smallest proofs and fastest verification — but it's built on elliptic curves, so it's not quantum-safe. STARKs produce larger proofs but rely only on hash functions, making them quantum-resistant. In practice, some systems even wrap a STARK proof inside a Groth16 proof — using the STARK for the heavy computation and Groth16 for the final compression.


Hash Functions — Two Completely Different Worlds

This was the part of the discussion I found most surprising. A student asked whether hash functions use elliptic curves behind the scenes. João's answer: absolutely not. Traditional hash functions and elliptic curve cryptography are fundamentally different beasts.

Traditional Hashes: No Math, Just Chaos

SHA-2, SHA-3, MD5 — these are based on bit mixing. You take an input, shuffle it, rotate it, permute it, do another round, shuffle, rotate, permute, and output a fixed-size result. There is no mathematical structure. No hard problem. No formula. Just carefully designed chaos.

And here's the remarkable thing: no one has ever proven these are secure. There is no mathematical proof that SHA-256 is collision-resistant. It's secure because, after decades of cryptanalysis, nobody has been able to break it. That's a very different kind of security than "this is as hard as the discrete log problem."

João made the distinction sharply: Diffie-Hellman's security is based on the discrete log problem — a formally defined hard problem. If you break the DLP, you break Diffie-Hellman. SHA's security is based on nobody having broken it yet. There's no underlying hard problem to point to. Same for AES, DES, and all symmetric encryption. They're "ugly" — no elegant math, just effective bit mangling.

But this "ugliness" is exactly why hash functions are quantum-resistant. Shor's algorithm exploits mathematical structure — the periodicity in modular exponentiation, the group structure of elliptic curves. Hash functions have no structure to exploit. The best quantum attack on SHA-256 is Grover's algorithm, which only gives a square-root speedup (effectively halving the bit security from 256 to 128 — still plenty secure).

ZK-Friendly Hashes: Math Returns

Then João introduced a completely different category: hash functions designed specifically for zero-knowledge proofs. Names like Poseidon and Pedersen.

Unlike SHA, these hashes do have algebraic structure. They're built from mathematical formulas — operations like repeated exponentiation over finite fields. This structure is exactly what makes them useful for ZK: if you have a formula, you can build a circuit to verify that formula. When we learn circuits later in the course, this will make sense — a circuit is essentially a mathematical equation that a verifier checks.

SHA-256 (Traditional) No algebraic structure
Just bit manipulation
Proven secure? No (practical only)
Quantum-resistant? Yes

ZK circuit size: ~millions of gates
POSEIDON (ZK-Friendly) Built on finite field arithmetic
Has a mathematical formula
Proven secure? No (but analyzable)
Quantum-resistant? Uncertain

ZK circuit size: ~hundreds of gates

That difference in circuit size is staggering — millions vs hundreds. If you're building a ZK proof that needs to verify a hash, using Poseidon instead of SHA-256 makes your circuit thousands of times smaller and cheaper. That's why ZK-friendly hashes were invented: not because SHA is insecure, but because SHA is impossibly expensive to prove in a circuit.

Practical note: Ethereum and Bitcoin signatures use SHA (specifically Keccak-256 for Ethereum). If your ZK circuit needs to verify an Ethereum signature, you're stuck using SHA inside the circuit — and paying the cost. But if you're designing a new system from scratch, you'd choose Poseidon or similar to keep circuits small. This is a real engineering trade-off in ZK system design.


Ethereum Signatures — S, R, and V

The final topic connected everything back to blockchain. João explained the three components of an Ethereum ECDSA signature: S, R, and V. As someone who audits Move smart contracts daily, I've seen these values in transaction objects and signature verification code — but I'd never understood where they come from mathematically.

ECDSA signature = (R, S, V) R and S: The actual signature. These two values are enough to verify that a signature is valid for a given public key.

V: The recovery parameter. Allows recovering the public key (and therefore the address) from the signature itself.

This is why Ethereum transactions don't include the sender's address — it's derived from the signature using V. The signature proves who you are AND what you signed, simultaneously.

João mentioned that coding the ECDSA algorithm from scratch used to be one of the course homeworks — understanding where S comes from, where R comes from, and why V is needed. The S and R values come from elliptic curve point operations (which we'll learn about when we cover elliptic curves). V essentially tells the verifier which of two possible public keys the signature corresponds to, since the math produces an ambiguity that V resolves.

The security researcher angle: Understanding that V enables public key recovery explains why ecrecover in Solidity and similar functions in Move can derive a signer's address from just a signature. It also explains a class of bugs: if V is malleated (flipped between 27 and 28, or between 0 and 1 depending on the chain), the same message can produce two valid-looking signatures that recover to different addresses. Signature malleability bugs have been found in real protocols.


"Why Do We Need to Know This Math?"

A student asked the question that I think everyone in the cohort was thinking: "Would knowing this math help me spot security mistakes in ZK implementations?"

João's answer was nuanced and honest. It depends on what you're doing:

  1. Writing ZK circuits? You need to know how to write circuits. The abstract algebra is helpful context, but you don't need to derive Fermat's Little Theorem to write a valid Circom template. Know what a group is, what a field is, and you're functional.
  2. Implementing a library (SnarkJS, Plonky3)? You need deep math. Multiplying two 256-bit numbers efficiently requires Number Theoretic Transforms, which are built on group theory. The math isn't optional — it's the implementation.
  3. Reading documentation and papers? You need the vocabulary. Papers will say "group," "field," "isomorphism," "homomorphism" without explanation. If you can't parse these terms, you can't read the literature.
  4. Auditing ZK systems? You need to know what the constraints should be. If a protocol uses a non-prime modulus where a prime is required, if a circuit fails to constrain a witness value, if a hash function is used where a ZK-friendly hash should be — you need enough understanding to recognize when something is wrong.

João's broader point resonated with how I think about auditing: "You can always go deeper. You can always ask 'why' one more time. But at some point, you accept certain facts and work with them. The important thing is to know enough that when something doesn't look right, you notice."

That's exactly how security research works too. You don't need to be able to prove every theorem. But you need enough understanding that when a protocol says "we use a 256-bit prime field" and you see they're actually using a non-prime modulus, alarm bells go off.


What's Next

Next week: elliptic curves. We're finally going to construct a group from geometric objects — points on a curve — and see the point addition operation that makes elliptic curve cryptography work. Everything from Weeks 1-3 (modular arithmetic, group theory, why primes matter) converges in elliptic curves. The finite field we built is the coordinate system. The group properties we learned are what the points satisfy. The prime modulus we proved is necessary is the field those coordinates live in.

João mentioned homomorphisms and isomorphisms — maps between groups where one direction is easy and the other is hard. That's the mathematical formalization of public-key cryptography: easy to compute a public key from a private key, hard to go back. Elliptic curves are where that idea becomes concrete.

Week 3 done. The theory has teeth now — primes aren't arbitrary, hash functions aren't magic, and every Ethereum signature carries elliptic curve math inside it. Next week we meet the curves themselves. 🔑

📝 ZK and cryptography evolve fast — if any tip here is outdated, incorrect, or no longer applies, please reach out on X so I can update this article accordingly.

Course: Rare Skills ZK Bootcamp
Instructor: João Paulo Morais (PhD Physics, 12+ years in ZK)
Duration: ~13-14 weeks
Goal: Code Groth16 from scratch
Week: 3 / 14
Format: Office hours Q&A — theory meets practice
Topics: Inverse proofs, units, discrete log sizes, ECC vs ElGamal, quantum, hash functions, ECDSA
Rare Skills: rareskills.io
Follow my journey: @thepantherplus