Secure Randomness in Go 1.22

https://go.dev/blog/chacha8rand

The Go Blog

Computers aren’t random. On the contrary, hardware designers work very hard to make sure computers run every program the same way every time. So when a program does need random numbers, that requires extra effort. Traditionally, computer scientists and programming languages have distinguished between two different kinds of random numbers: statistical and cryptographic randomness. In Go, those are provided by math/rand and crypto/rand, respectively. This post is about how Go 1.22 brings the two closer together, by using a cryptographic random number source in math/rand (as well as math/rand/v2, as mentioned in our previous post). The result is better randomness and far less damage when developers accidentally use math/rand instead of crypto/rand.

Before we can explain what Go 1.22 did, let’s take a closer look at statistical randomness compared to cryptographic randomness.

Statistical Randomness

Random numbers that pass basic statistical tests are usually appropriate for use cases like simulations, sampling, numerical analysis, non-cryptographic randomized algorithms, random testing, shuffling inputs, and random exponential backoff. Very basic, easy to compute mathematical formulas turn out to work well enough for these use cases. Because the methods are so simple, however, an observer who knows what algorithm is being used can typically predict the rest of the sequence after seeing enough values.

Essentially all programming environments provide a mechanism for generating statistical random numbers that traces back through C to Research Unix Third Edition (V3), which added a pair of functions: srand and rand. The manual page included a note that read:

WARNING   The author of this routine has been writing random-number generators for many years and has never been known to write one that worked.

This note was partly a joke but also an acknowledgement that such generators are inherently not random.

The source code of the generator makes clear how trivial it is. Translated from PDP-11 assembly to modern C, it was:

uint16 ranx;
void
srand(uint16 seed)
{
    ranx = seed;
}
int16
rand(void)
{
    ranx = 13077*ranx + 6925;
    return ranx & ~0x8000;
}

Calling srand seeds the generator with a single integer seed, and rand returns the next number from the generator. The AND in the return statement clears the sign bit to make sure the result is positive.

This function is an instance of the general class of linear congruential generators (LCGs), which Knuth analyzes in The Art of Computer Programming, Volume 2, section 3.2.1. The main benefit of LCGs is that constants can be chosen such that they emit every possible output value once before repeating, as the Unix implementation did for 15-bit outputs. A serious problem with LCGs, however, is that the high bits of the state do not affect the low bits at all, so every truncation of the sequence to k bits necessarily repeats with a smaller period. The low bit must toggle: 0, 1, 0, 1, 0, 1. The low two bits must count up or down: 0, 1, 2, 3, 0, 1, 2, 3, or else 0, 3, 2, 1, 0, 3, 2, 1. There are four possible three-bit sequences; the original Unix implementation repeats 0, 5, 6, 3, 4, 1, 2, 7. (These problems can be avoided by reducing the value modulo a prime, but that would have been quite expensive at the time. See S. K. Park and K. W. Miller’s 1988 CACM paper “Random number generators: good ones are hard to find” for a short analysis and the first chapter of Knuth Volume 2 for a longer one.)

Even with these known problems, the srand and rand functions were included in the first C standard, and equivalent functionality was included in essentially every language since then. LCGs were once the dominant implementation strategy, although they’ve fallen off in popularity due to some important drawbacks. One significant remaining use is java.util.Random, which powers java.lang.Math.random.

Another thing you can see from the implementation above is that the internal state is completely exposed by the result of rand. An observer who knows the algorithm and sees a single result can easily compute all future results. If you are running a server that calculates some random values that become public and some random values that must stay secret, using this kind of generator would be disastrous: the secrets wouldn’t be secret.

More modern random generators aren’t as terrible as the original Unix one, but they’re still not completely unpredictable. To make that point, next we will look at the original math/rand generator from Go 1 and the PCG generator we added in math/rand/v2.

The Go 1 Generator

The generator used in Go 1’s math/rand is an instance of what is called a linear-feedback shift register. The algorithm is based on an idea by George Marsaglia, tweaked by Don Mitchell and Jim Reeds, and further customized by Ken Thompson for Plan 9 and then Go. It has no official name, so this post calls it the Go 1 generator.

The Go 1 generator’s internal state is a slice vec of 607 uint64s. In that slice, there are two distinguished elements: vec[606], the last element, is called the “tap”, and vec[334] is called the “feed”. To generate the next random number, the generator adds the tap and the feed to produce a value x, stores x back into the feed, shifts the entire slice one position to the right (the tap moves to vec[0] and vec[i] moves to vec[i+1]), and returns x. The generator is called “linear feedback” because the tap is added to the feed; the entire state is a “shift register” because each step shifts the slice entries.

Of course, actually moving every slice entry forward would be prohibitively expensive, so instead the implementation leaves the slice data in place and moves the tap and feed positions backward on each step. The code looks like:

func (r *rngSource) Uint64() uint64 {
    r.tap--
    if r.tap < 0 {
        r.tap += len(r.vec)
    }
    r.feed--
    if r.feed < 0 {
        r.feed += len(r.vec)
    }
    x := r.vec[r.feed] + r.vec[r.tap]
    r.vec[r.feed] = x
    return uint64(x)
}

Generating the next number is quite cheap: two subtractions, two conditional adds, two loads, one add, one store.

Unfortunately, because the generator directly returns one slice element from its internal state vector, reading 607 values from the generator completely exposes all its state. With those values, you can predict all the future values, by filling in your own vec and then running the algorithm. You can also recover all the previous values, by running the algorithm backward (subtracting the tap from the feed and shifting the slice to the left).

As a complete demonstration, here is an insecure program generating pseudorandom authentication tokens along with code that predicts the next token given a sequence of earlier tokens. As you can see, the Go 1 generator provides no security at all (nor was it meant to). The quality of the generated numbers also depends on the initial setting of vec.

The PCG Generator

For math/rand/v2, we wanted to provide a more modern statistical random generator and settled on Melissa O’Neill’s PCG algorithm, published in 2014 in her paper “PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation”. The exhaustive analysis in the paper can make it hard to notice at first glance how utterly trivial the generators are: PCG is a post-processed 128-bit LCG.

If the state p.x were a uint128 (hypothetically), the code to compute the next value would be:

const (
    pcgM = 0x2360ed051fc65da44385df649fccf645
    pcgA = 0x5851f42d4c957f2d14057b7ef767814f
)
type PCG struct {
    x uint128
}
func (p *PCG) Uint64() uint64 {
    p.x = p.x * pcgM + pcgA
    return scramble(p.x)
}

The entire state is a single 128-bit number, and the update is a 128-bit multiply and add. In the return statement, the scramble function reduces the 128-bit state down to a 64-bit state. The original PCG used (again using a hypothetical uint128 type):

func scramble(x uint128) uint64 {
    return bits.RotateLeft(uint64(x>>64) ^ uint64(x), -int(x>>122))
}

This code XORs the two halves of the 128-bit state together and then rotates the result according to the top six bits of the state. This version is called PCG-XSL-RR, for “xor shift low, right rotate”.

Based on a suggestion from O’Neill during proposal discussion, Go’s PCG uses a new scramble function based on multiplication, which mixes the bits more aggressively:

func scramble(x uint128) uint64 {
    hi, lo := uint64(x>>64), uint64(x)
    hi ^= hi >> 32
    hi *= 0xda942042e4dd58b5
    hi ^= hi >> 48
    hi *= lo | 1
}

O’Neill calls PCG with this scrambler PCG-DXSM, for “double xorshift multiply.” Numpy uses this form of PCG as well.

Although PCG uses more computation to generate each value, it uses significantly less state: two uint64s instead of 607. It is also much less sensitive to the initial values of that state, and it passes many statistical tests that other generators do not. In many ways it is an ideal statistical generator.

Even so, PCG is not unpredictable. While the scrambling of bits to prepare the result does not expose the state directly like in the LCG and Go 1 generators, PCG-XSL-RR can still be reversed, and it would not be surprising if PCG-DXSM could too. For secrets, we need something different.

Cryptographic Randomness

Cryptographic random numbers need to be utterly unpredictable in practice, even to an observer who knows how they are generated and has observed any number of previously generated values. The safety of cryptographic protocols, secret keys, modern commerce, online privacy, and more all critically depend on access to cryptographic randomness.

Providing cryptographic randomness is ultimately the job of the operating system, which can gather true randomness from physical devices—timings of the mouse, keyboard, disks, and network, and more recently electrical noise measured directly by the CPU itself. Once the operating system has gathered a meaningful amount of randomness—say, at least 256 bits—it can use cryptographic hashing or encryption algorithms to stretch that seed into an arbitrarily long sequence of random numbers. (In practice the operating system is also constantly gathering and adding new randomness to the sequence too.)

The exact operating system interfaces have evolved over time. A decade ago, most systems provided a device file named /dev/random or something similar. Today, in recognition of how fundamental randomness has become, operating systems provide a direct system call instead. (This also allows programs to read randomness even when cut off from the file system.) In Go, the crypto/rand package abstracts away those details, providing the same interface on every operating system: rand.Read.

It would not be practical for math/rand to ask the operating system for randomness each time it needs a uint64. But we can use cryptographic techniques to define an in-process random generator that improves on LCGs, the Go 1 generator, and even PCG.

The ChaCha8Rand Generator

Our new generator, which we unimaginatively named ChaCha8Rand for specification purposes and implemented as math/rand/v2’s rand.ChaCha8, is a lightly modified version of Daniel J. Bernstein’s ChaCha stream cipher. ChaCha is widely used in a 20-round form called ChaCha20, including in TLS and SSH. Jean-Philippe Aumasson’s paper “Too Much Crypto” argues persuasively that the 8-round form ChaCha8 is secure too (and it’s roughly 2.5X faster). We used ChaCha8 as the core of ChaCha8Rand.

Most stream ciphers, including ChaCha8, work by defining a function that is given a key and a block number and produces a fixed-size block of apparently random data. The cryptographic standard these aim for (and usually meet) is for this output to be indistinguishable from actual random data in the absence of some kind of exponentially costly brute force search. A message is encrypted or decrypted by XOR’ing successive blocks of input data with successive randomly generated blocks. To use ChaCha8 as a rand.Source, we use the generated blocks directly instead of XOR’ing them with input data (this is equivalent to encrypting or decrypting all zeros).

We changed a few details to make ChaCha8Rand more suitable for generating random numbers. Briefly:

  • ChaCha8Rand takes a 32-byte seed, used as the ChaCha8 key.
  • ChaCha8 generates 64-byte blocks, with calculations treating a block as 16 uint32s. A common implementation is to compute four blocks at a time using SIMD instructions on 16 vector registers of four uint32s each. This produces four interleaved blocks that must be unshuffled for XOR’ing with the input data. ChaCha8Rand defines that the interleaved blocks are the random data stream, removing the cost of the unshuffle. (For security purposes, this can be viewed as standard ChaCha8 followed by a reshuffle.)
  • ChaCha8 finishes a block by adding certain values to each uint32 in the block. Half the values are key material and the other half are known constants. ChaCha8Rand defines that the known constants are not re-added, removing half of the final adds. (For security purposes, this can be viewed as standard ChaCha8 followed by subtracting the known constants.)
  • Every 16th generated block, ChaCha8Rand takes the final 32 bytes of the block for itself, making them the key for the next 16 blocks. This provides a kind of forward secrecy: if a system is compromised by an attack that recovers the entire memory state of the generator, only values generated since the last rekeying can be recovered. The past is inaccessible. ChaCha8Rand as defined so far must generate 4 blocks at a time, but we chose to do this key rotation every 16 blocks to leave open the possibility of faster implementations using 256-bit or 512-bit vectors, which could generate 8 or 16 blocks at a time.

We wrote and published a C2SP specification for ChaCha8Rand, along with test cases. This will enable other implementations to share repeatability with the Go implementation for a given seed.

The Go runtime now maintains a per-core ChaCha8Rand state (300 bytes), seeded with operating system-supplied cryptographic randomness, so that random numbers can be generated quickly without any lock contention. Dedicating 300 bytes per core may sound expensive, but on a 16-core system, it is about the same as storing a single shared Go 1 generator state (4,872 bytes). The speed is worth the memory. This per-core ChaCha8Rand generator is now used in three different places in the Go standard library:

  1. The math/rand/v2 package functions, such as rand.Float64 and rand.N, always use ChaCha8Rand.

  2. The math/rand package functions, such as rand.Float64 and rand.Intn, use ChaCha8Rand when rand.Seed has not been called. Applying ChaCha8Rand in math/rand improves the security of programs even before they update to math/rand/v2, provided they are not calling rand.Seed. (If rand.Seed is called, the implementation is required to fall back to the Go 1 generator for compatibility.)

  3. The runtime chooses the hash seed for each new map using ChaCha8Rand instead of a less secure wyrand-based generator it previously used. Random seeds are needed because if an attacker knows the specific hash function used by a map implementation, they can prepare input that drives the map into quadratic behavior (see Crosby and Wallach’s “Denial of Service via Algorithmic Complexity Attacks”). Using a per-map seed, instead of one global seed for all maps, also avoids other degenerate behaviors. It is not strictly clear that maps need a cryptographically random seed, but it’s also not clear that they don’t. It seemed prudent and was trivial to switch.

Code that needs its own ChaCha8Rand instances can create its own rand.ChaCha8 directly.

Fixing Security Mistakes

Go aims to help developers write code that is secure by default. When we observe a common mistake with security consequences, we look for ways to reduce the risk of that mistake or eliminate it entirely. In this case, math/rand’s global generator was far too predictable, leading to serious problems in a variety of contexts.

For example, when Go 1.20 deprecated math/rand’s Read, we heard from developers who discovered (thanks to tooling pointing out use of deprecated functionality) they had been using it in places where crypto/rand’s Read was definitely needed, like generating key material. Using Go 1.20, that mistake is a serious security problem that merits a detailed investigation to understand the damage. Where were the keys used? How were the keys exposed? Were other random outputs exposed that might allow an attacker to derive the keys? And so on. Using Go 1.22, that mistake is just a mistake. It’s still better to use crypto/rand, because the operating system kernel can do a better job keeping the random values secret from various kinds of prying eyes, the kernel is continually adding new entropy to its generator, and the kernel has had more scrutiny. But accidentally using math/rand is no longer a security catastrophe.

There are also a variety of use cases that don’t seem like “crypto” but nonetheless need unpredictable randomness. These cases are made more robust by using ChaCha8Rand instead of the Go 1 generator.

For example, consider generating a random UUID. Since UUIDs are not secret, using math/rand might seem fine. But if math/rand has been seeded with the current time, then running it at the same instant on different computers will produce the same value, making them not “universally unique”. This is especially likely on systems where the current time is only available with millisecond precision. Even with auto-seeding using OS-provided entropy, as introduced in Go 1.20, the Go 1 generator’s seed is only a 63-bit integer, so a program that generates a UUID at startup can only generate 2⁶³ possible UUIDs and is likely to see collisions after 2³¹ or so UUIDs. Using Go 1.22, the new ChaCha8Rand generator is seeded from 256 bits of entropy and can generate 2²⁵⁶ possible first UUIDs. It does not need to worry about collisions.

As another example, consider load balancing in a front-end server that randomly assigns incoming requests to back-end servers. If an attacker can observe the assignments and knows the predictable algorithm generating them, then the attacker could send a stream of mostly cheap requests but arrange for all the expensive requests to land on a single back-end server. This is an unlikely but plausible problem using the Go 1 generator. Using Go 1.22, it’s not a problem at all.

In all these examples, Go 1.22 has eliminated or greatly reduced security problems.

Performance

The security benefits of ChaCha8Rand do have a small cost, but ChaCha8Rand is still in the same ballpark as both the Go 1 generator and PCG. The following graphs compare the performance of the three generators, across a variety of hardware, running two operations: the primitive operation “Uint64,” which returns the next uint64 in the random stream, and the higher-level operation “N(1000),” which returns a random value in the range [0, 1000).

The “running 32-bit code” graphs show modern 64-bit x86 chips executing code built with GOARCH=386, meaning they are running in 32-bit mode. In that case, the fact that PCG requires 128-bit multiplications makes it slower than ChaCha8Rand, which only uses 32-bit SIMD arithmetic. Actual 32-bit systems matter less every year, but it is still interesting that ChaCha8Rand is faster than PCG on those systems.

On some systems, “Go 1: Uint64” is faster than “PCG: Uint64”, but “Go 1: N(1000)” is slower than “PCG: N(1000)”. This happens because “Go 1: N(1000)” is using math/rand’s algorithm for reducing a random int64 down to a value in the range [0, 1000), and that algorithm does two 64-bit integer divide operations. In contrast, “PCG: N(1000)” and “ChaCha8: N(1000)” use the faster math/rand/v2 algorithm, which almost always avoids the divisions. Removing the 64-bit divisions dominates the algorithm change for 32-bit execution and on the Ampere.

Overall, ChaCha8Rand is slower than the Go 1 generator, but it is never more than twice as slow, and on typical servers the difference is never more than 3ns. Very few programs will be bottlenecked by this difference, and many programs will enjoy the improved security.

Conclusion

Go 1.22 makes your programs more secure without any code changes. We did this by identifying the common mistake of accidentally using math/rand instead of crypto/rand and then strengthening math/rand. This is one small step in Go’s ongoing journey to keep programs safe by default.

These kinds of mistakes are not unique to Go. For example, the npm keypair package tries to generate an RSA key pair using Web Crypto APIs, but if they’re not available, it falls back to JavaScript’s Math.random. This is hardly an isolated case, and the security of our systems cannot depend on developers not making mistakes. Instead, we hope that eventually all programming languages will move to cryptographically strong pseudorandom generators even for “mathematical” randomness, eliminating this kind of mistake, or at least greatly reducing its blast radius. Go 1.22’s ChaCha8Rand implementation proves that this approach is competitive with other generators.

{
"by": "spacey",
"descendants": 4,
"id": 40237491,
"kids": [
40237591,
40238055,
40253468,
40237832
],
"score": 37,
"time": 1714664421,
"title": "Secure Randomness in Go 1.22",
"type": "story",
"url": "https://go.dev/blog/chacha8rand"
}
{
"author": "Russ Cox and Filippo Valsorda 2 May 2024",
"date": null,
"description": "ChaCha8Rand is a new cryptographically secure pseudorandom number generator used in Go 1.22.",
"image": "https://go.dev/doc/gopher/runningsquare.jpg",
"logo": "https://logo.clearbit.com/go.dev",
"publisher": "Go",
"title": "Secure Randomness in Go 1.22 - The Go Programming Language",
"url": "https://go.dev/blog/chacha8rand"
}
{
"url": "https://go.dev/blog/chacha8rand",
"title": "Secure Randomness in Go 1.22 - The Go Programming Language",
"description": "The Go Blog Computers aren’t random. On the contrary, hardware designers work very hard to make sure computers run every program the same way every time. So when a program does need random numbers, that...",
"links": [
"https://go.dev/blog/chacha8rand"
],
"image": "https://go.dev/doc/gopher/runningsquare.jpg",
"content": "<div>\n <h2><a target=\"_blank\" href=\"https://go.dev/blog/\">The Go Blog</a></h2>\n <p>Computers aren’t random.\nOn the contrary, hardware designers work very hard to make sure computers run every program the same way every time.\nSo when a program does need random numbers, that requires extra effort.\nTraditionally, computer scientists and programming languages\nhave distinguished between two different kinds of random numbers:\nstatistical and cryptographic randomness.\nIn Go, those are provided by <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/\"><code>math/rand</code></a>\nand <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand\"><code>crypto/rand</code></a>, respectively.\nThis post is about how Go 1.22 brings the two closer together,\nby using a cryptographic random number source in <code>math/rand</code>\n(as well as <code>math/rand/v2</code>, as mentioned in our <a target=\"_blank\" href=\"https://go.dev/blog/randv2\">previous post</a>).\nThe result is better randomness and far less damage when\ndevelopers accidentally use <code>math/rand</code> instead of <code>crypto/rand</code>.</p>\n<p>Before we can explain what Go 1.22 did, let’s take a closer look\nat statistical randomness compared to cryptographic randomness.</p>\n<h2 id=\"statistical-randomness\">Statistical Randomness</h2>\n<p>Random numbers that pass basic statistical tests\nare usually appropriate for use cases like simulations, sampling,\nnumerical analysis, non-cryptographic randomized algorithms,\n<a target=\"_blank\" href=\"https://go.dev/doc/security/fuzz/\">random testing</a>,\n<a href=\"https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle\" target=\"_blank\">shuffling inputs</a>,\nand\n<a href=\"https://en.wikipedia.org/wiki/Exponential_backoff#Collision_avoidance\" target=\"_blank\">random exponential backoff</a>.\nVery basic, easy to compute mathematical formulas turn out to work\nwell enough for these use cases.\nBecause the methods are so simple, however, an observer who\nknows what algorithm is being used can typically predict the rest\nof the sequence after seeing enough values.</p>\n<p>Essentially all programming environments provide a mechanism for generating\nstatistical random numbers\nthat traces back through C to\nResearch Unix Third Edition (V3), which added a pair of functions: <code>srand</code> and <code>rand</code>.\nThe manual page included\na note that read:</p>\n<blockquote>\n<p><em>WARNING   The author of this routine has been writing\nrandom-number generators for many years and has never been\nknown to write one that worked.</em></p>\n</blockquote>\n<p>This note was partly a joke but also an acknowledgement that such\ngenerators are <a href=\"https://www.tuhs.org/pipermail/tuhs/2024-March/029587.html\" target=\"_blank\">inherently not random</a>.</p>\n<p>The source code of the generator makes clear how trivial it is.\nTranslated from PDP-11 assembly to modern C, it was:</p>\n<pre><code>uint16 ranx;\nvoid\nsrand(uint16 seed)\n{\n ranx = seed;\n}\nint16\nrand(void)\n{\n ranx = 13077*ranx + 6925;\n return ranx &amp; ~0x8000;\n}\n</code></pre>\n<p>Calling <code>srand</code> seeds the generator with a single integer seed,\nand <code>rand</code> returns the next number from the generator.\nThe AND in the return statement clears the sign bit to make sure the result is positive.</p>\n<p>This function is an instance of the general class of\n<a href=\"https://en.wikipedia.org/wiki/Linear_congruential_generator\" target=\"_blank\">linear congruential generators (LCGs)</a>,\nwhich Knuth analyzes in <em>The Art of Computer Programming</em>, Volume 2, section 3.2.1.\nThe main benefit of LCGs is that constants can be chosen such that they\nemit every possible output value once before repeating,\nas the Unix implementation did for 15-bit outputs.\nA serious problem with LCGs, however, is that the high bits of the state do not affect the low bits at all,\nso every truncation of the sequence to <em>k</em> bits necessarily repeats with a smaller period.\nThe low bit must toggle: 0, 1, 0, 1, 0, 1.\nThe low two bits must count up or down: 0, 1, 2, 3, 0, 1, 2, 3, or else 0, 3, 2, 1, 0, 3, 2, 1.\nThere are four possible three-bit sequences; the original Unix implementation repeats 0, 5, 6, 3, 4, 1, 2, 7.\n(These problems can be avoided by reducing the value modulo a prime,\nbut that would have been quite expensive at the time.\nSee S. K. Park and K. W. Miller’s 1988 CACM paper\n“<a href=\"https://dl.acm.org/doi/10.1145/63039.63042\" target=\"_blank\">Random number generators: good ones are hard to find</a>”\nfor a short analysis\nand the first chapter of Knuth Volume 2 for a longer one.)</p>\n<p>Even with these known problems,\nthe <code>srand</code> and <code>rand</code> functions were included in the first C standard,\nand equivalent functionality was included in essentially every language since then.\nLCGs were once the dominant implementation strategy,\nalthough they’ve fallen off in popularity due to some important drawbacks.\nOne significant remaining use is <a href=\"https://github.com/openjdk/jdk8u-dev/blob/master/jdk/src/share/classes/java/util/Random.java\" target=\"_blank\"><code>java.util.Random</code></a>,\nwhich powers <a href=\"https://github.com/openjdk/jdk8u-dev/blob/master/jdk/src/share/classes/java/util/Random.java\" target=\"_blank\"><code>java.lang.Math.random</code></a>.</p>\n<p>Another thing you can see from the implementation above\nis that the internal state is completely exposed by the result of <code>rand</code>.\nAn observer who knows the algorithm and sees a single result\ncan easily compute all future results.\nIf you are running a server that calculates some random values\nthat become public and some random values that must stay secret,\nusing this kind of generator would be disastrous:\nthe secrets wouldn’t be secret.</p>\n<p>More modern random generators aren’t as terrible as the original Unix one,\nbut they’re still not completely unpredictable.\nTo make that point, next we will look at the original <code>math/rand</code> generator from Go 1\nand the PCG generator we added in <code>math/rand/v2</code>.</p>\n<h2 id=\"the-go-1-generator\">The Go 1 Generator</h2>\n<p>The generator used in Go 1’s <code>math/rand</code> is an instance of what is called a\n<a href=\"https://en.wikipedia.org/wiki/Linear-feedback_shift_register\" target=\"_blank\">linear-feedback shift register</a>.\nThe algorithm is based on an idea by George Marsaglia,\ntweaked by Don Mitchell and Jim Reeds,\nand further customized by Ken Thompson for Plan 9 and then Go.\nIt has no official name, so this post calls it the Go 1 generator.</p>\n<p>The Go 1 generator’s internal state is a slice <code>vec</code> of 607 uint64s.\nIn that slice, there are two distinguished elements: <code>vec[606]</code>, the last element, is called the “tap”,\nand <code>vec[334]</code> is called the “feed”.\nTo generate the next random number,\nthe generator adds the tap and the feed\nto produce a value <code>x</code>,\nstores <code>x</code> back into the feed,\nshifts the entire slice one position to the right\n(the tap moves to <code>vec[0]</code> and <code>vec[i]</code> moves to <code>vec[i+1]</code>),\nand returns <code>x</code>.\nThe generator is called “linear feedback” because the tap is <em>added</em> to the feed;\nthe entire state is a “shift register” because each step shifts the slice entries.</p>\n<p>Of course, actually moving every slice entry forward would be prohibitively expensive,\nso instead the implementation leaves the slice data in place\nand moves the tap and feed positions backward\non each step. The code looks like:</p>\n<pre><code>func (r *rngSource) Uint64() uint64 {\n r.tap--\n if r.tap &lt; 0 {\n r.tap += len(r.vec)\n }\n r.feed--\n if r.feed &lt; 0 {\n r.feed += len(r.vec)\n }\n x := r.vec[r.feed] + r.vec[r.tap]\n r.vec[r.feed] = x\n return uint64(x)\n}\n</code></pre>\n<p>Generating the next number is quite cheap: two subtractions, two conditional adds, two loads, one add, one store.</p>\n<p>Unfortunately, because the generator directly returns one slice element from its internal state vector,\nreading 607 values from the generator completely exposes all its state.\nWith those values, you can predict all the future values, by filling in your own <code>vec</code>\nand then running the algorithm.\nYou can also recover all the previous values, by running the algorithm backward\n(subtracting the tap from the feed and shifting the slice to the left).</p>\n<p>As a complete demonstration, here is an <a target=\"_blank\" href=\"https://go.dev/play/p/v0QdGjUAtzC\">insecure program</a>\ngenerating pseudorandom authentication\ntokens along with code that predicts the next token given a sequence of earlier tokens.\nAs you can see, the Go 1 generator provides no security at all (nor was it meant to).\nThe quality of the generated numbers also depends on the initial setting of <code>vec</code>.</p>\n<h2 id=\"the-pcg-generator\">The PCG Generator</h2>\n<p>For <code>math/rand/v2</code>, we wanted to provide a more modern statistical random generator\nand settled on Melissa O’Neill’s PCG algorithm, published in 2014 in her paper\n“<a href=\"https://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf\" target=\"_blank\">PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation</a>”.\nThe exhaustive analysis in the paper can make it hard to notice at first glance how utterly trivial the generators are:\nPCG is a post-processed 128-bit LCG.</p>\n<p>If the state <code>p.x</code> were a <code>uint128</code> (hypothetically), the code to compute the next value would be:</p>\n<pre><code>const (\n pcgM = 0x2360ed051fc65da44385df649fccf645\n pcgA = 0x5851f42d4c957f2d14057b7ef767814f\n)\ntype PCG struct {\n x uint128\n}\nfunc (p *PCG) Uint64() uint64 {\n p.x = p.x * pcgM + pcgA\n return scramble(p.x)\n}\n</code></pre>\n<p>The entire state is a single 128-bit number,\nand the update is a 128-bit multiply and add.\nIn the return statement, the <code>scramble</code> function reduces the 128-bit state\ndown to a 64-bit state.\nThe original PCG used (again using a hypothetical <code>uint128</code> type):</p>\n<pre><code>func scramble(x uint128) uint64 {\n return bits.RotateLeft(uint64(x&gt;&gt;64) ^ uint64(x), -int(x&gt;&gt;122))\n}\n</code></pre>\n<p>This code XORs the two halves of the 128-bit state together\nand then rotates the result according to the top six bits of the state.\nThis version is called PCG-XSL-RR, for “xor shift low, right rotate”.</p>\n<p>Based on a <a target=\"_blank\" href=\"https://go.dev/issue/21835#issuecomment-739065688\">suggestion from O’Neill during proposal discussion</a>,\nGo’s PCG uses a new scramble function based on multiplication,\nwhich mixes the bits more aggressively:</p>\n<pre><code>func scramble(x uint128) uint64 {\n hi, lo := uint64(x&gt;&gt;64), uint64(x)\n hi ^= hi &gt;&gt; 32\n hi *= 0xda942042e4dd58b5\n hi ^= hi &gt;&gt; 48\n hi *= lo | 1\n}\n</code></pre>\n<p>O’Neill calls PCG with this scrambler PCG-DXSM, for “double xorshift multiply.”\nNumpy uses this form of PCG as well.</p>\n<p>Although PCG uses more computation to generate each value,\nit uses significantly less state: two uint64s instead of 607.\nIt is also much less sensitive to the initial values of that state,\nand <a href=\"https://www.pcg-random.org/statistical-tests.html\" target=\"_blank\">it passes many statistical tests that other generators do not</a>.\nIn many ways it is an ideal statistical generator.</p>\n<p>Even so, PCG is not unpredictable.\nWhile the scrambling of bits to prepare the result does not\nexpose the state directly like in the LCG and Go 1 generators,\n<a href=\"https://pdfs.semanticscholar.org/4c5e/4a263d92787850edd011d38521966751a179.pdf\" target=\"_blank\">PCG-XSL-RR can still be reversed</a>,\nand it would not be surprising if PCG-DXSM could too.\nFor secrets, we need something different.</p>\n<h2 id=\"cryptographic-randomness\">Cryptographic Randomness</h2>\n<p><em>Cryptographic random numbers</em> need to be utterly unpredictable\nin practice, even to an observer who knows how they are generated\nand has observed any number of previously generated values.\nThe safety of cryptographic protocols, secret keys, modern commerce,\nonline privacy, and more all critically depend on access to cryptographic\nrandomness.</p>\n<p>Providing cryptographic randomness is ultimately the job of the\noperating system, which can gather true randomness from physical devices—timings\nof the mouse, keyboard, disks, and network, and more recently\n<a href=\"https://web.archive.org/web/20141230024150/http://www.cryptography.com/public/pdf/Intel_TRNG_Report_20120312.pdf\" target=\"_blank\">electrical noise measured directly by the CPU itself</a>.\nOnce the operating system has gathered a meaningful\namount of randomness—say, at least 256 bits—it can use cryptographic\nhashing or encryption algorithms to stretch that seed into\nan arbitrarily long sequence of random numbers.\n(In practice the operating system is also constantly gathering and\nadding new randomness to the sequence too.)</p>\n<p>The exact operating system interfaces have evolved over time.\nA decade ago, most systems provided a device file named\n<code>/dev/random</code> or something similar.\nToday, in recognition of how fundamental randomness has become,\noperating systems provide a direct system call instead.\n(This also allows programs to read randomness even\nwhen cut off from the file system.)\nIn Go, the <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand/\"><code>crypto/rand</code></a> package abstracts away those details,\nproviding the same interface on every operating system: <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand/#Read\"><code>rand.Read</code></a>.</p>\n<p>It would not be practical for <code>math/rand</code> to ask the operating system for\nrandomness each time it needs a <code>uint64</code>.\nBut we can use cryptographic techniques to define an in-process\nrandom generator that improves on LCGs, the Go 1 generator, and even PCG.</p>\n<h2 id=\"the-chacha8rand-generator\">The ChaCha8Rand Generator</h2>\n<p>Our new generator, which we unimaginatively named ChaCha8Rand for specification purposes\nand implemented as <code>math/rand/v2</code>’s <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#ChaCha8\"><code>rand.ChaCha8</code></a>,\nis a lightly modified version of Daniel J. Bernstein’s <a href=\"https://cr.yp.to/chacha.html\" target=\"_blank\">ChaCha stream cipher</a>.\nChaCha is widely used in a 20-round form called ChaCha20, including in TLS and SSH.\nJean-Philippe Aumasson’s paper “<a href=\"https://eprint.iacr.org/2019/1492.pdf\" target=\"_blank\">Too Much Crypto</a>”\nargues persuasively that the 8-round form ChaCha8 is secure too (and it’s roughly 2.5X faster).\nWe used ChaCha8 as the core of ChaCha8Rand.</p>\n<p>Most stream ciphers, including ChaCha8, work by defining a function that is given\na key and a block number and produces a fixed-size block of apparently random data.\nThe cryptographic standard these aim for (and usually meet) is for this output to be indistinguishable\nfrom actual random data in the absence of some kind of exponentially costly brute force search.\nA message is encrypted or decrypted by XOR’ing successive blocks of input data\nwith successive randomly generated blocks.\nTo use ChaCha8 as a <code>rand.Source</code>,\nwe use the generated blocks directly instead of XOR’ing them with input data\n(this is equivalent to encrypting or decrypting all zeros).</p>\n<p>We changed a few details to make ChaCha8Rand more suitable for generating random numbers. Briefly:</p>\n<ul>\n<li>ChaCha8Rand takes a 32-byte seed, used as the ChaCha8 key.</li>\n<li>ChaCha8 generates 64-byte blocks, with calculations treating a block as 16 <code>uint32</code>s.\nA common implementation is to compute four blocks at a time using <a href=\"https://en.wikipedia.org/wiki/Single_instruction,_multiple_data\" target=\"_blank\">SIMD instructions</a>\non 16 vector registers of four <code>uint32</code>s each.\nThis produces four interleaved blocks that must be unshuffled for XOR’ing with the input data.\nChaCha8Rand defines that the interleaved blocks are the random data stream,\nremoving the cost of the unshuffle.\n(For security purposes, this can be viewed as standard ChaCha8 followed by a reshuffle.)</li>\n<li>ChaCha8 finishes a block by adding certain values to each <code>uint32</code> in the block.\nHalf the values are key material and the other half are known constants.\nChaCha8Rand defines that the known constants are not re-added,\nremoving half of the final adds.\n(For security purposes, this can be viewed as standard ChaCha8 followed by subtracting the known constants.)</li>\n<li>Every 16th generated block, ChaCha8Rand takes the final 32 bytes of the block for itself,\nmaking them the key for the next 16 blocks.\nThis provides a kind of <a href=\"https://en.wikipedia.org/wiki/Forward_secrecy\" target=\"_blank\">forward secrecy</a>:\nif a system is compromised by an attack that\nrecovers the entire memory state of the generator, only values generated\nsince the last rekeying can be recovered. The past is inaccessible.\nChaCha8Rand as defined so far must generate 4 blocks at a time,\nbut we chose to do this key rotation every 16 blocks to leave open the\npossibility of faster implementations using 256-bit or 512-bit vectors,\nwhich could generate 8 or 16 blocks at a time.</li>\n</ul>\n<p>We wrote and published a <a href=\"https://c2sp.org/chacha8rand\" target=\"_blank\">C2SP specification for ChaCha8Rand</a>,\nalong with test cases.\nThis will enable other implementations to share repeatability with the Go implementation\nfor a given seed.</p>\n<p>The Go runtime now maintains a per-core ChaCha8Rand state (300 bytes),\nseeded with operating system-supplied cryptographic randomness,\nso that random numbers can be generated quickly without any lock contention.\nDedicating 300 bytes per core may sound expensive,\nbut on a 16-core system, it is about the same as storing a single shared Go 1 generator state (4,872 bytes).\nThe speed is worth the memory.\nThis per-core ChaCha8Rand generator is now used in three different places in the Go standard library:</p>\n<ol>\n<li>\n<p>The <code>math/rand/v2</code> package functions, such as\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#Float64\"><code>rand.Float64</code></a> and\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#N\"><code>rand.N</code></a>, always use ChaCha8Rand.</p>\n</li>\n<li>\n<p>The <code>math/rand</code> package functions, such as\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Float64\"><code>rand.Float64</code></a> and\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Intn\"><code>rand.Intn</code></a>,\nuse ChaCha8Rand when\n<a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Seed\"><code>rand.Seed</code></a> has not been called.\nApplying ChaCha8Rand in <code>math/rand</code> improves the security of programs\neven before they update to <code>math/rand/v2</code>,\nprovided they are not calling <code>rand.Seed</code>.\n(If <code>rand.Seed</code> is called, the implementation is required to fall back to the Go 1 generator for compatibility.)</p>\n</li>\n<li>\n<p>The runtime chooses the hash seed for each new map\nusing ChaCha8Rand instead of a less secure <a href=\"https://github.com/wangyi-fudan/wyhash\" target=\"_blank\">wyrand-based generator</a>\nit previously used.\nRandom seeds are needed because if\nan attacker knows the specific hash function used by a map implementation,\nthey can prepare input that drives the map into quadratic behavior\n(see Crosby and Wallach’s “<a href=\"https://www.usenix.org/conference/12th-usenix-security-symposium/denial-service-algorithmic-complexity-attacks\" target=\"_blank\">Denial of Service via Algorithmic Complexity Attacks</a>”).\nUsing a per-map seed, instead of one global seed for all maps,\nalso avoids <a href=\"https://accidentallyquadratic.tumblr.com/post/153545455987/rust-hash-iteration-reinsertion\" target=\"_blank\">other degenerate behaviors</a>.\nIt is not strictly clear that maps need a cryptographically random seed,\nbut it’s also not clear that they don’t. It seemed prudent and was trivial to switch.</p>\n</li>\n</ol>\n<p>Code that needs its own ChaCha8Rand instances can create its own <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/v2/#ChaCha8\"><code>rand.ChaCha8</code></a> directly.</p>\n<h2 id=\"fixing-security-mistakes\">Fixing Security Mistakes</h2>\n<p>Go aims to help developers write code that is secure by default.\nWhen we observe a common mistake with security consequences,\nwe look for ways to reduce the risk of that mistake\nor eliminate it entirely.\nIn this case, <code>math/rand</code>’s global generator was far too predictable,\nleading to serious problems in a variety of contexts.</p>\n<p>For example, when Go 1.20 deprecated <a target=\"_blank\" href=\"https://go.dev/pkg/math/rand/#Read\"><code>math/rand</code>’s <code>Read</code></a>,\nwe heard from developers who discovered (thanks to tooling pointing out\nuse of deprecated functionality) they had been\nusing it in places where <a target=\"_blank\" href=\"https://go.dev/pkg/crypto/rand/#Read\"><code>crypto/rand</code>’s <code>Read</code></a>\nwas definitely needed, like generating key material.\nUsing Go 1.20, that mistake\nis a serious security problem that merits a detailed investigation\nto understand the damage.\nWhere were the keys used?\nHow were the keys exposed?\nWere other random outputs exposed that might allow an attacker to derive the keys?\nAnd so on.\nUsing Go 1.22, that mistake is just a mistake.\nIt’s still better to use <code>crypto/rand</code>,\nbecause the operating system kernel can do a better job keeping the random values\nsecret from various kinds of prying eyes,\nthe kernel is continually adding new entropy to its generator,\nand the kernel has had more scrutiny.\nBut accidentally using <code>math/rand</code> is no longer a security catastrophe.</p>\n<p>There are also a variety of use cases that don’t seem like “crypto”\nbut nonetheless need unpredictable randomness.\nThese cases are made more robust by using ChaCha8Rand instead of the Go 1 generator.</p>\n<p>For example, consider generating a\n<a href=\"https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)\" target=\"_blank\">random UUID</a>.\nSince UUIDs are not secret, using <code>math/rand</code> might seem fine.\nBut if <code>math/rand</code> has been seeded with the current time,\nthen running it at the same instant on different computers\nwill produce the same value, making them not “universally unique”.\nThis is especially likely on systems where the current time\nis only available with millisecond precision.\nEven with auto-seeding using OS-provided entropy,\nas introduced in Go 1.20,\nthe Go 1 generator’s seed is only a 63-bit integer,\nso a program that generates a UUID at startup\ncan only generate 2⁶³ possible UUIDs and is\nlikely to see collisions after 2³¹ or so UUIDs.\nUsing Go 1.22, the new ChaCha8Rand generator\nis seeded from 256 bits of entropy and can generate\n2²⁵⁶ possible first UUIDs.\nIt does not need to worry about collisions.</p>\n<p>As another example, consider load balancing in a front-end server\nthat randomly assigns incoming requests to back-end servers.\nIf an attacker can observe the assignments and knows the\npredictable algorithm generating them,\nthen the attacker could send a stream\nof mostly cheap requests but arrange for all the expensive requests\nto land on a single back-end server.\nThis is an unlikely but plausible problem using the Go 1 generator.\nUsing Go 1.22, it’s not a problem at all.</p>\n<p>In all these examples, Go 1.22 has eliminated or greatly reduced\nsecurity problems.</p>\n<h2 id=\"performance\">Performance</h2>\n<p>The security benefits of ChaCha8Rand do have a small cost,\nbut ChaCha8Rand is still in the same ballpark as both the Go 1 generator and PCG.\nThe following graphs compare the performance of the three generators,\nacross a variety of hardware, running two operations:\nthe primitive operation “Uint64,” which returns the next <code>uint64</code> in the random stream,\nand the higher-level operation “N(1000),” which returns a random value in the range [0, 1000).</p>\n<p><img src=\"https://go.dev/blog/chacha8rand/amd.svg\" />\n<img src=\"https://go.dev/blog/chacha8rand/intel.svg\" />\n<img src=\"https://go.dev/blog/chacha8rand/amd32.svg\" />\n<img src=\"https://go.dev/blog/chacha8rand/intel32.svg\" />\n<img src=\"https://go.dev/blog/chacha8rand/m1.svg\" />\n<img src=\"https://go.dev/blog/chacha8rand/m3.svg\" />\n<img src=\"https://go.dev/blog/chacha8rand/taut2a.svg\" />\n</p>\n<p>The “running 32-bit code” graphs show modern 64-bit x86 chips\nexecuting code built with <code>GOARCH=386</code>, meaning they are\nrunning in 32-bit mode.\nIn that case, the fact that PCG requires 128-bit multiplications\nmakes it slower than ChaCha8Rand, which only uses 32-bit SIMD arithmetic.\nActual 32-bit systems matter less every year,\nbut it is still interesting that ChaCha8Rand is faster than PCG\non those systems.</p>\n<p>On some systems, “Go 1: Uint64” is faster than “PCG: Uint64”,\nbut “Go 1: N(1000)” is slower than “PCG: N(1000)”.\nThis happens because “Go 1: N(1000)” is using <code>math/rand</code>’s algorithm for\nreducing a random <code>int64</code> down to a value in the range [0, 1000),\nand that algorithm does two 64-bit integer divide operations.\nIn contrast, “PCG: N(1000)” and “ChaCha8: N(1000)” use the <a target=\"_blank\" href=\"https://go.dev/blog/randv2#problem.rand\">faster <code>math/rand/v2</code> algorithm</a>,\nwhich almost always avoids the divisions.\nRemoving the 64-bit divisions dominates the algorithm change\nfor 32-bit execution and on the Ampere.</p>\n<p>Overall, ChaCha8Rand is slower than the Go 1 generator,\nbut it is never more than twice as slow, and on typical servers the\ndifference is never more than 3ns.\nVery few programs will be bottlenecked by this difference,\nand many programs will enjoy the improved security.</p>\n<h2 id=\"conclusion\">Conclusion</h2>\n<p>Go 1.22 makes your programs more secure without any code changes.\nWe did this by identifying the common mistake of accidentally using <code>math/rand</code>\ninstead of <code>crypto/rand</code> and then strengthening <code>math/rand</code>.\nThis is one small step in Go’s ongoing journey to keep programs\nsafe by default.</p>\n<p>These kinds of mistakes are not unique to Go.\nFor example, the npm <code>keypair</code> package tries to generate an RSA key pair\nusing Web Crypto APIs, but if they’re not available, it falls back to JavaScript’s <code>Math.random</code>.\nThis is hardly an isolated case,\nand the security of our systems cannot depend on developers not making mistakes.\nInstead, we hope that eventually all programming languages\nwill move to cryptographically strong pseudorandom generators\neven for “mathematical” randomness,\neliminating this kind of mistake, or at least greatly reducing its blast radius.\nGo 1.22’s <a href=\"https://c2sp.org/chacha8rand\" target=\"_blank\">ChaCha8Rand</a> implementation\nproves that this approach is competitive with other generators.</p>\n </div>",
"author": "",
"favicon": "https://go.dev/images/favicon-gopher.svg",
"source": "go.dev",
"published": "",
"ttr": 685,
"type": ""
}