CS Professor’s Work Anything But Predictable

Share story

Algorithmic random number generators (RNGs) lie at the heart of many important algorithms in computer science. They are hard at work assisting in a variety of tasks, including computer simulation, computational creativity, global optimization, robotics, games and more. Yet despite our widespread reliability on them, many of the most widely used RNGs are actually flawed, says Harvey Mudd College Professor of Computer Science Melissa O’Neill.

“Most people are surprised to learn that the RNGs in languages like Python, Java and C++ are both easily predicted and have serious statistical flaws,” says O’Neill. “It’s surprising because RNGs have been around for about 50 years, so you might think that the problem would have been well solved already.” But there was no good all-round solution, until now.

O’Neill has developed a family of RNGs, under the acronym “PCG” (permuted congruential generator), that set a new performance standard. “Prior to the PCG family, there was always a trade-off,” she says. “You could pick a fast RNG, or a hard-to-predict one, or one with good statistical performance, but you couldn’t find one that would give you all the desirable features together.” O’Neill’s PCG scheme changes that, providing a broader array of features than found on any prior RNG, combining speed, ease of use and low predictability, all contained within a small amount of code and minimal memory usage.

O’Neill became interested in random number generation last year by chance. “I saw a talk where the speaker was critical of a particular kind of low-quality RNG and said it should never be used,” she says. “I always encourage students to think critically about the things they’re told, and in this case I applied that same skepticism. Should it really never be used?”

She became curious about whether she could turn a very low-quality RNG into a very good one. With the development of PCG, she succeeded. The end result was what she calls “a new randomized algorithm for randomness,” allowing an RNG to apply its own randomness to itself.

O’Neill first presented the work to Harvey Mudd summer research students, in part to show them how a fresh perspective on an old problem can yield new results. “As they embarked on their own projects, I wanted them to understand how valuable curiosity and play are,” says O’Neill. Colleagues who saw her presentation encouraged her to publish it. As a result, she formalized and generalized her work, culminating in a detailed academic paper (currently under review). Analysis indicates that the performance of the PCG family in statistical tests isn’t merely good, but optimal.

RNG table

PCG family versus other RNGs

O’Neill notes that the success of an RNG doesn’t just depend on its technical qualities; the truly successful ones are those that everyone uses. To that end, she has built a website that provides an accessible description of the advantages of the PCG scheme and offers downloadable code. Not long after the website went live, O’Neill was contacted by Mudd alumnus Zvi Effron ’10, who informed her that leading game industry graphics programmers were raving about the paper on Twitter. “It’s a new world,” says O’Neill, “where part of being successful isn’t just how well the work does in academic peer review, but also how much traction it gets on social media!”

One of the most popular aspects of her work is a more whimsical part of the PCG website that shows how PCG-based RNGs can perform “party tricks.” It includes an RNG that will produce 3,000 billion billion completely random numbers, then output Hamlet, and then return to completely random numbers again.

O’Neill’s research on random number generation isn’t the first case of her work being popular with bloggers as well as academics. Her 2009 paper, “The Genuine Sieve of Eratosthenes,” was also immensely popular online.

O’Neill sees her research as imbued with the spirit of Harvey Mudd—learning for learning’s sake, healthy skepticism and, most importantly, play.  She hopes it will be an empowering example to encourage her students to think outside the box and consider novel approaches to problem solving. “Some people think that undergraduates should aim low in research because of their relative lack of experience,” she says. “But I encourage them to aim high, because sometimes a fresh perspective is a wonderful thing.”

Hired by the college in 2001, O’Neill has taught Data Structures and Program Development, Programming Languages, and Operating Systems. Her other areas of active research include memory management and making parallel and concurrent programming easier.

O’Neill recently spoke about the PCG family at the Stanford computer science colloquium. The PCG website is www.pcg-random.org.