Toha - Property Statistics

An early warning that this is a pretty nerdy post, with lots of charts that might bring back some unpleasant memories of math lessons long-forgotten.

My wife and I reviewed somewhere between 15,000-20,000 sample outputs during the development of Toha. Thankfully (although tinged with a tiny bit of disappointment) there was nothing in the final outputs that was a total curveball.

However, we did spot that some color palettes happened much more or less frequently than we’d expected. We also noticed that the stencils were less uniformly picked than a perfect probability match would have predicted.

Alongside our own observations, several members of the GEN.ART discord community were curious about whether the outputs of Toha were in line with my expectations.

This post is an attempt to provide a detailed answer to that question, I don’t know if it will be as interesting as they or I hope but I’ve crunched the numbers now so I feel compelled to share them :)

How random is random?

Nearly all generative art relies on random number generation (RNG) to provide the variation between outputs. We might use this to control the choice of color palette, the creation of a noise or flow field, or determine whether a particular trait is active.

I write most of my projects using p5js, which provides a really useful function called “random” which returns a number between 0 and 1, but just how random is it? Let’s try a simple experiment to find out.

Coin Toss Simulation

In this short code sample, we ask for a random number 20 times.
If it’s less than 0.5 (which should happen 50%) of the time then we increase our “heads” tally by 1, otherwise we increase “tails”.

In a “perfect probability” world we’d expect 10 heads and 10 tails, but just like the real world we rarely see this outcome. Running this 5 times, the results were:

heads: 7, tails: 13
heads: 7, tails: 13
heads: 13, tails: 7
heads: 10, tails: 10
heads: 9, tails: 11

The more “coins” we toss, the nearer to a 50/50 split we tend to get but even with 2,000 tosses, the first result is:

heads: 1056, tails: 944

Toha - Expectations vs Reality

With 999 outputs and over 1,000,000,000 possibilities for the published properties alone (there are several others that work hard behind the scenes), Toha relies very heavily on those virtual coin tosses.

So let’s take a look at how that played out across each of the properties and if you are not a data nerd, then this is your final chance to escape.

Still here? Awesome, let’s do this :)

In each of the charts below, the grey bar is the expected percentage, the green bar is what actually happened. In the case of Distortion, Reflection, and Stencil I have removed the “None” value to make the remaining values easier to compare.

I won’t provide commentary as we go, but will attempt a summary at the end.

Color Palettes

Color Variation

Distortion

Edge Movement

Layer Count

Layer Gap

Layer Spread

Layer Style

Layer Thickness

Reflection

Scaling

Stencil

A Summary

In general, the outcomes were broadly as expected, the scaling chart in particular gives me a warm fuzzy feeling with its lovely near-uniformity. The main surprises when I compiled the data were the over-representation of Beachball, Dusk, Iris, and Mulan.

I re-ran the section of code responsible for color selection to retest this and got a more even distribution but three (different) colors were still noticeably more common. With a large number of palettes, this is probably to be expected (I’d love to hear from someone more knowledgeable about this than me). I suppose that if you rolled a 30-sided dice you would likely expect some numbers to be much more common even with 999 rolls.

Thanks again for taking the time to read this post and don’t hesitate to say hello via Twitter.