Skip to content

How To Create A Random Dice Simulator In Excel (Simulating Catan)

Are you ready for a really really (really) nerdy post? Yeah, I thought so.

So recently, on this blog, we took a look at probability in the board game Catan. The outcome was to say that, although 7 is statistically speaking the most likely single number to come out in a game of Catan, there is a high chance that it will not be the most likely result to come out across a series of dice rolls. This was simulated by me rolling 240x 2D6 and recording the results. There are, on average (according to some sources), around 80 rolls of the dice in a game of Catan, so that was a simulation of three games.

In that blog, I said that this is an incredibly fallible way of doing things as ultimately the dice results would even out and become more of a bell curve the more results we have.

Chatting with a couple of ace bloggers in the comments, Dave (from the blog Faith&Steel) and Faust (from the blog Double Down Dice), they suggested that I actually use something like Excel to simulate dice instead. This eliminates any wonky dice, as well as saves time, effort, and provides more accurate results. It allows for the generation of thousands of results rather than hundreds.

So, with that in mind, I took their advice and created a random dice generator in Excel. As we all know, there are a hundred thousand ways of doing things in Excel, so I thought I would show you my simulator, how I made it, what the formulas are, and what the results are. This is how I simulated 500 games of Catan, in a matter of minutes, using Excel. Coincidentally, by writing this article we are going to cover how to create a random dice simulator in Excel, so feel free to build along if you so wish.

How To Create A Random Dice Simulator In Excel

Step 1: Decide How Many Games To Simulate

Using the premise that a Catan game is around 80 rolls of the dice (I’m assuming 4 players, and 20 rounds looking at this as a secondary source and erring on the side of potential mathematics) and wanting to simulate 100x games to begin with we will use Columns B through CW. This is 100 cells across and then rows 2 through 81. This highlights 8000 cells across the middle. What we are looking for is figuring out the most rolled numbers across the games.

Step 2: Using A Basic 2D6 Formula

In cell B2 we are going to use the following formula:

=(RANDBETWEEN(1,6))+(RANDBETWEEN(1,6))

To explain that a bit – Randbetween generates a random number between the two numbers in the brackets. In this case it is 1 and 6 since we are simulating a D6. The Randbetween is added to itself to simulate 2D6.

It is worth noting that, if you ever want to add on an additional dice, so you want to simulate 3D6 for example, you just add another “+(RANDBETWEEN(1,6))” on the end. If you ever want to simulate a different type of dice, you turn it to “RANDBETWEEN(1,x)” where “x” is the number of sides on the dice.

This is dragged down to cover B2 through B81, and then highlighted and dragged across to CW81, filling in all the space in between.

Why start at B2? Well, I was debating if I would need a column at the side, so started with B2. You can do A2 to CV81 instead if you so wish.

How to build a random dice simulator in Excel - An example segment.

An example segment.

Just be warned that, due to the =randbetween, the cells will randomly change whenever you edit a cell from now on. You can get over this, if you want to, by copying and pasting the cells onto themselves; however, I wouldn’t recommend it at this moment in time.

Step 3: Finding the Mode

Since, in this case, we are wanting to know the most common number in each set, to get a modal average, we need to look at creating a mode calculator for each column. This means applying the following formula to space beneath each “simulated game”. I put it in the 84th row.

=MODE(B2:B81)

This provides the most commonly rolled number across the game in that cell. Dragging it across will generate the mode for each set of games.

For me, those modal formulas were written in B84 to CW84. This is important for later.

Step 4: Collate the Modes

Catan Simulated Rolls

Laid out rolls. Ignore the numbers at the moment, we will come onto those in a minute.

The next step is to collate the modes together to find how many there are of each dice roll to find out what number comes up the most in each game. For this, we will need 11 different cells and formulas, one for each combo there is when rolling 2D6 (you can’t roll a 1 on 2D6). These are laid out in a spare space like as shown on the right.

To explain this, there were, for instance, 2x games out of the first 100 where the most rolled number was 4. There were 3x games where the most rolled number was 5.

In the number column next to the Avg. 2, you need to write the following formula:

=COUNTIF($B$84:$CW$84,2)

This is going to take some explaining.

Effectively, the COUNTIF function in Excel will count the number of results. The way to think of it is COUNTIF(range,result), so, next to the Avg. 2 we want to count the modal results that are of 2. The messy looking $B$84 (etc.) just says that, when dragging those cells, keep that part of the formula the same. That is what the $ does in Excel.

So, for next to the Avg. 3 we want to count the number of times 3 comes up in the modal row (B84 to CW84 in this case). Thus the formula is:

=COUNTIF($B$84:$CW$84,3)

You can see the digit on the end has now changed to 3. Depending on your version of Excel, you may be able to drag this down and have the formula change automatically, or you may need to do it by hand.

Side Note: Why the Mode?

When we talk about averages we usually talk about three different types. These are Mean, Median, and Mode.

Mean is where we take all the rolls, add them together and divide by the number of rolls. This gives an average roll value; however, in most cases, it will be a decimal. This isn’t hugely what we want.

The Median gives us the middle roll so it will tell us, if we sorted all the rolls in numerical order and lined them up, what the roll around position 40 is.

The Mode, instead, gives us the most rolled number. This is what we are interested in as it gives us a full, tangible, and real number to use.

Step 5: Graph the Results

Take the results, of which you should now have 11, one for each roll possible on 2D6, into a line or bar graph. Personally, I prefer a line graph/chart in this case, but it is up to you.

The graph will be a representative of 100x games of Catan and the modal rolls for each one. Save the graph as a picture or take a snapshot or whatever you want to do to keep a record.

Step 6: Re-Enter the Formula

As mentioned near the top of this guide, the RANDBETWEEN will change every time you change something in the form. Now you want to do it intentionally, so take one box and drag it onto another. The formula is identical, but the results for all 8000 simulated rolls will change. This will change everything automatically, including the chart. Clever, eh?

So, I did it five times, giving me 40,000 simulated rolls in total, which kicks the ass of the 240 I did manually before. This is approximately 500 games (ish, depending on how long you take a game to be).

These are the results.

Simulating Catan – The Results

So, for those who read my other article (you can here) you will know that the determined result was that 80 rolls may not be enough for any probability pattern to take hold. Over the course of one game, seven may not be the most common number. This led to a graph like this, showing the results of 240 different rolls across three games.

3x 2D6, each rolled 80 times, simulating 80 games of Catan.

3x 2D6, each rolled 80 times, simulating 80 games of Catan.

Game 1, for instance, barely had a 7 come up at all (only 6 times). Across the three games we actually saw 7 be the third most common result, with 8 being most common, followed by 6. This isn’t all that unusual.

That being said, when we start collating results across games, rolling a lot more games, and taking the mode result rather than each dice result, we tend to see a different story.

Games 1-100 simulated and the most common result in each game.

Games 1-100 simulated and the most common result in each game.

We start to see more of a pattern that keeps arising.

Example 2

Games 101-200 simulated and the most common result in each game.

And the pattern keeps on and on, only varying slightly each time.

Games 201-300 simulated and the most common result in each game.

Games 201-300 simulated and the most common result in each game.

In fact, we start to see some incredibly interesting results.

Example 4

Games 301-400 simulated and the most common result in each game.

And finally…

Example 5

Games 401-500 simulated and the most common result in each game.

Interesting, eh?

So what we are seeing is two-fold. The first is that 7 is indeed the most common number to come up. That is point number one. There is no doubt about it – across enough rolls it is the most likely of all the numbers.

That being said, across all the 500 games there are also games where 4 has been the most rolled number. There are games where 10 has been. Where statistically unlikely, it is not impossible. Curiously, there appears a need for the number to have at least a 3/36 possibility of coming up for it to really stand a chance (using 500 games sample) of dominating a game.

This means that although 7 is the most likely, according to those 500 games, it still only has under a 50% of being the most dominant number in any given game. This means it is more likely that the mode will not be 7 than that it will be.

Over 10,000 Games (or 800,000 Rolls)

Over 10,000 Simulated Games of Catan

Modes Over 10,000 Simulated Games of Catan

So…never one to back down from a challenge, let’s look at 10,000 games (which goes up to Column NTQ, if we leave A blank, in Excel). This almost crashed my computer…

Needless to say, I needed to alter the final set of formulas to incorporate row NTQ into account.

Interestingly, when looking at really large numbers of games (no, I am not going to go higher than 10,000 simulated games in this article) we start seeing games where the most rolled number is 3 or 11. Part of me is curious to see how many games we would need to simulate to have two or twelve come up as the most likely number, but to be honest, I’m not sure my computer will take it.

To put all of this into a graph then it will look like this.

Example 6

10,000 Games Simulated.

That feels fairly definitive, and a much better method than rolling each individually. Judging how long it took me to simulate three games, it would have taken me around 104 days to simulate 10,000.

So there you have it – an incredibly geeky and nerdy and mathy article about probability in Catan. Weirdly, I really enjoyed writing this and had to work out a lot of Excel functions to write this post, so now I’m wondering what else I can simulate to this level.

Let’s find out!

Once again, a special thanks to Dave (you can read his latest article here) and Faust (you can read his latest article here) for their suggestions and in schooling me in how I should be doing things. Part of what this blog is about is learning as we go along, so if you have any ideas on how to improve this analysis please let me know in the comments below.

There we go – now you know how to create a random dice simulator in Excel. Now you too can simulate 10,000 games of Catan. I don’t know why you should unless you want to do analysis like this, but the important thing is you can.

Anyway, for the time being, what are your thoughts. Is there a more efficient way? How would you calculate the rolls? Let me know in the comments below.

If you enjoyed reading this
you may also enjoy:
Catan Analysis: Understanding the Numbers

19 Comments »

  1. Very awesome idea for an article! I’ll have to read more in depth and comment once I’m not toddler watching. Actually my way of doing things, is usually write a program in whatever computer language I’m using at the time. Though you probably got your answers much quicker in Excel. The only advantage to writing an app/program would possibly be less overhead when crunching 1000s of games. I can’t remember how many game simulations I had run in mine at once, but it was probably in the 1000s or 100,000s and I was checking initiative, attack rolls, defense rolls, damage, hit points, etc. Yay for robust web servers handling all my computations, haha!

    Also you can get graphs much quicker in Excel.

    Liked by 1 person

    • I can see why you’d use code. To be completely honest, I think this works in Excel because it is a simple roll. It is simply 2D6. Once you have more than one variable then it makes sense to use code and create a calculator. What code did you use?

      Liked by 1 person

      • You are definitely right. I’m also not an Excel wiz, so inputting what I was trying to do into Excel, never even occurred to me. For the system I was building, I used PHP running off a web server. There was also a mysql database filled with stats, so I could make things more dynamic if I wanted to later.

        Like

        • Whew! Just finished reading this (toddler-free!). Once again, it’s a great and well put together article. justneedsvarnish already mentioned the one thing that sprung into my mind and that was if the random number function was truly random . I ran into that in my program, and there was another function I was able to use make things more random. Again, it’s likely close enough when you spread this across 10,000 games and I don’t think you have a choice in Excel anyways.

          Always a pleasure to read your analytic approaches to games! πŸ™‚

          Like

  2. Good article and fun to watch a normal distribution of the mode develop. Of course the metaphysical component is stripped out – the component in which a human mind is coupled to each roll – the very component that assured if you need a specific number you will roll anything but! πŸ˜‰

    Liked by 1 person

    • Ahh, yes, the “why can’t I stop rolling the number 2 on 2D6” effect – I know it well.
      Actually, in all seriousness, we once had a game where one guy, when it came around to his turn he would roll 11. He did it four or five turns in a row, and then frequently throughout the rest of the game. Weird rolling.

      Like

      • Better than the game where one person rolled the Robber over 50% of the time. πŸ˜‰ Those odd games really stick in one’s mind though. We also played a game of The Resistance where one person drew ‘bad guy’ every game, coupled with the fact they were really bad at lying….it made for a hilarious night.

        Like

  3. Cheers for this Luke, you just managed to combine my two favourite geek things. Hobby and Excel. I use it every day but rand is not something I’d ever used, probably as I deal with actual rather than probabilities (you learn something every day!). I’m so going to have to try this out now.

    Liked by 2 people

    • You know, it’s probably possible to create an ace Warhammer simulator using Excel…
      Like…working out hits and misses, and then rerolls on special abilities for specific units, and thus whole armies…
      Hmmm…now that’s a thought…

      Liked by 1 person

  4. Thanks for sharing this Luke! I’ve used Excel for working out probabilities for my own wargames rules and find it it really useful! But I’ve never generated random numbers, mainly because I don’t know just how random it is when Excel does it! But from your (very large) population it seems to behave as expected!

    Liked by 2 people

    • So, this was something I was curious about so I asked a few people who know Excel and computers well. They suggested that it’s not really random, but rather somehow uses the clock to generate the numbers. I have no idea if that is true or not, but ultimately, even if it isn’t random (because it must be based on some form of computer function or other) the division it uses is so minute that it appears to be random to us.

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: