A Primer on RNG. Kinda.
SkittishSloth
✭✭✭
Because absolutely no one asked for it, I thought I'd throw out a very basic intro to how RNG often (sometimes?) is handled in video games - like Timelines. A couple of quick caveats:
Dammit Jim, I'm a coder, not a mathematician (or statistician). I'm a programmer, I've been one for almost twenty years. Always business-type stuff, never anything like video games. I always wanted to write games and used to try my hand at it, but bills and reasons and whatever. Some of the stuff I say might be off, or even flat out wrong, but hopefully it's close enough for government work. If you're reading this because you need the secrets of random numbers in the next five minutes to defuse a bomb, then I hope you figure it out so you can explain the sequence of events that led you to looking for bomb defusal advice on a game forum.
I hope this is somewhat obvious, but I don't work for DB. I haven't even filed a CS ticket that I can recall - no relationship with them whatsoever, and since I think this is my first post (versus a comment), I might have less of a relationship than a lot of you reading this. I want to make that clear because I have zero idea of how their stuff actually works. It could be exactly the same, it could be entirely different and make this post look idiotic at best. If I had to guess I'd say it's probably somewhere in between.
Also, this is only indirectly inspired by the recent Kirk-gate, and has no real bearing on it at all.
With those out of the way, let's begin shall we?
First, to be that guy, we aren't dealing with truely random numbers, we're dealing with psuedo-random numbers. They're generated by an algorithm - a list of instructions - so they aren't truly random. If you're bored enough and know the algorithm used, you can figure out the next number if you know what the last one was. That's entirely useless, but now you can sound like a pedantic jerk at your next dinner party.
Random number generators normally create a number between 0 and 1, something with a ton of decimal places. Something like 0.1234567898765432. Java - the programming language I use most of the time - can have up to 16 numbers after the decimal. I imagine whatever DB uses (if not Java, for Android at least) is pretty much the same. We won't worry a lot about that right now.
Let's say you want something to happen 50% of the time. You'll do something like this:
If you aren't a coder, I hope that still makes sense. We get a random number - between 0 and 1 with all those decimals - and if it's less than or equal to .5 we do something. So - in case it's been a while since you took math - to check what percent the random number is, you multiply it by 100. That example number I had above would be 12.3% (yes, lots more decimal places after). It would have met our condition, so we would've done stuff.
If you have multiple conditions, say 50%, 30%, and 20% - adds up to 100%, so every random number gets picked - you would set up your conditions like this:
(Normally in programming you wouldn't have the if () part in the last one - I'm just putting it here for clarity.)
We have our first block that takes 50% of the random numbers we throw at it. We want a block for 30%, so we add 30 to 50 and get 80 - .8 - for our second block. We don't care what the actual values are. We could have had it as 30% for the first block, then 50% for the second block (adding 20 from the original specs). We just care about the size of the chunk of numbers that can trigger the condition.
Now, for an absolutely perfect random number generator, over a long enough run you'll eventually get every number generated equally. Nothing is perfect. Computer algorithms especially. So there'll be clusters of numbers that pop up. But super smart people out there spent a lot of neurons trying to figure this stuff out, and they work pretty good for all their flaws, and - especially since we don't know what actual algorithm DB is using for their RNGs - we might as well assume it's close enough to a perfect algorithm.
The key thing though is that you'll get good distribution over a long enough run. If you ran that little sample algorithm above 10 times, sure! There's a chance you'll get 5 hits in the 50%, 2 in the 20, and 3 in the 30. You might also get all 10 in the 20%. After 100 runs, you'll get closer to the 50/30/20 breakdown, after 1,000 it'll be better still, after 1,000,000 or 1,000,000,000 or... Well, you get the idea. As long as the chance isn't exactly 100%, you can only expect it to balance out over a long enough run. And if you can figure out what "long enough" is, stop reading this and get to the slot machines.
With a distributed game like STT, it's quite possible that a good amount of the RNG is done server-side. If that's the case, a "long enough" run might just be a few minutes - but that's across however many people are playing right then. So that 5% drop you aren't getting shows up as 5% in their records, but across how many different rolls for how many different people around the world?
One thing I'd like to touch on that I'm sure all of us have thought about when our tin foil hats are comfy and the rolls aren't going our way. Yes, it would be entirely possible for DB to have something like this in their code:
As a programmer I'd like to point out that adding any kind of logic to address individual users is an absolute nightmare. You have to make sure it doesn't break anything else (you'd be surprised how hard it is to make sure of that). You have to keep it in mind in the future - what if they change the drop rate, and now that 0.0001 is actually awesome? You could put it in a database, where it'd be easier to change, but that's another moving part to worry about, with it's own problems. It's not worth it. When you're having a bad run, DB most likely isn't out to get you - at least, for their developers' sake, I hope they aren't.
Hopefully this shed a tiny glimmer of light on the black-magic-voodoo that decides whether you got Burnham or Shinzon in that pack you shelled out cash for. I know it doesn't change the fact that the hopes you had at seeing the gold border were changed into a little honor and a lot of heartbreak, but knowing a bit of how/why this stuff works like it does takes a touch of the edge off.
...
Okay, no it doesn't. Shinzon is horrible.
Dammit Jim, I'm a coder, not a mathematician (or statistician). I'm a programmer, I've been one for almost twenty years. Always business-type stuff, never anything like video games. I always wanted to write games and used to try my hand at it, but bills and reasons and whatever. Some of the stuff I say might be off, or even flat out wrong, but hopefully it's close enough for government work. If you're reading this because you need the secrets of random numbers in the next five minutes to defuse a bomb, then I hope you figure it out so you can explain the sequence of events that led you to looking for bomb defusal advice on a game forum.
I hope this is somewhat obvious, but I don't work for DB. I haven't even filed a CS ticket that I can recall - no relationship with them whatsoever, and since I think this is my first post (versus a comment), I might have less of a relationship than a lot of you reading this. I want to make that clear because I have zero idea of how their stuff actually works. It could be exactly the same, it could be entirely different and make this post look idiotic at best. If I had to guess I'd say it's probably somewhere in between.
Also, this is only indirectly inspired by the recent Kirk-gate, and has no real bearing on it at all.
With those out of the way, let's begin shall we?
First, to be that guy, we aren't dealing with truely random numbers, we're dealing with psuedo-random numbers. They're generated by an algorithm - a list of instructions - so they aren't truly random. If you're bored enough and know the algorithm used, you can figure out the next number if you know what the last one was. That's entirely useless, but now you can sound like a pedantic jerk at your next dinner party.
Random number generators normally create a number between 0 and 1, something with a ton of decimal places. Something like 0.1234567898765432. Java - the programming language I use most of the time - can have up to 16 numbers after the decimal. I imagine whatever DB uses (if not Java, for Android at least) is pretty much the same. We won't worry a lot about that right now.
Let's say you want something to happen 50% of the time. You'll do something like this:
chance = getRandomNumber() if (chance <= .5) doStuff()
If you aren't a coder, I hope that still makes sense. We get a random number - between 0 and 1 with all those decimals - and if it's less than or equal to .5 we do something. So - in case it's been a while since you took math - to check what percent the random number is, you multiply it by 100. That example number I had above would be 12.3% (yes, lots more decimal places after). It would have met our condition, so we would've done stuff.
If you have multiple conditions, say 50%, 30%, and 20% - adds up to 100%, so every random number gets picked - you would set up your conditions like this:
if (chance <= .5) doFirst() else if (chance <= .8) doSecond() else if (chance <= 1) doThird()
(Normally in programming you wouldn't have the if () part in the last one - I'm just putting it here for clarity.)
We have our first block that takes 50% of the random numbers we throw at it. We want a block for 30%, so we add 30 to 50 and get 80 - .8 - for our second block. We don't care what the actual values are. We could have had it as 30% for the first block, then 50% for the second block (adding 20 from the original specs). We just care about the size of the chunk of numbers that can trigger the condition.
Now, for an absolutely perfect random number generator, over a long enough run you'll eventually get every number generated equally. Nothing is perfect. Computer algorithms especially. So there'll be clusters of numbers that pop up. But super smart people out there spent a lot of neurons trying to figure this stuff out, and they work pretty good for all their flaws, and - especially since we don't know what actual algorithm DB is using for their RNGs - we might as well assume it's close enough to a perfect algorithm.
The key thing though is that you'll get good distribution over a long enough run. If you ran that little sample algorithm above 10 times, sure! There's a chance you'll get 5 hits in the 50%, 2 in the 20, and 3 in the 30. You might also get all 10 in the 20%. After 100 runs, you'll get closer to the 50/30/20 breakdown, after 1,000 it'll be better still, after 1,000,000 or 1,000,000,000 or... Well, you get the idea. As long as the chance isn't exactly 100%, you can only expect it to balance out over a long enough run. And if you can figure out what "long enough" is, stop reading this and get to the slot machines.
With a distributed game like STT, it's quite possible that a good amount of the RNG is done server-side. If that's the case, a "long enough" run might just be a few minutes - but that's across however many people are playing right then. So that 5% drop you aren't getting shows up as 5% in their records, but across how many different rolls for how many different people around the world?
One thing I'd like to touch on that I'm sure all of us have thought about when our tin foil hats are comfy and the rolls aren't going our way. Yes, it would be entirely possible for DB to have something like this in their code:
if (user is SkittishSloth) legendary chance = 0.0001 if (user is Shan) legendary chance = 5.0000
As a programmer I'd like to point out that adding any kind of logic to address individual users is an absolute nightmare. You have to make sure it doesn't break anything else (you'd be surprised how hard it is to make sure of that). You have to keep it in mind in the future - what if they change the drop rate, and now that 0.0001 is actually awesome? You could put it in a database, where it'd be easier to change, but that's another moving part to worry about, with it's own problems. It's not worth it. When you're having a bad run, DB most likely isn't out to get you - at least, for their developers' sake, I hope they aren't.
Hopefully this shed a tiny glimmer of light on the black-magic-voodoo that decides whether you got Burnham or Shinzon in that pack you shelled out cash for. I know it doesn't change the fact that the hopes you had at seeing the gold border were changed into a little honor and a lot of heartbreak, but knowing a bit of how/why this stuff works like it does takes a touch of the edge off.
...
Okay, no it doesn't. Shinzon is horrible.
16
Comments
I especially like:
Edit: Incidentally, how would you defuse a bomb with random numbers? ...Asking for a friend.
First, you will not know the next number out of pseudo RNG if you don’t know the seed. And normal people reinitialize seed often to address the issues of repeating numbers.
Second, chance is not directly returned from RNG. RNG returns some integer between 0 and some max threshold (in C/C++ it’s RAND_MAX that can be different on different platforms) So now it really matters how exactly calculations after RNG are implemented. For instance, say you need random numbers between 1 and 10000. One can write
int random = getRandomNumber(); // value is between 0 and 32767 on this system;
int chance = random % 10000 + 1; // modifying the value to be in range 1..10000
if (chance <= 127) // 1.27%
{giveLegendaryToThePlayer();}
Now do you see a problem here?
Third, there are implementations of RNG that utilize entropy from hardware like processors, there are problems with them too, but overall they’re much more ‘really’ random than pseudorandom algorithms, and they’re not hard to use.
And yes, from my conclusions observing this game, I think everything here is done on the server side, clients are very thin.
If DB don’t initialize RNG on per player basis every pull - they really f.. sorry, messed up
If they initialize RNG only by current time - they really messed up
If they do with the numbers returned by RNG something like I described above - they really messed up.
I’ve seen so many developers doing wrong things with RNG, more than I would like to. And based on my own experience with RNG here, I have a strong feeling that DB are amongst them.
If I had to guess, there is a reason, statistically, to do this. I never studied statistics but I know that its pretty funked up. Like saying a 1 in 10 chance doesn't mean that if you pull 10 times you're guaranteed 1. Sometimes when you pull 10 and get 2, you just stole someone else's 1. The law of averages makes it so that it is still a 1 in 10 chance across the board but there will be losers and winners. So... Because of this, a shared RNG may make sense to ensure that there are no anomalies of massive jackpots to everyone at once, or to ensure the statistical consistency, etc... *shrug*
Vesmer: I agree with everything you said.
Thank you for the post.
Two ways: use it to determine which wire to cut, or use it to play some kind of guessing game. Eventually the bomb will explode and it's not your problem anymore.
@Vesmer - you make solid points. I didn't really discuss some of the finer details for two reasons.
As for concerns of bias and distributions and everything - the "stringy"-ness @Enoch[SoS] mentioned - I don't think that's a horribly major concern here, and I have a simple thought experiment to elaborate.
Let's say that, instead of a random number generator, there's just a list of numbers that are cycled through. A small amount - maybe 100, 200, 10, whatever. Doesn't really matter. With the first install, it picks the first number, and grabs the next one whenever a "random" number is needed. When you quit, it saves its position and starts with that the next time. When it hits the end of the list, it starts back at the beginning. Pretty simple.
STT is a pretty diverse game; there's lots of stuff to do. Each of these things needs random numbers - probably lots of them - and might even need different amounts of numbers based on whatever number was picked. Away missions, for example. Maybe it uses a random number to determine how many items it pulls, then a random number to determine what each of those items are. (We'll pretend each section of the game knows how to make sense of the random number it gets.)
Because the list resets when it runs out, you can - in theory - figure out what that cycle is and predict what's going to happen. In fact, if you know the cycle, you'll know exactly what's going to happen and when.
Here's the catch though: the cycle will only repeat if you follow the exact steps you did when you ran the cycle. Run the same missions, in the same order. Then pull from the portal - doesn't matter what, just as long as it's the same each time. Run your voyage, with the exact same crew, and pick it up exactly at the same time - if it hits one extra hazard, you just screwed up the cycle because that took another number from your set. Run shuttle missions for the same faction with the same crew (since it's going off the same random number pool, you'll get the same missions with the same success rate percentages).
Anything you do that makes use of a random number - which is most of the game - you have to repeat exactly. Every. Single. Time. You can't do additional portal pulls. You can't do extra missions, or different missions. You can't run shuttle missions against a different faction to get an item, because the mission might fail, or it might succeed - and if it allows a different number of items than your original mission (maybe it only returns 4 items, but your previous one could get you 7), you'll go through a different number of random number pulls for those items. Your voyage failed at 6h 34m? You'll still need to kill it at the same time, so improving your crew won't matter.
With even a small set of "random" numbers, knowing the cycle is completely useless - the game would be absolutely no fun at all, or else you lose the cycle and you'll need to figure it out again.
If you introduce any randomness at all in this - even keeping that same small list but it starts out at a different spot every time you start the game - that makes a difference. Now you have chance involved, so it doesn't matter what order you do things in. Sure you can play long enough to figure out where the cycle is at for this session, but what happens if you get a phone call and it kills your game? Or you have to shut it off to go to work/school/whatever?
Now make it a large bucket of numbers - the 10,000 integers you were talking about, the 0-1 with 16 decimal places I was talking about, whatever. Even without reseeding, the only way you'll ever find a useable pattern is to just keep trying to repeat things until you notice the cycle, try an "exploit" (like "okay, cool, now I should get a Behold from a 90K portal"), and then it screw up your cycle. If you got it, great! You'll need to get back to the exact same place some how though to really make use of it though.
If you throw reseeding into the mix, or if it's a generator on a server somewhere, you don't need some super high-end random number generator. Java's Math.random() would be perfectly sufficient. Hell, just that list of numbers I was using as an example would probably be good enough.
As an aside, thinking a bit more about the server-side vs client-side RNG, I'm more inclined to say that actually most of the "day-to-day" RNG - mission results, space battles, stuff like that - are client side, because that'd be a massive load on a server. Anything that costs money - packs, for example, or Dabo (because dil) - would probably hit the server.
And to your point about repeating cycles, if RNG is server side and is shared across all users, you would never be able to predict the cycle without getting cooperation from everyone.
But yes, I definitely agree with you now that I realize what you're getting at.