Hello all,
There is lots of talk about the specific pattern that is used for villagers on mystery islands. I would like to present my findings in this thread and open it up for possible statistics discussions to confirm this theory even more.
First, I'd like to thank Selkie for sending me their data to perform statistical tests on and Sheba and ForbiddenSecrets for bringing up the theory in the first place. The theory is that the game first rolls for a species. Once the species is chosen, it then rolls for a specific villager in that chosen species. The game does NOT roll for a specific villager right off the bat.
Here is the thread where this theory was initially proposed:
The way the tests were done are pretty straightforward. People have been noticing the octopuses appearing a lot more than they should as there's only 3 of them. Therefore, I tested to see if the number of octopuses that actually appeared in the sample was statistically significant than the number of octopuses that we would expect to appear in the sample.
Earlier today, before this theory was brought up, I performed a Chi Square test on Selkie's data to check if it was completely random or pseudo-random or if there was a pattern. Selkie's data has a sample size of 344. The expected chance of an octopus appearing using the old theory of the game randomly rolls a villager from the pool of 391 is 3/391. In the data we would expect one to have about 2.56 appearances in the 344. The data contained 12. Here is the Chi Square test for this model:
As you can see, the Chi Square value is larger (much larger) than the value that produces a p-value of 0.05 for a Chi Square test with 1 degree of freedom (the p-value that is produced with 35.01 is less than 0.00001). This concludes that the observed and expected are indeed statistically different. Therefore, the game does NOT just choose a villager at random out of the 391.
After seeing Sheba's thread and ForbiddenSecret's reply with the species first theory, I decided to go back to Selkie's data and perform another Chi Square test using the parameters that the game chooses a species first at random then a villager in that species. Using this, the expected chance of an octopus is 1/35 not 3/391. In the sample of 344, we would expect the octopuses to have 9.54 appearances. Here is the new Chi Square test:
As you can see, the Chi Square value is less than the value that produces a p-value of 0.05 for a 1 degree of freedom test (the p-value that 0.65 produces is 0.42). This means that we cannot reject the null hypothesis of "The game randomly rolls a species of villager first" I conclude that the theory that the species is rolled first then one is selected in that chosen species is basically correct.
What is a Chi Square test?
Basically it tests whether 2 groups of data are statistically different. A common use of it is to test whether your observed set of data is statistically different (not due to random chance) from the expected data.
UPDATE: I have now tested (with the help of more data from TBT users) to see if this theory applies evenly across the board, only 2/35 species didn't uphold the theory, but that is not enough to disprove it all together, and I can attribute it to the nature of RNG. I conclude that it does! The chance for a specific species to be rolled is the same for every species.
UPDATE 2: While I have not formally conducted tests on if the game rolls for personality after species or just rolls straight for a villager yet (still working on this), my initial analysis of the new data provided suggest that this is NOT the case. The game does NOT roll for personality at all (villager 1-5 are locked personalities). I'm 95%+ sure this is the case, but can't outright solidly confirm that there is no personality roll. Nor does lacking a personality on your island increase or decrease the chances for that personality to appear on mystery islands. This also applies to what species you have as well, it won't increase or decrease the chances for a species to appear if you already have 1 or more on your island.
Thanks again to Selkie, Sheba, and ForbiddenSecrets for providing data and/or the theory!
Happy hunting!
Now the big question that a lot of you are wondering...
What does this mean for Raymond hunting?
Well, I'll tell you. The cat is even more elusive than we originally thought! It means that the chance to find Raymond on mystery islands is very low. It's lower than 1/391 because there are 20+ cats in this game. In fact, I have calculated the chance to find Raymond on a mystery island to be about 0.12% Basically 1 in a 1000. Good luck!
Other has pointed out a theory that lacking a smug will cause random move ins, i.e. letting the plot fill up, to be smug. Sounds like this is a much better bet, especially if you're willing to TT to force out the smug to try again!
There is lots of talk about the specific pattern that is used for villagers on mystery islands. I would like to present my findings in this thread and open it up for possible statistics discussions to confirm this theory even more.
First, I'd like to thank Selkie for sending me their data to perform statistical tests on and Sheba and ForbiddenSecrets for bringing up the theory in the first place. The theory is that the game first rolls for a species. Once the species is chosen, it then rolls for a specific villager in that chosen species. The game does NOT roll for a specific villager right off the bat.
Here is the thread where this theory was initially proposed:
[Spreadsheet] Mystery Tour Villager "RNG" (Peppys only. Spoiler: Species gets rolled first, thus odds are not equal)
EDIT: It's clear now! TL;DR: Game rolls species first, then villager if there's more than one in the species, making the odds uneven - less possible villagers in a species mean higher chances for those villagers appearing. Thank you to ForbiddenSecrets for helping me realize this! --- So, as I...
www.belltreeforums.com
The way the tests were done are pretty straightforward. People have been noticing the octopuses appearing a lot more than they should as there's only 3 of them. Therefore, I tested to see if the number of octopuses that actually appeared in the sample was statistically significant than the number of octopuses that we would expect to appear in the sample.
Earlier today, before this theory was brought up, I performed a Chi Square test on Selkie's data to check if it was completely random or pseudo-random or if there was a pattern. Selkie's data has a sample size of 344. The expected chance of an octopus appearing using the old theory of the game randomly rolls a villager from the pool of 391 is 3/391. In the data we would expect one to have about 2.56 appearances in the 344. The data contained 12. Here is the Chi Square test for this model:
As you can see, the Chi Square value is larger (much larger) than the value that produces a p-value of 0.05 for a Chi Square test with 1 degree of freedom (the p-value that is produced with 35.01 is less than 0.00001). This concludes that the observed and expected are indeed statistically different. Therefore, the game does NOT just choose a villager at random out of the 391.
After seeing Sheba's thread and ForbiddenSecret's reply with the species first theory, I decided to go back to Selkie's data and perform another Chi Square test using the parameters that the game chooses a species first at random then a villager in that species. Using this, the expected chance of an octopus is 1/35 not 3/391. In the sample of 344, we would expect the octopuses to have 9.54 appearances. Here is the new Chi Square test:
As you can see, the Chi Square value is less than the value that produces a p-value of 0.05 for a 1 degree of freedom test (the p-value that 0.65 produces is 0.42). This means that we cannot reject the null hypothesis of "The game randomly rolls a species of villager first" I conclude that the theory that the species is rolled first then one is selected in that chosen species is basically correct.
What is a Chi Square test?
Basically it tests whether 2 groups of data are statistically different. A common use of it is to test whether your observed set of data is statistically different (not due to random chance) from the expected data.
UPDATE: I have now tested (with the help of more data from TBT users) to see if this theory applies evenly across the board, only 2/35 species didn't uphold the theory, but that is not enough to disprove it all together, and I can attribute it to the nature of RNG. I conclude that it does! The chance for a specific species to be rolled is the same for every species.
UPDATE 2: While I have not formally conducted tests on if the game rolls for personality after species or just rolls straight for a villager yet (still working on this), my initial analysis of the new data provided suggest that this is NOT the case. The game does NOT roll for personality at all (villager 1-5 are locked personalities). I'm 95%+ sure this is the case, but can't outright solidly confirm that there is no personality roll. Nor does lacking a personality on your island increase or decrease the chances for that personality to appear on mystery islands. This also applies to what species you have as well, it won't increase or decrease the chances for a species to appear if you already have 1 or more on your island.
Thanks again to Selkie, Sheba, and ForbiddenSecrets for providing data and/or the theory!
Happy hunting!
Now the big question that a lot of you are wondering...
What does this mean for Raymond hunting?
Well, I'll tell you. The cat is even more elusive than we originally thought! It means that the chance to find Raymond on mystery islands is very low. It's lower than 1/391 because there are 20+ cats in this game. In fact, I have calculated the chance to find Raymond on a mystery island to be about 0.12% Basically 1 in a 1000. Good luck!
Other has pointed out a theory that lacking a smug will cause random move ins, i.e. letting the plot fill up, to be smug. Sounds like this is a much better bet, especially if you're willing to TT to force out the smug to try again!
Last edited: