Quick question on another variable that makes almost impossible to do the numbers
Is the number shown in the left hand column on events the number of shuttles ran or the vp points collected?
u
If your odds/percentages are based off server wide, then that number is ever changing so no way to track it, becuase you never know how many stuttles have been ran
We are led to believe those counters are point totals for individual factions. It would be difficult, if impossible, to independently verify those numbers
What is being proposed here isn't an accounting of all shuttles across all players. While that would certainly give the best picture of shuttle behavior, it's not really feasible.
Like the visuals of the graphs at the bottom of his 2nd post, would be a great way ,maybe make grapic sheet that players can download and insert numbers into , so you get more of a standard input
Verus just randomly throwing data up on a page as there is no standard, which makes it harder to collect data
Have everone track number of stuttles ran for 30days input factions and win/loss ratios on each into a downloadable graph
That would also probly help with DB and others to see it versus trying to make them pour over a crap load of numbers
Quick question on another variable that makes almost impossible to do the numbers
Is the number shown in the left hand column on events the number of shuttles ran or the vp points collected?
u
If your odds/percentages are based off server wide, then that number is ever changing so no way to track it, becuase you never know how many stuttles have been ran
We are led to believe those counters are point totals for individual factions. It would be difficult, if impossible, to independently verify those numbers
What is being proposed here isn't an accounting of all shuttles across all players. While that would certainly give the best picture of shuttle behavior, it's not really feasible.
Like the visuals of the graphs at the bottom of his 2nd post, would be a great way ,maybe make grapic sheet that players can download and insert numbers into , so you get more of a standard input
Verus just randomly throwing data up on a page as there is no standard, which makes it harder to collect data
Since that thread was created, we had a major change to shuttle success formulas. The initial data gathered in that other thread I linked showed that there may well be some odd things going on with shuttle probability.
Quick question on another variable that makes almost impossible to do the numbers
Is the number shown in the left hand column on events the number of shuttles ran or the vp points collected?
u
If your odds/percentages are based off server wide, then that number is ever changing so no way to track it, becuase you never know how many stuttles have been ran
We are led to believe those counters are point totals for individual factions. It would be difficult, if impossible, to independently verify those numbers
What is being proposed here isn't an accounting of all shuttles across all players. While that would certainly give the best picture of shuttle behavior, it's not really feasible.
Like the visuals of the graphs at the bottom of his 2nd post, would be a great way ,maybe make grapic sheet that players can download and insert numbers into , so you get more of a standard input
Verus just randomly throwing data up on a page as there is no standard, which makes it harder to collect data
Since that thread was created, we had a major change to shuttle success formulas. The initial data gathered in that other thread I linked showed that there may well be some odd things going on with shuttle probability.
The post might be outdated, was more using the graphs to help with new tracking
Do two, have someone do a downloadable to enter info then have say at the end of 30days take everyones graphs and enter the info into same graph
Then you could go to db with the graph
"Ok as you can see here we had 78 players on the first colum that ran 1,212 shuttles in 30days in federation faction, problity for this came out to be xxxxxx and actual success was xxxxxx"
Would ethier help prove you right there is a issue or is not a issue, and be able for any layman to see the differences
Quick question on another variable that makes almost impossible to do the numbers
Is the number shown in the left hand column on events the number of shuttles ran or the vp points collected?
If vp thats not that many players, but if its shuttles ran thats a ton
If your odds/percentages are based off server wide, then that number is ever changing so no way to track it, becuase you never know how many shuttles have been ran
You would also need to know when it reset if its based off a certain number or a timeframe
So collecting data is useless if you dont know the all variables
Even if each count is a shuttle and not the victory points. That total comes to less than 500 million. Some of the failure combinations were in the one in a trillion odds!
I'm in. If I have time later tonight I'll try to work on the previous 20-25 events I tracked and see if anything pops out.
But collecting only the displayed percentages and shuttle result for me is not enough. I do concur that I did feel that I passed many shuttles with low rate and failed many at really high rate, suggesting that we could have a different formula on server side. I'll see later if my feelings were correct. But that could be only part of the problem. If they make an error in input programming throwing in a char instead of an integer then a crew or a mission (who knows how are they coding) could miss a whole bunch of stats lowering the success rate.
Nobody confirmed me yet they had the same experience with shared crew in the last event, nor disproofed me.
No one else reported strange shuttle behavior with drunk tyler who was giving wrong bonus at one point. was it only displayed or even server side? In the past sometimes I skipped using crew with wrong bonus because I didn't trust the data was consistent. But if we are methodic, or maniacs, we may find out patterns, the way it is coded, instead of just the deviation.
with this hybrid event we can run a short number of shuttles, but having a large pool of bonus crew, many of us will be using the same crew. I want to see if any mission or crew behave differently compared to others.
Depends on the intention. Are we trying to prove that there is a systematic problem with the displayed shuttle percentage? Or are we trying to figure out what that problem actually is?
Given a large enough sample size, this form should be able to suss out the first question. It cannot answer the second.
For instance, @[DB:DB] ~SE~ Capt. Reynolds posted above that he was only noticing the discrepancy in his data that featured shared crew. You could never tease that out from this form.
I thought about determining the problem, but that would take much more complicated data collection that the game does not make easily available. So the intent of this is to determine whether the displayed percentages are correct. If we determine it's not, then we can look into more detailed collection.
The idea for this is to take enough data, and look at it percent by percent to see how close the results are to the percentages displayed. Of course, that will take hundreds of entries for each percent indicator, but it's the only way I see to make an accurate assessment.
As for being able to do multiple shuttles at once, Google Forms doesn't make that so easy. I can make it a longer form so can be collected, but then collating that data to determine the patterns becomes much more difficult because each entry on a form gets its own column in the related spreadsheet. A multi-result form also makes things harder because it would have to cover 2, 3, or 4 shuttle options, and know to ignore the blank spaces for <4. So I decided to keep it to a simple form, that could be used to make quick entries, hit Submit, and click the Submit Another without having to do lots of scrolling.
I've been using this to record my results for the past day or so, and it can be filled out, for all 4 shuttles, even with clicking, in well under a minute.
Depends on the intention. Are we trying to prove that there is a systematic problem with the displayed shuttle percentage? Or are we trying to figure out what that problem actually is?
Given a large enough sample size, this form should be able to suss out the first question. It cannot answer the second.
For instance, @[DB:DB] ~SE~ Capt. Reynolds posted above that he was only noticing the discrepancy in his data that featured shared crew. You could never tease that out from this form.
I could add an extra line for whether there are shared crew or not, if that would help. But I don't want to complicate it too much. It is designed currently to be able to separate out Event vs. Normal runs, which should at least point us in the direction of a shared crew program if those results have a discrepancy, but the normals don't. At that point, maybe do some separate data collection?
i'll throw in my 2 strips of gold pressed latinum:
Please keep in mind that there is certainly strong Von Restorff effect bias (i.e. that I am more likely to have noticed these outcomes because they seemed odd and not because they actually happened more frequently than other outcomes..)
But,... on many of my event shuttles where I only succeed 1/4 in a set the success is the mission that had the lowest displayed percent chance. Often that is because it was a mission with a ton of open slots and I just had to fill with non bonus crew. Sample size = a bunch.
Okay, so perhaps just another complainy anecdote; however, there are a couple of easy bugs that I can think of that would actually cause this behavior:
1. Crew bonuses are being counted on the display, but are left out of the actual roll comparison. This would be why using non-bonus crew seems to improve the actual results.
2. The range of the roll is being scaled by the number of crew inappropriately. This could show up in a number of places, but one example is that when calculating the "power" of the team there is a step that is meant to normalize (basically just take the mean) the value to a single crew member but instead of using a variable that corresponds to the size of the crew, it's hardcoded to 5 or whatever the maximum is. This would have the effect of causing small crew missions to fail more often, but large ones to be correct.
Okay and 2 more thoughts that aren't based on my own observations, but those high end players who have 90+% shuttles at 4k VP who seem to have failure rates that are hilariously impossible.
1. The displayed percentage is fudged at high % and somehow this has been executed horribly wrong. Remember back in the day when the 1 minute missions would show 100% chance of success and it was obvious to all that this just due to rounding 99.5%+ up since the curve is asymptotic approaching 100%? Well, that doesn't happen any more; the max displayed is 99%. One would assume that you could just achieve this with using a floor function to just round down the displayed percentage, but what if the coder tried to do something fancy and wrecked everything?
2. What if the shuttle difficulty isn't actually capped, but the displayed percentage and the 4k VP is? I.e., what if after you hit 4k, the difficulty continues to increase like non-event shuttles until you hit the 70ish% point where success/fails start to stabilize? This actually seems extremely plausible since you could create this bug by copy/pasting code from the non-event mission algorithm and would explain why the discrepancy seem more pronounced for players that have high displayed % at max vp.
Okay, that's enough free consulting. Send dilithium to my in game mail for more poorly-informed wild speculation.
Oh, and if you want the playerbase to help you debug this a good first step would be to provide the actual numbers for the roll and target on the outcome screen so that we can track our stats better. It's not much of an ask since you do that for away team missions and gauntlet already...
Reminder since most people are on the 3 hours schedule now to try and record your shown percentage compared to actual. Im also recording which mission have event vs bonus crew. More data the better. Eventually we can all agree on how to present it.
If you can, record what the skills are, if you're using a shared crew, and if you're using a skill bonus. There has been speculation that all of those are broken and the more data available, the more likely it is we can find the issue.
Hi, I've made a note of the last two sets out, had a slight change of crew on one but it should be pretty stable from this point. I'll try and record results where I can (I have created a spreadsheet that looks like ThisSisko1's above). I'm actually unsure who is putting this data together so let me know where to send it after we're done cheers for putting the effort in
I'll try to stick to the same missions and crew over and over, regardless of the results. Of course, I already failed my first 2 90+% missions, one was with shared crew.
Sisko, depending on how you want to rank and the fact you can make up for it in the galaxy portion, I would suggest you to keep doing lead by example with the same crew. I don't have Gieorgiou so I can't test your same conditions
Well at 4k I couldn't sustain anything over 90 percent for Lead by Example and already switched up the mission before I saw the recommendation . I failed another mission overnight.
Fleetmates provided me their data and I'll eventually add theirs to mine and note it. So far its about expected. About 10 point difference from shown to actual.
I’m not going to start making charts. I don’t need data just common sense to know that this event is way out of whack compared to all the others I’ve participated in. Factions were the events and parts of events I enjoyed and it is painfully obvious they have screwed these up now too. From the abysmal ratios that are not even close to accurate and the multitudes of four and five star missions, with a few threes thrown in I can see the plan with this event was to hose up the faction part completely. I’m just glad I don’t care for these characters. I was planning on finishing all the thresholds and doing some of the galaxy but now I’ve got just a couple more things to get and I’m done with this debacle. I’m waiting for them to find a way to monetize logging into the game at this rate.
So far its about expected. About 10 point difference from shown to actual.
This is where I think this type of analysis may be misleading. With a high chance of success and so far a small sample size, comparing displayed success to actual will be skewed. Maybe a better test would be on expected number of failures.
I’m not going to start making charts. I don’t need data just common sense to know that this event is way out of whack compared to all the others I’ve participated in. Factions were the events and parts of events I enjoyed and it is painfully obvious they have screwed these up now too. From the abysmal ratios that are not even close to accurate and the multitudes of four and five star missions, with a few threes thrown in I can see the plan with this event was to hose up the faction part completely. I’m just glad I don’t care for these characters. I was planning on finishing all the thresholds and doing some of the galaxy but now I’ve got just a couple more things to get and I’m done with this debacle. I’m waiting for them to find a way to monetize logging into the game at this rate.
Well, "common sense" without proof is just going on gut feelings. If you actually want things to change, you need proof that they are broken. Charts, tables, and records are vehicles for collecting evidence. With enough, a preponderance eventually becomes proof.
So far its about expected. About 10 point difference from shown to actual.
This is where I think this type of analysis may be misleading. With a high chance of success and so far a small sample size, comparing displayed success to actual will be skewed. Maybe a better test would be on expected number of failures.
Agree, expected vs actual failures would be the most relevant statistic. Saying "10 point difference" is certainly accurate, but a 10 point difference in this context actually means you are experiencing failure rates THREE TIMES what the expected rate should be. This is massive. A small data set, for sure, but if that rate holds up, there's certainly something wrong.
Well so far all 3 hour shuttles but 1 have been 95%+ and have passed, here's to small miracles. My one overnight shuttle that was 89% failed of course because stuff.
We are, but while working in real life it's easier posting here during downtime and deal with forms and spreadsheets with a proper computer. Subspace eddies are collecting data and the spreadsheet has a life on its own. I would propose you at one point to grant direct access to the spreadsheets to copy and paste our data, much faster and error free than typing all over again.
Comments
Have everone track number of stuttles ran for 30days input factions and win/loss ratios on each into a downloadable graph
That would also probly help with DB and others to see it versus trying to make them pour over a crap load of numbers
Visuals always help sale your points better
Since that thread was created, we had a major change to shuttle success formulas. The initial data gathered in that other thread I linked showed that there may well be some odd things going on with shuttle probability.
The post might be outdated, was more using the graphs to help with new tracking
Do two, have someone do a downloadable to enter info then have say at the end of 30days take everyones graphs and enter the info into same graph
Then you could go to db with the graph
"Ok as you can see here we had 78 players on the first colum that ran 1,212 shuttles in 30days in federation faction, problity for this came out to be xxxxxx and actual success was xxxxxx"
Would ethier help prove you right there is a issue or is not a issue, and be able for any layman to see the differences
Even if each count is a shuttle and not the victory points. That total comes to less than 500 million. Some of the failure combinations were in the one in a trillion odds!
But collecting only the displayed percentages and shuttle result for me is not enough. I do concur that I did feel that I passed many shuttles with low rate and failed many at really high rate, suggesting that we could have a different formula on server side. I'll see later if my feelings were correct. But that could be only part of the problem. If they make an error in input programming throwing in a char instead of an integer then a crew or a mission (who knows how are they coding) could miss a whole bunch of stats lowering the success rate.
Nobody confirmed me yet they had the same experience with shared crew in the last event, nor disproofed me.
No one else reported strange shuttle behavior with drunk tyler who was giving wrong bonus at one point. was it only displayed or even server side? In the past sometimes I skipped using crew with wrong bonus because I didn't trust the data was consistent. But if we are methodic, or maniacs, we may find out patterns, the way it is coded, instead of just the deviation.
with this hybrid event we can run a short number of shuttles, but having a large pool of bonus crew, many of us will be using the same crew. I want to see if any mission or crew behave differently compared to others.
https://goo.gl/forms/W9ugTZ4R9qpCRYop2
I like it but you might wanna a add the ablity to fo muti shuttles at one, one at a time could get old quick
Depends on the intention. Are we trying to prove that there is a systematic problem with the displayed shuttle percentage? Or are we trying to figure out what that problem actually is?
Given a large enough sample size, this form should be able to suss out the first question. It cannot answer the second.
For instance, @[DB:DB] ~SE~ Capt. Reynolds posted above that he was only noticing the discrepancy in his data that featured shared crew. You could never tease that out from this form.
The idea for this is to take enough data, and look at it percent by percent to see how close the results are to the percentages displayed. Of course, that will take hundreds of entries for each percent indicator, but it's the only way I see to make an accurate assessment.
As for being able to do multiple shuttles at once, Google Forms doesn't make that so easy. I can make it a longer form so can be collected, but then collating that data to determine the patterns becomes much more difficult because each entry on a form gets its own column in the related spreadsheet. A multi-result form also makes things harder because it would have to cover 2, 3, or 4 shuttle options, and know to ignore the blank spaces for <4. So I decided to keep it to a simple form, that could be used to make quick entries, hit Submit, and click the Submit Another without having to do lots of scrolling.
I've been using this to record my results for the past day or so, and it can be filled out, for all 4 shuttles, even with clicking, in well under a minute.
I could add an extra line for whether there are shared crew or not, if that would help. But I don't want to complicate it too much. It is designed currently to be able to separate out Event vs. Normal runs, which should at least point us in the direction of a shared crew program if those results have a discrepancy, but the normals don't. At that point, maybe do some separate data collection?
Please keep in mind that there is certainly strong Von Restorff effect bias (i.e. that I am more likely to have noticed these outcomes because they seemed odd and not because they actually happened more frequently than other outcomes..)
But,... on many of my event shuttles where I only succeed 1/4 in a set the success is the mission that had the lowest displayed percent chance. Often that is because it was a mission with a ton of open slots and I just had to fill with non bonus crew. Sample size = a bunch.
Okay, so perhaps just another complainy anecdote; however, there are a couple of easy bugs that I can think of that would actually cause this behavior:
1. Crew bonuses are being counted on the display, but are left out of the actual roll comparison. This would be why using non-bonus crew seems to improve the actual results.
2. The range of the roll is being scaled by the number of crew inappropriately. This could show up in a number of places, but one example is that when calculating the "power" of the team there is a step that is meant to normalize (basically just take the mean) the value to a single crew member but instead of using a variable that corresponds to the size of the crew, it's hardcoded to 5 or whatever the maximum is. This would have the effect of causing small crew missions to fail more often, but large ones to be correct.
Okay and 2 more thoughts that aren't based on my own observations, but those high end players who have 90+% shuttles at 4k VP who seem to have failure rates that are hilariously impossible.
1. The displayed percentage is fudged at high % and somehow this has been executed horribly wrong. Remember back in the day when the 1 minute missions would show 100% chance of success and it was obvious to all that this just due to rounding 99.5%+ up since the curve is asymptotic approaching 100%? Well, that doesn't happen any more; the max displayed is 99%. One would assume that you could just achieve this with using a floor function to just round down the displayed percentage, but what if the coder tried to do something fancy and wrecked everything?
2. What if the shuttle difficulty isn't actually capped, but the displayed percentage and the 4k VP is? I.e., what if after you hit 4k, the difficulty continues to increase like non-event shuttles until you hit the 70ish% point where success/fails start to stabilize? This actually seems extremely plausible since you could create this bug by copy/pasting code from the non-event mission algorithm and would explain why the discrepancy seem more pronounced for players that have high displayed % at max vp.
Okay, that's enough free consulting. Send dilithium to my in game mail for more poorly-informed wild speculation.
Oh, and if you want the playerbase to help you debug this a good first step would be to provide the actual numbers for the roll and target on the outcome screen so that we can track our stats better. It's not much of an ask since you do that for away team missions and gauntlet already...
Updated
The 9 and 5 means you will succeed 9 times and fail 5 times.
Who are you using in each slot for that mission?
Check out our website to find out more:
https://wiki.tenforwardloungers.com/
Sisko, depending on how you want to rank and the fact you can make up for it in the galaxy portion, I would suggest you to keep doing lead by example with the same crew. I don't have Gieorgiou so I can't test your same conditions
Feel like a failure?
Why not play Star Trek Timelines Faction events and play to your life skills
Fleetmates provided me their data and I'll eventually add theirs to mine and note it. So far its about expected. About 10 point difference from shown to actual.
This is where I think this type of analysis may be misleading. With a high chance of success and so far a small sample size, comparing displayed success to actual will be skewed. Maybe a better test would be on expected number of failures.
Well, "common sense" without proof is just going on gut feelings. If you actually want things to change, you need proof that they are broken. Charts, tables, and records are vehicles for collecting evidence. With enough, a preponderance eventually becomes proof.
Agree, expected vs actual failures would be the most relevant statistic. Saying "10 point difference" is certainly accurate, but a 10 point difference in this context actually means you are experiencing failure rates THREE TIMES what the expected rate should be. This is massive. A small data set, for sure, but if that rate holds up, there's certainly something wrong.
https://forum.disruptorbeam.com/stt/discussion/1713/shuttle-mission-success-chances-during-events-post-your-data-here/p1
https://docs.google.com/forms/d/e/1FAIpQLSeBZvZrwr4393m6tsf5X3SLYzF8umsdWEzT46lBaIsEM6eqlw/viewform