Frankly the fact that it took this level of analysis from non-DB employees says something about not only the QC process, but the "everything is working perfectly" mantra that can not be taken at face value for anything.
Their QC process is definitely known to be bad. I also stopped taking the "everything is working perfectly" bit seriously in 2016 after far too many instances of them being proven flat wrong. Everything wrong with DB has been gone over again and again for 2 years, but the bit about taking this much analysis is definitely what has always bugged me. I don't know if it's pride or overconfidence or something else.
When the player base is collectively suspecting something is off, it needs to be taken seriously and actually investigated. Sure, it can't be every single time we bring something up. And sure, something that would take one developer investigating all day everyday for a week is one thing. But it's another thing entirely when a dev having the game in front of them on their phone, and having an instance of the server fired up with extra logging or a breakpoint set could have easily verified this after 5 - 10 minutes of testing. And it not occurring to them or them not bothering to do it in simple cases like this really bugs me. Seriously, even if no official time could have been given to a developer for it because of aforementioned mantra, a few minutes first thing in the morning before getting settled into the work groove (jira task list or what have you) could have settled this particular case. I don't even know ruby (judging by the error messages we've sometimes seen that contain stacktraces) but I'm sure after cloning their environment even I could have verified this.
Honestly it's been fun to duplicate. Not enough data but only 10 percent of the shuttles using the second skill only crew have passed versus the 88 percent using the first skill only crew. If this was a typical event I would only report the 55 actual success percent to the 70 shown (ran the rest using single skill missions to test all 3 factors, AND missions and single skill). But digging in, I can see what to avoid now.
Honestly I've lost so many fleet members because of the false shuttle percentages that led to rage quiting. It's a relief that I can have an answer to help them stay and get close to an actual success percentages and avoid the 0/4 95 percent fails.
They most likely only display the max % in the range, but the algorithm always pick the lowest % (that of course won’t be revealed) but it is most likely what you guys found from collecting the data).
DB prob needs more time how to formulate answer that won’t cause players uproar.
But how is this any different than gauntlet crits, pack drop rate, shuttle/away missions indicator? We have experienced this daily and shouldn’t be surprised with this.
Honestly it's been fun to duplicate. Not enough data but only 10 percent of the shuttles using the second skill only crew have passed versus the 88 percent using the first skill only crew. If this was a typical event I would only report the 55 actual success percent to the 70 shown (ran the rest using single skill missions to test all 3 factors, AND missions and single skill). But digging in, I can see what to avoid now.
Honestly I've lost so many fleet members because of the false shuttle percentages that led to rage quiting. It's a relief that I can have an answer to help them stay and get close to an actual success percentages and avoid the 0/4 95 percent fails.
Just chiming in on the single crew experiment, I am now 7/10 using single event crew and 0/10 when using the second stat only of AND missions.... so yeah.
Edit: make that 11/14 vs 0/14....this is starting to get silly
I can't stress enuff how impurrtant all this data-gathering has been and still is. A big thankies to everybuddy who was invulved in the data-gathering process! I really hope for DB that this gets on top of the priority list of bugs to be adressed. As Sumbiesqurl stated very correctly, frustrated players won't spend further on the game and (more or less) silently leave.
I also hope that the "recommended" crew will adapt accordingly once the change is going live, so you will always be able to pick from the top to get the best actual shuttle success chance, so we won't have to pick the actual best crew ourselves.
Important edit: How about the OR function? Is that bugged, too? It's weird to see getting bonuses from let's say Eng OR Sec when you apply an Eng boost or a Sec boost ... shouldn't it be like as follows: You put in a card that has only ENG and you put in a SEC boost, it shouldn't give bonus. If you put in an ENG boost, it should give bonus.
My concrete hunch is that the bonus (even if it's wrong, as in ENG bonus counting for a SEC card in an Eng OR Sec seat) visually applies, but isn't calculated in the background either. It would be actually correct, but if it shows that the bonus does apply, it's - again - a mislead for us players.
"Everything about the Jem'Hadar is lethal!" - Eris (ST-DS9 Episode 2x26 "The Jem'Hadar")
I can't stress enuff how impurrtant all this data-gathering has been and still is. A big thankies to everybuddy who was invulved in the data-gathering process! I really hope for DB that this gets on top of the priority list of bugs to be adressed. As Sumbiesqurl stated very correctly, frustrated players won't spend further on the game and (more or less) silently leave.
I also hope that the "recommended" crew will adapt accordingly once the change is going live, so you will always be able to pick from the top to get the best actual shuttle success chance, so we won't have to pick the actual best crew ourselves.
Important edit: How about the OR function? Is that bugged, too? It's weird to see getting bonuses from let's say Eng OR Sec when you apply an Eng boost or a Sec boost ... shouldn't it be like as follows: You put in a card that has only ENG and you put in a SEC boost, it shouldn't give bonus. If you put in an ENG boost, it should give bonus.
My concrete hunch is that the bonus (even if it's wrong, as in ENG bonus counting for a SEC card in an Eng OR Sec seat) visually applies, but isn't calculated in the background either. It would be actually correct, but if it shows that the bonus does apply, it's - again - a mislead for us players.
Testing the or missions for first or second slot impact would be very similar to and tests due to the all or nothing nature of the stats. The impact of boosts is a lot harder and to get meaningful data in a relatively small sample size someone would probably need to use gold level boosts.
They most likely only display the max % in the range, but the algorithm always pick the lowest % (that of course won’t be revealed) but it is most likely what you guys found from collecting the data).
DB prob needs more time how to formulate answer that won’t cause players uproar.
But how is this any different than gauntlet crits, pack drop rate, shuttle/away missions indicator? We have experienced this daily and shouldn’t be surprised with this.
It's different because no one has, to date, shown a statistically significant problem with anything in relation to the gauntlet. The gauntlet has plenty of people that complain about it, but I feel that is mostly due to the transparency of it. You actually see the rolls and crits (unlike shuttles, which are a black box). This leads to a TON of confirmation bias. People run dozens of gauntlets per day and, every now and then, will see a few more upsets than they are used to and instantly blame it on "Bad RNG".
In addition to hoping this shuttle percentage issue is fixed, I hope they acknowledge that it is fixed so we don't have to guess if it is fixed or not.
3/32 on 2nd AND skill missions (69% shown)
11/12 on 1st AND skill missions (70% shown)
29/44 for single skill missions
44 runs on single skill missions. 65 percent actual to 72 shown. 9 percent success on the 2nd AND skill shuttle after 32 runs. 91 percent on the first skill shuttle after 12 runs. 88 total runs.
Now single skill shuttles are still under performing so this knowledge might not save all the false fails but it will certainly help.
In addition to hoping this shuttle percentage issue is fixed, I hope they acknowledge that it is fixed so we don't have to guess if it is fixed or not.
Unfortunately, I feel like there is a very high possibility of a stealth fix and we will be told that we must have been looking at sample sizes that were too small.
There are too many variables you have not taken into account, or at least not listed here. All I see here is results data, not process data.
Percentages could look identical for different crew, but the RNG on the crits/bonuses (and I am guessing boosts) could/would be significantly different.
Did you use the exact same crew on exact same missions? Accounting for bonus? Same boosts?
Did you variate some missions, positing higher results from 5* higher crit spread?
I had been keeping track of shuttle data for months, both for events and just daily faction missions. I have built up my top end crew significantly since then, so I will try to start data gathering again. My spreadsheets indicate which missions, exact crew (and rank/fuse level), boosts (if any), percentage shown pre boost/post boost, actual win/loss.
This last event, I had exactly 4 shuttles fail out of all 4 days. 2 @ 2750 (expected: they always fail at that level for some reason.) And 2 @ 4000. I had only one crew slot variance for a non-bonus ENG crew, otherwise all other shuttles slots filled with bonus/event crew. I also spread out EVENT crew between shuttles. for max effect, which last event proved out nicely.
I used the crew below for my missions. I went 5/26 (updated as of now) on the second skill And missions using Mirror Dax and Tkuvma. 18/26 using Mudd and Spock in the single skill missions. The expected AND missions was at 69 percent. The expected for single skill missions was at 78.
No bonuses were used. Not for the skill not for overnight. All missions used the same crew. I did not use bonuses and bonus crew are obviously not in play. Those factors could totally mess up the calculations also. It was not a factor of this test however. Only we have shown that the client side shows a different percentage than what the server side uses.
I would expect the Spock and Mudd missions to even out to the shown percentages over time. But to make up for the 20 percent to 69 shown in AND missions, I would need a ton of successes.
Here is something else to think about. I run the same crew in the same slots for the same missions throughout the event starting with the first set of 3hr shuttles. When I get to 4k, the displayed % stays the same. I don't save the data from previous events or keep actual data on pass/fail. I do know for sure though that since I have been keeping a spreadsheet and placing crew in as many And Mission shuttles with crew that have their main skill in first slot and secondary skill in the second slot, it has worked very well. The thing is that even though my displayed % stays the same each run, I have noticed that near the end the fail rate increases. So my question is, do we know for sure that the difficulty level stays static or does it continue to ramp up like non-event shuttles?
This last event was during my daughter's 13th birthday weekend so I was just sending shuttles when I could. Thursday morning I sent as many shuttles with 3* time boosts as I could knowing that I would most likely miss some during the weekend. When I got to 4k shuttles, my displayed % without boosts was 83, 85, 87 and 95. I failed one shuttle Friday which was one of my 9hr overnights. Saturday I passed all overnights and failed 3 missions in total all day. at 10pm I ran my 95 and 87 with a 4* boost. The 87 failed and then I sent all four with 9hr boosts. I had 2/4 fail. I finished 232.
TL:DR The OP lists the 4k difficulty at 2000 whereas my experience and assumption was that it continued to ramp up with each success until failure was inevitable.
There are too many variables you have not taken into account, or at least not listed here. All I see here is results data, not process data.
Percentages could look identical for different crew, but the RNG on the crits/bonuses (and I am guessing boosts) could/would be significantly different.
Did you use the exact same crew on exact same missions? Accounting for bonus? Same boosts?
Did you variate some missions, positing higher results from 5* higher crit spread?
I had been keeping track of shuttle data for months, both for events and just daily faction missions. I have built up my top end crew significantly since then, so I will try to start data gathering again. My spreadsheets indicate which missions, exact crew (and rank/fuse level), boosts (if any), percentage shown pre boost/post boost, actual win/loss.
This last event, I had exactly 4 shuttles fail out of all 4 days. 2 @ 2750 (expected: they always fail at that level for some reason.) And 2 @ 4000. I had only one crew slot variance for a non-bonus ENG crew, otherwise all other shuttles slots filled with bonus/event crew. I also spread out EVENT crew between shuttles. for max effect, which last event proved out nicely.
The single AND slot mission actually does a nice job at isolating a lot of the variables. Sure, it could always be something else, but they are showing a MASSIVE underperformance when not matching the first stat and relatively normal performance when not matching the second.
We are working with a black box situation, here. It is impossible for any of us to truly know what is wrong. The best we can do is to present DB with an explanation that describes the variation we are seeing better than their "normal variance" fallback.
Thank you for this. It’s fantastic investigative work, and should prompt a fix that will make shuttles fairer for everyone. Proud to have you as fleetmates!
First Officer - Task Force April
Squadron Leader - [TFA] Bateson’s Bulldogs
Was talking in fleet chat about this today. Do we know if this was ever confirmed by DB and/or addressed? I swear my shuttles succeed more when I match skill order, but there's a strong opinion the other direction as well.
Was talking in fleet chat about this today. Do we know if this was ever confirmed by DB and/or addressed?
Addressed? Yes, there was a not-very-stealthy stealth update that corrected the bug. Confirmed? Nah, not in a way that ever truly satisfied the community but time has passed and that is, at least for most players, water over the bridge.
Comments
I just snorted coffee through my nose after reading that. 🖖🏻
Their QC process is definitely known to be bad. I also stopped taking the "everything is working perfectly" bit seriously in 2016 after far too many instances of them being proven flat wrong. Everything wrong with DB has been gone over again and again for 2 years, but the bit about taking this much analysis is definitely what has always bugged me. I don't know if it's pride or overconfidence or something else.
When the player base is collectively suspecting something is off, it needs to be taken seriously and actually investigated. Sure, it can't be every single time we bring something up. And sure, something that would take one developer investigating all day everyday for a week is one thing. But it's another thing entirely when a dev having the game in front of them on their phone, and having an instance of the server fired up with extra logging or a breakpoint set could have easily verified this after 5 - 10 minutes of testing. And it not occurring to them or them not bothering to do it in simple cases like this really bugs me. Seriously, even if no official time could have been given to a developer for it because of aforementioned mantra, a few minutes first thing in the morning before getting settled into the work groove (jira task list or what have you) could have settled this particular case. I don't even know ruby (judging by the error messages we've sometimes seen that contain stacktraces) but I'm sure after cloning their environment even I could have verified this.
I posted this in the "Further Proof that shuttle success % displayed is wrong" thread last week...
Honestly it's been fun to duplicate. Not enough data but only 10 percent of the shuttles using the second skill only crew have passed versus the 88 percent using the first skill only crew. If this was a typical event I would only report the 55 actual success percent to the 70 shown (ran the rest using single skill missions to test all 3 factors, AND missions and single skill). But digging in, I can see what to avoid now.
Honestly I've lost so many fleet members because of the false shuttle percentages that led to rage quiting. It's a relief that I can have an answer to help them stay and get close to an actual success percentages and avoid the 0/4 95 percent fails.
They most likely only display the max % in the range, but the algorithm always pick the lowest % (that of course won’t be revealed) but it is most likely what you guys found from collecting the data).
DB prob needs more time how to formulate answer that won’t cause players uproar.
But how is this any different than gauntlet crits, pack drop rate, shuttle/away missions indicator? We have experienced this daily and shouldn’t be surprised with this.
Thank you for your service. I appreciate it.
Edit: make that 11/14 vs 0/14....this is starting to get silly
I also hope that the "recommended" crew will adapt accordingly once the change is going live, so you will always be able to pick from the top to get the best actual shuttle success chance, so we won't have to pick the actual best crew ourselves.
Important edit: How about the OR function? Is that bugged, too? It's weird to see getting bonuses from let's say Eng OR Sec when you apply an Eng boost or a Sec boost ... shouldn't it be like as follows: You put in a card that has only ENG and you put in a SEC boost, it shouldn't give bonus. If you put in an ENG boost, it should give bonus.
My concrete hunch is that the bonus (even if it's wrong, as in ENG bonus counting for a SEC card in an Eng OR Sec seat) visually applies, but isn't calculated in the background either. It would be actually correct, but if it shows that the bonus does apply, it's - again - a mislead for us players.
Testing the or missions for first or second slot impact would be very similar to and tests due to the all or nothing nature of the stats. The impact of boosts is a lot harder and to get meaningful data in a relatively small sample size someone would probably need to use gold level boosts.
It's different because no one has, to date, shown a statistically significant problem with anything in relation to the gauntlet. The gauntlet has plenty of people that complain about it, but I feel that is mostly due to the transparency of it. You actually see the rolls and crits (unlike shuttles, which are a black box). This leads to a TON of confirmation bias. People run dozens of gauntlets per day and, every now and then, will see a few more upsets than they are used to and instantly blame it on "Bad RNG".
11/12 on 1st AND skill missions (70% shown)
29/44 for single skill missions
44 runs on single skill missions. 65 percent actual to 72 shown. 9 percent success on the 2nd AND skill shuttle after 32 runs. 91 percent on the first skill shuttle after 12 runs. 88 total runs.
Now single skill shuttles are still under performing so this knowledge might not save all the false fails but it will certainly help.
Unfortunately, I feel like there is a very high possibility of a stealth fix and we will be told that we must have been looking at sample sizes that were too small.
Percentages could look identical for different crew, but the RNG on the crits/bonuses (and I am guessing boosts) could/would be significantly different.
Did you use the exact same crew on exact same missions? Accounting for bonus? Same boosts?
Did you variate some missions, positing higher results from 5* higher crit spread?
I had been keeping track of shuttle data for months, both for events and just daily faction missions. I have built up my top end crew significantly since then, so I will try to start data gathering again. My spreadsheets indicate which missions, exact crew (and rank/fuse level), boosts (if any), percentage shown pre boost/post boost, actual win/loss.
This last event, I had exactly 4 shuttles fail out of all 4 days. 2 @ 2750 (expected: they always fail at that level for some reason.) And 2 @ 4000. I had only one crew slot variance for a non-bonus ENG crew, otherwise all other shuttles slots filled with bonus/event crew. I also spread out EVENT crew between shuttles. for max effect, which last event proved out nicely.
No bonuses were used. Not for the skill not for overnight. All missions used the same crew. I did not use bonuses and bonus crew are obviously not in play. Those factors could totally mess up the calculations also. It was not a factor of this test however. Only we have shown that the client side shows a different percentage than what the server side uses.
I would expect the Spock and Mudd missions to even out to the shown percentages over time. But to make up for the 20 percent to 69 shown in AND missions, I would need a ton of successes.
This last event was during my daughter's 13th birthday weekend so I was just sending shuttles when I could. Thursday morning I sent as many shuttles with 3* time boosts as I could knowing that I would most likely miss some during the weekend. When I got to 4k shuttles, my displayed % without boosts was 83, 85, 87 and 95. I failed one shuttle Friday which was one of my 9hr overnights. Saturday I passed all overnights and failed 3 missions in total all day. at 10pm I ran my 95 and 87 with a 4* boost. The 87 failed and then I sent all four with 9hr boosts. I had 2/4 fail. I finished 232.
TL:DR The OP lists the 4k difficulty at 2000 whereas my experience and assumption was that it continued to ramp up with each success until failure was inevitable.
The single AND slot mission actually does a nice job at isolating a lot of the variables. Sure, it could always be something else, but they are showing a MASSIVE underperformance when not matching the first stat and relatively normal performance when not matching the second.
We are working with a black box situation, here. It is impossible for any of us to truly know what is wrong. The best we can do is to present DB with an explanation that describes the variation we are seeing better than their "normal variance" fallback.
Squadron Leader - [TFA] Bateson’s Bulldogs
Addressed? Yes, there was a not-very-stealthy stealth update that corrected the bug. Confirmed? Nah, not in a way that ever truly satisfied the community but time has passed and that is, at least for most players, water over the bridge.
This guy floods.
I thought the expression is "water under the bridge" ?
For some, the watermark is still up there...
Proud Former Officer of The Gluten Empire
Retired 12-14-20. So long, and thanks for all the cat pics!
My Captain Idol's
My DataCore page
My Spreadsheet
It is...I enjoy malaphors more than most people. “We’ll burn that bridge when we come to it” is probably the most people have heard of.
A good list can be found here: https://www.thoughtco.com/malaphor-word-play-1691298