Jump to content


  • Content Count

  • Joined

  • Last visited

Everything posted by Khift

  1. I'm not completely confident, but I also agree with the others that Kosori likely adds skill while bowed and at home. One thing to note - this doesn't say they can't contribute their skill, this says they don't. Kosori, on the other hand, says she does contribute. It's also not even obvious to me that this is even a rule; it seems perfectly logical to say that this line is a description and not a rule itself; this could be just be pointing out that bowed characters don't contribute because they aren't included in the list of characters 3.2.3 spells out do contribute, as opposed to this being an explicit (and totally redundant) forbiddance on bowed characters contributing.
  2. The issue is that the people who are 4-2 don't know this. It's not an apparent outcome, in fact the opposite is the default assumption. I've spoken to several people who are themselves 4-2 and they've all been really let down by me informing them this. I guarantee you there are more who don't know this result and are going in with false hope. And I do think that there's an element of cruelty in that, to imply that someone can achieve something (by letting them in the cut), but have it actually be effectively impossible. It's also not the only issue with the situation, either. The 5-1 players who go 6-2 are going to make or miss the cut entirely off of a coin flip. Like, that's not a tie breaker, that's just randomly determining who gets in and who doesn't. It only pretends to be a tiebreaker.
  3. The ANOVA was never to show that groups are not different in skill or different in challenge. The ANOVA was to show that SOS itself does not change in shape or distribution based on player count, which is a claim that was actually made and if true would be very relevant. All the ANOVA was used to show is that SOS across the groups are interchangeable assuming everything it is calculated by remains the same, which must be true in order to inherit tournament points fairly as well. You didn't even read the second paragraph of my post, which is really quite disappointing.
  4. Of course, no tiebreaker is ever going to accurately rank people, it's all approximation once you get to evaluating the better of two people who never played. But, I also have a hard time accepting that 2 rounds of SOS is going to be more accurate than 8 rounds would be on average. So, barring some yet unspoken argument that sinks the ability to inherit SOS across days but doesn't sink the ability to inherit tournament points across days, would 8 rounds of SOS not be better than 2?
  5. We're not even on topic any more, honestly. We went so far down the rabbit hole that we lost ourselves. The topic at hand is what's the fairest way to handle a tournament with three flights of 6 rounds each with a graduated cut into a 2 round flight which cuts into a Top 16 (+7) single elimination bracket. And on that topic I have not seen an argument against keeping SOS that doesn't fall down. Everything else put forward involves either changing the schedule of the tournament to allow more rounds in day 2 or can be shown to not be accurate. In a perfect world, sure, it'd be better to have Saturday be 6 rounds of swiss starting from scratch, leading into a 25+7 single elimination bracket on Sunday, but that's not what happened. Maybe next year will be better. If someone has a better way, I'd like to hear it, and hear why.
  6. Your leading question does not change the fact that 94ish percent of the field is actually seeded randomly from the same population. And it even more woefully fails to create an argument that SOS should be wiped but tournament points should be kept.
  7. That doesn't prove nearly as much as you think it would. If you take a random sampling of L5R players out of the general population you're still going to get a different distribution of clans each sample. In fact, that's almost exactly what happened to create these distributions, if you ignore Hatamotos who make up a pretty small chunk of the total players.
  8. Sure, but the assumption of equal skill has already been made. It's already inherent in how this tournament is being run. We can challenge that, fine, and maybe it even should be challenged - but doing so leaves us even worse off than before because there are only two rounds in day 2 and that's insufficient to do any kind of sorting. The ANOVA was very specifically to counter the claim that SOS would be higher, lower, or somehow vaguely 'different' in one flight as opposed to another. That claim, if true, would actually be a reason to wipe SOS but keep tournament points. But it wasn't true, and that's what was shown. Unfortunately I didn't make that entirely clear as honestly I thought I'd shown quite cleanly that an argument based on skill difference doesn't have any weight as a reason to support FFG's decisions.
  9. They are, yes. This is true. But the problem is that if your argument is that skill is what is differing then SOS isn't what needs to be dropped but rather tournament points -- which are instead kept. If skill is (approximately) the same, then both can be kept. if it isn't, then neither can be kept. Here, we've got a situation where we're keeping one but tossing the other and there's absolutely no good reason for it. My ANOVA test was to disprove @TheItsyBitsySpider's claim that SOS values would differ between the two days by virtue of population count and therefore give an advantage to one of the days. That claim is what was shown to be (probably) false.
  10. And statistics doesn't work any differently. Thinking doesn't have to do with it. The math does it for you; all you have to do is interpret it. @TheItsyBitsySpider, I'm honestly getting embarrassed for you. If you want to make the argument that because of Hatamotos the two aren't comparable then guess what - that also means that tournament points need to go. At which point we have a two round tournament with 100 people in it and we're determining the top cut by rolling the dice. In order for your argument to hold any water at all you need to come up with an argument that supports the claim that tournament points can be kept but SOS can't. Your one attempt at doing that was making the claim that a day with a different quantity of players would result in an unfair SOS advantage to people in one pod. This was tested and it was shown that that is not the case. You will not get more, or less, or different SOS distributions because of player count. You need to come up with an argument that supports keeping tournament points but doesn't support keeping SOS and so far you have utterly failed to do so. Every argument you've put forward either is more applicable to tournament points than it is to SOS, or doesn't shake out statistically. And until you can come up with something with merit I'm done responding to this inane train of thought.
  11. Population has a different meaning statistically. Several groups of data can be said to be from the same population if they are indistinguishable from the overall group. If the barriers are merely arbitrary then they are from the same population.
  12. I'm sorry man, but you need to give it up. It's over. The claim has been tested. The distributions are identical to distributions derived from the same grand population. There's really nothing left to argue, short of you digging up more tournament data and doing further analysis yourself. You're welcome to do so, but I doubt it'll change anything. At this point you're just tilting at windmills.
  13. How about we put our money where our mouth is, then? This claim of yours, that they are different populations, it's a testable claim. We can investigate this. We can use statistics to test whether or not it's a true statement. So, I did. I ran the SOS of 4-2 players through an ANOVA test to determine how likely these two samples come from the same population. Here are the results: What does all this mean? In layman's terms, an ANOVA test functions by first making the assumption that the samples being tested are from the same population, and then it determines how often samples of this size that are created from the same population would be less similar to each other than the given samples are. Basically, if you create random samples from a normal distribution and you put them into buckets these two buckets will not actually be identical - there's going to be a measurable difference between the buckets each time. Similarly, there is a measurable difference between the buckets in our actual sample. So ANOVA compares the actual difference and asks how often will a random distribution create more difference than this. This is recorded as the P-value of the test. In our case, 89.9% of the time a pair of samples of this size created from the same population would be less similar to each other than these two samples are. Meaning that even if these were the same population these samples are actually absurdly close to one another, batting well above average. For comparison, a reasonable first glance value for saying that two samples are not from the same population is a P-value of 5% or less. And people still get pretty queasy about that because it's still possible that that's just random variation. 89.9% utterly blows that out of the water. So, no. We can soundly keep the null hypothesis that the two SOS samples are from the same population.
  14. If you're going to hold a blatantly hypocritical position then I'm going to call that position hypocritical. You don't get to claim that they're different populations but then argue to keep the most important data point while simultaneously tossing a significantly less important data point. You are straight out trying to have your cake and eat it too and it does not add up. Additionally, This statement is just flat out false and is full of misconceptions. For one, the rate of pair-downs is not a function of size but rather a matter of how close the tournament is to an exponent of 2; at a perfect exponent of 2 there will be zero pair-downs, and at a distance in between pair-downs are maximized. For two, pair-downs do not have a net effect on the mean SOS. Pair-downs have a zero-sum effect; they hurt one player and help another. On the whole they completely wash out. They do increase the variance slightly, but that's it. Finally, and most importantly, the effect that you describe here is not even in the same order of magnitude as the one I describe in the OP. If the goal is to create the fairest and least biased tournament possible then this doesn't even hold a candle to the realization that your SOS on day 2 is entirely determined by a single game that you are not a part of. It's not an aggregate of games, your first tiebreaker is actually just a coinflip. If you want to compare that to one day have a slightly higher variance on their SOS then fine, go ahead, but don't expect to get any traction from me.
  15. Let's assume you're right. So we drop all stats from one tournament to the next. Day 2 starts from scratch with everyone at 0 tournament points. We do 2 rounds of swiss and determine that 25 of the 100 people won their two games, and randomly pick 16 of them to make the cut. Is this what you actually support? Because if not, then you're being hypocritical.
  16. You realize that it is very possible that this situation arises in swiss rounds of insufficient quantity to have a single undefeated player yes? It's unlikely, especially given the size of the tournament, but if the swiss rounds don't run all the way to completion then random chance is perfectly capable of creating an orphan group that only ever plays into itself. And again, your assertion that these two groups are from a different population just because you say so doesn't hold any merit. They aren't geographically separated. They aren't temporally separated. They were even randomly seeded into by the FFG random number machine from the same group of applicants. If that's not the same population then I don't know what is. You see this kind of thing in stats all the time, looking at both sub-groups as they are gathered and the entire sample set as a whole. Also, you completely ignored my other observation. If these truly are different populations then all data from them are incompatible, and we should toss out the tournament points in addition to the SOS. Are you in favor of that? Because I don't think a single other person would be. Courtesy is reciprocated. You showed absolutely none and got none in response.
  17. I'm going to pass on bearing with you. I have no interest in being subjected to a sermon. If you make the assumption that players in both days are drawn from the same population then no, it's perfectly reasonable to group them together. Maybe that assumption is wrong, an argument could be made that there is a difference in the quality of player that would play on a Wednesday vs a Thursday perhaps, but if that's the case then keeping the tournament points from one day to the next is also incorrect and they too should be dropped. But I see no complaints from you (or anyone) on that regard, so I suppose the assumption must hold if only for convenience's sake.
  18. It depends on what it means to say 'sort'. You're right that 3 rounds isn't enough to sort 100 players to the top 1 player -- but it does sort roughly 100 players into the top 12.5 players, which is smaller than the Top16 cut. Two rounds only sorts to the top 25 players, though, which is greater than the Top16 cut, which is what leads to these issues. Two rounds of swiss with only the 6-0's and 5-1's would sort those ~36 players into the top 8, which is also sufficient. Which is why two rounds would be "enough" if they didn't include the 4-2's, and also explains why none of the 4-2's can show up in the Top16 with only two rounds.
  19. A third round of swiss would obviate my entire set of arguments, yeah. I would still personally prefer that they carry over the SOS from the previous day, but it'd at least work. The primary issue is that there are too many people in the top cut to be properly sorted by only two rounds. Think about it like shuffling; it's a very similar problem, except backwards. When you shuffle you're trying to destroy information. You want to make it so that each card could possibly have moved from its starting location to any other location. If you don't shuffle enough, you haven't accomplished that - there are limits to how far that card could have moved, so it isn't random yet. Swiss operates on much the same basis, except instead of each iteration destroying information it generates information. Only doing two rounds of swiss and then using just that as the basis to sort people just doesn't create enough information to sort them by. To continue the shuffling metaphor, doing three rounds would be akin to riffle shuffling exactly 7 times in a 52 card deck. It's the bare minimum necessary to get that full movement... but I doubt people would stop there as it just doesn't feel like enough.
  20. A single question: How many rounds were in day 2? Because if it's not 2, then it isn't relevant to the conversation at hand. The issue here is that 2 rounds of Swiss is woefully insufficient to generate any meaningful data. I'm sorry about your friend's situation and that IA's SOS system was very clearly broken, but that doesn't make this situation not also broken.
  21. The issue is that this entire conversation is unnecessary. Everything I posted about would be completely obviated if there was only one strength of schedule that spanned both days. Instead they split the SOS for no apparent benefit except cruelty. And it's not even the only issue caused, either. Most or all of 5-1WLW players will make Top 16 - but few if any 5-1WLL players will. The difference between these two groups players is the outcome of a single game that they aren't even participants in. And it's all because FFG is just going to all but throw away all of the data from day 1. A player's entire D1SOS only matters, all six rounds of it, if their D2SOS ties with someone else while on the bubble. The worst part is, I'm about 98% certain that the only reason it is this way is laziness on FFG's part. They don't want to have to handle merging the tournaments, so they just make a new one. And if so that's a real shame on their part.
  22. Statement: It is effectively impossible for someone with a 4-2 record to make the single elimination rounds of the World Championship unless their clan has almost no 5-1 players in which case such a player might have a chance at being the challenger. Corollary statement: Every 6-0 player will make the Top 16 cut unless they take a modified loss. Considering that it is completely within someone's control whether or not their loss is regular or modified, every 6-0 player should end up in the Top 16. The reason these statements are true is because FFG is keeping the day 2 strength of schedule separate from the day 1 strength of schedule and is using D2SOS as the first tiebreaker. Meanwhile, they are keeping the tournament points from day 1 to day 2. This, combined with the fact that there are only two rounds in day 2, creates some very unfortunate circumstances. Consider the expected population. If day 1b has the same number of 6-0 players, 5-1 players, and 4-2 players as day 1a has, then there Day 2 will have approximately 5 (+/- 1) players who are 6-0, 30 (+/- 2) players who are 5-1, and 68 players (+/- 4) who are 4-2. Each player has 4 possible outcomes from day 2 -- WW, WL, LW, and LL. Each of these populations will actually equally distributed -- there will be the same number of 4-2WW and 4-2WL and 4-2LW and 4-2LL. Pair-downs and modified wins/losses screw this up some, but not by a huge amount, and those will be covered later. There is one more factor that needs to be considered here - and that is whether or not a player's first opponent wins or loses their second match. This is the sole determining factor in a player's D2SOS after considering their starting record and their personal performance. So, all 4-2WW's whose first opponents win their second round will have the same D2SOS, as will all 4-2WW's whose first opponents lose their second round. So we will call the former 4-2WWW, and the latter 4-2WWL. So, barring pair downs and barring modified wins/losses, this identifies all possible tournament point outcomes and all possible strength of schedule outcomes. Now we only need to rank them. 8-0 players consist only of 6-0WW's. There should be 1-2 8-0 players. They are guaranteed in the cut. 7-1 players consist of 6-0WL, 6-0LW, and 5-1WW players. Using Swiss logic it can be calculated that there will be 10 to 11 7-1 players; they too are guaranteed in the cut. 6-2 players consist of 6-0LL, 5-1LW, 5-1WL, and 4-2WW players. Using Swiss logic it can be calculated that there will be 0-2 6-0 LL players, 6 to 8 5-1WL's, 6 to 8 5-1LW's, and 16 to 20 4-2WW's. Three to five of these players will make the Top16 cut. So, let's evaluate the D2SOS of these groups: And herein lay the issue. Every 5-1 player that goes 6-2 will have a higher D2SOS than a 4-2 player that goes 6-2, simply because they started out paired up against a 5-1 player and not a 4-2 player in the first round. Now, I said that I was ignoring modified wins and losses and pair downs. So, let's consider these things right now. First, we can easily see that a modified win/loss or pair down will not affect whether a 8-0 or 7-1 player makes the cut unless there are several of them. Specifically, a 7-1 player with 3 modified wins, or 2 modified wins and 2 modified losses will end up beneath the mass of 6-2 players with no modified results. With less than that, the 7-1 player ends up above the 6-2 mass and is guaranteed in. There's only currently one 5-1 player with any modified anything, so it's true that if that one player gets two mod wins on day2 then they are out of the running but it seems very unlikely. Pair-downs only hurt the D2SOS, which these populations aren't relying on to make the cut, and so has no effect on the quantity of 6-2's that make the Top 16 cut. So that leaves us with 6-2's. On a 6-2 player, a pair down has the effect of dropping their D2SOS by 9/16. Doesn't matter when the pair down happens or who you are, getting paired down will have that result -- it effectively drops you one row on the above chart. Getting paired down against someone with a modified result will have a similarly deleterious effect, but not so severely; instead of falling an entire row you fall in between the rows. So in short, a 5-1LWL with a standard pair down has the same D2SOS as a 4-2WWW with no pair down. Finally, because of how Swiss works, pair downs happen at most once per round per score category, and so they're nowhere near enough to get a 4-2WWW into the Top16 cut. Instead it has to be modified wins/losses. Pretty much the entire 5-1WL and 5-1LW brackets need to have a modified result in order for a 4-2WWW to make the Top16 cut. And that's quite unlikely; we're hoping that ~12 people out of a population of ~14 to 16 receive what truthfully appears to be a quite rare result; there just are not that many of them floating around. Like I said, there is currently one 5-1 who currently has a modified result. The only possible out for a 4-2 to make the single elimination rounds is as a challenger, but even then that is only going to happen if every single 5-1 in their clan makes the natural Top 16 (the 6-0's are guaranteed in). Currently that could happen for Lion (which has no 5-1's at the moment), Unicorn and Phoenix (who each have one 5-1), but seems absurdly unlikely for the rest of the clans. I'm not entirely certain what the point of this post is other than an excuse to do math, but I do think that it's quite cruel to let 4-2's play in day 2 if they don't actually have a chance to make it into the single elimination rounds. Isn't this what a graduated cut is supposed to avoid?
  23. Oh, no, you're right, 4-2's do make it to Day 2. But it's completely impossible for a 4-2 to make the natural Top 16 cut. They could, potentially, if the competition is low enough, make the Challenger position, but due to the way the tournament is set up it actually cannot happen that a 4-2 makes it into the natural Top 16. I'm making a thread on this right now to show my numbers, should be up soonish. Long story short is, wiping the strength of schedule between days is a terrible idea and it makes having a two round Day 2 actually a waste of time for everyone.
  24. Here is my table of players currently making the Day 2 cut and their distribution. This table does ignore mod wins/losses, but there actually are not that many in the data; there are a total of 5 people out of the 61 with a mod win or loss. The Forecast column is the number of people with those records if Day 1b has an identical distribution as Day 1a had.
  • Create New...