Suspect suspects: The Status Quo and Suspect Tests (2024)

By a fairy. Released:2024/04/14

« Previous Article Next Article »

Suspect suspects: The Status Quo and Suspect Tests (1)

Art by bro torterra.

Introduction

Suspect tests are a core part of Smogon’s DNA. If a tier isn’t actively testing something, there’s often a verbal contingent, minority or not, clamoring for something new to get tested—either to be released from the tier’s ban list or to be sent to it. At any given point, there’s likely a suspect test running, either in the laddering or the voting halves of the test. Oftentimes, multiple tests can run simultaneously across the various ways that we play competitive Pokémon.

Some tests come and go, forgotten other than the metagame shifts they potentially make. Other tests have seismic impacts on the community as a whole, echoing out into the wider world of Pokémon. In December 2019, it took a mere seventeen hours for over 200 of the 300 elegible voters to vote to ban Dynamax in OU. By the time the vote formally closed, only 36 voters total had voted to keep Dynamax.

For those not around for the BW original Shaymin-S vote or Philip7086 Putting His Foot Down following a discussion and overriding a supermajority vote, it’s quite possible that suspect tests have shaped how you view Pokémon as much as you’ve shaped suspect tests with your votes or laddering attempts. Each suspect test builds on every one that comes before it and contributes to the tests that come after it, both by changing the tiers they are for and by setting informal precedent for future tests.

At their base, suspect tests are just simple numbers, and with the number of suspect tests that have happened in the history of Smogon, entire stories can be woven from the scores of Ban and Do Not Ban posts in the Blind Voting subforum. The story that resulted in the collection of this dataset set out to see how often suspect tests succeeded at changing the metagame, but that's not the only story that can be found or made.

Dataset and Disclaimers

This article focuses on all tests that took place between the release of Scarlet and Violet and The Indigo Disk. On a more technical level, it included every Blind Voting thread which was posted between those two dates but not things such as council votes or tiers that did not use the Blind Voting forum. This is a subjective decision, selected to balance a reasonable dataset that isn’t too large (105 suspects) while having a reasonably defined start and end point.

Due to the way Smogon approaches testing parts of their metagames, the vast majority of suspect tests follow a similar pattern, which enables uniformity in data collection. Things like the tier, suspect test target, number of voters, and required benchmark for action are all easily collectable before results are even released. The result, as well as the specific amount of votes that led to it, were also collected once suspect tests concluded. This pattern is based on a binary "take action" versus "take no action" system, however, bringing attention to the three unique cases.

The big one is the SV OU Terastallization test, which included more granular choices beyond the binary “Take Action” and “Do Not Take Action” options. BW OU and RBY UU are the other two, opting for a preferential voting system on multiple different possibilities for their tiers. The former has been limited to the binary options, while the latter two have been excluded from the dataset due to being incompatible with the binary nature of every other test.

Suspecting the Status Quo

Suspect tests are a huge time commitment and often are major milestones for their tiers during a generation’s lifespan. While no tier operates identically, the decision to make a suspect test is not taken lightly and involves considerable work and effort. With the power to determine tests coalescing in the hands of tiering leaders and councils, tests do not come from nowhere with little logic or reasoning.

The theory that drove this article's creation was that suspect tests should result in action taken more often than not. Suspect tests are used to bring the tier to a healthier state of being, a more competitive and enjoyable environment for its players. It stands to reason, then, that if the people entrusted with the leadership of a tier put forward a suspect test, it makes sense that it would result in the metagame changing, be that by the addition or removal of a Pokémon or mechanic.

After over 100 tests completed between the release of SV and Teal Mask, 69 tests ended with their tier changing, and 36 ended with the suspect test not making changes to the tier. In nearly two-thirds of cases, the target of a suspect test saw its position in the tier change. While not every test is identical, a large enough dataset can provide something to learn about suspect tests as a whole.

In the above theory, suspect tests should change the metagame more often than not, and with 65% of suspect tests changing the status quo, this matches up. A suspect test takes a lot for a tier to do, so tier leaders presumably don’t want to spend time and energy on tests when a metagame is stable or if a large metagame change—like DLC—is on the immediate horizon. The actual subjective opinion on the success of suspect tests varies for each individual, and there is an endless rabbit hole that involves quickbans, tiering policy, usage stats theory, and so much more.

The short answer for the theory that spawned this seems to be that suspect tests do have a success rate that is higher than random. No suspect test is unanimous (except for the select few that are!), especially when you start counting people who do not successfully get voting requisites. However, it certainly seems that, in the big picture, any test is significantly more likely to change its tier than not.

Fun Tidbits

While the purpose of the exercise was to find that 65% number, there are other fascinating tidbits of information that can be found in the dataset. For example, nearly every single time a move, ability, or item is tested, it’s removed from the tier—the exception being Balanced Hackmons not banning Mortal Spin. Given the extent required for one of those to be tested, this is perhaps not as large of a surprise as it may seem at first glance. Typical Smogon tiering policy heavily discourages these tests, so for something to get so far as to get a test, a lot of lines have likely been crossed already.

Not all tests are as simple as Ban or Do Not Ban, with ten tests taking some other approach. Seven of these involved the different-enough Keep Ban against Unban, and the final three come from the Other Metagame Mix and Mega, which can test Restrict against Do Not Restrict, limiting the Mega Stones a Pokémon can use. In these cases, Unban and Restrict are the options shaking up the status quo. In the end, these tests are still a binary "take action" against "do not take action", in their unique flavors.

Every test has a different number of qualified voters, but not all voters vote. Some tests have had as few as 13, while the SV OU Terastallization had a high mark of 351. A test where voters chosen via tournament performance had 40% of the voting pool not register a vote, whereas nearly thirty tests had full voter participation, the largest of which was DOU’s Flutter Mane test with 75 voters total.

Terastallization and Excadrill

Despite the generation-long discussion of the feature, only three tiers have tested Terastallization more than once, with a total of thirteen tests across all tiers. Seven of those fell in the range of passing 50% but failing to hit the necessary 60% benchmark, mirroring seven other tests in the same unfortunate spotlight. Examples of non-Terastalization "minority victories" include OU keeping Kingambit unbanned, Mix and Mega voting to ban Magearna, and 1v1's Ogerpon-H vote ending in leaving it unbanned.

For a while, BW banned Excadrill. This occurred before BW2 even, and to some extent, before we even knew it was called Excadrill due to the nature of staggered international release. Back then, the idea of complex bans or non-Pokémon bans were not unheard of, and single suspect tests could decide the fate of close to a dozen different Pokémon or other parts of the metagame. Years of trial and error and policy had yet to happen to develop a more formalized understanding of how suspect tests should work.

Of twelve different OU suspect tests during BW’s existence as the current generation, the first five featured Excadrill, staying comfortably unbanned for the first four of those, only meeting the banhammer on its fifth appearance. It would take until 2015 for the community to push for an Excadrill re-examination, with the World Cup of Pokémon being used to explore a BW metagame with Excadrill, ending in its return to the tier.

For further context, that 61% number of tests in SV's first third of its lifespan can be compared to BW's history. When comparing to the fifty potential changes to the metagame over twelve suspect tests—excluding one ranked choice voting, and yes that's fifty different potential bans or unbans in just twelve suspect tests—we can see a huge change of pace. In BW, only 33% of potential changes happened, with suspect test voters turning down a massive majority of slated suspects.

Terastallization and Excadrill are not the same, and SV and BW are very different generations with very different Smogon communities. We have learned much since those days and continue to learn. Only three tiers have had more than one test of Terastallization, though more may come now that the Indigo Disk DLC has released and we have arrived at the final version of SV that will endure through the generations.

Conclusion

Data does not tell stories. People tell stories and can use data to enhance that. With over 100 tests across 40 tiers, there are plenty of stories to tell. It's a small shock that someone who had one of her first experiences of suspect tests watching Excadrill go through the gauntlet over and over again would see echoes in the Terastallization discussions.

There are more than ten pages of voting threads and more hidden away in Thread Cryonics in Cold Storage if you know where to find them. The tests that took place between SV and Indigo Disk’s release dates are a small section of the road that we have come from since the start of suspect testing, with an entire story that could be told just in the BW OU tests referenced as a contrasting data set.

In that time, thousands of people have contributed in some manner to the state of tiering on Smogon. Over 1,000 accounts have the Tiering Contributor badge, and that’s disregarding those without it who have still voted, those who contributed via means of policy discussion or never quite made requisites, or even those who just played against suspect test accounts on the ladder.

Whether Terastallization ends the generation banned or unbanned in the various relevant tiers is impossible to say. Whether it will stay that way long-term once SV becomes an old generation is even harder to determine. The fate of Terastallization and every other suspect test target is in the hands of the community in the end, with suspect tests driving the very heart of competitive Pokémon.

HTML by Steorra.

« Previous Article Next Article »

All guides and strategy information are © 2004 Smogon.com and its contributors. Pokémon is © 1995 Nintendo. Privacy Policy

Suspect suspects: The Status Quo and Suspect Tests (2024)
Top Articles
Latest Posts
Article information

Author: Tyson Zemlak

Last Updated:

Views: 6156

Rating: 4.2 / 5 (63 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Tyson Zemlak

Birthday: 1992-03-17

Address: Apt. 662 96191 Quigley Dam, Kubview, MA 42013

Phone: +441678032891

Job: Community-Services Orchestrator

Hobby: Coffee roasting, Calligraphy, Metalworking, Fashion, Vehicle restoration, Shopping, Photography

Introduction: My name is Tyson Zemlak, I am a excited, light, sparkling, super, open, fair, magnificent person who loves writing and wants to share my knowledge and understanding with you.