She’ll Never Get Those Scores Internationally

Secret Classic, the most important gymnastics competition of the year until whatever’s next week, will be upon us as fast as you can say, “That connection is stupid, honey.” With it will come a heap of Olympic team predictions and proclamations about how those scores will or will not translate to the Olympics, burying us under a pile of our dear old friend, “She’ll never get those scores internationally.”

But will she?

The answer is…mostly. Sometimes.

Let’s begin in 2012 and work forward from there.

Here, I’ve taken the average execution score each US Olympic team member received at the four major 2012 competitions (Classic, Nationals, Trials, and the Olympics) and plotted them by event to compare the scores received domestically to the scores received internationally. I’ve excluded team members who did not ultimately compete that event at the Olympics—because then there’s no point of international comparison—so vault does not include Ross’s domestic scores and the other events don’t include Maroney’s.


Domestically, most events saw a slight increase in execution scores toward Trials, with Trials featuring the most enthusiastic judging (or, a nice person could argue, the most perfected routines). That’s something to keep an eye on this year as well, Classic as the most conservative of the US meets.

Once we arrive at the Olympics, the execution scores decrease on some events, but not all events and not for every gymnast. As is well documented, vault execution scores had a prescription drug problem at the 2012 Olympics and were largely off the chain, higher than at any point in the US season. Beam also remained quite constant, falling just slightly for the Olympics (a number which includes mistake routines from Douglas in EF and Raisman in the AA). The execution scores for hit beam routines between Trials and the Olympics were similar.

Of course, D score on beam was a different story, mostly because of Wieber’s walkover hell sandwich that the Olympics judges scraped off the bottom of their shoes with a stick and wiped on the curb. That’s where we can point to US judges doing a disservice by propping up unrealistic D expectations, but in execution, what we saw early was what we saw later.

More noticeable drops occurred on floor and bars. I’m not that concerned by floor because the US had issues on floor at the Olympics which depressed those execution totals. The hit-routine numbers were still a few tenths lower than they were in the US (more of a drop than on vault or beam), but not as extreme as the ultimate graph would suggest.

But let’s look at bars more closely because that is really where we see people NOT GETTING THOSE SCORES INTERNATIONALLY.


Although, it’s not the person you might expect. Raisman’s execution (in red above, red for Raisman, see how I did that) remained very steady from the US to London, while the rest fell rather dramatically. Now, Wieber had those arches at the Olympics, and Douglas went over on a handstand in the event finals, so we wouldn’t expect huge execution numbers (though that doesn’t account for the whole thing). Ross is the most interesting one to me, though, because her execution scores plummet almost seven tenths from Trials to the Olympics (spoiler alert: that’s a lot), and she showed solid Olympic routines.

So, clean-form queen Kyla Ross is the one who gets hit with the biggest NOT INTERNATIONALLY, while Brestyan’s von Brestyan’s chugs right along with the same scores. That’s the international scoring story.

But enough of ancient history, let’s get to more modern times for comparison. I’m dispensing with 2013 because there were only three US gymnasts competing at worlds, Maroney barely did routines that whole year, and Simone got nothing but 7s at Classic, so there’s little real information to work with. Let’s move on to 2014 when we have some more heft to the team.


In 2014, vault displayed a similar personality to 2012 with that very slight execution increase at worlds, though the other three events were all noticeably lower.

The bars decrease was in many ways a result of that terrible qualification performance when both Biles and Kocian fell (the execution decrease actually should be larger given those mistakes at worlds, but there were nearly as many mistakes at nationals, so it’s all just a big bag of falls).

Contrary to what we saw in 2012, however, beam execution was quite different between 2014 nationals and 2014 worlds.


Of course, some of this must be taken in context, like Baumann having a major wobble on her arabian at worlds, but once again Kyla got the brunt of the execution slam. This was the worlds, if you recall, when the judges really started cracking down on pause deductions, so once again Ross was the one hitting her routines (mostly) similarly to nationals while scoring a legit seven tenths lower at worlds. Only Simone escaped the beam’s wrath at worlds.

She did not escape it on floor, however, where we see a very traditional, expected, medium-sized decrease for every gymnast from nationals to worlds.


Across the board, the whole team lost about three tenths, no one that much more, no one that much less, no matter whether you’re Biles or Ross or Skinner.

To 2015!


Our most recent year provides probably the least compelling argument for the NEVER GETTING THOSE SCORES INTERNATIONALLY argument because everyone pretty much did get those scores internationally. Vault takes on a slightly different identity in 2015 as scores actually did fall a bit from nationals, a result of harsher evaluation of those Amanars (both Douglas and Dowell actually saw their E scores increase from nationals to worlds for their DTYs).

Floor showed the smallest change in 2015 among recent years, with Raisman and Dowell dropping a little for weaker showings but Biles and Douglas’s E scores matching their hit scores from nationals quite nicely. As for beam, those scores were remarkably identical (e.g., Biles’ execution average was 8.600 at nationals and 8.606 at worlds).

Note the giant drop on bars, but that drop is almost entirely due to the disasters from Dowell (5.433 E score) and Raisman skewing the total a ton. If we exclude those routines and look at the individuals who hit their routines (four lines for Biles, Douglas, Nichols, and Kocian below), their slight decrease is consistent with the other events.


In 2015, there was very little “not getting those scores internationally” going on at all.

In general, when US gymnasts head to worlds/Olympics these days, they do lose a little. A tenth or two is normal, a little less than that on vault, a little more than that on floor, but it’s usually kept in the reasonable, unremarkable range.

Sometimes, significantly worse international execution scores rear their heads (for hit routines), like on beam in 2014 as addressed, but that seems an outlying instance, and it basically only happened to Kyla Ross. Every time. Sorry, Kyla.

What there isn’t much support for is the idea that gymnasts with weaker execution will get destroyed at worlds, while gymnasts with better execution will remain constant. Pretty much everyone loses a little, no matter who you are. Raisman’s bars was the only execution score that didn’t fall on that event at the Olympics in 2012, and Skinner’s execution scores in 2014 didn’t drop at a more precipitous rate than Ross’s or Kocian’s or Biles’.

The answer to “Her scores will drop internationally” seems to be, “Yeah, maybe like a tenth, but so will everyone’s.”

In fact, the standout cases of a larger scoring hit internationally have tended to come from the cleanest gymnasts simply because the judges at worlds aren’t willing to throw out super high Es. To anyone. The best execution scores often have to weather the international scoring landscape more than the weaker ones.

Vault in 2015 illuminates that with respect to Biles.


Biles experienced a clear drop in execution scores from nationals to worlds. Her worlds vaults were certainly not her best showing and not as good as they were at nationals, but a major factor in the drop is simply that the worlds judges had no interest in giving her the kind of 9.900 E score she got at nationals. It wasn’t going to happen, bringing her back to the realm of humans.

One outcome of the best gymnasts’ execution scores dropping and the others staying the same (or increasing) is that the international execution scores coalesce within a smaller range than we see in the US competitions. Vault in 2015 is a good example of that. With the exception of Raisman—and that long-jump landing on her Amanar that made her execution at worlds justifiably lower—the rest of the scores huddled in a pretty tight four-tenth execution range at worlds compared to the eight-tenth range at nationals.

Many of the graphs above reflect a similar phenomenon. Look at the final dots for bars in 2012 or beam in 2014 or the overall events in 2014. The international scores are extremely bunched, much more so than the domestic scores.

We tend to bemoan US judging when it ultimately doesn’t correspond to what happens at worlds (using the notion that a harsher score is equivalent to a more correct score, which is sometimes the case but not exclusively), but in some ways the US elite judges have it right in that they’re using the code more the way it was meant to be used.

Judges have a whole ten points of execution scoring to work with in an open-ended system. There’s no reason to have everyone stuck in the same little five-tenth range for wildly different performances. While the US judges might get a little enthusiastic by comparison and throw out super high 9s here and there that will never happen at worlds, they’re utilizing more of the execution range at their disposal, which is commendable.

2 thoughts on “She’ll Never Get Those Scores Internationally”

  1. Love it Spencer! And I totally agree with you. International judges seem to think 9.1-9.2 is like the best E score a gymnast can get(maybe a wee bit higher for vault) and it really bugs me. It feels like they’re so ready to punish gymnasts for mistakes but cannot reward them for really good execution.


  2. Wow, can’t believe we still have to talk about E-scores as if the judges watch the whole routine and then deduct using their overall impression/memory of mistakes with a healthy dash of subjectivity. If that’s so something needs to be done. The code specifies the deductions that should be taken so every wobble, improperly flexed foot, leg sep, short handstand, bent leg, short of splits, short of rotation, overrotation, deep chest, lack of height, excessive preparation pause etc should lose the gymnast 0.1/0.3/0.5. Fine – gymnasts know what to cut out to score more and we see a better cleaner routine. They are not entitled to a 9 just because they didn’t fall and threw a big D. I think perhaps judging should be more open ie we get to know what deductions were taken and why and if the scores that matter are close then scoring is checked to make sure gymnasts/teams are correctly positoned. There’s probably a case for judging using slow mo’s and multiple angles but it is obviously difficult to treat every gymnast the same without taking all day over a single subdivision.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s