Varieties of “Black Mirror” Appreciation – A Statistical Analysis

[Note: This post contains spoilers for Black Mirror]

A few days ago I finished the last available episode of Black Mirror. If you haven’t seen the show, this post will make very little sense to you and I recommend closing this tab, watching the series (ONE EPISODE PER DAY, MAX. DO NOT BINGE.), and then come back after that approximate fortnight to read this.

For those who have seen the show but don’t remember every episode title, here is a reminder (plot descriptions from

Season 1
The National Anthem
Prime Minister Michael Callow faces a shocking dilemma when Princess Susannah, a much-loved member of the Royal Family, is kidnapped.

Fifteen Million Merits
After failing to impress the judges on a singing competition show, a woman must either perform degrading acts or return to a slave-like existence.

The Entire History of You
In the near future, everyone has access to a memory implant that records everything they do, see and hear. You need never forget a face again – but is that always a good thing?

Season 2
Be Right Back
After losing her husband in a car crash, a grieving woman uses a computer software that allows you to “talk” to the deceased.

White Bear
A woman wakes up in a strange dystopian world with no memory, where everyone is glued to their phones and there are hunters out to kill her.

The Waldo Moment
A failed comedian who voices a popular cartoon bear named Waldo finds himself mixing in politics when TV executives want Waldo to run for office.

White Christmas
In a mysterious and remote snowy outpost, Matt and Potter share an interesting Christmas meal together, swapping creepy tales of their earlier lives in the outside world.

Season 3
In a future entirely controlled by how people evaluate others on social media, a girl is trying to keep her “score” high while preparing for her oldest childhood friend’s wedding.

An American traveler short on cash signs up to test a revolutionary new gaming system, but soon can’t tell where the hoot game ends and reality begins.

Shut Up and Dance
When withdrawn Kenny stumbles headlong into an online trap, he is quickly forced into an uneasy alliance with shifty Hector – both at the mercy of persons unknown.

San Junipero
In a seaside town in 1987, a shy young woman and an outgoing party girl strike up a powerful bond that seems to defy the laws of space and time.

Men Against Fire
Future soldiers Stripe and Raiman must protect frightened villagers from an infestation of vicious feral mutants. Technologically, they have the edge – but will that help them survive?

Hated in the Nation
In near-future London, police detective Karin Parke and her tech-savvy sidekick Blue investigate a string of mysterious deaths with a sinister link to social media.

I like Black Mirror, for some strange, masochistic, Stockholm-syndromic meaning of “like”. Most things on tv are disappointing in that they don’t surprise you, they work firmly within particular frameworks of genre and style. Some frameworks are more narrow and cramped than others (police procedurals or teen dramas barely leave place to stand), while others are more open. You can be ambitious and constantly work to break out of the previously established frames, but if you keep doing that over and over again you’ll likely to end up with a incoherent mess of a story (looking at you, “Lost”) unless you’re really awesome and have planned everything perfectly.

Black Mirror does it by cheating. It cheats by being an anthology, meaning you start over with a clean slate every time, never having any idea what you’re going to get. The episodes are all separate stories tied together by a common ethos and viewpoint more than anything else. They vary considerably in tone, feel, subject matter, aesthetics and genre. This has certain consequences for the fanbase; while everybody obviously likes the show, taste in individual episodes are all over the place and nearly every story manages to be divisive among fans, who nonetheless are united in their appreciation for the show as a whole.

This offers an interesting opportunity for soma data-driven erisology. I’ve touched upon a few times before here that I’m interested in people’s differing taste in art and stories and what it is, psychologically, that makes us prefer different things. “There is no accounting for taste”, they say. My reaction to that is something like: “Well why the hell not? Have you even tried? There is obviously some kind of explanation”. I won’t try to offer some explanations now, I’m not in a position to do that. But some exploratory work is possible, and Black Mirror is a great subject for it.

Black Mirror episodes are all different and the fans react differently to them, which means the set of 13 episodes are a small sample with quite a lot of variation in “story-DNA” terms. Divisive things can be used as clues to what defines people’s taste, but a single divisive movie, tv show or book don’t offer a lot of data — you only get one thing people differ on. What if you could get the same set of people to watch many divisive stories, and these stories were divisive in different ways, splitting people along different dimensions?

Usually this is difficult or impossible because people’s taste doesn’t just determine how they react to stories but which ones they seek out in the first place — and which ones they care to rate, making such data biased. That’s why its good to have a quite small set of  relatively diverse stories that you know everyone in the group has seen and can remember separately. Black Mirror offers not a perfect but a pretty good example.

I wanted to examine Black Mirror episode preferences to see if there was any interesting structure to it. When you browse threads on Reddit’s r/blackmirror where fans rank their favorites from top to bottom there is a remarkable amount of disagreement and with a few exceptions the rankings look almost random.

I copypasted data from a few of those threads, comparing usernames to avoid duplicates, and ended up with 89 full ranking lists. Ideally I’d have more but I didn’t want to start a new thread when there already were several of them. What did I find? Lets dig in.

First of all, which is the best episode? Here are they all, ordered from top to bottom by average rank (lower figures indicating higher positions).

Episode title Average rank
White Christmas 3.9
Fifteen Million Merits 4.4
Shut Up and Dance 5.0
San Junipero 5.4
White Bear 5.9
The Entire History of You 6.0
Be Right Back 7.2
Hated in the Nation 7.6
Playtest 7.8
Nosedive 8.4
The National Anthem 8.5
Men Against Fire 9.4
The Waldo Moment 11.5

Seems like, among Black Mirror fans hardcore enough to post their full lists on Reddit, “White Christmas” is the favorite. Note however that its average rank is only 4th place, far from a consensus. The community is far more in agreement about the outlier “The Waldo Moment” being the weakest episode, ranking 11.5 out of 13 on average. Interestingly, all three seasons were about equally popular, with averages 6.3, 7.1 and 6.2, respectively.

Since I’m interested in divisiveness and disagreement I also checked which episodes were the most controversial. Based on the discussions on Reddit, I suspected “San Junipero” would top the list, being hailed by many as the best of the series but strongly disliked by others for deviating from the shows’ usual ethos of pessimism and grimness. I also expected “The National Anthem” to be controversial, considering that fans often advise newcomers to not start with it even though it’s the first episode because its primeministerial pig sex puts many people off for some reason.

I was right. Here is the full list of standard deviations in rank, from most to least divisive.

Episode title Standard deviation
San Junipero 3.76
The National Anthem 3.52
Playtest 3.27
Fifteen Million Merits 3.25
Shut Up and Dance 3.22
Be Right Back 3.21
Hated in the Nation 3.12
White Bear 3.07
The Entire History of You 3.07
Nosedive 2.98
White Christmas 2.80
Men Against Fire 2.46
The Waldo Moment 2.29

Just looking at the standard deviations doesn’t quite do the data justice because we don’t really have intuitions for standard deviations the way we have for averages. What does 3.76 mean? Here is a figure showing the distribution of rankings for each episode, from best to worst. Note that every single episode has people ranking it in the top three (green) and bottom three (red), and a full 9 out of the 13 is both someone’s favorite and someone’s least favorite.


Ok, so there is a lot of variety. But is it all random and inscrutable or is there some kind of sense to the variation? Does liking one particular episode or episodes make you more likely to like another? I’d presume so, all kinds of recommendation systems for movies, books and whatever are built on that principle and I don’t think people’s preferences are random — there is accounting for taste.

Recommendation systems generally work with spotty and flawed data for reasons I described before, and Black Mirror episodes are an unusually clean data set (hopefully compensating for the small size of my sample) so odds are good we can find something.

I could look at correlations between the rankings of different episodes, but that would only give pairwise relationships. Instead I ran a statistical procedure called Principal Component Analysis, or PCA. What PCA does is to take a multidimensional data set (this set has 13 dimensions, one for each episode) and retain as much as possible of the variation in the set while reducing the number of dimensions by creating complex properties (“principal components”) that consist of weighted combinations of the raw dimensions. It’s technical, requires quite a bit of “data analysis literacy” to really get and hard to explain properly without pictures and way more than one paragraph. What it does, in layman’s terms, is look at all relationships at once and try to find the underlying axes along which the data varies the most.

I ran the analysis and found three components with eigenvalues significantly over 1 (that just means three strong dimensions that very probably are not random noise). By using these three combined properties instead of the full 13-dimensional rankings we can keep about 45% of the total variation, which is decent but not spectacular. Maybe more data would offer better results.

So without further ado, here is the first and strongest axis along which tastes vary. The numbers refer to how strongly each episode defines this dimension (1.000 is the theoretical maximum, 0.0 means complete irrelevance). The next issue is the interpretation of what the axes actually mean, and while PCA is “scientific”, interpreting the resulting axes is an art.

Shut Up and Dance 0.776
The Waldo Moment 0.482
The National Anthem 0.377
White Christmas 0.323
White Bear 0.272
Men Against Fire 0.131
Hated in the Nation 0.089
Playtest 0.044
The Entire History of You -0.136
Fifteen Million Merits -0.243
Nosedive -0.554
Be Right Back -0.589
San Junipero -0.708

So the most powerful pattern is that people who like “Shut Up and Dance”, “The Waldo Moment” and “White Christmas” more than others tend to dislike “San Junipero”, “Be Right Back” and “Nosedive”. This makes sense to me. The top 5 here are kind of grim and shocking (The Waldo Moment sticks out, but it’s quite cynical which I guess fits and also a bit wonky statistically because of its outlier status making the data highly asymmetrical which isn’t ideal for PCA), while the bottom 3 (and to a lesser extent the next 2) are gentler and softer, more relationship-oriented. If I was interested in opening up a jar of angry bees I might also guess that there could be something male vs. female about this axis.

This first axis explains 18% of the variation and is about as important as the second and third put together. The second and third are about equally strong. Here is the second:

Playtest 0,696
White Christmas 0,617
White Bear 0,286
The Entire History of You 0,148
Men Against Fire 0,143
Nosedive 0,026
San Junipero -0,003
Shut Up and Dance -0,008
Be Right Back -0,212
Fifteen Million Merits -0,234
Hated in the Nation -0,331
The Waldo Moment -0,389
The National Anthem -0,67

What this means is less obvious, but what stands out to me is that the top 2 and somewhat no. 3 and even less 4 and 5 all deal with mind games and terrifying, freakish mental experiences. “The National Anthem” and the others near the bottom are more about society and politics, more “extroverted”, you might say.

The third and final dimension is even harder for me to interpret.

Men Against Fire 0,67
Hated in the Nation 0,63
Be Right Back 0,219
White Christmas 0,13
San Junipero 0,13
Shut Up and Dance 0,065
White Bear 0,039
Nosedive -0,051
The National Anthem -0,193
Playtest -0,22
The Waldo Moment -0,309
Fifteen Million Merits -0,333
The Entire History of You -0,703

Ok, the top two are both kind of suspense-based. But so are the middle ones… They’re critical of society in a broad sense, but so is “Fifteen Million Merits” and “The Waldo Moment”. Could there be something about season 1 vs season 3? The top two are the last of season 3 and the bottom 2 are from season 1. Could a certain group of people rate older episodes lower because they haven’t seen them for while and the impression has faded? Another possibility is genre-conformity. “Men Against Fire” is like an action movie while “Hated in the Nation” is like a police procedural, both down-to-earth style wise. The bottom three are a bit more mixed up and more difficult to parse. But I’m grasping now. Suggestions welcome.

So there is a pattern behind who likes what. But few if any people will recognize their own taste perfectly in any of the dimensions. I know I dont. I’ll end this post with my own list:

  1. San Junipero
  2. The Entire History of You
  3. Fifteen Million Merits
  4. White Bear
  5. Nosedive
  6. White Christmas
  7. Shut Up and Dance
  8. Hated in the Nation
  9. The National Anthem
  10. Be Right Back
  11. Playtest
  12. The Waldo Moment
  13. Men Against Fire

Yes, in the for-or-against “San Junipero” controversy, I come down on the “pro” side. It was such a wonderful catharsis after so much grimness (and the show surprised me yet again).  But note that “SJ” would not be that high up on its own, its place at the top (for me) depends entirely on other episodes in the series being so disturbing [1]. We earned that happy ending, especially after the double gut-punch of the two preceding chapters “Playtest” and “Shut Up and Dance”. After the end of the latter my first words were: “Sometimes I wonder why we’re even watching this show”.

I noticed that ranking all the episodes is really hard. No order seems fair because the episodes are so different as to be incomparable. And how do you rate the episodes that are extremely effective but makes you feel like shit? “White Christmas”[2], “White Bear” and “Shut Up and Dance” are outstandingly well put together but I don’t want to watch them again. It makes me think of Funny Games, a masterpiece that made me want to throw up my intestines.


[1] This is a common problem for any long story. Often the very best and most appreciated elements are the ones that stand out and deviate from expectations (especially in comedy, since subversion of expectation is kind of what humor is). But you want to do more of what works, so those things become less and less outstanding as you do them more and more and a feedback mechanism that amounts to “do more of whatever you do the least” is ultimately self-defeating. Its the mechanism behind Flanderization (Warning, tvtropes link) and the reason Family Guy cutscenes stopped being funny.

[2] I rank the fan favorite “White Christmas” somewhat low. Not because it’s not powerful or affected me, it is and it did. I think it’s because it contains three separate stories and I find that kind of messy. In general my personal preference is for focused, highly cohesive stories.

Did you enjoy this article? Consider supporting Everything Studies on Patreon.

4 thoughts on “Varieties of “Black Mirror” Appreciation – A Statistical Analysis

  1. Interesting post. Have you noticed how the episodes “mirror” themselves? For example:

    Season 1:

    The National Anthem (Political relations)
    15 Million Merits (Social relations)
    The Entire History of You (Personal relations)

    Season 2: *mirror*
    Be Right Back (Personal Relations)
    White Bear (Social relations)
    The Waldo Moment (Political relations)

    The last episode of season 2, “White Christmas,” was released as a separate, standalone episode well after the first three episodes. That makes it the middle of the entire series.

    Season 3 isn’t as cohesive, but I haven’t had as much time to think about it:

    Season 3:
    Nosedive (Rating others)
    Playtest (Gaming)
    Shut Up and Dance (Contacting others through technology)
    San Junipero (Meeting others through technology)
    Machines Against Fire (Gamification of war)
    Hated in the Nation (Rating others)

    And I just wanna say, “The Waldo Moment” was one of my favorite episodes. I’m surprised it ranked so low. It contains a “rosetta stone” for the series, which is when the main character asks, “What are you for?!” This is one of the fundamental questions of the whole series: what (or who) is technology for? How will we use it, what will we use it for? But I guess the importance of an episode doesn’t correspond to people’s tastes 😉

    Liked by 1 person

  2. Interesting, there may be something here, even though there is quite a lot of interpretive flexibility. I’d like to see if one might be able to apply a similar but different scheme on them if the order was randomized.

    I did like The Waldo Moment too, and I think its place at the bottom is a little undeserved. I didn’t rank it much higher myself though, the competition really is fierce.


Leave a comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s