If you've been unenthused about the emoji of recent years, you're not alone. A flashlight? A toolbox? A fire extinguisher? A tin can? Who even uses these?
The emoji set to appear on your phone next year are similarly dismal. A screwdriver, a toothbrush, a bell pepper—seriously, what is this, a shopping center? When you think of emoji, you don't think of a laundry list of random objects. You think of iconic, sometimes weird, expressive faces, like the face with tears of joy, the thinking face, the angry devil, the smiling pile of poo, and the see-no-evil monkey, plus classic symbols like the thumbs up and the heart. But the latest batch includes just three new faces and one new hand shape, compared with 49 new objects, from a rollerskate to a rock to a plunger.The reason for this slide into irrelevance? The Unicode Consortium—the organization in charge of determining which symbols our devices are supposed to recognize—has more and more been measuring the wrong thing in the process of approving new emoji.
SUBSCRIBESubscribe to WIRED and stay smart with more of your favorite Ideas writers.No one intends to encode boring emoji, of course. Unicode has three main criteria, and one of them is, "Is there substantial evidence that a large number of people will likely use this new emoji"? Which sounds good in theory, but what actually is "substantial evidence"? Unicode doesn't consider emoji data that comes from petitions, corporate sponsorship, or non-public data sources, judging them too easy to manipulate. After it encoded the original set of emoji from Japanese cellphone carriers, Unicode started looking to search results: If you submit a proposal for a new emoji, you need to provide screenshots showing how many webpages are found when you search for its associated word or phrase in Google Search, Bing Search, and Google Video Search, plus Google Trends.Unicode's official guidelines note that the median emoji has 500 million search results for a regular Google search, 25 million on Bing, and 75 million on Google Video Search. While "the values are factors that are taken into consideration, not hard limits," the Emoji Subcommittee generally takes a dim view of prospective emoji that aren't in this ballpark. For example, T. rex, which did become an emoji, meets the threshold—Google reports 554 million webpages that mention this word—while ichthyosaur, which was rejected, is nowhere near it (less than a million).
Search results do have some advantages. It's harder (though not impossible) to astroturf your way to half a billion search results than half a billion electronic signatures on a petition. And unlike a proprietary internal dataset, it's easy to verify a Google screenshot (you can just repeat the search yourself and look for that grey number right above your list of links). But search results also have a big disadvantage—do people really make websites about the same sorts of things they use emoji for?
As someone who's been spending serious time observing how people use emoji over the past few years, my hunch was no. But I wasn't able to prove it until a new Unicode dataset came out a few weeks ago: It's a public list of all 1,468 emoji ranked by how much people are using them. (Emoji from 2018 and later were excluded for the time being because they're not necessarily broadly available on all devices, so they might not be living up to their potential yet.) Unicode wouldn't specify the sources of the data—I'm assuming it's from major tech companies, many of which are Unicode members—but did tell me the data was international, from the past six months, and on a log scale according to the median for each emoji across several sources, to avoid getting skewed by outliers on a single platform.There really hasn't been a comparable public dataset like this before; everything else is seriously patchy. Emojipedia puts on its homepage the top half-dozen or so emoji by searches, which can give us an idea of when a new emoji is really taking off, but has no information on anything below that top handful. Emojitracker tracks emoji use in real time on Twitter, which seems great until you realize that no new emoji have been added to the tracker since 2015 and that certain emoji (such as the recycling sign for retweet) are way more popular in spam tweets than tweets by actual people. Periodically, some company trying to court publicity will put out a press release with the "top 50" or "top 100" emoji, often from mysterious sources and lumped together into mysterious categories that prevent one from being able to do any serious stats. To be clear, I've been citing them all anyway, because there's still an emoji chapter in my book about internet linguistics, but they haven’t been rigorous or reliable enough to look for trends, except to notice that faces and hands and hearts are consistently the most popular categories.