On “Geek” Versus “Nerd”

To many people, “geek” and “nerd” are synonyms, but in fact they are a little different. Consider the phrase “sports geek” — an occasional substitute for “jock” and perhaps the arch-rival of a “nerd” in high-school folklore. If “geek” and “nerd” are synonyms, then “sports geek” might be an oxymoron. (Furthermore, “sports nerd” either doesn’t compute or means something else.)

In my mind, “geek” and “nerd” are related, but capture different dimensions of an intense dedication to a subject:

  • geek – An enthusiast of a particular topic or field. Geeks are “collection” oriented, gathering facts and mementos related to their subject of interest. They are obsessed with the newest, coolest, trendiest things that their subject has to offer.
  • nerd – A studious intellectual, although again of a particular topic or field. Nerds are “achievement” oriented, and focus their efforts on acquiring knowledge and skill over trivia and memorabilia.

Or, to put it pictorially à la The Simpsons:
geeknerd-simpsons

Both are dedicated to their subjects, and sometimes socially awkward. The distinction is that geeks are fans of their subjects, and nerds are practitioners of them. A computer geek might read Wired and tap the Silicon Valley rumor-mill for leads on the next hot-new-thing, while a computer nerd might read CLRS and keep an eye out for clever new ways of applying Dijkstra’s algorithm. Note that, while not synonyms, they are not necessarily distinct either: many geeks are also nerds (and vice versa).

An Experiment

Do I have any evidence for this contrast? (By the way, this viewpoint dates back to a grad-school conversation with fellow geek/nerd Bryan Barnes, now a physicist at NIST.) The Wiktionary entries for “geek” and “nerd” lend some credence to my position, but I’d like something a bit more empirical…

“You shall know a word by the company it keeps” ~ J.R. Firth (1957)

To characterize the similarities and differences between “geek” and “nerd,” maybe we can find the other words that tend to keep them company, and see if these linguistic companions support my point of view?

Data and Method

(Note: If you’re neither a geek nor a nerd, don’t be scared by the math. It’s not too bad… or you can probably just skip to the “Results” subsection below…)

I analyzed two sources of Twitter data, since it’s readily available and pretty geeky/nerdy to boot. This includes a background corpus of 2.6 million tweets via the streaming API from between December 6, 2012, and January 3, 2013. I also sampled tweets via the search API matching the query terms “geek” and “nerd” during the same time period (38.8k and 30.6k total, respectively). Yes, yes, yes… I collected all the data six months ago but just now got around to crunching the numbers. It’s been a busy year!

A great little statistic for measuring how much company two words tend to keep is pointwise mutual information (PMI). It’s commonly used in the information retrieval literature to measure the cooccurrence of words and phrases in text, and it also turns out to be a good predictor of how humans evaluate semantic word similarity (Recchia & Jones, 2009) and topic model quality (Newman & al., 2010).

For two words w and v, the PMI is given by:

{\rm pmi}(w;v) = \log\frac{p(w,v)}{p(w)p(v)} = \log p(w|v) - \log p(w) ,

where in this case p(\cdot) is the probability of the word(s) in question appearing in a random tweet, as estimated from the data. For instance, if we let v = “geek,” we compute the log-probability of a word w in the “geek” search corpus, and subtract the log-probability of w in the background corpus.

Results

The PMI statistic measures a kind of correlation: a positive PMI score for two words means they “keep great company,” a negative score means they tend to keep their distance, and a score close to zero means they bump into each other more or less at random.

With that in mind, here is a scatterplot of various words according to their PMI scores for both “geek” and “nerd” on different axes (ignoring words with negative PMI, and treating #hashtags as distinct):
geeknerd-plot-01

Many people have asked for a high-res PDF of this plot, so here you go.

Moving up the vertical axis, words become more geeky (“#music” → “#gadget” → “#cosplay”), and moving left to right they become more nerdy (“education” → “grammar” → “neuroscience”). Words along the diagonal are similarly geeky and nerdy, including social (“#awkward”, “weirdo”), mainstream tech (“#computers”, “#microsoft”), and sci-fi/fantasy terms (“doctorwho,” “#thehobbit”). Words in the lower-left (“chores,” “vegetables,” “boobies”) aren’t really associated with either, while those in the upper-right (“#avengers”, “#gamer”, “#glasses”) are strongly tied to both. Orange words are more geeky than nerdy, and blue words are the opposite. Some observations:

  • Collections are geeky. All derivatives of the word “collect” (“collection,” “collectables”, etc.) are orange. As are “boxset” and “#original,” which imply a taste for completeness and authenticity.
  • Academic fields are nerdy“math”, “#history,” “physics,” “biology,” “neuroscience,” “biochemistry,” etc. Other academic words (“thesis”, “#studymode”) and institutions (“harvard”, “oxford”) are also blue.
  • The science & technology words differ. General terms (“#computers,” “#bigdata”) are on the diagonal — similarly geeky and nerdy. As you splay up toward more geeky, though, you see products, startups, brands, and more cultish technologies (“#apple”, “#linux”). As you splay down toward more nerdy you see more methodologies (“calculus”).
  • #Hashtags are geeky. OK, sure, hashtags are all over the place. But they do tend toward the upper-left. And since hashtags are “#trendy,” I take it to mean that geeks are into trends. (I take this one back. The average PMI score for all hashtags is 0.74 with “geek” but 0.73 with “nerd.” The difference isn’t statistically significant using a paired t-test or Wilcoxon test, or practically significant using a common-sense test.)
  • Hobbies: compare the more geeky pastimes (“#toys,” “#manga”) with the more nerdy ones (“chess,” “sudoku”).
  • Brains: the word “intelligence” may be geeky, but “education,” “intellectual,” and “#smartypants” are nerdy.
  • Reading: “#books” are nerdy, but “ebooks” and “ibooks” are geeky.
  • Pop culture vs. high culture: “#shiny” and “#trendy” are super-geeky, but (curiously) “cellist” is the nerdiest

The list goes on. If you want to poke around yourself, download the raw PMI scores (4.2mb) and let me know in the comments what you find. Since many people have asked: I computed PMI for all words appearing in the search tweets with “geek” and “nerd” (millions) and then manually scanned roughly 7,500 words with positive PMI scores for both. The scatterplot contains about 300 words that I hand-picked because they made sense.

(Update: I learned that Olivia Culpo — a self-described “cellist nerd” — was crowned Miss Universe on December 20, 2012. The event was heavily tweeted smack in the middle of my data collection, so that probably explains the correlation between “cellist” and “nerd” here. It also underscores the limitations of time-sensitive data.)

Conclusion

In broad strokes, it seems to me that geeky words are more about stuff (e.g., “#stuff”), while nerdy words are more about ideas (e.g., “hypothesis”). Geeks are fans, and fans collect stuff; nerds are practitioners, and practitioners play with ideas. Of course, geeks can collect ideas and nerds play with stuff, too. Plus, they aren’t two distinct personalities as much as different aspects of personality. Generally, the data seem to affirm my thinking.

I wonder how similar the results would be if you applied this method to the Google Books Ngrams corpus, or something more general instead of a niche media like Twitter. I also wonder what other questions might be answered with this kind of analysis (for example, my wife and I have a perennial disagreement over which word is wetter: “moist” vs. “damp.”).

Finally, when I mentioned to a friend that I was going to write up this post, she said “Well, I guess we know which one you are.” But do we really? I may be a science nerd, but I’m probably a music geek

Update (June 25, 2013): Woah. This has gotten more attention than I ever anticipated. A few impressions. (1) Prior to writing this, I had no idea there was a “geek vs. nerd” holy war in certain corners of the Internet; fueling these flamewars was certainly not my intent. Lighten up! (2) I fear I’ll be better known for this diversion than for any of my “real” research. To be clear: this was a fun way to kill a few hours on a Saturday afternoon, not necessarily my best science. I think the writeup here is sound and self-evident, but I’m the first to acknowledge that there are better corpora, methods, and analysis techniques — which could use a grant, grad student, and/or more than an afternoon — for uncovering this all-important “Truth.” (3) For those interested in the etymologies of “geek” and “nerd,” I found this cool writeup.

Advertisements

310 Comments Add yours

  1. Nick says:

    nice! Just curious, what software did you use to process the data?

    1. burrsettles says:

      I wrote my own software to compute PMI. I made the scatterplot in R.

      1. andrew says:

        So #nerdy. So #geeky. So #lovely.

      2. Ritu says:

        Nice work!

  2. That’s great! Well done =)

  3. Doug K says:

    beautiful, thank you..

    moist is definitely wetter than damp. IMO.

    1. Otto B. says:

      Agreed, . . . examples: moist – coarser, distinguishable water droplets as on the outside of a ice-filled glass once ice begins to melt; damp – fog/steam/vapor, i.e. finer, indistinguishable individual water droplets . . . my take on it anyway!

    2. Ceri K. says:

      I concur.

      Also, #fuckyeahdata

  4. Awesome post, man! thank you

  5. I love this. This was the main debate in high school between my friends. 😛

  6. Rigarashi says:

    Very Interesting and thought provoking. Thank you. Never really thought what the differences are between geek and nerd. Your post is an eye opener that synonyms can be statistically analyzed similarly.

  7. Pirtom Lubis says:

    Nice post, nerd.

  8. Ian Sober says:

    Nice, but there is a problem with your analysis if the number of occurrences of a word is low. Then your log-odds can become high even though there is in fact no interesting association between the word in question and either geek or nerd. This could explain the curious ‘cellist’, which is not nerdy at all… You could filter out things that appear too few times, or, somewhat nerdier, add pseudo counts to word occurrences.

    1. Cellist isn’t nerdy? I beg to differ. I think the cello is among the nerdiest of classical instruments – perhaps only bested by the harp.

    2. MP says:

      It would be interesting to see a variant of the analysis that, to your point, shows relative frequency information, perhaps in this visualization it could be achieved by word size or transparency (e.g. low frequency = more transparent text in the graph).

      I suspect the cellist-nerd pair would still show up due to something occurring right in the middle of the time period those tweets cover:

      http://www.google.com/search?q=Cellist+Nerd+Olivia+Culpo

      Adding a time distribution element to the results would uncover the persistence or transience/emergence of each correlation, so you’d be able to identify the sudden emergence of the cellist nerd on Dec 20th.

      To the author of the blog, this was a very cool idea and analysis.

      1. burrsettles says:

        Nice find! I’m pretty sure that the cellist-nerd correlation has something to do with this event, then. And yes, there are all kinds of interesting variants one can imagine… among other things, to overcome the problems of analyzing such a short time window.

    3. Yuen King Ho says:

      You may also consider using a bayesian estimator with uninformative prior. It will pull the estimator towards 0.5 for those words with low occurrences.

  9. Mansi Gandhi says:

    Woah!! This is brilliant! I can totally relate to it when you talk about the drastic difference between geek and nerd. Well, I am kind of both because I am a statistician by profession and also a tennis player! So, I can go from being a total nerd from 9 a.m. to 5 p.m. to being a sports geek on the tennis court! At times I am amazed myself at the stark contrast in my personality at work and on the field. Kudos to a great post and congrats on getting freshly pressed.

    1. Mansi Gandhi says:

      Did the graph come from R? What package would you use for that plot?

      1. burrsettles says:

        Yes, that’s an R plot. I used the textxy() function in the calibrate library.

      2. Mansi Gandhi says:

        Okay thats awesome! I would trust R to come up with a cool graph like that. SAS graphs can be seriously ugly at times!

  10. E Josiah Lutton says:

    Interesting study, I’m glad I can now use the terms in a well defined manner.

    I have to say I’d question the reliability of your data. Intermittent trends could cause correlations to increase for a short period. Since internet trends are generally quite short-lived, I’m not sure how relevant this would be to your data, but sampling over a longer time-period would correct for this.

    I’d be interested to see this kind of sampling done with other words too, it could actually throw up all sorts of things, such as changes in perception of words, how words are changing in meaning, evolution of new words etc.

  11. Perhaps, gathering the statistics and formulating them, and finding a display for them, categorizes you as a geek, but the impetus, and the probable enjoyment of the results, marks you as a nerd. LOL Great piece of work (and now I’m not sure which one I am)….

  12. Nice article. I’m a clinical human factors engineer by trade, a data visualization enthusiast by design, and an avid sci-fi/music/Ameritrash boardgame geek by coincidence. Generally, the term “nerd” applies to the sciences that interest me at work, while the term “geek” applies to the sciences that interest me at home. P.S. Geeks around here tend to be drawn to R, nerds to Stata. 😉

  13. ‘moist’ and ‘damp’ are often used in slightly different contexts, but they still occur together frequently enough on the Web for us to determine significant differences in intensity. We recently wrote a paper about how this can be done: http://www.demelo.org/gdm/intensity/.

    In this case, it seems that “damp” is more frequently thought of as stronger.

  14. aerislair says:

    Ingenious! You are truly a nerd, sir! 🙂

  15. Anja Jones says:

    Brillant – nice work 🙂

  16. Anja Jones says:

    What tool/software did you use for this analysis?

  17. Stewart says:

    I wouldn’t say Apple is a cultish technology. It’s about as mainstream as you can get!

    1. burrsettles says:

      While that’s true, I meant “cultish technology” in the same way you refer to a “cult movie” like The Princess Bride or The Big Lebowski. Not that it’s on the fringes, but that it has a cult following within and beyond the mainstream.

  18. Lucy says:

    As a Librarian who deals with collections of ideas (books both print and e), it seems I might be the exact cross-over between geek and nerd.

  19. Pipeta says:

    Just great, now I don’t know what I am!

  20. Nice post! Would really really like to see a post about the steps you took to do this (or a link to somewhere that explains it).

  21. zyxo says:

    Do you consider yourself more nerdy than geeky?

  22. Geeks stuff their closets. Nerds stuff their skulls.

  23. revdrjon says:

    Possibly now requires the comparative inclusion of data on “dork”… ;}P>

    1. burrsettles says:

      Check the raw data. 🙂

  24. Pooya says:

    Wonderful job. Really original.
    You used your own software (geeky) to do a scientific (nerdy) study. Which category are you in then? Are you on the diagonal? 😉

  25. just astonishingly remarkable how you shed light on these two words that have always beat the world. I can’t exactly place myself i.e a nerd or a geek but one thing is certain…everybody has got both a nerd and a geek in them! Thanks, pal

  26. Fantastic article. Very interesting and confirms that I am the geek I thought I was.

  27. bookwurm15 says:

    Haha, my friends and I argue about this all the time, this helps bunches!

  28. shazfall says:

    I feel like i should change my dissertation paper to this…

  29. Lena says:

    I totally agree with your statement regarding geeks: the are early adopters & innovators! Awesome article! Thnx!

  30. geeky_nerd says:

    In my impression the term “nerd” has changed it’s meaning in the last decades from something rather negative to something that’s almost mainstream. That’s why I’m not sure what people are referring to, when they use the term “nerd”. I’ll probably link to this article in the future 🙂

    In contrast, “geek” seems to be a semantically more stable word.

    Relevant: http://www.quickmeme.com/Idiot-Nerd-Girl/

  31. Andrew Durso says:

    It seems to me that a “sports nerd” might describe someone who obsesses over sports scores and statistics but does not actually participate in sports. I’m not one myself, so I don’t have a lot of insight into the psychology, but it was the first thing I thought of.

  32. ShinyGeek says:

    Come to the geek side. We have shiny.

  33. muchomango13 says:

    This was great! I found it particularly enjoyable because I go to a school for nerds & it’s nice to clear the air between what a geek and a nerd actually stand for.

  34. I can’t believe you didn’t quote Douglas Coupland on the subject — the difference is in employability.

  35. scottleibrand says:

    Nice data and analysis.

    I disagree with the primary conclusion that geekiness is about collecting stuff, though. (Anecdotally, I don’t collect anything, and I’m definitely a geek about a lot of things.) The collection angle may still be true for some kinds of geeks, but the key distinction I see (in this data and more generally) is that geekiness is associated with technology and an outward-looking focus. Nerdiness is associated with an inward-looking focus on a usually academic topic.

  36. Written by true nerd! Sweet. 🙂

  37. Chris Gray says:

    My standard reference for the difference between geek and nerd is the comic strip Cat and Girl: http://catandgirl.com/?p=1341

    You get a definition if dork at the same time.

  38. carlisdm says:

    Nice post! I knew there was a difference between those two terms.. but believe or not it´s very easy to be both of them, I consider myself geek and nerd… one with more strenght than the other but still the two…

  39. barry says:

    Well, to my mind a geek is a SPECIALIST – having an obsessive special interest in one or two topics (such as Spiderman comics of the late 60s) – while a nerd is a GENERALIST – trying to be pedantically knowledgable about anything and everything.

  40. Eve says:

    Fun Fact: the only two names of people in the plot are “Wheaton” and “Whedon.” I kinda hope this wasn’t intentional, because that would be much more hilarious. 🙂

    How did you choose the words in the plot?

  41. Found you in the “Freshly Pressed” email. I rarely click on any of those links, but I’m so glad I read this. This should go viral. I am comforted that one can be styled a geek/nerd. I have my own dilettante/professional struggle. Maybe I should pray, “Grant me the geekiness to enjoy the things I cannot master, the nerdiness to persist with the things I can, and the patience to tolerate the difference.”

  42. Woohoo! Like #500! So, if nerds rule the world, “be nice to nerds, they’ll probably become your boss” and all that, what of geeks?

  43. bruno martin says:

    Very interesting ! It would be great to keep this list of words and compute the google distance instead to see if it corroborates your results.
    http://en.wikipedia.org/wiki/Normalized_Google_distance
    If you have the list of words, I will be happy to contribute to your investigations.

  44. Great article, it clears a long standing argument. Especially love the graph!

  45. And nerds often put a lot of effort and time into use- or/and sense-less endeavors – e.g. a science nerd. :-p

  46. fcimeson says:

    Loved it. This also fit with my intuitive definitions.

    Q: Do you use python as well or just R?

  47. ToNYC says:

    This could have been written by a geek acting like a nerd to buy more word salad ingredients.

  48. First of all, totally nerdy that you used .png for the graphic: I love it!

    Next, I am clearly more of a nerd, but I want to be a geek, at least where books are concerned. I am definitely a music geek (I have a pretty huge record collection), but have longed to become a sci-fi fantasy geek. Sadly I’m too busy being a nerd about science and other topics to spend much time geeking.

    I’m totally sharing this at lab meeting.

  49. I love this scatter plot. I’m going to save it and study it. That’s how geeky/nerdy I am.

Leave a Reply to Ekaterina Kat Balaban Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s