A Web Journal about Machine Learning, Music, and other Mischief

Rebuilding A Vintage Tube Amplifier

As someone who works mainly with digital data and software, it’s nice to have a “material” hobby that can get my hands dirty. That’s why I’ve been building my own guitar effects recently (but more about that in a later post…). For now, consider that the only guitar amplifier I’ve had for ages is my Peavey Classic 30 (ca. 2000, back when they were still made in the USA). It sounds great on stage, but packs too much of a punch for testing out circuits on a workbench. So I decided to build myself a small, low-power DIY practice amp. I was hitting up some thrift/antique stores one weekend in search of a lunchbox or small suitcase to build the amp into… when I came across this instead:


It’s a Harmony H303A, which I’d never seen before, though it seems to get good reviews and goes for $150-200 on eBay when listed (which is rare). The vintage appears to be 1950s or so, and online sources claim it outputs somewhere in the 2-5 watt range… too quiet for gigging but good for practicing and recording. A dealer in an antique mall had it in his collection and was really hot to move it for some reason. The price tag originally said “$50 for that sweet tube sound!” and when he went down to $25 I figured, “even if it doesn’t work, it’ll make a nice cabinet for the little solid-state practice amp I was going to build in the first place.” As it turns out, the Harmony does work… although it took a little effort to whip it into shape. Here’s a video showing the fruits of my labor:

Read the rest of this entry »

Most Livable Cities: A Meta-Analysis

Every few weeks, my Facebook newsfeed throws me an article like “Most Livable Cities” or “Best Cities for Quality of Life”or “Happiest and Unhappiest U.S. Cities” or somesuch. These rankings are generally quite different (though with a few common themes), and often include — in the top ten or so — the home city of whoever shared the link with their fellow facefriends.

The rankings vary widely in source, methodology, and credibility (for that matter, even in use of supporting data). So I was curious to do a sort of meta-analysis, combining these lists in a reasonable way to see (1) what cities are most livable by consensus, and (2) what social/demographic indicators seem to make them that way. Here are the main things I learned:

  • The livability of a city isn’t related to the happiness of its people.
  • Livability rankings comes in two types, which I call Chill Rankings and Jetsetter Rankings. The statistical models of livability that they produce are totally different, and there is no overlap in their top ten cities.
  • A few cities crack the top 25 for both types, though, suggesting a more balanced lifestyle: Washington DC, Boston, San Francisco, Pittsburgh, Minneapolis, Seattle, Buffalo, Honolulu, Portland, and Houston.
  • Surprisingly, cost of living and pollution have little relationship with livability in either type of ranking.

The Experiment

In full disclosure, this was originally just an excuse to play around with structural equation modeling (more below). But I also wanted to take an inductive approach to livability — to blenderize all these contradictory lists and try to learn something from them. The typical approach seems wantonly deductive to me — e.g., rank cities by average rent, household income, commute time, and violent crimes per capita using census data, and then sum those rankings into an overall score. Nevermind that rent and income are strongly correlated (and effectively double-counted), or that crime should maybe count for more than commute time. Some of these rankings come from studies by scientists who know how to deal with these complexities, but many are compiled by journalists (or their interns) based on intuition alone.

Structural Equation Models (SEMs)

I wanted to design the meta-analysis around SEMs, which I discovered from papers on social influence in online communities. SEMs were first articulated by Sewall Wright, a geneticist at my alma mater UW-Madison… and apparently a frequent dinner guest at my band‘s singer’s mother’s childhood home. (Small world!) But my training as a computer scientist did not include SEMs in school or any formal setting. I’ve been eager to learn more and find an interesting application.

SEMs are graphical models with the sweet ability to create hypothetical latent variables and uncover statistical relationships between them. For example, there isn’t really a way to measure livability, precisely, but we can tap into so-called “manifest variables” — like these Facebook-flung livability lists — to create a “latent construct” that summarizes them all. The idea is that this construct represents the “true but hidden” livability measure, and all these rankings are “symptomatic” manifestations of the underlying scale. The same can be done for for cost of living (a construct of the average rent, price of gas, cost of a slice of pizza, etc.) or education (a construct of the percentage of residents with various degrees). We can also essentially perform a regression analysis to see how constructs like cost of living and education might influence livability.

Model and Data

The figure below illustrates my basic model, which incorporates a lot of the general assumptions about what influences livability:
example-sem Read the rest of this entry »

Encoding Human Thought Processes into a Computer

One of my favorite characters in William Gibson’s Neuromancer was a so-called “psychological construct” named The Dixie Flatline. Dixie wasn’t a person, really, but an emulation of a famous computer hacker named McCoy Pauley (based on a brain scan that was made before he died). As he — or, it — said in a conversation with the novel’s protagonist Henry Case:

“Me, I’m not human … but I respond like one, see? … But I’m really just a bunch of ROM. It’s one of them, ah, philosophical questions, I guess….” The ugly laughter sensation rattled down Case’s spine. “But I ain’t likely to write you no poem, if you follow me.”

The Flatline was neither a human nor an artificial intelligence, but a machine that partially emulated how a human thought. It did a pretty good job, too, playing the central role of “smart guy” in the novel’s main cyberpunk-heist plotline. Yet it wasn’t a perfect human emulation: its laugh was “wrong,” and it was self-aware enough to note its own lack of creativity. Turning its ROM disk off and back on again totally reset Dixie’s memory, and later in the story the villain tried to take out Case first (still alive and human) precisely because the Flatline was a machine, and therefore much more predictable.

Cognitive Models and Their Uses

Regardless, it’s pretty cool to think about what we can accomplish with computational cognitive models derived using real data from real people. In Neuromancer, the data was McCoy Pauley’s brain scan, which was modeled and encoded into a computer program called The Dixie Flatline. The model wasn’t quite right, but was still useful. All that is science fiction of course, but we are making progress in the real world, too. There are both practical and theoretical uses for these kinds of models, such as:

  • “Encoding” a human thought process into a computer. It’s hard to “teach” computers directly. Most machine learning algorithms learn by example (i.e., observational data) but there aren’t great ways for people to inject their instincts about a problem into the machine. If we have a good cognitive model that captures properties of our thinking, though, we can perhaps encode that more directly into a learning algorithm.
  • Understanding how people think. If a computational model predicts real human behavior pretty well, then there’s a chance that it captures something real about how we think. And if its parameters are easily interpretable, we can gain insight into how our brains work, too.

With these in mind, let me summarize a recent collaboration with fellow computer/cognitive scientists at my alma mater UW-Madison. Here, the data consist of word lists that people think up, which we model computationally for both the practical and theoretical uses mentioned above. In fact, the paper is being presented at the ICML 2013 conference this week in Atlanta. We made a short video overview of the research, too:

That’s mostly me talking in the video, but Kwang-Sung will present it at the conference. The paper itself is here:

K.S. Jun, X. Zhu, B. Settles, and T.T. Rogers. Learning from Human-Generated ListsProceedings of the International Conference on Machine Learning (ICML), pages 181-189. 2013.

Read the rest of this entry »

On “Geek” Versus “Nerd”

To many people, “geek” and “nerd” are synonyms, but in fact they are a little different. Consider the phrase “sports geek” — an occasional substitute for “jock” and perhaps the arch-rival of a “nerd” in high-school folklore. If “geek” and “nerd” are synonyms, then “sports geek” might be an oxymoron. (Furthermore, “sports nerd” either doesn’t compute or means something else.)

In my mind, “geek” and “nerd” are related, but capture different dimensions of an intense dedication to a subject:

  • geek – An enthusiast of a particular topic or field. Geeks are “collection” oriented, gathering facts and mementos related to their subject of interest. They are obsessed with the newest, coolest, trendiest things that their subject has to offer.
  • nerd A studious intellectual, although again of a particular topic or field. Nerds are “achievement” oriented, and focus their efforts on acquiring knowledge and skill over trivia and memorabilia.

Or, to put it pictorially à la The Simpsons:

Both are dedicated to their subjects, and sometimes socially awkward. The distinction is that geeks are fans of their subjects, and nerds are practitioners of them. A computer geek might read Wired and tap the Silicon Valley rumor-mill for leads on the next hot-new-thing, while a computer nerd might read CLRS and keep an eye out for clever new ways of applying Dijkstra’s algorithm. Note that, while not synonyms, they are not necessarily distinct either: many geeks are also nerds (and vice versa).

An Experiment

Do I have any evidence for this contrast? (By the way, this viewpoint dates back to a grad-school conversation with fellow geek/nerd Bryan Barnes, now a physicist at NIST.) The Wiktionary entries for “geek” and “nerd” lend some credence to my position, but I’d like something a bit more empirical…

“You shall know a word by the company it keeps” ~ J.R. Firth (1957)

To characterize the similarities and differences between “geek” and “nerd,” maybe we can find the other words that tend to keep them company, and see if these linguistic companions support my point of view? Read the rest of this entry »

Machine Learning and Social Science: Taking the Best of Both Worlds (A Case Study)

Machine learning and social science are converging, since both are hot to answer questions and challenges raised by vast modern social data sets. The more I talk to and work with social scientists, the more I realize that we use the same basic statistical tools in our research (e.g., linear or logistic regression), but in very different ways. Here are the fundamental differences in how the two camps approach things, I think (broadly speaking):

  • Social scientists (e.g., psychologists, sociologists, economists) tend to start with a hypothesis, and then design experiments — or find observational data sets — to test that hypothesis. I think of this as a deductive, top-down, or theory-driven approach.
  • Computer scientists (i.e., the machine learning and data mining communities) tend to “let the data speak for itself,” by throwing algorithms at the problem and seeing what sticks. I think of this as an inductive, bottom-up, or data-driven approach.

Both approaches have their uses (and their pitfalls). Theory-driven research is probably better for advancing scientific knowledge: the models may not predict the future very well, but they can shed light on causes and effects, or confirm/deny hypotheses. Data-driven research is often more practical: we have great spam filters and recommender systems today as a result, but the best methods are usually “black boxes” that perform well without providing much insight. Ideally, we would like sophisticated methods that can make accurate predictions and tell us something about the world.

In this post, I’ll argue for (1) a hybrid inductive + deductive research approach and (2) a specific algorithm called path-based regression, both of which help push us toward this unified vision, I think. These perspectives grew out of a recent “machine learning meets social science” project of mine to try to explain and predict how creative collaborations form in an online music community.

(A note to self-identified statisticians: I’m not blatantly ignoring you, I just don’t quite know which camp you fall into. Perhaps it depends on whether you’re more motivated by inference or prediction. I suspect, though, that good statisticians are the unicorns who already know everything I have to say here…)

Understanding and Predicting Online Creative Collaborations

Mere days ago, I launched the tenth iteration of February Album Writing Month (FAWM). FAWM is a music project I started during grad school with a few friends, the goal being to write an album in a month: “14 songs in 28 days.” Recently, it has become a bit of a research project, too, since I have collected a rich data set over the years about individuals’ online interactions and musical productivity. Last fall, I teamed up with Steven Dow from Carnegie Mellon’s Social Computing Group to look into how collaborative songwriting projects form and succeed in FAWM. We’ll present it at the CHI 2013 conference in a few months… and since CHI required us to make a promo video (sheesh), here is a 30-second overview:

The paper itself is available here:

B. Settles and S. Dow. Let’s Get Together: The Formation and Success of Online Creative CollaborationsProceedings of the Conference on Human Factors in Computing Systems (CHI). ACM, 2013.

Read the rest of this entry »

Duolingo’s Data-Driven Approach to Education

This blog lay fallow for the past several months. Despite the title, I haven’t been slacking… I’ve simply been busy with a new job, and wrapping up another project (which I will blog about soon). Until then, here is a link to a brief description of what I’ve been working on at Duolingo.

Machine Learning and Personality Type

Here are some thoughts on statistical approaches for pinpointing personality types. Text analysis and crowdsourcing FTW!


I recently discovered Typealyzer, a service that analyzes a web page and tries to determine the author’s personality type, in terms of Myers-Briggs Type Indicators. I’m not sure what kind of classifier it uses, but it’s apparently built on uClassify‘s API and trained using psychocographic text data gathered by Mattias Östmar. It determines each of the four dimensions independently, presumably using a “bag of words” document model.

In both formal and informal tests, I have always scored INTP (introverted, intuitive, thinking, perceiver) since high school. So I was curious what Typealizer would make of my writing. I have web presence in multiple public places, so I decided to try a few of them. Here are the results:

So one might conclude that I’m either an INTJ (“Mastermind”) with 2/5 = 40% probability, or marginalize over the four dimensions independently and say that I am in fact an INTP (“Architect”) with 17.3% probability (INTJ comes in second at 11.5%). A Bayesian might put stronger priors on my personal and academic pages, or priors based on population distribution, or use model confidences (which Typealizer didn’t provide). At any rate, I’m either an INTJ or INTP, and probably the latter.

(Update: I checked Typealizer immediately after posting this, and it revised its prediction for this blog to INTP.)

The Enneagram

All that reminded me of a little “breakfast experiment” I did a while ago to help me determine my Enneagram type. The Enneagram is not nearly as popular as Myers-Briggs, but I find it more useful for being self-aware about bad habits or unhealthy tendencies. Without going into too much detail, there are nine basic types:


Each type also has two adjacent wings, and people have one of three instinctual variants, which allows for a total of 9×2×3 = 54 personality types! But I’m only concerned about two of the nine basic types shown above: Five (“Investigator”) and Nine (“Peacemaker”).

I’ve always tested as a Five, but with Nine in second place, which is kind of weird since they are share very little in common according to the theory. I always assumed I really was a Five since I feel more like an investigator than a peacemaker, plus type Five is correlated with both INTP and INTJ in Myers-Briggs-land (c.f., this study). But a little over a year ago I was going through a period of major personal stress, and my friend Charles (who introduced me to the Enneagram) suggested that I might be a Nine instead of a Five based on how I was responding to the situation(s). He said that he had recently revised his own type, and pointed me to an article by the Enneagram Institute arguing that educated male Nines tend to think they are Fives:

Despite their similarities, the main point of confusion for Nines arises around the notion of “thinking.” Nines think they are Fives because they think they have profound ideas: therefore, they must be Fives.

Part of the problem stems from the fact that individuals of both types can be highly intelligent…. Although intelligence can be manifested in different ways, being intelligent does not make Nines intellectuals, just as thinking does not make them thinkers.

They also claim that the Nine-to-Five (teehee!) misclassification is the most common… although it rarely happens the other way around. So I read up on both types, but they both felt like they described me in different ways. I re-took some tests, and Five still came out on top with Nine close behind.

A Crowdsourcing Experiment

So (of course) I decided to build a classifier. First, I collected 40 first-person statements that supposedly characterize either Fives or Nines, lightly edited them for stylistic consistency, shuffled the order (to reduce presentation bias), and emailed the list to 28 close friends and family. I asked them to reply with all the statements they thought describe me, and delete the ones that do not. In a sense, “crowdsourcing” my personality description.

Then I built a multinomial naïve Bayes classifier that computes the probability p(t|\mathbf{s}) of an Enneagram type t given a set of these statements \mathbf{s} = \{s_i\}^{40}_{i=1}:

p(t|\mathbf{s}) \propto p(t)\prod_i p(s_i|t)^{\mathrm{freq}(s_i)}

Here, \mathrm{freq}(s_i) is the number of people who responded saying that statement s_i describes me. Estimating the probabilities for this model was tricky with no actual data, but it is called naïve Bayes, so I took it to the extreme and only used priors. For type priors p(t), I used probabilities of 44.1% for Five and 55.9% for Nine (which came from this study of the general population). For statement probabilities, I adapted an approach I have used before — for classifying text using labeled words in addition to or instead of labeled documents — and used a simple informative Dirichlet prior of 2 “pseudocounts” if statement s_i describes type t (e.g., “I stand back and try to view life objectively” describes type Five), and 1 otherwise. These get normalized to form the conditional multinomials p(s_i|t).


Over about two weeks, 12 people replied (42.9% response rate), which was a pretty good cross-section of family and friends from high school, college, grad school, and my more recent Pittsburgh days. I generally agreed with the responses, although I was surprised how many people thought I would say “Tell me when you like how I look” or “Hug me and show physical affection.” I don’t think I give off those vibes. Do I? Anyway, here are the four statements all 12 respondents unanimously agreed on:

  • “I need time alone to process my feelings and thoughts.”
  • “I like to have a thorough understanding; perceiving causes and effects.”
  • “My sense of integrity: doing what I think is right and not being influenced by social pressure.”
  • “I know that most people enjoy my company, I’m easy to be around!”

The first three describe a Five, and the last one describes a Nine (although, let’s face it… that’s just flattery). The model predicts with 95.9% confidence (about 23-to-1 odds) that I am indeed a Five, a prediction that isn’t very sensitive to fiddling with the Dirichlet priors at all (although it is naïve Bayes, and the statements are probably not conditionally independent). Furthermore, I suppose that the very act of conducting an experiment like this, however silly, is a very Five kind of thing to do. So… uhmm… case closed?

More Thoughts

After living with the idea for a year or so now, I think I actually disagree with that analysis. Insofar as we have discrete personality types (which is a little dubious to begin with), I think I am in fact a Nine… just a very curious and analytical Nine. Here is why, according to the theory (which relates to the arrows in the diagram above):

  • The Five’s investigative nature supposedly stems from a fear of not being able to understand “Truths” about the world. Stressed-out Fives can be hyperactive and paranoid, spread across a lot of projects (like some Sevens). Healthy Fives, however, become confident leaders and decisive “benevolent dictators” (like some Eights).
  • In contrast, Nines want peace of mind. In the face of stress, they can become anxious worrywarts (like some Sixes), but healthy Nines pick up energy and become focused on self-improvement (like some Threes).

The latter feels more like me. I am an investigator not because of some deep need to get to the bottom of things (Five), but because it’s a hell of a lot of fun, and it forces me to learn things and develop new skills in the process (healthy-Nine/Three). And while it’s true that I try to “do what I think is right and not be influenced by social pressure” (Five), I still worry a awful lot about what other people think of my decisions (unhealthy-Nine/Six). I suspect that academia is overrun with Fives, and thus I have either taken on some Five-like traits or they are projected onto me by the friends & family who replied to my survey. Nuances this fine might be too subtle to pull out of a questionnaire-based personality test.


Anyway, fun stuff… although the Nine-to-Five misclassification got me thinking about an “active” personality test that, instead of asking a rote set of questions, could adapt and try to tease these subtleties out (like a good game of 20 Questions). Personal writing samples — which the Typealizer folks are starting to get at — could be a good source of data for such a test. It would be cool to gather Enneagram types for a bunch of bloggers, and use NLP techniques to try to understand how language is used by the different types. A test could tailor follow-up questions based on preliminary guesses from the text. As always, though, training data is a big bottleneck…


Get every new post delivered to your Inbox.

Join 1,240 other followers