Valleywag is fascinated with Uber. A few weeks ago, with the goal of better understanding who uses the the car-service app, Valleywag editor Dan Lyons asked readers to send in their Uber Scores alongside some demographic information.
Since I'm the numbers guy here, Gawker editors reached out to me to parse the results of the survey they'd written. This is what I found.
Ok: so before we get into it, I need to make a disclaimer: this data set is not about Uber. It's about Valleywag readers who use Uber. Well, more accurately, it's about Valleywag readers who user Uber and enjoy filling out surveys. So please act like an adult when considering the implications of these results, ok?
So, really, the data
We had 1,525 respondents, or a bit more than 2% of unique viewers of the article at the time of collection. Of those respondents, 1,429 had data that we could use. I discarded data that was incomplete or was obviously false.
Here is how the respondents broke down: About 99% of respondents were located in the US. Most of the rest were in Canada or the UK.
- White, not Hispanic: 1,161 (81%)
- Asian & Pacific Islander: 91 (6%)
- Black: 82 (6%)
- Hispanic: 62 (4%)
- Mixed: 31 (2%)
- Native American: 2 (0%)
Over 80% of survey respondents identified as being something that I interpreted as 'white'. How white is that? The above breakdown is pretty close to the demographics of Minnesota. So either Valleywag or Uber users or survey completers (or, probably, all three) are whiter than the national average.
- Male: 1,094 (76%)
- Female: 335 (24%)
The median age of a survey completer was 29, the average age was just under 31. The oldest was 70, the youngest was 16. Here's the distribution:
I also looked at the age distribution by race: I didn't see anything in particular jump out at me.
Who got the best scores? Who got the worst?
Overall, the average score in our sample was 4.72: the median was 4.8. Just about a quarter of our sample had a perfect score of 5.
Median scores for all races were the same, except that mixed-race people had a slightly higher score.
- Mixed: 4.9
- Asian & Pacific Islander: 4.8
- Black: 4.8
- Hispanic: 4.8
- White, Not Hispanic: 4.8
Average scores didn't change by race. Asians and Blacks had not-statistically-different (but lower) scores than whites (p=.52). Mixed-Race & Hispanic people had not-statistically-different (but higher) scores (p=.21). Basically no difference in scores by race. Which is great!
- Mixed: 4.82
- Hispanic: 4.79
- White, Not Hispanic: 4.73
- Asian & Pacific Islander: 4.70
- Black: 4.70
The plurality of survey respondents were 30 or younger. Looking at the numbers, it seems like the riders with the lowest scores were actually older riders. Respondents over 40 were by far the lowest scoring group, by a significant margin (p=.002).
- 25 and under: 4.71
- 26-30: 4.74
- 31-35: 4.75
- 36-40: 4.79
- 41+: 4.60
All age groups had the same median score, so this makes me think there were a few elderly users with very low scores. We have 123 user scores over 40. I have to admit this was the exact opposite of what I expected though: I figured a bunch of drunk 20-somethings would get low ratings for puking in their drivers' cars. I guess those guys & gals don't read Valleywag.
For respondents in the US, I also looked at scores by state where we had at least 40 data points. You're not going to believe this, but New York might have some rude/inconsiderate people.
Looking at this, I noticed the further North I went, the lower the score. So I looked at the average score by degree of latitude (the size of each dot is based on the number of respondents at each tick of latitude).
There's a pretty clear relationship here**: respondents that are further north have, on average, lower scores. To give you a better sense of the data, here's the same data, annotated with the largest city at that degree of latitude (these scores aren't just for the city, but in most cases the majority of data at a certain latitude are from the city listed).
I suspect that this relationship is not due to anything about Uber, Valleywag, or survey completers. I think it's due to the weather and sunlight.
It's well known that weather has an impact on mood: 50% of people suffer from some sort of Seasonal Affective Disorder. If we had access to ratings on individual rides, we could really see whether there is a relationship between scores and the weather conditions, but alas, no one but Uber has that data.
So there you go: in our Valleywag sample, there's no sign of racism in Uber ratings, some possible signs of age preference, and a really strong sign that cold people are irritable.
* So, Race and Gender. In our initial survey, we left race as open-response fields. Some of the responses were fairly straightforward, some were not. Decisions on whether to include survey respondents or not, and how to categorize respondents by gender and race, were made by me, probably in a manner that was offensive to all sorts of people but honestly and with the best of intentions. Of particular note: one respondent identified as transsexual, but other information submitted by that respondent led me to believe the data was suspect, so their data was excluded. Also, since we have so few Native American respondents, I excluded them from race-based breakdowns for their privacy. Also also, percentages may not add up to 100% due to rounding.
** Stats nerds: I ran an ordinary least squares regression on the raw data (every score). Since individual scores are only to the tenth of a point with a max of 5, there were high levels of kurtosis and skew, so any inferences I make are somewhat dubious.
I also ran a weighted least squares regression with the data-by-latitude information (the last charts). This data actually started to look normal (kurtosis=2.7, skew <.15, JB=.17) and still had and R-squared over .9, but I'm a little dubious about using the already averaged data, and I only have 23 data points. Thoughts? Please let me know what I'm screwing up in the comments.