How to Find Your Next Date Using Census Data

Welcome to another edition of “In the Minds of Our Analysts.”

At System2, we foster a culture of encouraging our team to express their thoughts, investigate, pen down, and share their perspectives on various topics. This series provides a space for our analysts to showcase their insights.

All opinions expressed by System2 employees and their guests are solely their own and do not reflect the opinions of System2. This post is for informational purposes only and should not be relied upon as a basis for investment decisions. Clients of System2 may maintain positions in the securities discussed in this post.

Today’s post was written by Seth Leonard.


Skip the Text

Want to skip the text and start pulling data from the CPS or ACS?

Who Knew Data Science Could Be So Useful

This past spring, I found myself single for the first time in 17 years. The last time I was in the dating pool the first iPhone was still a year away. Now, the strange world of dating apps is the norm. Navigating this new landscape is foreign to me, so the obvious course of action was to start pulling data from the Census Bureau API and see what my odds look like…

Where to Find a Mate

  • From ages 25 to 40, college-educated single women outnumber college-educated single men nationally by 1.09 to 1.

If we don’t filter by college education, from ages 25 to 40 there are fewer single women than men nationally with a ratio of 0.86 to 1.

Regardless of education, Alaska and Wyoming have the lowest ratio of single women to single men in this age range.

  • Regardless of education, Mississippi has the highest ratio of single women to single men in this age range.

Doing the Research

Census data can be tricky to use. For one thing, there’s lots of it. And there are infinite ways to slice it. The first step is simply to decide which dataset to use. The two obvious datasets are the Current Population Survey (CPS) and the American Community Survey (ACS). The CPS is a monthly survey of around 60,000 households in all 50 states and DC. The ACS is an annual survey of 3.5M Americans; not as frequent, but far more extensive. And both have the variables I’m interested in: geography, sex, age, education, and marital status.

The Census Bureau has a number of pre-sliced tables. However, if you want something very specific, the API is the way to go. The CPS and ACS APIs provide microeconomic data in two formats. The first is raw: each row represents an (anonymized) individual and his or her characteristics (female, 35, in Manhattan, bachelor's degree, never married, just for example). This sounds perfect. The trouble is that these data are just a sample of the total population. Suppose our data is disproportionately sampled from the Upper East Side. Presumably, our Manhattan data would skew older, richer, and whiter than the average population. To address this problem, the Census Bureau provides another view called tabulated queries. Instead of raw row-level data, data are represented as weights on each characteristic. It’s more restrictive but should give a better picture of overall demographics.

To pull CPS and ACS data, I wrote a basic API wrapper using requests in Python. I won’t dive into technical details here, but if you want to do your own research, get an API key here. You can use (or improve!) the tools I put together on GitHub. Tabulated data allows us to get weights for one or two variables, though the list of variables we can filter by is unlimited. Geographies can be state or county for the CPS, and state or PUMA (FIPS) for the ACS.

So what does a basic call look like? For the CSP:

What’s going on here? We’re slicing the data by age (PRTAGE) and sex (PESEX). We’ll pull data for October 2023 (the most recent). State 36 is New York, and counties 061, 081, and 047 are Manhattan (New York), Queens, and Brooklyn (Kings). Finally, we’ll filter on marital status divorced, separated, or never married (4, 5, and 6) and education level bachelor's degree (43) and above.

For the ACS:

Variables are the same as above, with the exception of geography. Because the ACS data is so much more extensive, this query focuses exclusively on Manhattan. The PUMA codes are, in order, Lower East Side, Chelsea and Hell’s Kitchen, Upper West Side, Upper East Side and Roosevelt Island, Morningside Heights and Hamilton Heights, Harlem, East Harlem, Washington Heights & Inwood, Financial District & Greenwich Village, and Midtown, East Midtown and Flatiron.

The resulting data provides a weight for men and women over 15. To figure out dating prospects, I looked at the ratio of women aged 25 to 35 (not that those numbers have any relation to my app settings…) against men ages 30 to 40. For the October CPS, that comes out to 1.3 college-educated single women ages 25 to 35 for every 1 college-educated single man ages 30 to 40. But the number bounces around quite a bit, as high as 1.9 in August. In this case, the ACS provides a more reliable estimate. Focusing on Manhattan, in 2022 there were 1.77 college-educated single women ages 25 to 35 for every 1 college-educated man ages 30 to 40. Of course, New York skews young (as do single people), and I’m using uneven age brackets. To get a truer estimate of the single male-female ratio we can use 25 to 40 for both sexes. In that case, the statistic is much closer: 1.11 college-educated single women for every college-educated single male. Still a big difference. However, if we get rid of the college-educated criteria, the numbers are even.

We can look more broadly at the ratio of men to women (college-educated or otherwise) by looping over states. Mississippi is a stand-out with 1.67 college-educated single women for each college-educated single man ages 25 to 40. New York state as a whole aligns with the city at 1.14. At the opposite extreme, ratios for Alaska and Wyoming are 0.74 and 0.75 respectively. Getting rid of the college-educated requirement changes the picture; in this case, single men tend to outnumber single women. What’s happening here? The upper age limit in this case is 40. What we see is that women tend to marry younger, so there are fewer single women in the age bracket we’ve selected.

As it turns out, New York wasn’t a bad choice, particularly for dating younger. My own neighborhood, the Upper East Side, is even more skewed in my favor with 1.34 single college-educated women to men ages 25 to 40. But if I was really clever, maybe I should move to Mississippi.

matei zatreanu