Project

The OKCupid dataset: A very large public dataset of dating site users

Goal: A very large dataset (N=68,371, 2,620 variables) from the dating site OKCupid is presented and made publicly available for use by others.

This project concerns the original data release, any follow-up data releases and any studies I do with the data.

Updates
0 new
9
Recommendations
0 new
0
Followers
0 new
54
Reads
41 new
15247

Project log

Emil O. W. Kirkegaard
added a research item
We sought to assess whether previous findings regarding the relationship between cognitive ability and religiosity could be replicated in a large dataset of online daters (maximum n = 67k). We found that self-declared religious people had lower IQs than nonreligious people (atheists and agnostics). Furthermore, within most religious groups, a negative relationship between the strength of religious conviction and IQ was observed. This relationship was absent or reversed in nonreligious groups. A factor of religiousness based on five questions correlated at −0.38 with IQ after adjusting for reliability (−0.30 before). The relationship between IQ and religiousness was not strongly confounded by plausible demographic covariates (β = −0.24 in final model versus −0.30 without covariates).
Emil O. W. Kirkegaard
added an update
Emil O. W. Kirkegaard
added an update
Miron Zuckerman is updating his meta-analysis of the religion and intelligence relationship, and asked me to compute some extra analyses of the OKCupid dataset.
 
Emil O. W. Kirkegaard
added a research item
The relationship between criminal and antisocial (CAS) behaviors and cognitive ability (CA) were examined in a large online sample of dating site users (complete sample n = 68,371). 12 question items were found that measured CAS to some degree. Of these, 11 showed a negative relation to CA. The answers to the CAS items were all positively related, suggesting the existence of a general factor of CAS behavior. Scores for this factor were estimated using multiple methods. The resulting scores were then subjected to a series of regression models to examine whether the link between CA and CAS would hold up in the presence of other predictors. The results showed that the link between CA and CAS scores was robust to model specifications with standardized betas of -.15 to -.20. Furthermore, a CA x sex interaction was found such that the CA x CAS relationship was stronger for men (r’s -.20 and -.13, for men and women, respectively).
Emil O. W. Kirkegaard
added an update
Paper on criminal/antisocial behavior and cognitive ability submitted.
Title: Crime and cognitive ability in a sample of dating site users
Abstract: The relationship between cognitive ability and criminal/rule breaking behavior was examined in a sample of dating site users (n ≈ 12k). Cognitive ability was scored from 14 suitable items. There were 5 criminal outcomes: ever arrested, ever imprisoned, punched someone in the face as adult, ever cheated on an exam, and would cheat on taxes if no one would know. These variables were all negatively correlated to cognitive ability: -0.17 [CI95: -0.19, -0.14], -0.14 [CI95: -0.19, -0.09], -0.15 [CI95: -0.17, -0.13], -0.06 [CI95: -0.09, -0.02], and -0.08 [CI95: -0.09, -0.06], respectively (same order as mentioned above). The criminal outcomes were all positively correlated (mean correlation = .45) and formed a general crime factor. A score was computed for this factor, but it was very non-normal and unsuitable for analysis.
Instead, arrest history was used in a logistic regression with control variables (sex, age and cognitive ability x sex). Arrest history was chosen for this purpose because it was the indicator with the strongest loading on the crime factor (.96). The results showed that cognitive ability was still negatively related to arrest history, β = -.29 (logit). This result was virtually unchanged by three robustness checks: 1) restricting the sample to Whites only, 2) removing the interaction term, and 3) filtering out persons with cognitive ability scores < -2. Key words: crime, arrested, cognitive ability, IQ, intelligence, dating site
 
Emil O. W. Kirkegaard
added a research item
A very large dataset (N=68,371, 2,620 variables) from the dating site OKCupid is presented and made publicly available for use by others. As an example of the analyses one can do with the dataset, a cognitive ability test is constructed from 14 suitable items. To validate the dataset and the test, the relationship of cognitive ability to religious beliefs and political interest/participation is examined. Cognitive ability is found to be negatively related to all measures of religious belief (latent correlations -.26 to -.35), and found to be positively related to all measures of political interest and participation (latent correlations .19 to .32). To further validate the dataset, we examined the relationship between Zodiac sign and every other variable. We found very scant evidence of any influence (the distribution of p-values from chi square tests was flat). Limitations of the dataset are discussed.
Emil O. W. Kirkegaard
added an update
 
Emil O. W. Kirkegaard
added an update
Project files are now available at https://mega.nz/#F!QIpXkL4Q!b3QXepE6tgyZ3zDhWbv1eg
 
Emil O. W. Kirkegaard
added an update
This is the dataset with usernames, but due to the ethical complaints, it would perhaps be better to have a version without the usernames and cities.
 
Emil O. W. Kirkegaard
added an update
 
Emil O. W. Kirkegaard
added an update
Current draft.
 
Emil O. W. Kirkegaard
added a project goal
A very large dataset (N=68,371, 2,620 variables) from the dating site OKCupid is presented and made publicly available for use by others.
This project concerns the original data release, any follow-up data releases and any studies I do with the data.