CM 091: Seth Stephens-Davidowitz on Big Data as Truth Serum

Do you really know your neighbors or coworkers?

To understand human behavior, we need research participants who act and respond truthfully. But that is a tall order when it comes to topics that are embarrassing or even incriminating. Social scientists have found it hard to get honest answers when asked about topics that might reveal racism, sexism, gluttony or a slew of other socially unacceptable traits.

Researchers like Seth Stephens-Davidowitz have found a way around that problem by analyzing data from our over 3.5 billion daily Google searches. And it turns out that the candid words, phrases, and questions we type in reveal a whole lot about us.

Seth is the author of the bestselling book, Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us about Who We Really Are. He is also a New York Times op-ed contributor, a visiting lecturer at The Wharton School, and a former Google data scientist.

In this interview we discuss:

  • How Internet datasets help us ask bigger questions than ever before
  • How word and picture data expand the kinds of questions we can ask and yield unexpected insights
  • How data from our over 3.5 billion daily Google searches serves as a digital truth serum for learning more about what we actually think and do
  • How big data is giving researchers insights into small groups of people we rarely had before  
  • How big data is helping researchers engage in rapid experimentation and conduct quick tests to see how people respond
  • How horse racing analytics data scientists like Jeff Seder help us think beyond traditional data sets to uncover game-changing findings
  • How night lights in India revealed key insights regarding economic activity
  • Just how much creativity is involved in data science research
  • How researchers studied big data in the hopes of helping political leaders shift hate group behaviors
  • What Google search analysts learned about gender from searches on children and intelligence
  • What we are learning about poverty and economic mobility from big data
  • The connection between the health of poor people and the number of rich people living nearby
  • The connection between the number of tax accountants and how many people cheat on their taxes
  • How data scientists are using our doppelgangers to anticipate what we might want to buy
  • How the healthcare industry can use doppelgangers to personalize treatment
  • The fact that Google conducts more experiments in one day than the FDA does in one year
  • How your love of curly fries may signal high intelligence to prospective employers
  • How it is becoming harder than ever for regulators to stay ahead of all the things companies can know about us as the number of variables keeps on growing
  • How researchers may use big data to figure out, once and for all, which foods are nutritious — and whether we really should be eating broccoli

Links to Episode Topics

@SethS_D

http://sethsd.com/

Freakonomics by Steven Levitt and Stephen Dubner

Jeff Seder

American Pharoah

Night Lights and Economic Activity in India

2015 San Bernardino Attack

The Rise of Hate Search New York Times article

Weapons of Math Destruction by Cathy O’Neil

The Better Angels of Our Nature by Steven Pinker

If you enjoy the podcast, please rate and review it on iTunes – your ratings make all the difference. For automatic delivery of new episodes, be sure to subscribe. As always, thanks for listening!