NLP
SENTIMENT ANALYSIS
PREDICTIVE MODELING
PYTHON
SCIKIT-LEARN
MIXED METHODS
NLP
SENTIMENT ANALYSIS
PREDICTIVE MODELING
PYTHON
SCIKIT-LEARN
MIXED METHODS
NLP
SENTIMENT ANALYSIS
PREDICTIVE MODELING
PYTHON
SCIKIT-LEARN
MIXED METHODS
CASE STUDY — 2022 · SENIOR CAPSTONE THESIS
Decoding
Sentiment Analysis & Behavioral Modeling of Dating App Discourse
Can behavioral and linguistic patterns in dating app discourse predict swipe decisions and compatibility outcomes; and do the algorithms meant to help people connect actually reinforce the biases they came in with?
CASE STUDY — 2022 · SENIOR CAPSTONE THESIS
Decoding
Sentiment Analysis & Behavioral Modeling of Dating App Discourse
Can behavioral and linguistic patterns in dating app discourse predict swipe decisions and compatibility outcomes; and do the algorithms meant to help people connect actually reinforce the biases they came in with?
CASE STUDY — 2022 · SENIOR CAPSTONE THESIS
Decoding
Sentiment Analysis & Behavioral Modeling of Dating App Discourse
Can behavioral and linguistic patterns in dating app discourse predict swipe decisions and compatibility outcomes; and do the algorithms meant to help people connect actually reinforce the biases they came in with?
CONTEXT
SMC Interaction Design thesis
ROLE
Project manager + researcher
DURATION
5 weeks · 2022
TEAM
4 Interaction Designers
CONTEXT
SMC Interaction Design thesis
ROLE
Project manager + researcher
DURATION
5 weeks · 2022
TEAM
4 Interaction Designers
CONTEXT
SMC Interaction Design thesis
ROLE
Project manager + researcher
DURATION
5 weeks · 2022
TEAM
4 Interaction Designers
◆ RESEARCH QUESTION
Dating apps are the dominant infrastructure for how people find partners and yet the algorithms powering them are largely opaque, and the biases users carry into swiping are rarely surfaced or challenged. This project asked whether data science could do what UX alone can't: reveal the gap between what people say they want and what their behavior actually shows.
CORE QUESTION
What are people truly looking for when dating, and how much of it is emotional, future-focused, or superficial? Can those patterns be modeled and predicted?
◆ RESEARCH QUESTION
Dating apps are the dominant infrastructure for how people find partners and yet the algorithms powering them are largely opaque, and the biases users carry into swiping are rarely surfaced or challenged. This project asked whether data science could do what UX alone can't: reveal the gap between what people say they want and what their behavior actually shows.
CORE QUESTION
What are people truly looking for when dating, and how much of it is emotional, future-focused, or superficial? Can those patterns be modeled and predicted?
◆ RESEARCH QUESTION
Dating apps are the dominant infrastructure for how people find partners and yet the algorithms powering them are largely opaque, and the biases users carry into swiping are rarely surfaced or challenged. This project asked whether data science could do what UX alone can't: reveal the gap between what people say they want and what their behavior actually shows.
CORE QUESTION
What are people truly looking for when dating, and how much of it is emotional, future-focused, or superficial? Can those patterns be modeled and predicted?
◆ Key RESULTS
Swipe behavior prediction accuracy, validated on held-out test data
Reddit posts scraped across r/dating, r/OkCupid, r/hingeapp via PRAW
Users surveyed across interviews, competitive analysis, and observational research at virtual speed dating events
Family and future-focused topics triggered the strongest emotional extremes in VADER sentiment scoring; stronger than either romantic or surface-level content. That asymmetry became the basis for the predictive model's key features.
◆ Key RESULTS
85%
Swipe behavior prediction accuracy, validated on held-out test data
450+
Reddit posts scraped across r/dating, r/OkCupid, r/hingeapp via PRAW
100+
Users surveyed across interviews, competitive analysis, and observational research at virtual speed dating events
Family and future-focused topics triggered the strongest emotional extremes in VADER sentiment scoring; stronger than either romantic or surface-level content. That asymmetry became the basis for the predictive model's key features.
◆ Key RESULTS
85%
Swipe behavior prediction accuracy, validated on held-out test data
450+
Reddit posts scraped across r/dating, r/OkCupid, r/hingeapp via PRAW
100+
Users surveyed across interviews, competitive analysis, and observational research at virtual speed dating events
Family and future-focused topics triggered the strongest emotional extremes in VADER sentiment scoring; stronger than either romantic or surface-level content. That asymmetry became the basis for the predictive model's key features.
◆ TECHNICAL PIPELINE
A two-phase mixed-methods approach: qualitative UX research (100+ user surveys, competitive analysis, expert interview with a PhD sexologist, observational research at virtual speed dating events) followed by a full data science expansion
◎
Data collection
PRAW · Reddit API
→
≋
Sentiment scoring
VADER NLP
→
⬡
Predictive model
scikit-learn
→
◈
Visualization
Matplotlib · Seaborn
Python
Pandas
NumPy
scikit-learn
VADER
PRAW
Matplotlib
Seaborn
◆ TECHNICAL PIPELINE
A two-phase mixed-methods approach: qualitative UX research (100+ user surveys, competitive analysis, expert interview with a PhD sexologist, observational research at virtual speed dating events) followed by a full data science expansion
◎
Data collection
PRAW · Reddit API
→
≋
Sentiment scoring
VADER NLP
→
⬡
Predictive model
scikit-learn
→
◈
Visualization
Matplotlib · Seaborn
Python
Pandas
NumPy
scikit-learn
VADER
PRAW
Matplotlib
Seaborn
◆ TECHNICAL PIPELINE
A two-phase mixed-methods approach: qualitative UX research (100+ user surveys, competitive analysis, expert interview with a PhD sexologist, observational research at virtual speed dating events) followed by a full data science expansion
◎
Data collection
PRAW · Reddit API
→
≋
Sentiment scoring
VADER NLP
→
⬡
Predictive model
scikit-learn
→
◈
Visualization
Matplotlib · Seaborn
Python
Pandas
NumPy
scikit-learn
VADER
PRAW
Matplotlib
Seaborn
◆ VIEW THE SOURCE
The full analysis is open source VADER sentiment pipeline, data cleaning, model training, and visualizations.
VADER (Valence Aware Dictionary and Sentiment Reasoner) the sentiment analysis tool used to score Reddit post valence across the Decoding Desire dataset.
◆ VIEW THE SOURCE
The full analysis is open source VADER sentiment pipeline, data cleaning, model training, and visualizations.
VADER (Valence Aware Dictionary and Sentiment Reasoner) the sentiment analysis tool used to score Reddit post valence across the Decoding Desire dataset.
◆ VIEW THE SOURCE
The full analysis is open source VADER sentiment pipeline, data cleaning, model training, and visualizations.
VADER (Valence Aware Dictionary and Sentiment Reasoner) the sentiment analysis tool used to score Reddit post valence across the Decoding Desire dataset.
◆ KEY FINDINGS
01
Unconscious type patterns
Most users have statistically predictable "types" they're unaware of; surface trait preferences dominate swiping even when users report valuing emotional connection.
02
Algorithms reinforce bias
Dating recommendation systems tend to amplify existing preference patterns rather than surface diverse or compatible matches. The optimization loop works against users' stated goals.
03
Future focus triggers extremes
Family and future-oriented discourse generated the strongest emotional sentiment scores, far stronger than romantic or status content, revealing where users' deepest values actually live.
◆ KEY FINDINGS
01
Unconscious type patterns
Most users have statistically predictable "types" they're unaware of; surface trait preferences dominate swiping even when users report valuing emotional connection.
02
Algorithms reinforce bias
Dating recommendation systems tend to amplify existing preference patterns rather than surface diverse or compatible matches. The optimization loop works against users' stated goals.
03
Future focus triggers extremes
Family and future-oriented discourse generated the strongest emotional sentiment scores, far stronger than romantic or status content, revealing where users' deepest values actually live.
◆ KEY FINDINGS
01
Unconscious type patterns
Most users have statistically predictable "types" they're unaware of; surface trait preferences dominate swiping even when users report valuing emotional connection.
02
Algorithms reinforce bias
Dating recommendation systems tend to amplify existing preference patterns rather than surface diverse or compatible matches. The optimization loop works against users' stated goals.
03
Future focus triggers extremes
Family and future-oriented discourse generated the strongest emotional sentiment scores, far stronger than romantic or status content, revealing where users' deepest values actually live.
◆ OUTCOMES
◎
predicting swipe behavior from sentiment features and behavioral signals extracted from Reddit discourse, validated on held-out test data
85% model accuracy
↗
across competitive analysis, expert interviews, and observational research at virtual speed dating events
100+ users surveyed
◈
Two rounds of speculative mockups translating model findings into interface concepts, including compatibility scoring and bias surface features
UI concept designs
≡
across 450+ posts: emotional themes, family and future orientation, surface and status signals; each with distinct predictive weight
3-theme sentiment taxonomy
◆ OUTCOMES
◎
predicting swipe behavior from sentiment features and behavioral signals extracted from Reddit discourse, validated on held-out test data
85% model accuracy
↗
across competitive analysis, expert interviews, and observational research at virtual speed dating events
100+ users surveyed
◈
Two rounds of speculative mockups translating model findings into interface concepts, including compatibility scoring and bias surface features
UI concept designs
≡
across 450+ posts: emotional themes, family and future orientation, surface and status signals; each with distinct predictive weight
3-theme sentiment taxonomy
◆ OUTCOMES
◎
predicting swipe behavior from sentiment features and behavioral signals extracted from Reddit discourse, validated on held-out test data
85% model accuracy
↗
across competitive analysis, expert interviews, and observational research at virtual speed dating events
100+ users surveyed
◈
Two rounds of speculative mockups translating model findings into interface concepts, including compatibility scoring and bias surface features
UI concept designs
≡
across 450+ posts: emotional themes, family and future orientation, surface and status signals; each with distinct predictive weight
3-theme sentiment taxonomy
"This project deepened my understanding of how crucial it is to design AI systems that promote emotional intelligence and equity, not just efficiency. The model worked, but the more important finding was what the data revealed about the gap between what people say they want and what they actually pursue."
"This project deepened my understanding of how crucial it is to design AI systems that promote emotional intelligence and equity, not just efficiency. The model worked, but the more important finding was what the data revealed about the gap between what people say they want and what they actually pursue."
"This project deepened my understanding of how crucial it is to design AI systems that promote emotional intelligence and equity, not just efficiency. The model worked, but the more important finding was what the data revealed about the gap between what people say they want and what they actually pursue."