Harvey Mudd College Wins Citadel West Coast Data OpenJanuary 8, 2022
Defeating teams of graduate students, two groups of Harvey Mudd College first years placed first and third in Citadel’s West Coast Data Open.
Members of Harvey Mudd’s first-place team—Milo Knell ’25 (CS and math), Alan Wu ’25 (CS and math), David Chen ’25 (CS) and Forrest Bicker ’25 (CS and math)—received a $10,000 cash prize and interview offers at Citadel, a leading alternative investment manager. As winners of the West Coast regional, they qualified for the Datathon Global Championship and the opportunity to compete against other top regional teams for a $100,000 cash prize.
The third-place finishers were Sahil Rane ‘25, Baltazar Zuniga-Ruiz ’25, Karina Walker ’25 and Shahnawaz Mogal ’25 (University of Arizona). They received a $2,500 prize.
At the competition, participants work in teams on large and complex dataset challenges impacting the global markets then present their findings to a panel of judges. Both teams were given a dataset from the research archive of Upworthy, a digital media platform often credited for the rise of overly dramatic clickbait headlines, due in large part to a series of A/B tests they conducted from 2013 to 2015. The teams analyzed and reported on findings of a dataset of Upworthy’s A/B tests consisting of 150,817 different article packages and the respective number of clicks each received.
“Given Upworthy’s interesting reputation for clickbait, we wanted to build a machine learning model to measure whether an article is clickbait and see what it said about Upworthy’s headlines,” said Bicker, a member of HMC’s first-place team, whose members all share a love for computer science and machine learning. “To do this, we theorized that fake news tends to look very similar to clickbait because both aim to pull in viewers, so we trained an AI classifier on an external dataset of fake news.
“Applying the classifier on Upworthy’s dataset of headlines, we found that fake news predicted clickbait more accurately than click rate alone,” he said. “We found that predicted fake news is a good proxy to examine clickbait that avoids the influence of confounding variables like overall business performance and external factors that are not accounted for in the Upworthy data. Using a variety of Natural Language Processing techniques, we also found that clickbait tends to use more extreme emotional language (very positive or negative) that is potentially harmful to the public’s mental health and emotional wellbeing.”
Bicker said the team took a learning-focused approach to the competition, using it as an opportunity to explore new analytical techniques. “We wanted to push ourselves to think of novel, creative solutions to the problem, so we experimented with a number of distinct approaches. It was also our priority to bring a high standard of rigor to our work, making sure not to cut corners on our analysis and budget time appropriately for quality checks,” he said.
Read the first-place team’s full report.