Defending NLP Models Against Adversarial Attacks

Proofpoint, Inc. Computer Science, 2021–22

Liaison(s): Cameron Malloy, Adam Starr
Advisor(s): Blake Jackson
Students(s): Meg Kaye (PM-F), Dana Teves (PM-S), Emily Chin, Skylar Litz, Keizo Morgan

Proofpoint uses natural language processing to classify and filter out malicious or fraudulent emails. However, their classifiers often face attacks that alter the contents of emails to trick the classifiers while maintaining human readability. The team has developed a testing system to evaluate how various classification methods perform against these attacks. Using this tool, the team has analyzed various combinations of classifiers, attacks, and defense methods, and offered suggestions to Proofpoint on how best to defend against these attacks.