Machine Learning Accuracy Research Published

Share story

Two papers authored by Harvey Mudd College computer scientists in Professor George Montañez’s AMISTAD Lab have been accepted to conferences this summer.

The IEEE 2022 International Joint Conference on Neural Networks (part of the IEEE World Congress on Computational Intelligence) accepted “Bounding Generalization Error Through Bias and Capacity,” co-authored by Montañez, Ramya Ramalingam ’21, Nico Espinosa Dice ’22 and Megan Kaye ’22. The paper is Espinosa Dice’s and Kaye’s second with the AMISTAD Lab and Ramalingam’s first. Ramalingam is a PhD student at the University of Pennsylvania, Espinosa Dice will enter the CS PhD program at Cornell in the fall and Kaye will enter the workforce. Espinosa Dice and Kaye are recipients of the 2022 Don Chamberlin Research Award given by the CS department, in recognition for this and previous work.

“Identifying Bias in Data Using Two-Distribution Hypothesis Tests,” by Montañez and co-authors William Yik ’24, Limnanthes Serafini ’24, and Tim Lindsey ’23 (Biola University) will be presented at the 5th AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society in Oxford, England, in August.

Montañez summarizes the research:

Bounding Generalization Error Through Bias and Capacity

In supervised machine learning (i.e., classification), you are given a training dataset and you use that to produce some output (called a hypothesis) that you can use to classify future examples. While training, you have some notion on how accurate your classification hypothesis is under your training data. In the future, however, you may encounter examples you have not seen in training, so your accuracy may be different than it was in training. The degree to which you fail to generalize (properly classify) new training examples is called your generalization error. It is formally the expected difference between your error during training and your error on new examples, when you deploy your model in the real-world.

Traditional machine learning theory has developed ways of bounding how bad your generalization error can be, so that you have guarantees that your future performance won’t differ too much from your performance during training. In this paper, we develop ways of obtaining these same sorts of generalization error guarantees using geometric notions of algorithm bias and the degree to which a machine learning model can memorize training examples (e.g., its capacity). This work draws on recent results in the machine learning literature, but for the first time creates generalization bounds within the unified machine learning/AI framework called the Algorithmic Search Framework. This allows practitioners using the framework to have access to the same types of generalization guarantees previously only available within other theoretical machine learning frameworks.

Identifying Bias in Data Using Two-Distribution Hypothesis Tests

The paper presents a way of assessing whether training data is potentially too skewed (biased) to be used in training machine learning models. For example, as a hiring manager you might see that two-thirds of recent hires were male and wonder if that was a mere statistical fluke of a fair process that hires men and women in roughly equal number or if it could indicate biases within the hiring process. Before you decide to use your hiring data to train any machine learning model, the tests would alert you to the strong possibility of bias in this data. We use two-distribution statistical hypothesis tests to determine when skewed becomes too skewed to be plausibly explained as the result of a fair process. Our methods allow us to tell the user what the closest plausible explanation for their observed data is, relative to their original hypothesized explanation. A user can then determine whether or not that plausible explanation aligns with their desired notion of fairness.