Trushkowsky Pens Paper on Crowdsourcing Research

Harvey Mudd computer scientists are researching how to improve query processing systems by combining the precision of a computer database system and the interpretive abilities of the human mind. Such hybrid systems could improve the scope, accuracy and efficiency of information gathering and processing.

In a paper recently accepted to the highly regarded journal of the Association for Computing Machinery (ACM), Communications of the ACM, lead author and Assistant Professor of Computer Science Beth Trushkowsky discusses the benefits and challenges of answering database queries with crowdsourcing—the process of obtaining large amounts of data and human processing power from online communities known as the “crowd.” “Answering Enumeration Queries With the Crowd” explores how computer science can leverage human intelligence to improve the overall question-answering ability of crowdsourced queries.

In such hybrid systems, the goal is for computers to perform the bulk of the work quickly and automatically. People are then brought in to interpret the data, thereby broadening the types of questions that can be asked. The challenge, Trushkowsky says, is reconciling the naturally constrained space of a database and the open-endedness of the real world. She and her research team are developing statistical tools that users and developers can employ to gauge query completeness in the hopes of ultimately helping drive query execution and crowdsourcing strategies.

Hybrid human/computer database systems promise to greatly expand the usefulness of query processing by incorporating the crowd,” says Trushkowsky, who earned master’s and PhD degrees researching database systems at UC Berkeley.

But such hybrid systems raise challenging implementation issues. Notably, they violate the closed-world assumption, a principle in knowledge management that assumes a database is complete—contains all data needed to answer the query—at the time the query is posed. Crowdsourcing, by its open-ended nature, creates the potential for new and unforeseen query results, necessitating human interpretation as well as creating the potential for human error.

“Computers are good at many things, but not great at others, such as reasoning, subjectivity and so on,” says Trushkowsky. “Humans are just human—they make errors. But we can take advantage of human intelligence and perception to help solve interesting problems in an open system. Our work is figuring out how to leverage the best of both worlds.”