Telemetry and Modeling for Automatic Tuning in Apache Cassandra

A Capella Computer Science, 2018-19

Liaison(s): (not listed)
Advisor(s): Beth Trushkowsky
Students(s): Jonathan Cruz (PM-S), Carissa DeRanek, Lilly Liu, Jonathan Raygoza, Ashley Schmit (PM-F)

Databases—especially large-scale databases—are the “reactor core” powering today’s software services. The performance of large databases depends on many interactions between internal parameters and the varying external load they’re asked to handle. Our goal is to improve both the “sensing” and the “control” of Cassandra, a large-scale open-source database, adapting its operation based on changing conditions. The pipeline will use machine learning to guide parameter-tuning for the database, depending on operational and query patterns.