We present BlowFish, a distributed data store that admits a smooth tradeoff between storage and performance for point queries. What makes BlowFish unique is its ability to navigate along this tradeoff curve efficiently at fine-grained time scales with low computational overhead.
Achieving a smooth and dynamic storage-performance tradeoff enables a wide range of applications. We apply BlowFish to several such applications from real-world production clusters: (i) as a data recovery mechanism during failures: in practice, BlowFish requires 5.4× lower bandwidth and 2.5× lower repair time compared to state-of-the-art erasure codes, while reducing the storage cost of replication from 3× to 1.9×; and (ii) data stores with spatially-skewed and time-varying workloads (e.g., due to object popularity and/or transient failures): we show that navigating the storage-performance tradeoff achieves higher system-wide utility (e.g., throughput) than selectively caching hot objects.
Shu Liu, Xiangxi Mo, Moshik Hershcovitch, Hongjun Zhang, Audrey Cheng, Guy Girmonsky, Gil Vernik, Michael Factor, Tiemo Bang, Soujanya Ponnapalli, Natacha Crooks, Joseph E. Gonzalez, Danny Harnik, Ion Stoica
Discussion(0)
No comments yet. Be the first to comment.