Performance clarity as a first-class design principle
Article 2017 en
Authors
KO
Kay Ousterhout
CC
Christopher Canel
MW
Max Wolffe
Abstract
1 min read
Users often struggle to reason about the performance of today's systems. Without an understanding of what factors are most important to performance, users do not know how to tune their system's hardware and software configuration to improve performance. We argue that performance clarity -- making it easy to understand where bottlenecks lie and the performance implications of various system changes -- should be a first class design goal. To illustrate that this is possible, we propose an architecture for data analytics frameworks in which jobs are decomposed into schedulable units called monotasks that each consume a single resource. By untangling the use of different resources, using monotasks allows the system to trivially report time used on each resource and the resource bottleneck. Our prototype implementation of monotasks for Apache Spark is API-compatible and achieves performance parity with Spark, and yields a simple performance model that can predict the effects of future hardware and software changes.
Mark Hamilton, Nick Gonsalves, Christina Lee, Anand P. Raman, Brendan Walsh, Siddhartha Prasad, Dalitso Banda, Lucy Zhang, Lei Zhang, William T. Freeman
Discussion(0)
No comments yet. Be the first to comment.