The waning benefits of device scaling have caused a push towards domain specific accelerators (DSAs), which sacrifice programmability for efficiency. While providing huge benefits, DSAs are prone to obsoletion due to domain volatility, have recurring design and verification costs, and have large area footprints when multiple DSAs are required in a single device. Because of the benefits of generality, this work explores how far a programmable architecture can be pushed, and whether it can come close to the performance, energy, and area efficiency of a DSA-based approach. It has been taken for granted that workload specialization is necessary leaving software developers with no clean abstractions to target.
In this talk we dispel this myth and show efficiency and programmability can indeed be achieved. First, we discover and demonstrate all DSAs are more similar than dissimilar. We show that all DSAs employ common specialization principles for concurrency, computation, communication, data-reuse and coordination, and that these same principles can be exploited in a programmable architecture using a composition of known microarchitectural mechanisms. Second, we propose a universal accelerator fabric called Stream-Dataflow that can match DSA. Our results from modeling and hardware/software implementation show that a programmable, specialized architecture can indeed be competitive with a domain-specific approach.
Karu Sankaralingam (http://www.cs.wisc.edu/~karu) is Professor in the computer sciences department at the University of Wisconsin-Madison, where he also leads the Vertical Research Group (http://www.cs.wisc.edu/vertical/). He is currently CEO/Co-Founder of SimpleMachines Inc. His research interests include microarchitecture, architecture, and software issues for massively parallel computation systems. He is a recipient of the IEEE TCCA Young Computer Architecture Award in 2012, an NSF CAREER award in 2009, the Emil H Steiger Distinguished Teaching award in 2014, and the Letters and Science Philip R. Certain - Gary Sandefur Distinguished Faculty Award in 2013 which recognizes outstanding teaching by a member of the College of The Letters and Science. He earned a PhD from The University of Texas at Austin in December 2006, and was the lead student architect of the TRIPS chip, a 170 million transistor chip.