NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement\n Learning

Ameer Haj-Ali; Nesreen K. Ahmed; Ted Willke; Sophia Shao; Krste Asanović; Ion Stoica

doi:10.48550/arxiv.1909.13639

Abstract

1 min read

One of the key challenges arising when compilers vectorize loops for today's\nSIMD-compatible architectures is to decide if vectorization or interleaving is\nbeneficial. Then, the compiler has to determine how many instructions to pack\ntogether and how many loop iterations to interleave. Compilers are designed\ntoday to use fixed-cost models that are based on heuristics to make\nvectorization decisions on loops. However, these models are unable to capture\nthe data dependency, the computation graph, or the organization of\ninstructions. Alternatively, software engineers often hand-write the\nvectorization factors of every loop. This, however, places a huge burden on\nthem, since it requires prior experience and significantly increases the\ndevelopment time. In this work, we explore a novel approach for handling loop\nvectorization and propose an end-to-end solution using deep reinforcement\nlearning (RL). We conjecture that deep RL can capture different instructions,\ndependencies, and data structures to enable learning a sophisticated model that\ncan better predict the actual performance cost and determine the optimal\nvectorization factors. We develop an end-to-end framework, from code to\nvectorization, that integrates deep RL in the LLVM compiler. Our proposed\nframework takes benchmark codes as input and extracts the loop codes. These\nloop codes are then fed to a loop embedding generator that learns an embedding\nfor these loops. Finally, the learned embeddings are used as input to a Deep RL\nagent, which determines the vectorization factors for all the loops. We further\nextend our framework to support multiple supervised learning methods. We\nevaluate our approaches against the currently used LLVM vectorizer and loop\npolyhedral optimization techniques. Our experiments show 1.29X-4.73X\nperformance speedup compared to baseline and only 3% worse than the brute-force\nsearch on a wide range of benchmarks.\n

NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement\n Learning

Abstract

Discussion(0)

Open reviews(0)

Related publications

NeuroVectorizer: end-to-end vectorization with deep reinforcement learning

AutoPhase: Compiler Phase-Ordering for HLS with Deep Reinforcement Learning

AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning

AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning

AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning

Related publications

Preprint2020
NeuroVectorizer: end-to-end vectorization with deep reinforcement learning
Preprint2020

Article2019
AutoPhase: Compiler Phase-Ordering for HLS with Deep Reinforcement Learning
Article2019

Preprint2019
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning
Preprint2019

Preprint2020
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning
Preprint2020

Article2020
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning
Article2020