Research Paper at iWAPT 2013
Research PaperUsing Machine Learning in order to Improve Automatic SIMD Instruction Generation
Antoine Trouve, Arnaldo Cruz, Hiroki Fukuyama, Jun Maki, Hadrien Clarke, Kazuaki Murakami, Masaki Arai, Tadashi Nakahira, Eiji Yamanaka
The international Workshop on Automatic Performance Tuning (iWAPT)
6th June 2013
Abstract: Basic block vectorization consists in extracting instruction level parallelism inside basic blocks in order to generate SIMD instructions and thus speedup data processing. It is however a double-edged technique, because the vectorized program may actually be slower than the original one. Therefore, it would be useful to predict beforehand whether or not vectorization could actually produce any speedup. In this article, we propose to do so by using a machine learning technique called support vector machine. We consider a benchmark suite containing 151 loops, unrolled with factors ranging from 1 to 20. We do our prediction offline after as well as before unrolling. Our contribution is threefold. First, we manage to predict correctly the profitability of vectorization for 70% of the programs in both cases. Second, we propose a list of static software characteristics that successfully describe our benchmark with respect to our goal. Finally, we determine that machine learning makes it possible to significantly improve the quality of the code generated by Intel Compiler, with speedups up to 2.2 times.
No comments yet.