A long time ago, in a galaxy far away, I scribbled down some notes on how to use Prolog to exhaustively search for the best assembly instruction sequences that perform particular data manipulations, in particular for SIMD. And although I actually used such an approach to verify whether code examples shown in The Software Vectorization Handbook were truly optimal, I always thought the ideas were too thin for actual publication. However, now that ML-to-optimize-ML is becoming popular, I was hoping that perhaps a few people would be interested in reading about ideas from a simpler time, when AI still meant Prolog and expert systems and such. Therefore, I made the notes available as arXiv white paper .
Showing posts from 2021
- Other Apps
- Other Apps
I guess you can never take the Intel out of the boy even if the boy is long out of Intel. A few months back, I had lots of fun making sure MLIR's code generation maps to efficient AVX512 instructions. This week, I thoroughly enjoyed designing and implementing a MLIR dialect for Intel Advanced Matrix extensions (AMX) with integration tests that run correctly on a Sapphire Rapids emulator. Staring at some x86 assembly instructions, it does not get much better than that....