Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is an extraordinarily difficult problem to transform scalar code into vector instructions. The only way to get even passable output from a vectorizing compiler is to write the code as vectors to begin with, such as with cross-platform assembly tools like Orc.

And even then you'll often end up significantly worse off than if you wrote the assembly by hand.

A run of Intel's compiler on the C versions of our DSP functions resulted in a grand total of one vectorization, which was done terribly, too.



The problem is that you used C, which doesn't have any syntax to represent meta-information about the problem you're trying to solve. When you write out C code to, say, add a list of numbers, it's hard for the compiler to optimize that. But it's very easy for the compiler when you tell it "sum this list of numbers".




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: