Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Strassen's algorithm is rarely used: its primary use, to my understanding, is in matrix algorithms of more exotic fields than reals or complex numbers, where minimizing multiplies is extremely useful. Even then, Strassen only starts to beat out naive matrix multiply when you're looking at n >= 1000's--and at that size, you're probably starting to think about using sparse matrices where your implementation strategy is completely different. But for a regular dgemm or sgemm, it's not commonly used in BLAS implementations.

The more advanced algorithms than Strassen's are even worse in terms of the cutover point, and are never seriously considered.



The cross-over can be around 500 (https://doi.org/10.1109/SC.2016.58) for 2-level Strassen. It's not used by regular BLAS because it is less numerically stable (a concern that becomes more severe for the fancier fast MM algorithms). Whether or not the matrix can be compressed (as sparse, fast transforms, or data-sparse such as the various hierarchical low-rank representations) is more a statement about the problem domain, though it's true a sizable portion of applications that produce large matrices are producing matrices that are amenable to data-sparse representations.


How big are the matrixes in some modern training pipelines? We always talk of absurdly large parameter spaces.


SGD keeps the matrices small, I think.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: