performance - Why is MATLAB so fast in matrix multiplication? - Stack Overflow
Matrix Multiplication Optimization – Brian C. Becker
CUDA – Matrix Multiplication | The Elancer
CUTLASS: Fast Linear Algebra in CUDA C++ | NVIDIA Technical Blog
GitHub - jim-rafferty/cuda-matrix-multiply-mex: A mex function to perform matrix multiplication on an nvidia gpu with a potentially huge improvement in performance depending on hardware available. Matlab's parallel computing toolbox is not required.