Yes, the XNA/Monogame way means the three (well four) floats that define the translation are together in memory.
It’s like in the old days the Atari ST pixel format put four pixels into a single word where the Amiga split the pixel across bit planes.
So on the amiga you read a word to get the red channel of 16 pixels, then have to read different memory addresses to get the other channels.
Commodore did that because it made sense for the blitter, Atari didn’t have a blitter, so the best they could do was exploit the movp.l instruction on the 68000
Thanks. Haha, this went a bit off topic quite fast, but I understand the link, and it’s interesting to have that background.
So the matrix orientation is to put things together that are commonly needed together. (SIMD can just get the x,y,z,w translation from the matrix in one go). Thanks!