Transformation between systems

Hey guys!

One thing I always face and never understand is the transformation between differen coordinate systems.
For instance using the position of an object, multiplying it by the view and projection matrix of the camera results in the world-view-projection in screen-space. Taking this information and multiply by the inverse projection matrix is … what??

Is there any complete intro into that? or can somebody share a diagram of some kind for “which matrix to apply to what to result in foobar coord system” ?

The big question is something like: What matrix multiplications result in which space and how to convert for any space to another. Is there an overview to that available? What is view-space? cliping-space? Etc.

What is the “w” component of a transformed position vector?
How do I reconstruct the world position from a depth-map?
Stuff like that.

This would be a cool intro to absolute newbies, too.

Sorry since this is a basic topic, but I am still learning und some PRO might have time to write a coool intro into this stuff :wink:

See http://www.codinglabs.net/article_world_view_projection_matrix.aspx

1 Like

There are a lot of tutorials on the subject you can pretty much google it.
Basically the spaces match the matrix names Model, World, View, Projection.

To put it pointedly though spaces themselves imply relativity to some origin.

A space must be relative to something.

In the case of your model its relative to the origin from which it was created this is the 0,0,0 position in say your model editor, vertices in this space are related to system modeling coordinates from its vertices orgin point.

(we define the models vertices spac-ial positioning from the model coordinate system).

This is the local space that vertices derived from when you load them into your game, because they are relative to model space.

When you drop it into your game into your game world, you position it in your World Space by the using a world matrix (when just moving it without rotation this is called translation) just like you place a piece on a map. When you turn it facing some direction its rotation (rotations are also transformations).
The combination of translation and rotation is called a transformation.

World’s space transformations have a associated matrix the world matrix. Typically each object has its own unique world matrix to handle its transformation. Your model vertices were relative to your model editors origin now there relative to your game worlds origin.

Now its nice and all to have everything in your world on your map but when you want to look at objects that are far from position 0,0,0 you need to either move them closer to you or move yourself closer to them. For this you need to define what that position and orientation will be. This choice is what we call a camera.

This is were the view space comes in what it does is takes that camera position as the new origin of the world. The idea is that it transforms the entire world so all vertices from all models are effected by it. The Matrix.CreateLookAt creates it (This function does a little trick to make sure it works right). But its spacial change is very similar to the world matrices change its just another transformation to the world but it is intended to be applied to all the other object world matrixs.
Typically the view matrix is thought of as a camera matrix or sometimes camera space While the camera itself should be thought of as some position and orientation just like any other object.

(the view matrix is created from the camera object in the world.).

It is used to translate and rotate everything around its position at least everything that is transformed by it. This primarily means your vertices from your model created in model space that were placed into your world and that you now wish to view from the point of view of a camera in that world.

This can be illustrated in steps on vertices… for example:
Here we go from model to world to view to screen space.
vertice = p
p * world = p1
p1 * view = p2
p2 * projection = p3

However transforming vertices by each matrice one at a time is inefficient, This is why we multiply the matrices together first. Once that is done we can then just multiply the vertices one time by the final matrix. In this case the steps are simply.
finalMatrix = world * view * projection; then.
vertice * finalMatrix = result.

(this is done on the gpu usually, so we pass all the matrices to a shader and do that work on the gpu however you can of course test this fully on the cpu to see it in steps including the step of building a final matrix.).

Vector3 has a transform function that takes a matrix just for this reason.
Since a vector3 can represent a position or a direction

The projection space deals with creating a type of projection into gpu or screen space it usually acts as a scaling matrix that scales x y coordinates to gpu virtual space its special in that it also can scale x y coordinates dependent on there depth. This defines a mathematical view frustrum (you can google it) It can do other things like effects on its own which are beyond the scope of this topic. Though it usually has to do primarily with a focal point. The same things you might learn in a art or drawing tutorial for actual painters is what a perspective projection matrix handles.

The projection matrix is essentially a fancy scaling matrix because gpu space is virtual it ranges from -1 to 1 from end to end (things beyond this range are clip-ed simply not drawn) so it scales everything way way down. It seldom needs to change unless you resize the game window.

We have matrix’s that denote these different spaces you will often hear of spaces and what space you are within. However just remember one multiplied by the other gives a combined space or if you like leads to a new space and even a 2d sprite can be thought of as a model as it has vertices,
So a model vertices * matrix.world = gives a model to world transformation. It places the model into the world space.

A model(or your vertices) * world * view = a WorldView space transform it translates vertices from model space were they were created thru world space were they are oriented into view space, were they are reoriented around the view matrix or the cameras lookat orientation. Finally by the projection were they are projected to the gpu’s coordinate system space typically just called screen space.

When you multiply all the matrices first…
world * view * projection you get a final matrix WorldViewProjection.

Though typically you per frame you keep a ViewProjection matrix combination even across multiple frames tacking on a altered world matrix to it whenever a objects world orientation changes. This lets you multiply your models vertices against the other matrixs one time or many times at different places. The gpu does this when you pass in your vertices thru draw and you set the world matrix for the model to the shader.
With that it can multiply vertices against the frames final matrix directly which is of course what shaders do.

Thx guys!
The codinglabs.net link is worth a million - finally understood it.

willmotil, thank you for your great effort, too!

Your welcome.

I have a camera class posted on here im about to update it a little bit later with more functionality.
I think im going to keep posting below the original as it changes.
So the whole post can serve as a reference of how it evolves to do more.

if you want to take a look at it as is its pretty basic it doesn’t do much yet maybe that is more helpful.