The Verge posted a nice video about how to create mixed reality videos with the HTC Vive/SteamVR.
The one big that is sort of missing is more explanation about how you align the real world camera with the virtual one that has the third controller attached. Basically you need to specify in a config file the transform (offset and rotation) from the position of the controller attached to the real camera to the ideal focus point on the camera itself. You also need to specify that cameras “FOV” in a way that makes sense to a game engine.
At a basic level, we just created a test app that drew the virtual controllers and let you type keys to adjust the configuration in real time until they matched. I think I used X,Y,Z for the translation and U,R,F for the rotation axis. Put the real camera on a fixed tripod and adjust until it all looks right. The problem is there are a lot of degrees of freedom and if something like the FOV isn’t set just right you might get things looking just right in one place but they can be significantly off in the distance or at the edges of the view. So you keep moving the controllers and readjust until you lock it in.
Or of course if you have more than two controllers you can do it more easily. Since we had plenty of controllers around Valve we were able to track 8 of them at the same time (plus the HMD and the extra controller on the camera of course!) So we distributed them around and used alignment points on the real camera with alignment lines we drew in the test app to line them all up.
This photo shows the actual configuration of the cameras during calibration. Notice that we lined up several of the controllers on the specific alignment points, some in the foreground and some further out.
And this photo shows me getting things roughly in position in the test app with the controllers just in a circle. You can see the real camera view on the back monitor and the virtual one showing the exact same controllers right in front of me.
Hopefully soon more precise ways to do this will be available. Ideally you could have fiducial markers that software can just read to get the exact alignment of the camera. Also you could measure the distortion of the real-world camera and adjust for that. For example we noticed that our alignment was slightly off near the outsides of the frame and you could adjust for that when rending the mixed reality view of the game. In practice this latter effect was pretty small and since typically not a lot of action is happening at the edges people won’t notice much.