Indeed, the price barrier to virtual reality goggles is coming down. VR used to be cool yet expensive, but that is changing to just being cool. You can find decent goggles for around $500, or even build your own. Years ago, you had to spend upwards of $1000 on a good yaw/pitch/roll head tracker. Now, you can buy decent ones for about $100, and if you wanted to make your own, components will cost you just a few dollars. Sure, if you want professional-grade performance - truly high resolution, high-precision tracking, high-quality optics with very little distortion - you will still need to pay professional-grade prices. However, the high school teacher that wants some VR in the classroom, or the gamer that wants to wear some 3D on the head without breaking the bank, are starting to have viable options.
A VR goggle by itself, inexpensive as it may be, is still just a monitor on the head. To interact with the application - game, 3D paint program, operating system - you'll want to start adding sensors. The motion tracker on the goggles will tell you which way your head is turned, but that's not enough. How can I choose an option? Raise a weapon? Manipulate a virtual object? Sensors, sensors and more sensors. Perhaps a Kinect can help capture your posture; an eye tracker can figure out your gaze. A pair of motion controllers will tell you where your hands are. The list can go on and on: a body suit gives additional information on other limbs; a voice recognition chip can help with voice commands. Cameras on the goggles can identify faces, objects or markers around you. Biometric sensors can tell if you are breathing heavily, if your heart rate is up or even if you are drunk.
Sounds like a serious case of sensor overload with several problems. Some tactical problems include:
- Complexity. How many sensors are you willing to wear in order for a game to know what it needs to know about you? Once you are done putting all the sensors on you, would you be willing to start instrumenting the gaming weapon? Putting markers on the walls? Connecting cameras in the room?
- Which sensor to use when? A Kinect might be great at understanding your posture when you are right in front of the TV, but what happens when you walk away? How robust are sensors in various lighting conditions? What happens when one sensor breaks down? How does one determine which sensors are available and which sensors are best to use at any given time?
- Timing and synchronization - can you read out the data from multiple sensors in a synchronized, low-latency method? Can you get them over to the host computer (game console, PC, tablet) in a wireless fashion?
Tactical problems aside, the strategic issue is that very often, what the application really needs is not the raw data but rather some higher-level understanding of the context of the user. It's nice to know that both knees are bent at 90 degrees and that the user's body is stationary, but wouldn't the application just want to know what the user is sitting down? That the gaming weapon has been raised to the shoulder? The the user is very close to the sofa in the living room and, in fact, on a collision course with it?
Sensors give us the data - unorganized facts that need to be processed. What we also really want is information - organizing, processing and putting the data in context so that it becomes useful.
Context leads us to a higher-level understanding and is highly useful. Imagine writing a game where you truly understand the user's activity (running, jumping, sitting down, on the ground, holding a steering wheel), position (1 meter away from the sofa, lifting the gaming weapon, at a heading of 45 degrees relative to another player) and even the biophysical or emotional state - tired, excited, scared.
What is needed is a framework or architecture to collect data from sensors, fuse it and generate useful information and context. A low-cost goggle with decent image quality is good news, but getting the context right is key to the next-level of immersion and interaction.