Friday, April 18, 2014

New iPhone utility app simplifies VR calculations

A few months ago, I realized we go back again and again to the same Excel spreadsheets for some VR calculations. For instance, if the field of view is 90 degree diagonal and the aspect ratio is 16:9, what is the horizontal field of view? If using a goggle is like watching a 70 inch TV from 6 feet, what does that say about the field of view?

To help with this, the Sensics team decided to create a simple iPhone app (Android version coming soon) that helps with these calculations. This app is now available as a free download from iTunes. Think about it as public service for the VR community.

In it's current beta form, the app provides several useful conversions:

  • TV screen size to/from goggle field of view
  • Screen size and aspect ratio to/from field of view
  • Quaternion to/from Euler angles
The app also includes useful reference info as well as a way to ask the VR experts at Sensics.

Some of this math already appeared on this blog such as here and here, but it is now available in your pocket.

What additional calculators would you find useful? What changes would you like to see in the app? Any bugs that need to be fixed? Let me know.

Monday, April 14, 2014

Why have two types of HMDs: those with OLED micro-displays and those with flat-panel displays?

I was recently asked why it made sense to offer HMDs based on micro displays at the same time that we offer HMDs based on flat-panel displays.

Because we can.

But seriously, here's why.

Micro-OLEDs (such as those from eMagin) have certain advantages:

  • Small physical size allows for small physical size of HMD. Compare, for instance, the size of the zSight 1920 which uses micro-OLEDs with the size of the dSight which uses flat-panel displays. Both have 1920x1080 pixels per eye, but the dSight is physically larger. For applications that have space constraints or where the user needs to brings objects close to the cheek (such as a gun), a small HMD has a big advantage.
  • Micro-OLEDs current offer higher refresh rate: 85 Hz as opposed to typically 60 Hz for flat-panels. 
  • Most available flat-panel displays use some version of LCD technology. OLEDs offer superior response time and contrast. However, several vendors have announced (or are already selling) OLED flat-panel displays.
  • If you care about pixel density, it is easier to design an optical system that would provide very high pixel density - even eye-limiting resolution - using a small micro-display. High pixel density implies lower field of view for the same number of pixels. You would care about high pixel density if you need to see every bit of detail in your image, or need to detect virtual objects at far distances, such as in military training.

Flat-panel displays have different advantages:

  • Their cost is much lower since they are key components to cell phones.
  • Larger supplier diversity.
  • Much easier to create very wide field-of-view systems than with the micro-OLEDs. If you care about immersion, you can usually get more immersion with flat panels. Of course, wider field of view implies lower pixel density.
  • Resolutions are rapidly increasing. 1920x1080 seems to be the current standard for high-end phones but this will soon be displaced by 2560x1440 or other high resolutions.

Ultimately, there would be many more HMDs that are based on flat-panels, but there are unique professional applications that would continue to prefer OLED micro-displays.

Sunday, April 6, 2014

IEEE VR presentation: the next technical frontiers for HMDs

The annual IEEE conference on virtual reality took place in Minneapolis last week. It was a unique opportunity to meet some of the leading VR researchers in the world, to showcase new product innovations and to exchange views on the present and future of VR.

I had the pleasure of sharing the stage in "the battle of the HMDs" panel session at the conference, together with David A Smith, Chief Innovation Officer for Lockheed Martin, Stephen Ellis who leads the Advanced Displays and Spatial Perception Laboratory at NASA and Dr. Jason Jerald of NextGen Interactions.

Below are a (slightly edited) version of my slide and a free-form version of the accompanying text. The audience was primarily VR researchers, so if one thinks of "R&D" as "Research and Development", this talk was aimed more at the research side then the development side.

I believe that there are three layers to what I call the "HMD value pyramid": baseline technology, sensing and context. As one would expect, the pyramid cannot stand without its baseline technology, which we will discuss shortly, but once baseline technology exists, additional layers of value build upon it. While the baseline technologies are mandatory, the real value in my opinion is in the layers above it. This is where I am hoping the audience will focus their research: making these layers work, and then developing methods and algorithms to make these capabilities affordable and thus widespread.

There are several components that form the baseline of the VR visual experience:
  • Display(s)
  • Optics that adapt the displays to the appropriate viewing distance and provide the desired field of view, eye relief and other optical qualities.
  • Ergonomics: a way to wear these optics and displays comfortably on the head, understanding that there are different sizes and facial formations, and quickly adjust them to an optimal position
  • Wireless video, which allows disconnecting an HMD from a host computer, thus allowing freedom of motion without risk of cable entanglement
  • Processing power, whether performing the simple tasks of controlling the displays, performing calculation-intensive activities such as distortion correction or ultimately allowing applications to run completely inside the HMD without the need to connect to an external computing device.
There will clearly continue to be many improvements in these components. We will see higher-resolution and faster displays. We will continue to see innovative optical designs (as Sensics is showing in the exhibit outside). We will continue to see alternative displays such as pico projectors. But basically, we can now deliver reasonably good visual experience in a reasonably good price. Yes, just like in cars or audio systems or airplane seats or wedding services, there are different experience levels and different price levels, but I think these topics are moving from a 'research' focus into a 'development' focus.

Once the underlying technologies of the HMD are in place, we can move the next layer which I think is more interesting and more valuable: the sensory layer. I've spoken and written about this before: beyond a head-worn display, the HMD is a platform. It is a computing platform but it is first and foremost a sensory platform that is uniquely positioned to gather real-time information about the user. Some of the key sensors:
  • Head orientation sensors (yaw/pitch/roll) that have become commonplace in HMDs
  • Head position sensors (X/Y/Z)
  • Position and orientation sensors for other body parts such as arms or legs
  • Sensors to detect hands and fingers
  • Eye tracking which provides real-time reporting of gaze direction
  • Biometric sensors - heart rate, skin conductivity, EEG
  • Outward-facing cameras that can provide real-time image of the surroundings (whether visible, IR or depth)
  • Inward-facing cameras that might provide clues with regards to facial expressions
Each of these sensors are in a different stage of technical maturity. Head orientation sensors, for instance, used to cost thousands of dollars just a few years back. Today, an orientation sensor can be had for a few dollars and are much more powerful than those of the past: tracking accuracy has improved. Predictive tracking is sometimes built in. Tremor cancellation; gravity direction sensing, sensing of the magnetic north, and of course reporting speeds are increasingly higher.

HMD eye tracking sensors are behind in the development curve. Yes, it is possible to buy excellent HMD-based eye trackers for $10K-$20K, but at these prices, only a few can afford them. What would it take to have a "good enough" eye tracker follow the price curve of the orientation tracker?

HMD-based hand and finger sensors are probably even farther behind in terms of robustness, responsiveness, detection field and analysis capabilities.

All these sensors could bring tremendous benefits to the user experience, to the ability of the application to effectively serve the user, or even to the ability of remote users to naturally communicate with each other while wearing HMDs. I think the challenge this this audience is to advance these frontiers: make these sensors work; make them work robustly (e.g. across many users, in many different conditions and not just in the lab) and then make them in such a way that they can be mass-produced inexpensively. Whether these required breakthroughs are in new types of sensing elements, or new computational algorithms, that is up to you to decide, but I can't under-emphasize how important sensors are beyond the basic capabilities of HMDs.

Once sensors have been figured out, context is the next and ultimate frontier. Context takes data from one or more sensors and combines it into information. It provides the application a high-level cue of what is going on; what the user is doing or where the user is or what's going to happen next.
For instance, it's nice to know where my hand is, but tracking the hand over time might indicate that I am drawing a "figure 8" in air. Or maybe that my hands are positioned to signal a "time out". Or maybe, as in the Microsoft patent filing image above, that the hand is brought close to the ear to signal that I would like to increase the volume. That "louder" gesture doesn't work if the hand is 50 cm from the head. It takes an understanding of where the hand is relative to the head and thus I look at it as a higher level of information relative to just the positional data of the head and hand.

Additional examples of context that is derived from multiple sensors: the user is walking; or jumping; or excited (through biometric data and pupil size); or smiling; or scared. The user is about to run into the sofa. The user is next to Joe. The user is holding a toy gun and is aiming at the window.

Sometimes, there are many ways to express the same thing. Consider a "yes/no" dialog box in Windows. The user might click on "yes" using the mouse, or "tab" over to the "yes" button and hit space, or click alt-y, or say yes, and there are perhaps a few other modes to achieve the same result. Similarly in VR, the user might speak "yes" or might nod her head up and down in a "yes" gesture, or might provide the thumbs up sigh, or might touch a virtual "yes" button in space. Context enables the multi-modal interface that focuses on "what" you are trying to express as opposed to exactly "how" you are doing it.

Context, of course, requires a lot of research. Which sensors are available? How much can their data be trusted? How can we minimize training? How can we reduce false negative or false positives? This is yet another great challenge to this community.

In summary, we live in exciting times for the VR world, and we hope that you can join us for the travel up the HMD value pyramid.