Monday, April 11, 2016

Understanding Foveated Rendering

Foveated rendering is a rendering technique that takes advantage of the fact that that the resolution of the eye is highest in the fovea (the central vision area) and lower in the peripheral areas. As a result, if one can sense the gaze direction (with an eye tracker), GPU computational load can be reduced by rendering an image that has higher resolution at the direction of gaze and lower resolution elsewhere.

The challenge in turning this from theory to reality is to find the optimal function and parameters that maximally reduce GPU computation while maintaining highest quality visual experience. If done well, the user shouldn’t be able to tell that foveated rendering is being used. The main questions to address are:
  1. In what angle around the center of vision should we keep the highest resolution? 
  2. Is there a mid-level resolution that is best to use? 
  3. What is the drop-off in “pixel density” between central and peripheral vision? 
  4. What is the maximum speed that the eye can move? This question is important because even though the eye is normally looking at the center of the image, the eye can potentially rotate so that the fovea is aimed at image areas with lower resolution.
Let's address these questions:

1. In what angle around the center of vision should we keep the highest resolution?

Source: Wikipedia
The macula portion of the retina is responsible for fine detail. It spans the central 18˚ around the gaze point, or 9˚ eccentricity (the angular distance away from the center of gaze). This would be the best place to put the boundary of the inner layer. Fine detail is processed by cones (as opposed to rods), and at eccentricities past 9˚ you see a rapid fall off of cone density, so this makes sense biologically as well. Furthermore, the “central visual field” ends at 30˚ eccentricity, and everything past that is considered periphery. This is a logical spot to put the boundary between the middle and outermost layer for foveated rendering.

2. Is there a mid-level resolution that is best to use?  and 3. What is the drop-off in “pixel density” between central and peripheral vision? 

Some vendors such as Sensomotoric Instruments (SMI) use an inner layer at full native resolution, a middle layer at 60% resolution, and an outer layer at 20% resolution. When selecting the resolution dropoff, it is important to ensure that at the layer boundaries, the resolution is at or above the eye’s acuity at that eccentricity. At 9˚ eccentricity, acuity drops to 20% of the maximum acuity, and at 30˚ acuity drops to 7.4% of the max acuity. Given this, it appears that SMI’s values work, but are generous compared to what the eye can see.

4.    What is the maximum speed that the eye can move?

Source: Indiana University
A saccade is a rapid movement of the eye between fixation points. Saccade speed is determined by the distance between the current gaze and the stimulus. If the stimulus is as far as 50˚ away, then peak saccade velocity can get up to around 900˚/sec. This is important because you want the high resolution layer to be large enough so that the eye can’t move to the lower resolution portion in the time it takes to get the gaze position and render the scene. So if system latency is 20 msec, and assume eye can move at 900˚/sec – eye could move 18˚ in that time, meaning you would want the inner (higheslayer radius to be greater than that – but that is only if the stimulus presented is 50˚ away from current gaze. 

Additional thoughts

Source: Vision and Ocular Motility by Gunter Noorden

Visual acuity decreases on the temporal side (e.g. towards the ear) somewhat more rapidly than on the nasal side. It also decreases more sharply below and, especially, above the fovea, so that lines connecting points of equal visual acuity are elliptic, paralleling the outer margins of the visual field. Following this, it might make sense to render the different layers in ellipses rather than circles. The image shows the lines of equal visual acuity for the visual field of the left eye – so one can see that it extends farther to the left (temporal side) for the left eye, and for the right eye visual field would extend farther to the right.

For additional reading

This paper from Microsoft research is particularly interesting. 
They approach the foveated rendering problem in a more technical way – optimizing to find layer parameters based on a simple but fundamental idea: for a given acuity falloff line, find the eccentricity layer sizes which support at least that much resolution at every eccentricity, while minimizing the total number of pixels across all layers. It explains their methodology though does not give their results for the resolution values and layer sizes.

Note: special thanks to Emma Hafermann for her research on this post

For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

Tuesday, April 5, 2016

Time-warp Explained

In the context of virtual reality, time warp is a technique to reduce the apparent latency between head movement and the the corresponding image that appears inside an HMD.

In an ideal world, the rendering engine would render an image using the measured head pose (orientation and position) immediately before the image is displayed on the screen. However, in the real world, rendering takes time, so the rendering engine uses a pose reading that is a few milliseconds before the image is displayed on the screen. During these few milliseconds, the head moves, so the displayed image lags a little bit after the actual pose reading.

Let's take a numerical example, Let's assume we need to render at 90 frames per second, so there are approximately 11 milliseconds for the rendering process of each frame. Let's assume that head tracking data is available pretty much continuously but that rendering takes 10 milliseconds. Knowing the rendering time, the rendering engine starts rendering as late as possible, which is 10 milliseconds before the frame needs to be displayed. Thus, the rendering engine uses head tracking data that is 10 milliseconds old. If the head rotates at a rate of 200 degrees/second, these 10 milliseconds are equivalent to 2 degrees. If the horizontal field of view of the HMD is 100 degrees and there are 1000 pixels across the visual field, a 2-degree error means that the image lags actual head movement by about 20 pixels.

However, it turns out that even a 2 degree head rotation does not dramatically change the perspective of how the image is drawn. Thus, if there was a way to move the image by 20 pixels on the screens (e.g. 2 degrees in the example), the resultant image would be pretty much exactly what the render engine would draw if the reported head position was changed by two degrees.

That's precisely what time-warping (or "TW" for short) does: it quickly (in less than 1 millisecond) translates the image a little bit based on how much the head rotated between the time the render engine used the head rotation reading and the time the time warping begins.

The process with time warping is fairly simple: the render engine renders and then when the render engine is done, the time-warp is quickly applied to the resultant image.

But what happens if the render engine takes more time than is available between frames? In this case, a version of time-warping, called asynchronous time-warping ("ATW") is often used. ATW takes the last available frame and applies time-warping to it. If the render engine did not finish in time, ATW takes the previous frame, and applies time-warping to it. If the previous frame is taken, the head probably rotated even more, so a greater shift is required. While not as ideal as having the render engine finish on time, ATW on the previous frame is still better than just missing a frame which typically manifests itself in 'judder' - uneven movement on the screen. This is why ATW is sometimes referred to as a "safety net" for rendering, acting in case the render did not complete on time. The "Asynchronous" part of ATW comes from the fact that ATW is an independent process/thread from the main render engine, and runs at a higher priority than the render engine so that it can present an updated frame to the display even if the render engine did not finish on time.

Let's finish with a few finer technical points:

  • The time-warping example might lead to believe that only left-right (e.g. yaw) head motion can be compensated. In practice, all three rotation directions - yaw, pitch and roll - can be compensated as well as head position under some assumptions. For instance, OSVR actually performs 6-DOF warping based in an assumption of objects that are 2 meters from the center of projection. It handles rotation about the gaze direction and approximates all other translations and rotations.
  • Moving objects in the scene - such as hands - will still exhibit judder if the render engine misses a frame, in spite of time-warping. 
  • For time-warping to work well, the rendered frame needs to be somewhat bigger than the size of the display. Otherwise, when shifting the image one might end up shifting empty pixels into the visible area. Exactly how much the rendered frame needs to be larger depends on the frame rate, and the expected velocity of the head rotation. Larger frames mean more pixels to render and more memory, so time warping is not completely 'free'
  • If the image inside the HMD is rendered onto a single display (as opposed to two displays - one per eye), time warping might want to use different warping amounts for each eye because typically one eye would be drawn on screen before the other.
  • Objects such as a menu that are in "head space" (e.g. should be fixed relative to head) need to be rendered and submitted to the time-warp code separately since they should not be modified for projected head movement.
  • Predictive tracking (estimating future pose based on previous reads of orientation, position and angular/linear velocity) can help as input to the render engine, but an actual measurement is always preferable to estimation of the future pose.
  • Depending on the configuration of the HMD displays, there may be some rendering delay between left eye and right eye (for instance, if the screen is a portrait-mode screen, renders top to bottom and the left eye maps to the top part of the screen). In this case, one can use different time warp values for each eye.

For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

Saturday, March 19, 2016

The temporary job of a VR bridesmaid

At GDC - the Game Developers Conference - I saw quite a lot of HTC Vive demos. Compare these two images:

See any similarities? Just like the bridesmaid carries the bride's train, the "VR bridesmaid" carries the wire for the VR user.

Low-latency wireless video links have been on the market for years. I hope to see some in consumer VR very soon, so that the "VR bridesmaid" can be a thing of the past.

Tuesday, March 15, 2016

Action Items from the OSVR Software Developer Survey

A couple of weeks ago, we surveyed the OSVR community for what they would like the core software development team to focus on. You can see the questions and answers here

Based on this input, the Sensics development team met and decided to focus on the following items:

1. Smoother end-user experience
This includes:

  • 1-click installer for both end-user and developer
  • Lightweight software that will detect when a new hardware is connected and help with the process of obtaining and installing the relevant drivers. 
  • A graphical configurator to help the end-user select the HMD, input and output devices as well as configure key parameters for each
2. Continue to add device support. An immediate focus is to add support for the HTC Vive so that an end-user can obtain software that was written on the OSVR framework that - using a simple configurator - decide what to run it on. For instance 1) HTC Vive; 2) Oculus + hand controller; 3) OSVR HDK and more.

3. Make it easier for the OSVR community to contribute to the platform by listing key development priorities as well as high-level directions on how to perform certain development tasks.

Keep an eye on the progress of the OSVR software platform in the coming weeks.

Thank you for your feedback!

Monday, February 15, 2016

Vision Summit 2016: Using OSVR to Support (practically) Any Device in AR/VR

I delivered this presentation last week at the Vision 2016 Summit

Full video:

Slides only:

The key point in the presentation is that no one wants to write AR/VR applications that work only on one device. To put a positive frame on it, the ability of applications to work across a wide range of displays, inputs and output devices is valuable to practically to everyone:

  • Content providers want their applications to be used on the widest range of possible devices.
  • Makers of VR displays, input or output devices don't want to settle for a few pieces of content that are written specifically for then; they want access to a wide range of content
  • Consumers want 2016 applications to work on 2017 hardware without having to buy upgrades or wonder if their new hardware will ever be supported
OSVR achieves just that - it allows runtime choice of what input, output and display devices to use and the presentation illustrates this. OSVR supports numerous devices today, with new devices being added every week.

Monday, January 25, 2016

Got new or custom HMD? Need direct render, time warping, distortion correction and game engine integration?

Google Concept from Sensics
Let's assume you built a new HMD. You've done all the hardware work - figured out the optics and display setup, created driver boards, integrated a motion tracker and perhaps an eye tracker, worked out the ergonomics issues. It's a lot of work and you should be commended on doing it.

How do you get software support for it? Specifically, you probably want to:
- Model the field of view and allow users to correct any optical or color distortion
- Obtain support for direct render and asynchronous time warp
- Get Unity and Unreal drivers for your trackers
- Get some demo content and some cool games running on it
- and, in general, make it easy for others to support your new creation.

 A solution to consider is OSVR. By integrating your new HMD into the open-source OSVR framework, you can get all that done (and more) very quickly. Get low-latency rendering for your HMD; correct distortion in one of several possible ways; support for many game engines; debug and demonstration software.

OSVR already supports many devices (full list here) and based on the work team Sensics did with several HMD vendors over the last few months, we put up documentation on how to add an HMD to OSVR. You can find it here and it is part of the official OSVR developer documentation repository.

Let me know what you think and how we can make it easier to get this done.

Saturday, January 23, 2016

Snow World

As I write this, not far from Washington, DC, a major snow storm is raging outside. Forecasts call for over 20 inches (50 cm) of snow on the ground, and current accumulation is already not far from that.

With all this snow outside, I can't help but remember Snow World, a VR pain management experience that was one of first useful non-gaming VR applications. Developed many years ago at the University of Washington by Dr. Hunter Hoffman, Dr. David Patterson and a team from Firsthand, SnowWorld is a simple game that takes place in an icy canyon where users throw snow balls at snowmen and penguins, Clinical trials have shown that when burn victims use this game, patients report a dramatic reduction in pain.

See this report from NBC News about the experience:

Visit for breaking news, world news, and news about the economy

The ability of VR to 'trick' the brain is being used not just for therapy - just like in snow-world - but also in other areas. For instance, Redirected Walking uses subtle visual cues to make people walk in circles inside a limited space, even though they think they are walking in a straight line. This technique is being used in The Void experience today. Redirected walking allows experiencing a large virtual world in spite of being constricted to a smaller physical space, helps avoid physical obstacles and It allows multiple people to be immersed in the same physical space without bumping into each other.

Similarly, the usefulness of VR in therapy is not limited to pain management but extends to other areas such as PTSD, fear of flying, fear of heights, fear of public speaking and much, much more.

VR gaming is great, but VR applications like Snow World can have a more important impact on those that need it.