Yuval Boger (VRGuy) has been doing VR since 2006. He shares his experience and views on HMDs and VR technologies.
Also, check out the VRguy podcasts where I host industry experts for deeper conversations about VR and AR.
Wednesday, October 19, 2016
Tuesday, August 9, 2016
Why did Sensics launch the OSVR Store?
Last week, the OSVR Store came on-line. It offers a range of OSVR-related products, services, accessories and components. It also contains useful information, most of it adopted from this blog.
But why did the Sensics team launch it?
The first answer that comes to mind is “to make money”. That’s an obvious reason, as Sensics is a for-profit company. We invest a lot in developing OSVR and would love to see returns on our investments.
But that’s not the only reason, nor perhaps the most important one. Here are some others.
We wanted the OSVR Store to be helpful to the VR enthusiast and hacker. That’s why we offer components: optics, tracking boards from various vendors, IR camera. More components are coming. Some will use those to upgrade an existing system, others to build a new one.
We wanted a place for hardware developers, a platform to market their innovations. If you make something OSVR-related, we invite you to sell it on the OSVR Store. It can be an OSVR-supported HMD. It can be an accessory or component that can help OSVR users. It can even be OSVR-related services. We strive to offer fair and simple terms. If you can build it, we can help you promote it. Drop us a note at hello@osvrstore.com to get started.
To me, OSVR has always been about choice. About democratizing VR. Not forcing users to buy everything from the same vendor. Encouraging applications to run on many devices. Support more than one operating system.
The OSVR Store is one more way to give everyone choice. Check it out.
Monday, August 1, 2016
OSVR - a Look Ahead
Introduction
OSVR is an open source software platform and VR goggle. Sensics and Razer launched OSVR 18 months ago with the intent of democratizing VR. We wanted to provide an open alternative to walled-garden, single-device approaches.It turns out that others share this vision. We saw exponential growth in participation in OSVR. Acer, NVIDIA, Valve, Ubisoft, Leap Motion and many others joined the ecosystem. The OSVR goggle – called the Hacker Development Kit – has seen several major hardware improvements. The founding team and many other contributors expanded the functionality of the OSVR software.
I’d like to describe how I hope to see OSVR develop given past and present industry trends.
Increased Device Diversity leads to more Choices for Customers
Trends
An avalanche of new virtual reality devices arrived. We see goggles, motion trackers, haptics, eye trackers, motion chairs and body suits. There is no slowdown in sight: many new devices will launch in the coming months. What is common to all these devices? They need software: game engine plugins, compatible content and software utilities. For device manufacturers, this software is not a core competency but ‘a necessary evil’. Without software, these new devices are almost useless.At the same time, content providers realize it’s best not to limit the their content to one device. The VR market is too small for that. The more devices you support, the largest your addressable market becomes.
With such rapid innovation, what was the best VR system six months ago is anything but that today. The dream VR system might be a goggle from one vendor, input devices from another and tracking from a third. Wait another six months and you’ll want something else. Does everything need to come from the same vendor? Maybe not. The lessons of home electronics apply to VR: you don’t need a single vendor to make all your devices.
This ‘mix and match’ ability is even more critical for enterprise customers. VR arcades, for instance, might use custom hardware or professional tracking systems. They want a software environment that is flexible and extensible. They want an environment that supports ‘off-the-shelf’ products yet extends for ‘custom’ designs.
OSVR Implications
OSVR already supports hundreds devices. The up-to-date list is here: http://osvr.github.io/compatibility/ . Every month, device vendors, VR enthusiasts and the core OSVR team add new devices. Most OSVR plugins (extension modules) are open-sourced. Thus, it is often possible to use an existing plugin as baseline for a new one. With every new device, we come closer towards achieving universal device support.
A key OSVR goal is to create abstract device interfaces. This allows applications to work without regards to the particular device or technology choice. For example, head tracking can come from optical trackers or inertial ones. The option of a a “mix and match” approach overcomes the risk of a single vendor lock-in. You don’t change your word processor when you buy a new printer. Likewise, you shouldn’t have to change your applications when you get a new VR device.
We try to make it easy to add OSVR support to any device. We worked with several goggle manufactures to create plugins for their products. Others did this work themselves. Once such a plugin is ready, customers instantly gains access to all OSVR content. Many game engines – such as Unity, Unreal and SteamVR- immediately support it.
The same is also true for input and output peripherals such as eye trackers and haptic devices. If developers use an API from one peripheral vendor, they need to learn a new API for each new device. If developers use the OSVR API, they don’t need to bother with vendor-specific interfaces.
I would love to see more enhancements to the abstract OSVR interfaces. They should reflect new capabilities, support new devices and integrate smart plugins.
OSVR already supports hundreds devices. The up-to-date list is here: http://osvr.github.io/compatibility/ . Every month, device vendors, VR enthusiasts and the core OSVR team add new devices. Most OSVR plugins (extension modules) are open-sourced. Thus, it is often possible to use an existing plugin as baseline for a new one. With every new device, we come closer towards achieving universal device support.
A key OSVR goal is to create abstract device interfaces. This allows applications to work without regards to the particular device or technology choice. For example, head tracking can come from optical trackers or inertial ones. The option of a a “mix and match” approach overcomes the risk of a single vendor lock-in. You don’t change your word processor when you buy a new printer. Likewise, you shouldn’t have to change your applications when you get a new VR device.
We try to make it easy to add OSVR support to any device. We worked with several goggle manufactures to create plugins for their products. Others did this work themselves. Once such a plugin is ready, customers instantly gains access to all OSVR content. Many game engines – such as Unity, Unreal and SteamVR- immediately support it.
The same is also true for input and output peripherals such as eye trackers and haptic devices. If developers use an API from one peripheral vendor, they need to learn a new API for each new device. If developers use the OSVR API, they don’t need to bother with vendor-specific interfaces.
I would love to see more enhancements to the abstract OSVR interfaces. They should reflect new capabilities, support new devices and integrate smart plugins.
More People Exposed to more VR Applications in More Places
Trends
Just a few years ago, the biggest VR-centric conference of the year had 500 attendees. Most attendees had advanced computer science degrees. My company was one of about 10 presenting vendors. Today, you can experience a VR demo at a Best Buy. You can use a VR device on a roller coaster. With a $10 investment, you can turn your phone into a simple VR device.In the past, to set up a VR system you had to be a geek with plenty of time. Now, ordinary people expect to do it with ease.
More than ever, businesses are experimenting with adopting VR. Applications that have always been the subject of dreams of are becoming practical. We see entertainment, therapy, home improvement, tourism, meditation, design and many other applications.
These businesses are discovering that different applications have different hardware and software requirements. A treadmill at home is not going to survive the intensive use at a gym. Likewise, a VR device designed for home use is not suitable for use in a high-traffic shopping mall. The computing and packaging requirements for these applications are different from use to use. Some accept a high-end gaming PC, while others prefer inexpensive Android machines. I expect to see the full gamut of hardware platforms and a wide variety of cost and packaging options.
OSVR Implications
“Any customer can have a car painted any color that he wants so long as it is black”, said Henry Ford. I’d like to see a different approach, one that encourages variety and customization.On the hardware side, Sensics is designing many products that use OSVR components. For instance, our “Goggles for public VR” use OSVR parts in an amusement park goggle. We also help other companies use OSVR components inside their own packages. For those that want to design their own hardware, the OSVR goggle is a good reference design.
On the software side, I would like to see OSVR expand to support more platforms. I’d like to see better Mac support and more complete coverage of Android and Linux platforms. I’d like to see VR work well on mid-range PCs and not limited to the newest graphics cards. This will lower the barriers to experience good VR and bring more people into the fold. I’d like to see device-specific optimizations to make the most of available capabilities. The OpenCV image processing library has optimizations for many processors. OSVR could follow a similar path.
Additionally, it is important to automate or at least simplify the end-user experience. Make it as close to plug-and-play as possible . The task of identifying available devices and configuring them should be quick and simple.
Simplicity is not limited to configuration. We’d like to see easier ways to choose, buy and deploy software.
Reducing Latency is Becoming Complex
Trends
Presence in VR requires low latency, and reducing latency is not easy. Low latency is also not the result of one single technique. Instead, many methods work together to achieve the desired result. Asynchronous time warp modifies the image just before sending it to the display. Predictive tracking lowers perceived latency by estimating future orientation. Direct mode bypasses the operating system. Foveated rendering reduces render complexity by understanding eye position. Render masking removes pixels from hidden areas in the image.If this sounds complex, it is just the beginning. One needs to measure optical distortion and correct it in real-time. Frame rates continue to increase, thus lowering the available time to render a frame. Engines can optimize rendering by using similarities between the left- and right-eye images. Techniques that used to be exotic are now becoming mainstream.
A handful of companies have the money and people to master all these techniques. Most other organizations prefer to focus on their core competencies. What should they do?
OSVR implications
A key goal of OSVR is to “make hard things easy without making easy things hard”. The OSVR Render Manager examplifies this. OSVR makes these latency-reduction methods available to everyone. We work with graphics vendors to achieve direct mode through their API. We work with game engines to provide native integration of OSVR into their code.I expect the OSVR community to continue to keep track of the state of the art, and improve the code-base. Developers using OSVR can focus away from the plumbing of rendering. OSVR will continue to allow developers to focus on great experiences.
The Peripherals are Coming
Trends
A PC is useful with a mouse and keyboard. Likewise, A goggle is useful with a head tracker. A PC is better when adding a printer, a high-quality microphone and a scanner. A goggle is better with an eye tracker, a hand controller and a haptic device. VR peripherals increase immersion and bring more senses into play.In a PC environment, there are many ways to achieve the same task. You select an option using the mouse, the keyboard, by touching the screen, or even with your voice. In VR, you can do this with a hand gesture, with a head nod or by pressing a button. Applications want to focus on what you want to do rather than how you express your wishes.
More peripherals mean more configurations. If you are in a car racing experience, you’d love to use a rumble chair if you have it. Even though Rumble chairs are not commonplace, there are several types of them. Applications need to be able to sense what peripherals are available and make use of them.
Even a fundamental capability like tracking will have many variants. Maybe you have a wireless goggle that allows you to roam around. Maybe you sit in front of a desk with limited space. Maybe you have room to reach forward with your hands. Maybe you are on a train and can’t do so. Applications can’t assume just one configuration.
OSVR implications
OSVR embeds Virtual Reality Peripheral Network (VRPN), an established open-source library. Supporting many devices and focusing on the what, not the how is in our DNA.I expect OSVR to continue to improve its support for new devices. We might need to enhance the generic eye tracker interface as eye trackers become more common. We will need to look for common characteristics of haptics devices. We might even be able to standardize how vendors specify optical distortion.
This is a community effort, not handed down from some elder council in an imperial palace. I would love to see working groups formed to address areas of common interest.
Turning Data into Information
Trends
A stream of XYZ hand coordinates is useful. Knowing that this stream represents a ‘figure 8’ is more useful. Smart software can turn data into higher-level information. Augmented reality tools detect objects in video feeds. Eye tracking software converts eye images into gaze direction. Hand tracking software converts hand position into gestures.Analyzing real-time data gets us closer to understanding emotion and intent. In turn, applications that make use if this information can become more compelling. A game can use gaze direction to improve the quality of interaction with a virtual character. Monitoring body vitals can help achieve the desire level of relaxation or excitement.
As users experience this enhanced interaction, they will demand more of it.
OSVR Implications
Desktop applications don’t have code to detect a mouse double-click. They rely on the operating system to convert mouse data into the double-click event. OSVR needs to provide applications with both low-level data and high-level information.In “OSVR speak”, an analysis plugin is the software that converts data into information. While early OSVR work focused on lower-level tasks, several analysis plugins are already available. For example, DAQRI integrated a plugin that detects objects in a video stream.
I expect many more plugins will become available. The open OSVR architecture opens plugin development to everyone. If you are an eye tracking expert, you can add an eye tracking plugin. If you have code that detects gestures, it is easy to connect it to OSVR. One might also expect a plugin marketplace, like an asset store, to help find and deploy plugins.
Augmenting Reality
Market trends
Most existing consumer-level devices are virtual reality devices. Google Glass has not been as successful as hoped. Magic Leap is not commercial yet. Microsoft Hololens kits are shipping to developers, but are not priced for consumers yet.With time, augmented-reality headsets will become consumer products. AR products share many of the needs of their VR cousins. They need abstract interfaces. They need to turn data into information. They need high-performance rendering and flexible sensing.
OSVR Implications
The OSVR architecture supports AR just as it supports VR. Because AR and VR have so much in common, many components are already in place.AR devices are less likely to tether to a Windows PC. The multi-platform and multi-OS capabilities of OSVR will be an advantage. Wherever possible, I hope to continue and see a consistent cross-platform API for OSVR. This will allow developers to tailor deployment options to the customer needs.
Summary
We designed OSVR to provide universal connectivity between engines and devices. OSVR makes hard things easy so developers can focus on fantastic experiences, not plumbing. It is open so that the rate of innovation is not constrained by a single company. I expect it to be invaluable for many years to come. Please join the OSVR team and myself for this exciting journey.
To learn more about our work in OSVR, please visit this page
This post was written by Yuval Boger, CEO of Sensics and co-founder of OSVR. Yuval and his team designed the OSVR software platform and built key parts of the OSVR offering.
Labels:
augmented reality,
middleware,
OSVR,
roadmap,
sensors
Thursday, July 21, 2016
Key Parameters for Optical Designs
At Sensics, we completed many optical designs for VR over the years, and are busy these days with new ones to accommodate new displays and new sets of requirements. For those thinking about optics, here is a collection of some important parameters to consider, when focusing in optical systems for VR.
Field of View: typically measured in degrees, the field of view defines what is the horizontal, vertical and diagonal extent that can be viewed at any given point. This is often specified as a monocular (single eye) field of view, but it is also customary to specify the binocular field of view and thus the binocular overlap
Material and type of lens: a lens is typically made from optical-grade plastic or from glass. There are hundreds of different optical-grade glass types but only about a dozen optical-grade plastic material. Different material provide different light bending properties (e.g. index of refraction) so it is quite common that multi-element optical systems are made with more than one material. Glass is typically heavier, more expensive to mold, but has greater variety, provides better surface quality and is often physically harder (e.g. more resistant to scratches). Plastic is cheaper and lighter. Additional lens types and non-linear optical elements such as Fresnel Lenses and polarizers are also available.
Distortion: optical distortion is one type of imperfection in an optical design. Distortion causes straight lines not being seen as straight lines when viewed through the optics. An example of this is shown below.
Field of View: typically measured in degrees, the field of view defines what is the horizontal, vertical and diagonal extent that can be viewed at any given point. This is often specified as a monocular (single eye) field of view, but it is also customary to specify the binocular field of view and thus the binocular overlap
Eye relief: typically measured in millimeters, the eye relief indicates the distance between the eye and the closest optical element as seen in the illustration below.
Illustration of eye relief |
Regular eyeglasses have an eye relief of about 12mm.
Advantages of larger eye relief:
- If the optics are too close to the eye, they generate discomfort such as when the eyelashes touch the optics.
- If the eye relief is large enough, the system might be able to accommodate people wearing glasses without the need to provide a focusing mechanism to compensate for not having glasses
- The total depth of the optical system (distance from eye to screen) becomes larger and the overall system potentially more cumbersome.
- The minimal diameter first optical element is dictated by a combination of the desired field of view and eye relief. Larger eye relief requires the lens to be wider and thus likely heavier.
Comparing optical quality at a distance away from the optimal eye position |
Material and type of lens: a lens is typically made from optical-grade plastic or from glass. There are hundreds of different optical-grade glass types but only about a dozen optical-grade plastic material. Different material provide different light bending properties (e.g. index of refraction) so it is quite common that multi-element optical systems are made with more than one material. Glass is typically heavier, more expensive to mold, but has greater variety, provides better surface quality and is often physically harder (e.g. more resistant to scratches). Plastic is cheaper and lighter. Additional lens types and non-linear optical elements such as Fresnel Lenses and polarizers are also available.
Distortion: optical distortion is one type of imperfection in an optical design. Distortion causes straight lines not being seen as straight lines when viewed through the optics. An example of this is shown below.
Optical distortion |
Distortion is reported in percentage units. If a pixel is placed at a distance of 100 pixels (or mm or degrees or inches or whichever unit you prefer) and appears as if it at a distance of 110, the distortion at that particular point is (110-100)/100 = 10%. During the process of optical design, distortion graphs are commonly viewed during the iterations of the design. For instance, consider the distortion graph below for a design with 96 degrees field of view (2 x 48):
Distortion graph |
The graph shows, for instance, that at 30 degrees away from the center, distortion is still about 2-3%, but at 40 degrees away from the center it increases to about 8 percent. The effect of distortion is sometimes shown in a distortion grid shown below. If the optical design was perfect and had no distortion, each blue cross would line up perfectly at the grid intersection points.
Distortion Grid |
Sometimes, distortion is monotonic, meaning that it gradually increases as one moves towards the edge. Non-monotonic distortion can cause the appearance of a 'bubble' if not corrected.
Chromatic aberration: Just like white light breaks into various colors when passing through a prism, an optical system might behave differently for different wavelengths/colors. This could cause color breakup. It is useful to explore how much the system is 'color corrected' so as to minimize this color breakup. The image below shows a nice picture at the center of the optical system but fairly significant color breakup at the edges.
Color breakup |
Relative illumination: the ability of an optical system to collect light can change throughout the image. Consider a uniformly-lit surface that is viewed through an optical system. Often, the perceived brightness at the center of the optics is the highest and it drops is one moves towards the edges. This is numerically expressed as relative illumination such as the graph below. While the human eye has amazing dynamic range, non-monotonic illumination can cause the appearance of dark or bright 'rings' in the image.
Relative Illumination |
Spot size: imagine a screen with a pattern of tiny dots. In a perfect world, all dots would appear with the same size and no smear when looking through the optical system. In reality, the dot size typically increases as one moves away from the center. The numerical measurement of this is the spot size and diagrams indicating the spot size at different points through the optics often look something like this:
Spot Size |
Other characteristics: depending on the desired use case, there are often size, weight and cost limitations that need to be considered to narrow the range of acceptable solutions to the specifications. Just like it is easier to fit a higher-degree polynomial to a set of data points because more terms provide additional degrees of freedom, it is easier to achieve the a set of desired optical parameters with additional lenses (or more precisely with additional surfaces), but extra lenses often add cost, size and weight.
Putting it all together: it is practically impossible to find a car that is inexpensive, has amazing fuel efficiency, offers fantastic acceleration, seats 7 people and is very pleasing to the eye. Similarly, it is difficult to design an optical system that has no distortion, provides wide field of view, large eye box, costs next to nothing and is very thin. When contracting the design of an optical system, it is useful to define all desired characteristics but specify which parameters are key and which parameters are less important.
Labels:
distortion,
eye box,
eye relief,
field of view,
optics,
spot size
Wednesday, June 22, 2016
The Three (and a half) Configurations of Eye Trackers
Eye tracking could become a critical sensor in HMDs. In previous posts such as here, here and here we discussed some of the ways that eye trackers could be useful as input devices, as ways to reduce rendering load and more.
But how are eye trackers installed inside an HMD? An appropriate placement of the eye tracking camera gives a quality image of the eye regardless of the gaze direction. If the eye image is bad, the tracking quality will be bad. It's truly a 'garbage in, garbage out' situation.
The three typical ways to install a camera are:
- Underneath the optics
- Combined with the optics via a hot mirror (or an internal reflection)
- Inside the optics.
Underneath the optics
This configuration is illustrated in the image on the right, which shows the Sensics zSight HMD with an integrated Ergoneers eye tracker. The tracker is the small camera that is visible underneath the left eyepiece. The angle in which the camera is installed is important. A camera that is perpendicular - practically looking into the eye - will typically get an excellent image. If the camera angle is steep, the anatomy of the eye - eyelids, eyelashes, inset eyes - gets in the way of getting a good image. If the eye relief (distance from cornea to first element of the optics) is small, the camera will need to be placed at a steeper angle than if the eye relief was large. If the diameter of the optics are large, the camera would need to be placed lower and thus at a steeper angle than if the diameter of the optics is smaller. If the user wears glasses, an eye tracker that is placed underneath the optics might "see" the frame of the glasses instead of the eye. Having said that, the advantage of this approach is that it does not place many constraints on the optics. Eye tracker cameras could usually be added below optics that were not designed to accommodate eye tracking.Eye tracker that is combined with the optics
Eye tracking cameras are often infra-red cameras that look at IR light that is reflected off the eye. As such, eye tracking cameras don't need visible light. This allows using what is called a hot mirror: a mirror that reflects IR light yet passes visible light. Consider the optical system shown to the right (copyright Sensics). Light from the screen (right side) passes through a lens, a hot mirror and another lens and reaches the eye. In contrast, if the eye is lit by an IR light source, IR light coming back from the eye is reflected off the hot mirror towards the upper part of the optical system. If a camera is placed there, it can have an excellent view of the eye without interfering with the optical quality. This configuration also gives more flexibility with regards to the camera being used. For instance, a larger camera (perhaps with very high frame rate) would not be feasible if placed under the optics. However, when placed separately from the optical system such as above the mirror, it might fit. The downside of this configuration, other than the need to add the hot mirror, is that the optical system needs to leave enough room for the hot mirror and this introduces a mechanical constraint that limits the options of the optical designer. A variation on this design (what I referred in the title as "the half" configuration is having the IR light reflect off one of the optical surfaces, assuming this surface is coated with an IR-reflective coating. You can see this in the configuration on the right (also copyright Sensics). An optical element is curved and the IR light reflects off it into the camera. The image received by the camera might be somewhat distorted, but since that image is processed by an algorithm, that algorithm could compensate for the image distortion. This solution removes the need for a hot mirror but does require that there is a lens that is shaped in a way to reflect the IR light into the camera. It also requires the additional expense of an IR coating.Eye tracker integrated with the optics
The third configuration is even simpler. A miniature camera is used. A small hole is drilled through the optics and the camera is placed through it. The angle and location of the camera is balanced between getting an optical image of the eye and the need to not introduce a significant visual distraction. This is shown on the right as part of the eye tracking option of the Sensics dSight. This configuration gives excellent flexibility with regards to camera placement, but does introduce some visual distraction and requires careful drilling of a hole through the optics.Thursday, June 16, 2016
Notes from the Zero Latency Free-Roam VR Gameplay
I spent this past weekend in Australia working with Sensics customer Zero Latency towards their upcoming VR deployment at SEGA's Joypolis park in Tokyo. As part of the visit, I had the chance to go through the Zero Latency "Zombie Outbreak" experience and I thought I would share some notes from it.
Zero Latency has been running this experience for quite some time and have had nearly 10,000 paying customers do it. The experience is about 1-hour long including about 10 minutes of pre-game briefing and equipment setup, 45 minutes of play and 5 minutes to take the equipment off and get the space ready for the next group. There are 6 customer slots per hour and everyone plays together in the same space at the same time. To date, Zero Latency has opened this to customers for about 29 hours a week - mostly on weekends - but will now be adding weeknights for a total of 40 game hours per week. A ticket costs 88 Australian dollars (about 75 US dollars) and there is typically a 6-week waiting list to get in.
The experience is located in the Zero Latency office, a converted warehouse in the north side of Melbourne Australia. Most of the warehouse is taken up by the rectangular game space, about 15 x 25 meters (50 x 80 feet), or 375 m² (4000 sq ft) to be precise. The rest of the warehouse is used for two floors engineering and administrative offices. One can peak through the office windows at the customers playing and during the day you can constantly hear the shouts of excitement, squeals of joy and screams of horror coming from the game space.
I had a chance to go through the game twice: once with a group of Zero Latency employees before the space was opened to customers, and once as the 6th man of a 5-person group of paying customers late night. Once customers come in they are greeted by a 'game master' that provides a pre-mission briefing, explains the rules and provides explanation on the gaming gun. The gun can switch between a semi-automatic rifle and a shotgun. It has a trigger, a button to switch modes, a reload button and a pump to load bullets into the shotgun and load grenades when in rifle mode. I found the gun to be comfortable and balanced, and it seems that is has undergone many iterations before arriving in the current form. Players wear a backpack that includes a lightweight Alienware portable computer, a battery and a control box. The HMD and the gun have lighted spheres on them - reminiscent of the PlayStation Move - that are used to track the players and the weapons throughout the space. Players also wear Razer headsets that provide two-way audio so that players can easily communicate with each other as well as hear instructions from the game master.
The game starts with a few minutes of acclimation where players walk across the space to virtual shooting range and spend a couple of minutes getting comfortable with operating their weapons. The game then starts. It is essentially a simple game - players fight their way through the space while shooting zombies and other menacing characters, some of which shoot back at you. Every few minutes, players switch scenes by going through an elevator or teleportation waypoints, circles on the ground where each of the six players has to stand before the next scene can be reached. Sometimes you fight in an urban setting, sometimes on a rooftop, inside a cafeteria and so forth. Zombies can be killed by a direct shot to the head or multiple shots to the body. The players can also be killed, but then return to the game after about 10 seconds of appearing as a 'ghost'. Game 'power ups' are sometimes found through the space. For instance, during my gameplay I found an AK-47 assault rifle and later a heavy machine gun. At the end of the game, each player is shown their score and ranking, where the score is calculated based on the number of kills and the number of player deaths. That score sheet is emailed to players and is available for later viewing on the Web. The graphics are fine and an attacking zombie is quite compelling when it is right in your face, but the things I truly found compelling in the game are not so much the graphics and gameplay but rather a few other things:
I had a chance to go through the game twice: once with a group of Zero Latency employees before the space was opened to customers, and once as the 6th man of a 5-person group of paying customers late night. Once customers come in they are greeted by a 'game master' that provides a pre-mission briefing, explains the rules and provides explanation on the gaming gun. The gun can switch between a semi-automatic rifle and a shotgun. It has a trigger, a button to switch modes, a reload button and a pump to load bullets into the shotgun and load grenades when in rifle mode. I found the gun to be comfortable and balanced, and it seems that is has undergone many iterations before arriving in the current form. Players wear a backpack that includes a lightweight Alienware portable computer, a battery and a control box. The HMD and the gun have lighted spheres on them - reminiscent of the PlayStation Move - that are used to track the players and the weapons throughout the space. Players also wear Razer headsets that provide two-way audio so that players can easily communicate with each other as well as hear instructions from the game master.
The game starts with a few minutes of acclimation where players walk across the space to virtual shooting range and spend a couple of minutes getting comfortable with operating their weapons. The game then starts. It is essentially a simple game - players fight their way through the space while shooting zombies and other menacing characters, some of which shoot back at you. Every few minutes, players switch scenes by going through an elevator or teleportation waypoints, circles on the ground where each of the six players has to stand before the next scene can be reached. Sometimes you fight in an urban setting, sometimes on a rooftop, inside a cafeteria and so forth. Zombies can be killed by a direct shot to the head or multiple shots to the body. The players can also be killed, but then return to the game after about 10 seconds of appearing as a 'ghost'. Game 'power ups' are sometimes found through the space. For instance, during my gameplay I found an AK-47 assault rifle and later a heavy machine gun. At the end of the game, each player is shown their score and ranking, where the score is calculated based on the number of kills and the number of player deaths. That score sheet is emailed to players and is available for later viewing on the Web. The graphics are fine and an attacking zombie is quite compelling when it is right in your face, but the things I truly found compelling in the game are not so much the graphics and gameplay but rather a few other things:
- Free-roam VR is great. The large space offers fantastic freedom of movement. You can see players move throughout the space, duck to take cover, turn around quickly with no hesitation at all. This generates an excellent feeling of immersion. You can truly feel that you could hide behind corners or walk anywhere with no apparent limitations. Of course, every space has physical limitations and Zero Latency has implemented a system where if you get too close to a player or a wall, something like a radar appears on your screen showing you at the center and the obstacles (players, walls) on it so that you know how to avoid them. If you get too close, the game pauses until you are farther away. This felt very natural. Throughout nearly two hours of active gameplay I think I brushed once or twice against another player but no more than that, even though players were in close proximity. Immersion is such that players don't notice people that are not players around them. In the current Zero Latency office, the bathroom for the office (the "Loo" in "Australian") is right across from the playing space so to get there you can either take a detour walking alongside the walls or go straight through the playing area where the players couldn't care less because they don't even know that you are walking by.
- The social aspect is very compelling. This game is not about 6 individuals playing separately in a space. It is about 6 players acting as a team within the space. You can definitely hear "you take the right corridor and I'll take the left", or "watch your back" or "I need some help here!" shouts from one player to another. Players that work individually have little chance to stop the zombie invasion coming from all directions, but playing together gives you that chance.
- Tracking - for both the head and the weapon - are very smooth to the point where you don't think about it. Because multiple players are tracked in the space you can see their avatars around you (sometimes with name tags). The graphics of players walking in the game need some work in my opinion, but you can clearly see where everyone is and what they are doing.
- 45 minutes of game play go by very quickly and the game masters control the pace very well. As you can imagine, some groups take longer than others to get to the next waypoints, and the game uses waiting for elevators or helicopters as a way to condense or extend the total time. For instance, once you arrive in the cafeteria a sign shows up that the elevator will arrive in 100 seconds. I would imagine that if a group arrived earlier, they would have to wait longer for the elevator or if a group took more time, they would wait less.
Monday, June 6, 2016
Understanding Pixel Density and Eye-Limiting Resolution
If the human eye was a digital camera, it's "data sheet" would say that it has a of 60 pixels/degree at the fovea (the part of the retina where the visual acuity is highest). This is called eye-limiting resolution.
This means that if there an image with 3600 pixels (60 x 60) and that image fell on a 1° x 1° area of the fovea, a person would not be able to tell it apart from an image with 8100 pixels (90 x 90) that fell on a 1° x 1° area of the fovea.
Note 1: 60 pixels per degree figure is sometimes expressed as "1 arc-minute per pixel". Not surprisingly, an arc-minute is an angular measurement defined as 1/60th of a degree.
Note 2: this kind of calculation is the basis for what Apple refers to as a "retina display", a screen that when held at the right distance would generate this kind of pixel density on the retina.
If you have a VR goggle, you can calculate the pixel density - how many pixels per degree if presents the eye - by dividing the number of pixels in a horizontal display line by the horizontal field of view provided by the eyepiece. For instance, the Oculus DK1 (yes, I know that was quite a while ago) had 1280 x 800 pixels across both eyes, so 640 x 800 pixels per eye, and with a monocular horizontal field of view of about 90 degrees, it had a pixel density of 640 / 90 so just over 7 pixels/degree.
Not to pile on the DK1 (it had many good things, though resolution was not one of them), 7 pixels/degree is the linear pixel density. When you think about it in terms of pixel density per surface area, is it not just 8.5 times worse than the human eye (60 / 7 = 8.5) but actually a lot worse (8.5 * 8.5 which is over 70). The following table compares pixel densities for some popular consumer and professional HMDs:
Higher pixel density allows you to see finer details - read text; see the grain of the leather on a car's dashboard; spot a target at a greater distance - and in general contribute to an increasingly realistic image.
Historically, one of the things that separated professional-grade HMDs from consumer HMDs was that the professional HMDs had higher pixel density. Let's simulate this using the following four images. Let's assume that the first image, taken from Unreal Engine's Showdown demo, is shown at full 60 pixels/degree density. We can then re-sample it at half the pixel density - simulating 30 pixels/degree - and then half again (resulting in 15 pixels/degree) and half again (7,5 pixels/degree). Notice the stark differences as we go to lower and lower pixel densities.
Higher pixel density for the visual system is not the same as higher pixel density for the screen because pixels on the screen are magnified through the optics. The same screen could be magnified differently with two different optical systems resulting in different pixel densities presented to the eye. It is true, though, that given the same optical system, higher pixel density of pixels on the screen does translate to higher pixel density presented to the eye.
As screens get better and better, we will get increasingly closer to eye-limiting resolution in the HMD and thus to essentially photo-realistic experiences.
This means that if there an image with 3600 pixels (60 x 60) and that image fell on a 1° x 1° area of the fovea, a person would not be able to tell it apart from an image with 8100 pixels (90 x 90) that fell on a 1° x 1° area of the fovea.
Note 1: 60 pixels per degree figure is sometimes expressed as "1 arc-minute per pixel". Not surprisingly, an arc-minute is an angular measurement defined as 1/60th of a degree.
Note 2: this kind of calculation is the basis for what Apple refers to as a "retina display", a screen that when held at the right distance would generate this kind of pixel density on the retina.
If you have a VR goggle, you can calculate the pixel density - how many pixels per degree if presents the eye - by dividing the number of pixels in a horizontal display line by the horizontal field of view provided by the eyepiece. For instance, the Oculus DK1 (yes, I know that was quite a while ago) had 1280 x 800 pixels across both eyes, so 640 x 800 pixels per eye, and with a monocular horizontal field of view of about 90 degrees, it had a pixel density of 640 / 90 so just over 7 pixels/degree.
Not to pile on the DK1 (it had many good things, though resolution was not one of them), 7 pixels/degree is the linear pixel density. When you think about it in terms of pixel density per surface area, is it not just 8.5 times worse than the human eye (60 / 7 = 8.5) but actually a lot worse (8.5 * 8.5 which is over 70). The following table compares pixel densities for some popular consumer and professional HMDs:
Product | Horizontal pixels per eye | Approximate Horizontal Field of View (degrees per eye) | Approximate Pixel Density (pixels/degree) |
---|---|---|---|
Oculus DK1 | 640 | 90 | 7.1 |
OSVR HDK | 960 | 90 | 10.7 |
HTC VIVE | 1080 | 90 | 12.0 |
Sensics dSight | 1920 | 95 | 20.2 |
Sensics zSight | 1280 | 48 | 26.6 |
Sensics zSight 1920 | 1920 | 60 | 32.0 |
Human fovea | 60.0 |
Higher pixel density allows you to see finer details - read text; see the grain of the leather on a car's dashboard; spot a target at a greater distance - and in general contribute to an increasingly realistic image.
Historically, one of the things that separated professional-grade HMDs from consumer HMDs was that the professional HMDs had higher pixel density. Let's simulate this using the following four images. Let's assume that the first image, taken from Unreal Engine's Showdown demo, is shown at full 60 pixels/degree density. We can then re-sample it at half the pixel density - simulating 30 pixels/degree - and then half again (resulting in 15 pixels/degree) and half again (7,5 pixels/degree). Notice the stark differences as we go to lower and lower pixel densities.
Full resolution (simulating 60 pixels/degree)
Half resolution (simulating 30 pixels/degree)
Simulating 15 pixels/degree
Simulating 7.5 pixels/degree
Labels:
display,
eye limiting,
pixel density,
resolution,
tutorial
Tuesday, May 31, 2016
How binocular overlap impacts horizontal field of view
In a previous post, we discussed binocular overlap which increases overall horizontal (and diagonal) field of view. HMD manufacturers sometimes create partially overlapped systems (e.g. overlap less than 100%) to increase the overall horizontal field of view.
For example, imagine an eyepiece that provides a 90 degree horizontal field of view that subtends from 45° to the left to 45° to the right. If both left and right eyepieces point at the same angle, the overall horizontal field of view of the goggles is also from 45° to the left to 45° to the right, so a total of 90 degrees. When both eyepieces cover the same angles, as in this example, we call this 100% overlap.
But now lets assume that the left eyepiece is rotated a bit to the left so that it subtends from 50° to the left and 40° to the right. The monocular field of view is unchanged at 90°. If the right eye is symmetrically moved, it now covers from 40° to the left to 50° to the right. In this case, the binocular (overall) horizontal field of view is 100°, so a bit larger than in the 100% case, and the overlap is 80° (40° to the left to 40° to the right) or 80/90=88.8%
The following tables provide a useful reference to see how to percent of binocular overlap impacts the horizontal (and thus also the diagonal) field of view. We provide two tables, one for displays with a 16:9 aspect ratio (such as 2560x1440 or 1920x1080) and the other for 9:10 aspect ratio (such as the 1080x1200 display in the HTC VIVE). Click on them to see a larger version.
For instance, if we look at the 16:9 table we can read through an example of a 90° diagonal field of view, which would translate into 82.1° horizontal and 52.2° vertical if the entire screen was visible. Going down the table we can see that at 100% overlap, the binocular horizontal field of view remains the same, e.g. 82.1° and the diagonal also remains the same. However, if we chose 80% binocular overlap, the binocular horizontal field of view grows to 98.6°, vertical stays the same and diagonal grows to 103.2°
For those interested, the exact math is below:
For example, imagine an eyepiece that provides a 90 degree horizontal field of view that subtends from 45° to the left to 45° to the right. If both left and right eyepieces point at the same angle, the overall horizontal field of view of the goggles is also from 45° to the left to 45° to the right, so a total of 90 degrees. When both eyepieces cover the same angles, as in this example, we call this 100% overlap.
But now lets assume that the left eyepiece is rotated a bit to the left so that it subtends from 50° to the left and 40° to the right. The monocular field of view is unchanged at 90°. If the right eye is symmetrically moved, it now covers from 40° to the left to 50° to the right. In this case, the binocular (overall) horizontal field of view is 100°, so a bit larger than in the 100% case, and the overlap is 80° (40° to the left to 40° to the right) or 80/90=88.8%
The following tables provide a useful reference to see how to percent of binocular overlap impacts the horizontal (and thus also the diagonal) field of view. We provide two tables, one for displays with a 16:9 aspect ratio (such as 2560x1440 or 1920x1080) and the other for 9:10 aspect ratio (such as the 1080x1200 display in the HTC VIVE). Click on them to see a larger version.
For instance, if we look at the 16:9 table we can read through an example of a 90° diagonal field of view, which would translate into 82.1° horizontal and 52.2° vertical if the entire screen was visible. Going down the table we can see that at 100% overlap, the binocular horizontal field of view remains the same, e.g. 82.1° and the diagonal also remains the same. However, if we chose 80% binocular overlap, the binocular horizontal field of view grows to 98.6°, vertical stays the same and diagonal grows to 103.2°
For those interested, the exact math is below:
Sunday, May 8, 2016
Understanding Predictive Tracking
Image source: Adrian Boeing blog |
Why is predictive tracking useful?
One common use of predictive tracking is to reduce the apparent "motion to photon" latency, meaning the time between movement and when that movement is reflected in the drawn scene. Since there is some delay between movement and an updated display (more on the sources of that delay below), using an estimated future orientation and position as the data used in updating the display, could shorten that perceived latency.While a lot of attention has been focused on predictive tracking in virtual reality applications, it is also very important in augmented reality. For instance, if you are displaying a graphical overlay to appear on top of a physical object that you see with an augmented reality goggles, it is important that the overlay stays on the object even when you rotate your head. The object might be recognized with a camera, but it takes time for the camera to capture the frame, for a processor to determine where the object is in the frame and for a graphics chip to render the new overlay. By using predictive tracking, you can get better apparent registration between the overlay and the physical object.
How does it work?
If you saw a car travelling at a constant speed and you wanted to predict where that car will be one second in the future, you could probably make a fairly accurate prediction. You know the current position of the car, you might know (or can estimate) the current velocity, and thus you can extrapolate the position into the near future.Of course if you compare your prediction with where the car actually is in one second, your prediction is unlikely to be 100% accurate every time: the car might change direction or speed during that time. The farther out you are trying to predict, the less accurate your prediction will be: predicting where the car will be in one second is likely much more accurate than predicting where it will be in one minute.
The more you know about the car and its behavior, the better chance you have of making an accurate prediction. For instance, if you were able to measure not only the velocity but also the acceleration, you can make a more accurate prediction.
If you have additional information about the behavior of the tracked body, this can also improve prediction accuracy. For instance, when doing head tracking, understand how fast the head can possibly rotate and what are common rotation speeds, can improve the tracking model. Similarly, if you are doing eye tracking, you can use the eye tracking information to anticipate head movements as discussed in this post
Sources of latency
The desired to perform predictive tracking comes from having some latency between actual movement and displaying an image that reflects that movement. Latency can come from multiple sources, such as:
- Sensing delays. The sensors (e.g. gyroscope) may be bandwidth-limited and do not instantaneously report orientation or position changes. Similarly, camera-based sensors may exhibit delay between when the pixel on the camera sensor receives light from the tracked object to that frame being ready to be sent to the host processor.
- Processing delays. Sensors are often combined using some kind of sensor fusion algorithm, and executing this algorithm can add latency.
- Data smoothing. Sensor data is sometimes noisy and to avoid erroneous jitter, software or hardware-based low pass algorithms are executed.
- Transmission delays. For example, if orientation sensing is done using a USB-connected device, there is some non-zero time between the data available to be ready by the host processor and the time data transfer over USB is completed.
- Rendering delays. When rendering a non-trivial scene, it takes some time to have the image ready to be sent to the display device.
- Frame rate delays. If a display is operating at 100 Hz, for instance, there is a 10 mSec time between successive frames. Information that is not precisely current to when a particular pixel is drawn may need to wait until the next time that pixel is drawn on the display.
Some of these delays are very small, but unfortunately all of them add up and predictive tracking, along with other techniques such as time warping, are helpful in reducing the apparent latency.
How much to track into the future?
In two words: it depends. You will want to estimate the end-the-end latency of your system as a starting point and then optimize them to your liking.
It may be that you will need to predict several timepoints into the future at any given time. Here are some examples why this may be required:
- There are objects with different end-to-end delays. For instance, a hand tracked with a camera may be have different latency than a head tracker, but both need to be drawn in sync in the same scene, so predictive tracking with different 'look ahead' times will be used.
- In configurations where a single screen - such as a cell phone screen - is used to provide imagery to both eyes, it is often the case that the image for one eye appears with a delay of half a frame (e.g. half of 1/60 seconds, or approx 8 mSec) relative to the other eye. In this case, it is best to use predictive tracking that looks ahead 8 mSec more for that delayed half of the screen.
Common prediction algorithms
Here is some sampling of predictive tracking algorithms:
- Dead reckoning. This is a very simple algorithm: if the position and velocity (or angular position and angular velocity) is known at a given time, the predicted position assumes that the last know position and velocity are correct and the velocity remains the same. For instance, if the last known position is 100 units and the last known velocity is 10 units/sec, then the predicted position 10 mSec (0.01 seconds) into the future is 100 + 10 x 0.01 = 100.1. While this is very simple to compute, it assumes that the last position and velocity are accurate (e.g. not subject to any measurement noise) and that the velocity is constant. Both these assumptions are often incorrect.
- Kalman predictor. This is based on a popular Kalman filter that is used to reduce sensor noise in systems where there exists a mathematical model of the system's operation. See here for more detailed explanation of the Kalman filter.
- Alpha-beta-gamma. The ABG predictor is closely related to the Kalman predictor, but is less general and has simpler math, which we can explain here at a high level. ABG tries to continuously estimate both velocity and acceleration and use them in prediction. Because the estimates take into account actual data, they provide some measurement noise reduction. Configuring the parameters (alpha, beta and gamma) provide the ability to emphasize responsiveness as opposed to noise reduction. If you'd like to follow the math, here it goes:
Summary
Predictive tracking is a useful and commonly-used technique for reducing apparent latency. It offers simple or sophisticated implementations, requires some thought and analysis, but it is well worth it.
Saturday, April 30, 2016
VR and AR in 12 variations
I've been thinking about how to classify VR and AR headsets and am starting to look at them along three dimensions (no pun intended):
- VR vs AR
- PC-powered vs. Phone-powered vs. Self-powered. This looks at where the processing and video generation is coming from. Is it connected to a PC? Is it using a standard phone? Or does it embed processing inside the headset
- Wide field of view vs. Narrow FOV
This generates a total of 2 x 3 x 2 = 12 options as follows
Looking forward to feedback and comments.
|
Monday, April 11, 2016
Understanding Foveated Rendering
Foveated rendering is a rendering technique that takes advantage of the fact that that the resolution of the eye is highest in the fovea (the central vision area) and lower in the peripheral areas. As a result, if one can sense the gaze direction (with an eye tracker), GPU computational load can be reduced by rendering an image that has higher resolution at the direction of gaze and lower resolution elsewhere.
The challenge in turning this from theory to reality is to find the optimal function and parameters that maximally reduce GPU computation while maintaining highest quality visual experience. If done well, the user shouldn’t be able to tell that foveated rendering is being used. The main questions to address are:
- In what angle around the center of vision should we keep the highest resolution?
- Is there a mid-level resolution that is best to use?
- What is the drop-off in “pixel density” between central and peripheral vision?
- What is the maximum speed that the eye can move? This question is important because even though the eye is normally looking at the center of the image, the eye can potentially rotate so that the fovea is aimed at image areas with lower resolution.
1. In what angle around the center of vision should we keep the highest resolution?
Source: Wikipedia |
The macula portion of the retina is responsible for fine detail. It spans the central 18˚ around the gaze point, or 9˚ eccentricity (the angular distance away from the center of gaze). This would be the best place to put the boundary of the inner layer. Fine detail is processed by cones (as opposed to rods), and at eccentricities past 9˚ you see a rapid fall off of cone density, so this makes sense biologically as well. Furthermore, the “central visual field” ends at 30˚ eccentricity, and everything past that is considered periphery. This is a logical spot to put the boundary between the middle and outermost layer for foveated rendering.
2. Is there a mid-level resolution that is best to use? and 3. What is the drop-off in “pixel density” between central and peripheral vision?
Some vendors such as Sensomotoric Instruments (SMI) use an inner layer at full native resolution, a middle layer at 60% resolution, and an outer layer at 20% resolution. When selecting the resolution dropoff, it is important to ensure that at the layer boundaries, the resolution is at or above the eye’s acuity at that eccentricity. At 9˚ eccentricity, acuity drops to 20% of the maximum acuity, and at 30˚ acuity drops to 7.4% of the max acuity. Given this, it appears that SMI’s values work, but are generous compared to what the eye can see.
4. What is the maximum speed that the eye can move?
Source: Indiana University |
A saccade is a rapid movement of the eye between fixation points. Saccade
speed is determined by the distance between the current gaze and the stimulus. If
the stimulus is as far as 50˚ away, then peak saccade velocity can get up to
around 900˚/sec. This is important because you want the high resolution layer
to be large enough so that the eye can’t move to the lower resolution portion
in the time it takes to get the gaze position and render the scene. So if system
latency is 20 msec, and assume eye can move at 900˚/sec – eye could move 18˚ in
that time, meaning you would want the inner (higheslayer radius to be greater than that –
but that is only if the stimulus presented is 50˚ away from current gaze.
Additional thoughts
.Source: Vision and Ocular Motility by Gunter Noorden |
Visual acuity decreases on the temporal side (e.g. towards the ear) somewhat more rapidly than on the nasal side. It also decreases more sharply below and, especially, above the fovea, so that lines connecting points of equal visual acuity are elliptic, paralleling the outer margins of the visual field. Following this, it might make sense to render the different layers in ellipses rather than circles. The image shows the lines of equal visual acuity for the visual field of the left eye – so one can see that it extends farther to the left (temporal side) for the left eye, and for the right eye visual field would extend farther to the right.
For additional reading
This paper from Microsoft research is particularly interesting.
They approach the foveated rendering problem in
a more technical way – optimizing to find layer parameters
based on a simple but fundamental idea: for a given acuity falloff line, find
the eccentricity layer sizes which support at least that much resolution at
every eccentricity, while minimizing the total number of pixels across all layers.
It explains their methodology though does not give their results for the resolution
values and layer sizes.
Note: special thanks to Emma Hafermann for her research on this post
For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here
Note: special thanks to Emma Hafermann for her research on this post
For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here
Subscribe to:
Posts (Atom)