What are you looking for?
Hero background image
What is AR, VR, MR, XR, 360?
Virtual Reality (VR) started to make its way into mainstream conversations a couple years ago and the industry is advancing quickly. Keeping up with the terms and acronyms can be daunting. To help keep you in the loop, we’ve created a glossary of terms extending across the immersive technology spectrum from AR to XR.
360 Video

Definition:

Frequently called "spherical videos" or "immersive videos", 360 videos are video recordings where a view in multiple directions is recorded simultaneously. They are typically shot using a specialist omnidirectional camera, or a collection of separate, connected cameras mounted as a spherical array. 360 videos can be live action (cinematography or videography that does not use animation), animated (captured from a 3D scene), or a mix of computer generated-graphics and live action. After being prepared for display via a technology such as a 3D game engine, 360 videos are then viewed by the user in a headset.

360 videos can be non-interactive or interactive. Non-interactive 360 videos are experiences where the viewer cannot influence the viewing experience outside of perhaps pausing the video or moving their head to take in different "camera angles". Interactive 360 videos are experiences where the viewer can interact with the UI or other interactable elements using gaze or a controller.

The opportunity:

360 video is an opportunity for creators to work with a variety of industries now wanting to provide content in a marketing or entertainment form. While some of the production of 360 video is distinct from building from digital assets, the post-production process is relatively comparable to creating gaming and other digital MR content.

Ambisonic Audio

Definition:

This surround sound technique covers sound sources both below and above the user. Officially a "full sphere" technique, it also serves audio sources positioned on the horizontal plane. Ambisonics are stored in a multi-channel format. Instead of each channel mapped to a specific speaker, ambisonics instead represent the sound field in a more general way. The sound field can then be rotated based on the listener’s orientation – for example, the user’s head rotation in XR. The sound field can also be decoded into a format that matches the speaker setup. Ambisonics are commonly paired with 360-degree videos and used as an audio skybox for distant ambient sounds.

The opportunity:

While ambisonic audio does potentially mean more expense – both in memory and production budgets – it grants your VR experience a fully immersive soundscape. Audio design and production are more important for VR than previous display methods, and "3D sound" will make most virtual reality experiences all the more convincing and immersive.

Anti-Aliasing

Definition:

At its most fundamental level, anti-aliasing is a technique that smooths the jagged lines on the edges of three-dimensional assets. The approach smooths the color of an edge with the color of pixels immediately surrounding it. Anti-aliasing is particularly crucial in VR, where the jagged edges can undermine immersion and presence.

The opportunity:

Anti-aliasing provides a straightforward and well-established means through which to improve visual fidelity in 3D virtual content. 3D engines like Unity allow developers using forward rendering to enable multisample anti-aliasing in many cases. While deferred rendering does not allow for multisample anti-aliasing, in those cases developers can opt to apply anti-aliasing as a post-effect.

API (Application Programming Interface)

Definition:

An API, or “Application Programming Interface”, is a common concept in software development, found throughout VR and AR content development. In essence, it is a standardized interface that lets software connect with an operating system and make use of its resources. APIs are not visible to the user of a VR or AR experience.

The opportunity:

Accessing and harnessing an operating system’s resources and potential is simpler, standardized and more efficient.

ARCore

Definition:

A software-only solution for AR that works on all Android phones running Android Nougat or any subsequent version of the OS. It will enable mobile AR experiences at scale in a similar fashion to how ARKit provides for iOS. In fact, the SDK itself has similar functionality to ARKit. Google is unsure if ARCore will be housed within Android, or if it will be a standalone product. They have confirmed it will not be part of the Daydream brand.

The opportunity:

If ARCore is as successful as Google hopes, then it gives a vast audience an accessible AR platform. That, of course, means a massive audience for your AR content.

ARCore SDK for Unity

Definition:

The software development kit that enables development for AR apps targeting Android devices and ARCore.

The opportunity:

A convenient and efficient way to make content for ARCore devices.

ARKit

Definition:

A framework that allows you to create and launch augmented reality experiences for iPhone and iPad.

The opportunity:

An accessible means through which to take AR experiences to the sizable iOS audience

ARKit plugin

Definition:

The Unity software package that enables development of apps targeting ARKit for iOS.

The opportunity:

More accessible, high-quality AR development for iOS platforms.

AR Light Estimation

Definition:

Information – based on calculated approximations – about any scene lighting associated with captured video frames from an AR session.

The opportunity:

AR Light Estimation gives you an opportunity to make sure that virtual objects rendered on top of the camera feed look like they belong in the environment, which is essential for immersion.

Audio Spatializer

Definition:

A feature that changes the way audio is transmitted from an audio source into the surrounding space. A plugin of this kind takes the source and regulates the gains of the left and right ear contributions; in a 3D engine like Unity the calculation is based on the distance and angle between the AudioListener and the AudioSource.

The opportunity:

More convincing, immersive sound that compliments the 3D nature of your VR content.

Audio Spatializer SDK

Definition:

An extension of the native audio plugin SDK that allows changing the way audio is transmitted from an audio source into the surrounding space. The built-in panning of audio sources is a simple form of spatial audio – it takes the source and regulates the gains of the left and right ear contributions based on the distance and angle between the AudioListener and the AudioSource. This provides simple directional cues for the player on the horizontal plane.

The opportunity:

This SDK offers a simple, undemanding and efficient means through which to implement the potential offered by audio spatializer features.

Augmented Reality (AR)

Definition:

Augmented reality is the overlaying of digitally-created content on top of the real world. Augmented reality – or 'AR' – allows the user to interact with both the real world and digital elements or augmentations. AR can be offered to users via headsets like Microsoft’s HoloLens, or through the video camera of a smartphone.

In both practical and experimental implementations, augmented reality can also replace or diminish the user’s perception of reality. This altered perception could include simulation of an ocular condition for medical training purposes, or gradual obstruction of reality to introduce a game world. It is worth noting that there is a point when augmented reality and virtual reality likely merge, or overlap. See also, Mixed Reality in this glossary.

The opportunity:

While much of the consumer interest, investment activity and industry hype first focussed on virtual reality, augmented reality is increasingly becoming more prominent thanks to its lack of dedicated hardware. The accessibility provided by AR by not completely restricting the user’s vision along with the vast potential of untethered usage have increased its popularity. As demonstrated by the phenomenal success of Pokémon GO – as well as AR’s rapid uptake as a tool in industrial and creative workplaces – AR has an opportunity to enjoy substantial success reaching large audiences. For more insights into the opportunity AR presents, check out part one and part two of Unity’s blog exploring the future of augmented reality.

Augmented Virtuality

Simple definition:

On the mixed reality continuum, augmented virtuality lies somewhere between AR and VR. The precise definition refers to bringing real-world objects into virtual worlds where they can be interacted with. It could be seen as a reversal of – or mirror to – what augmented reality is.

Augmented virtuality is perhaps best understood as a specific example or implementation of MR. Use of “augmented virtuality” is imprecise, so consider “augmented virtuality” to be flexible terminology.

The opportunity:

Augmented virtuality presents a means through which to make VR spaces more intuitive from a UI perspective, as well as more familiar and “friendly” to new users.

Cinematic VR

Definition:

VR provides tremendous potential to filmmakers and audiences, offering a new way to deliver stories, harnessing all the immersive potential of VR, and pulling on the power of presence. There are many distinct examples of Cinematic VR, from linear narratives in which the viewer can participate to branching stories and “films” with gameplay-like elements. While there are different interpretations of the term, Cinematic VR essentially covers the many approaches where virtual reality content appropriates or employs filmmaking methods to deliver narrative experiences.

The opportunity:

If you are a filmmaker, there is a revolution in creativity underway. If you are a viewer, film is about to get a great deal more varied and exciting. And if you make content for VR, such as games, Cinematic VR may open the doors for you to spread your wings and create for new industries.

CPU (Computer Processing Unit)

Definition:

The computer processing unit can be seen to be the central component of a modern computer. The CPU’s job is to carry out the instructions provided by a computer program. Today you will typically find that CPUs are microprocessors, so they are made up of a single integrated circuit.

The opportunity:

Using a game engine’s profiler, developers can see how much rendering demand is being put on the CPU. Understanding this data, you can optimize areas of VR content to ensure a better, more comfortable experience for users.

Cyber Sickness (AKA Virtual Reality Sickness or Simulation Sickness)

Definition:

Motion sickness – often felt on a long car journey or plane flight – happens when people are moving through physical space while their brain understands that they are stationary, as their body is not contributing to momentum. Cyber sickness, by contrast, happens when the subject is stationary but has a compelling sense of motion induced through exposure to changing visual imagery. (Arns and Cerney, 2005)

The feeling of cyber sickness is, however, comparable to the experience of motion sickness.

There is not one single factor that causes motion sickness. For example, elements like lag, refresh rate, and update rate of the visual display can each cause sickness. Other factors which may influence sickness are contrast, resolution, color, field of view, viewing region, binocular viewing, scene content, flicker and camera movement.

Early on in the current generation of VR, simulation sickness was deemed to be more common, and it continues to offer a negative association with VR for many users. It is now generally accepted that most onus for preventing cyber sickness falls on the content over the hardware. Many believe users can build up a tolerance to cyber sickness through usage. Much remains to be learned about the experience; particularly its effect on younger users.

The opportunity:

This provides a significant challenge, both to individual projects and the overall reputation and potential of VR. VR content that prompts cyber sickness can severely limit adoption, and damage the reputation of VR. Do your research. And lots of testing. Fortunately, best practices around cyber sickness are now shared by the industry.

Direct3D Transformation Pipeline

Definition:

The Direct3D Transformation Pipeline is a graphics transformation pipeline specific to the Direct3D graphics API for Microsoft Windows. This implementation of a graphics transformation pipeline uses three Direct 3D matrices: world transform; view transform; projection transform. Direct3D matrices work like those seen in high-level graphics transformation pipelines.

The opportunity:

A graphics transformation pipeline is available tailored for those that work with Direct3D.

Eye Tracking

Definition:

Cameras inside the head-mounted display can track which direction the user is looking. Eye tracking can be used as a new input axis; for example, for targeting enemy aircraft in a dogfighting game. For example, the FOVE is an HMD launched on Kickstarter promising eye tracking capabilities and a foveated rendering SDK.

While eye tracking is not a prerequisite for foveated rendering, it can deliver significant improvement through being able to shift the high-detail region based on user eye direction. Furthermore, new users tend to have difficulty overcoming the natural inclination to look around with their eyes. The problem is HMD optics tend to work best when looking straight through them to the center of the screen, ideally the user moves their head to look around. Eye tracking is the first step toward allowing users to use their eyes to naturally track within VR.

The opportunity:

A way to deliver more comfortable, intuitive VR content that is more immersive.

Face Tracking

Definition:

Computer vision technology designed to obtain data from still images and video sequences by tracking certain facial features in real-time.

The opportunity:

More convincing and natural in-game characters and interactions, empowering storytelling, immersion and presence, while providing the potential for innovative new mechanisms of interaction.

Field-of-Regard

Definition:

Related to field-of-view, field-of-regard covers the space a user can see from a given position, including when moving eyes, head, and neck.

The opportunity:

Along with field-of-view, field-of-regard is the viewer perspective on which the cinematography or framing of a given VR, AR, or MR experience is established.

Field-of-View (FOV)

Definition:

The field-of-view is all that you can see while looking straight ahead. FOV is the extent of your natural vision, both in reality and in MX content. The average human field-of-view is approximately 200 degrees.

When researching virtual reality headsets – also known as head-mounted displays or HMDs – you will see that there is a specification for field-of-view. Most current VR headsets have a minimum field of view of 90-to-110 degrees, which is a baseline for a great VR experience. The higher the field of view, the more of the environment the user will see as it will extend to the edge of their vision, and as a result, the more immersive an experience they will have. Similar to the difference between an IMAX movie theater screen and a regular movie theater screen. The IMAX screen is much larger and therefore takes up more of your field-of-view, which allows you to see more, thus creating a more immersive kind of experience.

A wide field-of-view is difficult to achieve because the limitations of lens optics – chromatic aberration and barrel distortion – become more severe, and the optics themselves have to be larger or more complex. Like a photograph taken with a fisheye lens, the images on the HMD screen are distorted to account for the optics of the HMD. Furthermore, widening the field-of-view “stretches” the available screen resolution, meaning that resolution must increase to keep the same pixel densities at higher FOV angles – the potential impact can be lessened by the use of multi-res VR shading and foveated rendering.

It’s also worth noting that some headsets – such as the HoloLens – also present a limited field-of-view. One could understand the field-of-view of a smartphone AR experience to be the available screen size, though this is not a strict technical definition.

In rare cases, field-of-view is referred to as field-of-vision. See also: Field-of-Regard.

The opportunity:

If you’re an HMD manufacturer, FOV issues present a lot to think about. For content creators, hardware FOV limitations effectively set the 'canvas' on which your VR or AR vision can be painted, so it’s an important factor – especially for multiformat releases.

Foveated Rendering

Definition:

By working to complement human biology, advanced VR rendering engines will be able to spend more time on the center of the visual field, rendering less detail in the periphery areas of the field-of-view.

The computer can render the entire scene more quickly if it allows itself to render at a lower resolution, or with simplified objects. Because human eyes perceive more detail in the center of the visual field, there is a lot of detail in each frame that we don’t even see. By rendering at low quality at the edge of the frame, the computer can either spend more time rendering detail in the center or render a single frame quicker.

The opportunity:

Foveated rendering offers tremendous speed savings. Equally, it provides more memory to play with where the GPU is concerned, and more freedom to realize your ideas in VR without being limited by the demands of rendering entire scenes at the highest resolution.

Frames-Per-Second (FPS)

Definition:

"Frames-Per-Second" – or FPS for short – refers to the number of times an image on the screen is refreshed each second.

The opportunity:

The higher the frames-per-second, the smoother the motion appears and the more comfortable a VR experience will be. This is extremely important for virtual reality because slow or choppy motion will often prompt simulation sickness. For users to feel comfortable while experiencing VR, they should ensure that they purchase a VR headset that can achieve at least 90 FPS for desktop or console VR, and at least 60 FPS for mobile. Most VR headsets on the market today achieve 60-to-120 frames per second. This is also known as the screen refresh rate and is sometimes identified in Hertz – for example, 90 Hz.

Frustum Culling

Definition:

Near and far clip plane properties are those which determine where the viewpoint of a camera in a scene begins and ends. The planes are laid out perpendicular to the camera’s direction and are measured from its position. The near plane is the closest location that will be rendered, and the far plane is the furthest. The near and far clip planes – together with the planes defined by the field of view of the camera – describe what is popularly known as the camera frustum. Frustum culling involves not displaying objects which are entirely outside of that area. Within 3D engines like Unity, frustum culling happens irrespective of whether you use occlusion culling in your game.

The opportunity:

Frustum culling can significantly improve performance in virtual reality, helping deliver experiences that are more comfortable, impressive and immersive.

Gaze Tracking (AKA Eye Tracking)

Definition:

Tracking the direction and movement of a user’s eyes, and sometimes using the tracked data as an input. See also: Head Tracking.

The opportunity:

A method to allow very subtle, nuanced user control and input, and a way to pull data on how users interact with a given experience. Gaze tracking also offers a powerful tool for accessibility, by providing a means for interaction to users with, for example, limited physical movement.

Graphics Transformation Pipeline

Definition:

The graphics transformation pipeline is an established method to take objects created in graphics software, game engines and the like, and deliver them to their intended space in a scene, and ultimately the user’s view. Graphics transformation pipelines effectively work in the same way with VR and AR as they do with more traditional 3D display methods.

The opportunity:

A reliable, established way to ensure that your objects appear in a VR or AR scene as intended. Graphics transformation pipelines and the associated matrices are often provided by a 3D game engine like Unity, meaning you will not have to worry too much about your 3D objects journey to their VR or AR home on a user’s screen.

GPU (Graphics Processing Unit)

Definition:

A graphics processing unit consists of a component – meaning an electronic circuit – specifically employed to accelerate producing images within a frame buffer. In this case, such images are created for display on a screen or similar. They are found in personal computers, workstations, game consoles, mobile devices and many other places. Virtual reality puts considerable demand on a GPU, largely thanks to the display method’s need to create distinct images for the user’s left and right eyes respectively.

The opportunity:

Consumers need to invest a fair amount to get enough GPU power to support high-end VR solutions such as Oculus Rift and HTC Vive. While cost can significantly limit the potential audience for VR, numerous methods have emerged to optimize GPU performance in virtual reality, many of which are defined in this glossary.

Haptics (AKA Touch Feedback)

Definition:

Haptics simulate and stimulate the sense of touch by applying various forces – most commonly vibrations – to the user, via the likes of input devices or specific haptic wearables. Haptics are used to lend a tangibility to an object or movement on screen. Vibrating game controllers offer the classic example, but also includes vibration delivered through a smartphone screen, and modern approaches like ultrasound speaker arrays that project textures into the air that the VR user can feel as they interact with the content.

The opportunity:

Another way to improve VR immersion, and particularly presence in VR.

Headset (AKA Head-Mounted Display or HMD)

Definition:

A virtual or augmented reality headset typically takes the form of a goggle-like device that the user wears on their head, covering or enclosing the eyes. VR headsets generally contain a screen and lenses that allow the user to see into the virtual world, or a translucent screen on which augmented reality content can be displayed. Many distinct headsets serve multiple different hardware platforms, and anything from a phone to a console can output VR content. That means it is best to work with creative tools and technology that support as many different VR platforms as possible.

The opportunity:

The VR headset is the foundation of modern virtual reality and established the template now followed by AR and other HMDs. Technology has come a long way over the last 50-60 years, and the lumbering, uncomfortable and wildly expensive VR headsets of the early 70s have evolved into something around the same size of ski or snowboard goggles. Some VR headsets even use your phone as the screen, like Samsung Gear VR or Google cardboard. When researching VR headsets, look at whether the screens are built-in or if it requires you to use your mobile phone. If you are looking for the best immersive experience, high-end VR Headsets like the Oculus Rift or HTC Vive are worth considering. But remember, high-end VR Headsets will require a high-end computer to run them. If you are looking for a great quality mobile VR experience, Samsung Gear VR and Google Daydream offer more nuanced experiences than the cardboard VR viewers; the latter being extremely affordable, and a great way to demonstrate the fundamental simplicity of VR.

Head Tracking

Definition:

Using various approaches, head tracking monitors and tracks the position and movements of a given user’s head and neck, offering a potential means for input and interaction.

For example, if the user’s neck and head are slightly inclined to one side, with head tracking enabled what they see in the HMD could shift to the same angle. A user can equally stretch their neck to look around or up and over something. That same user could make a movement such as "looking at the floor" to activate a specific gameplay action.

The opportunity:

Head tracking is close to the heart of what VR offers – a chance to build worlds the user can explore in the same way they interact with the real world.

Immersion

Definition:

Immersion refers to drawing a user completely into a virtual world. While presence in VR specifically refers to the sensation or subconscious belief that you exist within a given experience, immersion tends to be a more general term for becoming entirely encompassed, and forgetting about reality. In VR, immersion takes on a practical sense, as users’ eyes, ears, and sometimes even hands and bodies are engaged, thus blocking out any cues or sensory inputs from reality.

The opportunity:

Immersion is the primary power of VR – and of some AR creations – whether you are considering immersing your users in a convincing experience, or immersing yourself in one. Immersion is the appeal of VR and provides an opportunity to engage audiences.

Immersive Experiences

Definition:

The notion of immersive experiences long predates the current generation of VR and AR, though it includes experiences which use those forms – and potentially all MR and XR content. The term has even been used to cover specific approaches to website design and the likes of amusement park ride design. Where VR is concerned, however, the term refers to fully interactive, minimally interactive and non-gaming experiences. These can be offered both as true VR and 360 video. The term is a very broad one – much like XR – but in this context does not include traditional digital and cinematic experiences consumed via a traditional flat screen.

The opportunity:

A chance to engage users, explore new creative forms, educate, entertain, train, serve, promote, and much more.

Immersive Entertainment/Hyper-Reality

Definition:

Entertainment, promotional and experiential content that combines real-world physicality with VR or AR, as well as other forms such as narrative writing and filmmaking.

The opportunity:

An opportunity to craft content for amusement parks, arcades, malls and many other physical venues, which can provide a broad access point through which the public can first experience VR.

Inertial Measurement Unit (IMU, AKA Odometry)

Definition:

An IMU – or inertial measurement unit – is an electronic device that can detect motion through various means and technologies. IMUs consist of an accelerometer, gyroscope, or compass to measure the absolute rotation of the device with very low latency, and are used, for example, in head tracking. Combined with optical tracking systems, an IMU can be used to determine the view direction of an HMD.

As with any tracking system, latency and accuracy are key factors for an IMU. Generally, these features aren’t advertised and don’t vary significantly between devices. It is worth noting that the Samsung GearVR includes a dedicated IMU, as opposed to Google Cardboard and Daydream, which both rely on what a given phone’s inbuilt IMU brings to the headsets.

The opportunity:

The same underlying technology that flips your phone from landscape to portrait or provides tilt control to mobile games is used in VR HMDs to match the virtual camera to the user’s head direction. That provides an opportunity for all kinds of innovative forms of control and immersion in VR experiences.

Input

Definition:

An input provides a way to interact with a machine, computer or other device. In the case of VR and AR specifically, “input” refers to the method of control you will use for virtual reality and related forms. This most likely means motion tracking with controllers, but many VR, AR, and related experiences let the user interact using a mouse and keyboard or a gamepad.

As VR matures, many alternative forms of input are becoming available and affordable, from gloves that track the movements of individual fingers, to body suits that allow the entire body to be tracked in a VR experience.

The opportunity:

For designers, inputs provide many ways to offer unusual game mechanics. For users, they are a means to interact with digital worlds and feel genuinely immersed. Input approaches that don’t compliment the VR content they serve can do much to disconnect the user from the experience, undermining the form’s greatest potential – that of immersion. So 3D creators give input decision making plenty of thought.

Inside-Out/Outside-In Tracking

Definition:

The two major desktop virtual reality platforms – the HTC Vive and Oculus Rift – both rely on either a camera or “lighthouse” to be placed in a fixed position in the room outside of the HMD itself. This is what defines outside-in tracking. Meanwhile, devices like Windows Immersive Mixed Reality headsets and Microsoft HoloLens use a technique called visual odometry to analyze images from cameras mounted on the HMD itself, which serve to track its position relative to the environment around it. That latter method can be understood, by contrast to external camera setups, to offer inside-out tracking.

The opportunity:

While hardware conventions are primarily in the hands of the platform holders themselves, the two options available increase the number of settings in which VR and AR are relevant, and thus the potential landscape of audiences and experiences.

Interpupillary Distance (IPD)

Definition:

The measured distance between the pupils of a given user’s eyes. IPD can be understood to be something of a "base measurement" that provides a foundation for scale in VR. Some HMDs allow for the physical adjustment of the horizontal displacement of the lenses to better match the individual user’s IPD.

The opportunity:

While hardware conventions are primarily in the hands of the platform holders themselves, the two options available increase the number of settings in which VR and AR are relevant, and thus the potential landscape of audiences and experiences.

Latency

Definition:

Latency is the speed at which the virtual world reacts to a user’s movement. A virtual world with high latency could be described as showing lag. As a simple rule, the less latency there is, the more comfortable a given experience will be. The rule of thumb is for latency to be sub 20 milliseconds. The lower the number of milliseconds, the better an experience will be.

Latency can also refer to the rate at which a virtual world updates for the user.

The opportunity:

Low latency combats cyber sickness, and thus bolsters immersion and presence. At an even more fundamental level, it is a means through which to be comfortable in a virtual world.

Where world updating is concerned, keeping latency to a minimum makes worlds more convincing, and interactive experiences more rewarding.

Latency is an essential factor in the overall quality of an XR experience.

Light Field Technology

Definition:

Light field technology groups together various computational imaging and display technologies, hardware and image processing solutions that allow the capturing of images and video which can be altered after capture. The result, aperture and focus in video content can be adjusted in post, and potentially within a single user’s individual experience. Pioneered by the company Lytro, light field technology cameras work in a manner fundamentally similar to contemporary digital cameras. However, they use a microlens array built up from some 200,000 tiny lenses, which are used to capture myriad distinct perspectives as light hits a camera’s processor from multiple angles. By contrast, a conventional digital camera’s image sensor captures light as it enters from a single perspective, mimicking the fundamentals of a traditional film camera.

Much of the work is also done by processing and calibration software. See also: Light Field Video.

The opportunity:

Light field technology offers the potential for much more nuanced, realistic and variable 360-video, VR, AR, and MR content, with wild potential for innovative interactions, and the ability for users to move through video experiences without needing to stay anchored to the original viewpoint of the capturing camera.

Light Field Video

Definition:

Using a unique set-up that combines a traditional DSLR video camera and a Lytro Illum camera – the latter of which is a light field camera – a team of academics from Berkeley and San Diego built a hybrid device that allows for light field video at a consumer hardware level. Typically light field technology cameras have a maximum frame rate of just 3FPS, making them unsuitable for video. This new approach brings all the advantages of light field cameras to video work, meaning the ability to refocus, change of view, change of aperture and more, all after the point at which the video has been captured.

The opportunity:

An emerging technology that brings vast potential and flexibility to 360 and other forms of immersive video, in terms of post-production process and interactive design and creativity. More broadly, light field technology offers ways to simulate real-world cues around focus, perspective, and distance in VR and 360 video content.

Low-Persistence Display

Definition:

The ability to look around within an experience is probably one of the most fundamental strengths VR offers. However, many early VR technologies were undermined by the fact that fast user movements could cause blurred visuals, prompt discomfort and break immersion. A low-persistence display offers a means to address this issue.

As part of Google’s Daydream specification, a low-persistence mode for smartphone displays delivers a major distinguishing feature, elevating the offering from “just using a smartphone with some lenses” to being much more of a true VR HMD; albeit one with the accessibility of mobile platforms. The Samsung Gear VR switches displays into this special mode when it is inserted into the HMD and can be manually activated using Gear VR’s developer mode. In this mode, when viewed from outside of the HMD, the device appears to flicker. For that reason, it is vital that the low-persistence state is temporary.

The opportunity:

An increasingly refined way to give users the freedom to move as they want and enjoy true presence in the worlds you create.

Mixed Reality (MR)

Definition:

A mixed reality experience is one that seamlessly blends the user’s real-world environment and digitally-created content, where both environments can coexist and interact with each other. It can often be found in VR experiences and installations and can be understood to be a continuum on which pure VR and pure AR are both found. Comparable to Immersive Entertainment/Hyper-Reality.

Mixed reality has seen very broad usage as a marketing term, and many alternative definitions co-exist today, some encompassing AR experiences, or experiences that move back and forth between VR and AR. However, the definition above is increasingly emerging as the agreed meaning of the term.

The opportunity:

While mixed reality offers many design challenges, and much progress is needed concerning platforms that host and support it, there is a tremendous opportunity to bring a diversity of experiences and display methods to audiences through MR. That should mean more content can reach and serve a broader range of people, including those who do not find traditional VR or AR relevant to their abilities, comfort, taste, or budgets.

Mixed Reality Capture (AKA Mixed Cast)

Definition:

A term and approach predominantly put forward and facilitated by Oculus, mixed reality capture gives a person outside of a VR experience an impression of what it is like inside that content. The approach, as described by Oculus, lets developers "create videos and other content that merges live footage of people using Rift and Touch in the real world with in-game footage from VR apps.”

The opportunity:

Mixed reality capture provides a captivating means through which to share, market, promote and communicate VR experiences.

Motion-to-Photon Latency

Definition:

Motion-to-photon latency is the measure of time between when actual motion occurs in the real world and when the eye receives a photon from the HMD screen that reflects this change. Thanks to the extremely high speeds and rather short distances, it is very difficult to measure, but represents the total effectiveness of a VR system from a latency standpoint. Lay users sometimes describe this phenomenon under the same terms as “lag”.

The opportunity:

A high frame rate will render smooth motion and avoid the appearance of strobing, which can contribute to motion sickness. However, the underlying cause of VR discomfort is the discrepancy between real-world motion and visual perception. In such a case the computer may well be rendering frames very quickly, but if the tracking data is on a delay, or if the frames need to be encoded and streamed, the high motion-to-photon latency will still cause motion sickness. This issue currently makes it difficult-to-impossible to do VR with cloud-based rendering.

Motion Tracking

Definition:

Motion tracking is the ability to track and record a VR user’s movement and the movement of real-world objects, reading them as inputs and replicating those movements in virtual reality in real-time.

The opportunity:

Motion tracking is what allows VR users to move around in an environment just as they would in reality. When you lean in to look at something in a virtual world, you will get closer to that object; just as you would in real life. Motion tracking is one the most significant components required to trick your senses into thinking you are participating in the virtual environment. Equally, it now provides content creators with a means through which to create and shape VR content from within VR.

Multi-Pass Stereo Rendering

Definition:

For virtual reality to offer users stereoscopic 3D, a different image must be provided to each eye. This means rendering two distinct 3D images to an HMD. Multi-pass stereo rendering, however, is less performant than single-pass stereo rendering, and therefore limits the visual fidelity or complexity of possible scenes.

The opportunity:

For those making tools for creating games and other VR content, lots of effort must be put into enabling and supporting multi-pass stereo rendering. If you are a VR consumer yourself, multipass rendering might be the reason you may require such a powerful platform through which to use a virtual reality HMD.

Non-Gaming Virtual Reality/Augmented Reality

Definition:

VR experiences that include all non-gaming content, such as educational apps, medical training software, architectural visualization, military simulation, promotional installations, amusement park rides, retail uses, and creative tools. These types of experience are beginning to make up a significant portion of the VR content made today.

The opportunity:

Good news. The more industries and sectors that are embracing VR, the more the ecosystem of virtual reality can grow. That means more tools, more investment, and more talent – fantastic whether you’re a VR creator or a consumer.

OpenGL Transformation Pipeline

Definition:

OpenGL transformation occurs in the OpenGL pipeline, and in doing so delivers the same fundamental process as general graphics transformation pipelines, specifically for the cross-language, cross-platform graphics API. OpenGL matrices are used, in a way similar to that seen with the Direct3D Transformation Pipeline’s tailored Matrices.

The opportunity:

For those familiar with the OpenGL API, a bespoke Graphics Transformation Pipeline is available.

OpenVR SDK/API

Definition:

An SDK and API created by Valve specifically to support development for the SteamVR/HTC Vive and similar VR headsets. By contrast, the OpenXR initiative is a broader working group looking to establish general standards to support the creation and distribution of VR and AR content, tools, and hardware.

The opportunity:

A means to create content for one of the current generation of VR’s most prolific and popular platforms.

OpenXR

Definition:

An initiative to create an open standard for VR and AR apps and devices, and to eliminate the fragmentation of the industry. See also: OpenVR SDK/API.

The opportunity:

A more robust, reliable, and advanced ecosystem on which to create VR and AR content.

Panoramic 2D/3D Video

Definition:

As with many terms in the emerging VR and AR space, panoramic 2D and 3D video cover a relatively broad spectrum. They generally define video content that wraps entirely around the user, be it as a 360-degree band at eye-level, or as an entire sphere. Broadly the term includes both 360 video viewed in a VR HMD context, but equally screen-based installations at venues like amusement parks. Most live action 360 video content today is a 2D image, though with the right equipment – and budget – truly stereoscopic panoramic 3D video is entirely possible.

The opportunity:

Beyond the opportunities 360 video offers, panoramic video provides other content creators – game developers and marketers – with a means to reduce scene complexity by including video as a pre-rendered background in the place of real geometry. 3D Engines like Unity offer built-in support for this kind of video content.

Positional Tracking

Definition:

Positional tracking is the ability to record user movement and the movement of objects in real-time. This translates to users being able to move around in reality, and have those motions reproduced as interactions in a given virtual world.

While positional tracking as a term touches on similar areas to those defined by head tracking and gaze tracking, it covers HMDs, controllers, props and other real-world objects, including those seen in true mixed reality experiences.

The opportunity:

At a base level, the relative finesse of positional tracking impacts a VR experiences ability to be convincing and immersive. Beyond that, the expanding potential of the kinds of objects and inputs that can be positionally tracked is significantly broadening the spectrum of experiences virtual reality can offer.

Post FX for VR (or Post Processing Stack)

Definition:

Post FX for VR offers the application of various visual effects applied after a scene has been created. A post-processing stack combines a complete set of image effects into a single post-process pipeline, allowing cinematic effects to be applied directly and efficiently, and in the correct order for a single pass.

The opportunity:

Approaches like that seen with the post-processing stack provide a straightforward, relatively fast way to garnish VR worlds with an extra degree of nuance and detail, ultimately making them more convincing.

Presence (or Sense of Presence)

Definition:

The sense of being somewhere, be it in reality or a virtual reality. In reality, a present person may be particularly aware and socially interactive. In VR, the term applies to the experience of believing you do occupy the virtual world. We could arguably also be understood to be “present” in a book or film when we forget the real world exists, and that fiction is reality. VR offers a sense of presence unrivaled by almost any other medium or form.

The opportunity:

Presence is arguably the founding strength of VR and a vital tool for achieving immersion of the player. Engendering presence comes from the many techniques used to make quality VR, but perhaps the most important rule is that “anything that reminds a user they are in a VR experience – rather than a reality – will counter presence.” That means an incongruous menu or moment of lag could undermine presence in an instant.

Render Loop (AKA Render Pipeline)

Definition:

A render loop provides the logical architecture which dictates how a rendered frame is composed. A typically render loop may take the following structure and order, for example:

Culling > Shadows > Opaque > Transparent > Post Processing > Present

While a render loop is the set of steps the 3D engine goes through to compose a scene, the graphics transformation pipeline is the set of steps to transform an object from its own space into physical screen space.

The opportunity:

For VR two distinct images must be rendered for each eye. While running two render loops is workable, it is extremely demanding on CPU and GPU resource. However, the development of techniques like Unity’s Single-Pass Stereo Rendering makes rendering loops that support VR content much more efficient, freeing up GPU and CPU for other tasks.

Render Target (including Render Target Array)

Definition:

A render target is a memory buffer that effectively serves as a place to draw to, allowing objects to appear in an end user’s headset. An array of render targets allows for outputting to multiple render targets simultaneously.

The opportunity:

Render Targets are an established convention of game and related development and provide a useful ability to render objects offscreen.

Render Texture

Definition:

Render textures are unique types of textures that are created and updated at runtime. You can create a new render texture before designating one of your cameras to render into it.

The opportunity:

Render Textures can be used in a material within a game engine, bringing the advantages of runtime.

Scene Graph

Definition:

Scene graphs are specialized data structures that organize the information needed to render a scene. The scene graph is consumed and understood by the renderer. The scene graph can refer to either the scene in its entirety or the portion visible to the view. In the latter context the term “culled scene graph” is used.

The opportunity:

Well ordered VR worlds are efficient and reliable while putting minimum computational demands onto a system through efficient positioning and scaling of objects in a scene.

Screen Resolution

Definition:

Screen resolution refers to the number of pixels that are displayed on the screen. Much like a computer monitor or television, the more pixels present, the clearer and more realistic the image quality will be. However, in the case of the screen resolution of a VR headset, because the image is only a few inches away from your eyes, a higher screen resolution is needed so users don't perceive the gaps between individual pixels. Additionally, in the case of VR headsets, the screen is split in half to show one image accurately to each eye.

The opportunity:

As developer or consumer, when looking for a mid-level or high-end VR headset, look for a screen resolution with a minimum of 2160 by 1200 (or 1080 by 1200 per eye). Anything lower and you may notice what is called the “screen door effect”, which feels like you are looking through a screen door; in other words, you can see the little black dots or lines in the screen.

Single-Pass Stereo Rendering

Definition:

Single-pass stereo rendering is a feature that renders both eye images at the same time into one packed render texture, meaning that the whole scene is only rendered once, and CPU processing time is significantly reduced.

With single-pass stereo rendering or stereo instancing enabled, a 3D engine like Unity shares culling – whether or not a 3D object should be rendered based on if it is visible to the camera – and shadow data between the eyes. So for each visible object, we only need to render the object once, which can result in considerable speedups, while still delivering robust performance.

The difference between single-pass rendering and stereo instancing is that the latter is even more performant, but requires hardware support. Without this feature, the 3D engine renders the scene twice: first to render the left-eye image, and then again for the right-eye image.

As of Unity 2017.3, all platforms supported have Single Pass Stereo and, where available, Single Pass Instancing.

The opportunity:

With the demands traditional render loops place on CPU processing time reduced, you are given more to work with elsewhere on your VR project.

Six Degrees of Freedom (6DOF)

Definition:

A system that provides six degrees of freedom tracks an object’s position and rotation in three dimensions. The three positional axes combined with the three rotational axes total six “degrees” which can be freely controlled.

The opportunity:

There is a significant difference between what you can do with 3DOF rotational tracking and full 6DOF tracking. As an example, the original Wii controller only tracked rotation, which forced game developers to use control 'metaphors' for things like throwing a ball or swinging a tennis racquet. On the other hand, the HTC Vive and Oculus Touch controllers can be precisely controlled in space, giving users a sense of where their hands are, delivering more nuance and enabling presence.

SLAM (Simultaneous Localization and Mapping)

Definition:

Simultaneous localization and mapping is the process whereby a map is generated and updated by an agent – perhaps a vehicle – moving through that space, at the same time that such an agent is tracked in the space. There are many distinct approaches at present, and the technology is emerging as crucial to self-driving cars, domestic robots, and AR applications.

The opportunity:

Establishing SLAM technologies and the algorithms that power them has vast potential to influence the evolution of AR, offering a means for multiple practical applications, as well as in games and other entertainment forms.

Spatial Audio (or 3D Audio)

Definition:

Spatial audio provides a method through which to build and place audio assets so that – from the VR user’s perspective – a given sound originates from a particular position in a 3D scene. This is like surround-sound in a home theater setup or at the cinema, and very important to presence and immersion in VR.

The opportunity:

Sound is one of the essential components to creating an immersive VR experience. Spatial sound allows you to hear sound all around you and also tracks the sound when you move your head, just like in real life. That means a VR developer can – as well as offering more realism – use sound to direct or guide a player, as well as for more innovative mechanics distinct to VR.

Stereoscopy

Definition:

The reproduction of the effects of binocular vision by photographic or other graphic means. In other words, recreating the experience humans get from seeing the real world with two eyes. Typically stereoscopic approaches provide two distinct images of the same scene: one for the user’s left eye and one for their right eye. This would happen, for example, through the left and right lens of an HMD. The user’s brain then combines the two images to build a 3D scene with depth and perspective, just as is the case with what our left and right eyes see – and how those images are compiled – in reality.

The opportunity:

VR content that is more immersive, offers more presence and provides a more distinct experience from that seen on traditional flat displays.

Stereo Instancing

Definition:

An evolution of single-pass rendering, and the latest of several rendering optimization approaches that strive to help developers ensure a smoother experience in VR, where the framerate budgets are impossibly small compared to the constraints of traditional game development.

The opportunity:

A way for developers to save CPU processing times, and free up power they can use elsewhere.

Tracked Pose Driver

Definition:

A built-in, cross-platform driver component that simplifies setting up tracking of player movements and peripherals by matching a real-world device or object’s position and rotation to its pose – the location of the corresponding virtual object.

The opportunity:

Delivering convincing, realistic, and responsive tracking is much less demanding for creators. For players, immersion and presence are significantly supported.

Tracking

Definition:

Tracking is crucial for a fully immersive VR experience. Essentially, by tracking the position of the VR HMD – or a peripheral such as a specialized controller – this method tells the computer where a user is looking and what they are doing so that it can accurately and appropriately draw the virtual world around them. The more precise the tracking, the more comfortable the VR experience will be. See also: Motion Tracking, Positional Tracking, and Gaze Tracking.

The opportunity:

Innovative ways to control a game, and – once more – increased immersion and presence. Quality tracking also provides a countermeasure to cyber sickness.

Uncanny Valley

Definition:

Initially coined by robotics professor Masahiro Mori in the 1970s, the uncanny valley describes a phenomenon that frames how humans relate to physical or digital objects that take on a human or human-like form. The more an object looks like a human, the more a real human will feel a positive, engaged response to that object. However, at the point an object almost looks like a photorealistic human – but not quite – there is a drop off in the positive response from the viewer. That drop – as seen on a simple line graph – is the eponymous 'valley', and can be felt when we find a robot or computer animated character creepy or unsettling because it almost looks real, but isn’t convincing.

In many cases, a physical or digital human form that is less than realistic can be more engaging than one that is almost entirely convincing.

The opportunity:

The uncanny valley effect in VR and other MX forms can do much to break immersion and presence. As a result, content creators often opt to design low-fidelity human characters over hi-fidelity ones that pursue realism to avoid causing a negative response from viewers.

Vestibular System

Definition:

A network of canals in the inner ear that effectively serve as motion detectors, allowing us to balance and understand our movement. Conflicts of the visual and vestibular systems – or “vestibular mismatch” – are at the heart of why we feel cyber sickness and motion sickness.

The opportunity:

The vestibular system is so fundamental to how we interpret VR, AR and MR content – and why it can thrive or fail – a basic understanding will help you make better content that gives users a more convincing, immersion experience.

Virtual Reality

Definition:

As virtual reality has evolved and found different uses in different sectors, several different definitions have emerged, most of which significantly overlap with one another. Discrepancies exist. The following elements, however, are near universal in framing what VR offers:

  • Computer-generated stereo visuals which entirely surround the user, entirely replacing the real world environment around them. Many believe this definition rightly excludes 360 video from true VR.
  • Content is consumed and experienced from a viewer-centric perspective.
  • Real-time user interaction within the virtual environment is possible, whether through detailed interactions, or simply being able to look around within the experience. Here the real-time element means response comes within a particular time interval that is specific to the application or field.

The opportunity:

A high level of VR immersion is achieved by engaging your two most prominent senses, vision and hearing, by using a VR headset and headphones. The VR headset wraps the virtual world or experience nearly to the edge of your natural vision field of view. When you look around, you experience the environment the same as you do when you look around in real life. Headphones amplify the experience by blocking the noise around you, while allowing

you to hear the sounds within the VR experience. When you move your head, the sounds within the VR environment move around you like they would in real life.

3D engines like Unity have made it possible to create and deliver highly polished VR content. Such solutions make creating virtual reality experiences a much more accessible pursuit, meaning the display method is becoming more common. That means there is an opportunity to master VR content creation to engage a growing audience.

Virtual Reality/Augmented Reality Programming

Definition:

Programming for VR and AR is comparable to programming for other display methods and more traditional types of content. C++ and C# are particularly popular forms of programming for AR and VR, reflecting how established development tools have adapted to the arrival of VR and AR while maintaining established conventions of coding.

The opportunity:

Your existing programming skills – or those of your team – are ready for VR and AR development.

Volumetric Video

Definition:

A limitation of even the most beautiful 3D 360 video is the challenge of allowing the user to move through the world with agency and freedom. To date, most 360 videos limit the user to adopting and following the original position of the camera at the time it was used to capture the video. Volumetric video moves to address this limitation by capturing the volumetric data of a space being filmed, and for each frame. That data can then be used to present the scene as if it is a rendered graphical scene, meaning the user can walk around the video.

The opportunity:

Live action stereoscopic 360 video content you can move around in. That may make real some of the wilder predictions for what VR in the future could be.

VR Installation

Definition:

VR Installations are virtual reality experiences specific to a location or site. They can be one time creations – seen in the likes of art installations – or duplicated at multiple locations, such as amusement parks. They are often used as marketing entities. For example, promoting a film at a ComicCon, or advertising a brand at a music festival. Often in such a context they are supported with elaborate physical sets and physical effects such as generated wind or moving floors. Retailers may also implement temporary or permanent VR installations on site, allowing consumers to, for example, test the experience of a new car not stocked at the showroom.

The opportunity:

VR installations offer consumers a chance to sample high-end virtual reality without investing in any equipment. They also provide a distinct opportunity for creators, as more brands look to employ VR as a marketing tool.

WebAR

Definition:

An open standard that makes it possible to experience AR in a browser, rather than having to download apps. Comparable to WebVR.

The opportunity:

WebAR should be particularly important to mobile, where websites can deliver AR experiences via smartphone browsers, and where developers can provide simple web apps offering AR content. This should significantly democratize both the creation of AR content and access to it for users. It could also provide a convenient testing capacity for AR developers.

WebVR

Definition:

An open standard that provides a means through which to experience VR via a browser, instead of having to download specialized apps first.

The opportunity:

VR using only a headset and web browser, rather than introducing the cost of high-end computing hardware. That results in more accessible VR, and larger potential audiences for content creators. As with WebAR, this method could in some cases provide developers with a convenient testing option.

XR

Definition:

Technology-mediated experiences that combine virtual and real-world environments and realities. Here the “X” can be seen as a placeholder for V(R), A(R), or M(R), though it also represents an undefined or variable quality/quantity. XR covers the hardware, software, methods, and experience that make virtual reality, mixed reality, augmented reality, cinematic reality and others a reality. Most definitions of XR encompass platforms and content where the user can take digital objects into reality, or, conversely, see physical objects as present in a digital scene.

XR experiences include those where users generate new forms of reality by bringing digital objects into the physical world and those that see the bringing physical world objects into the digital world.

XR is generally used as an umbrella term and is frequently used as a casual shorthand to group technologies such as VR, AR, and MR together.

The opportunity:

Exploring and understanding XR broadly – instead of focusing on particular environments – should allow creators to remain flexible and evolve with the types of XR that emerge, rather than being stifled by committing to a single form.