Virtual Reality and Artificial Intelligence

This week I did a “lockdown lecture”, part of a series of talks from academics in Goldsmiths, Department of Computing. They are a way of keeping in touch with our current students, prospective students and wider public during the COVID-19 lockdown. It was great to make contact with a lot of people (almost 250!) and there was a lot of really great, active discussion in the chat, so for me it was a great experience.

The title of the talk was Virtual Reality and Artificial Intelligence, two of my main interests. I often talk about the two separately, but it was a chance to talk about how AI (or more exactly Machine Learning) can be used to make better VR experiences.

I thought I would write a little post summarising the key ideas, you can watch the whole video for more info. There are lots of ways that you can use machine learning in VR, but my talk focused on three in particular: Content Creation, Embodied Interaction and Virtual Humans.

Content Creation

One of the most popular uses of machine learning in games, and increasingly in VR, is procedural content generation. It’s not an area I’ve work on much but I’ve been lucky to work with PhD students Sokol Murturi and Rob Homewood who are researching content generation.

Content in games and VR can mean a lot of things, but typically we are talking about environment, characters and other graphical objects or textures. Typically this content is created by artists working on it by hand in platforms like Maya or 3D Studio Max. This can result in great content, but it takes a lot of work. Using algorithms to automatically generate content (“Procedural Content Generation”) can reduce a lot of the work, allowing small, independent teams to produce big environments and content rich experiences. More importantly it makes it possible to generate bespoke environments for every player.

This is what Minecraft does. It is probably the most popular and visible Procedural Content based game at the moment, which makes it able to easily create new worlds for every player. Minecraft doesn’t use machine learning (as far as I know), but machine learning based procedural generation can help create more complex worlds (like the evolutionary worlds of William Latham and Lance Armstrong show in the middle panel above), and allow allow more input from creators into the generation process (like the work of Sokol Murturi on the right).

Embodied Interaction

This is one of my key areas of research. VR isn’t like interacting with a computer or smart phone screen. It is about interacting with a virtual world in the same way we interact with the real world. That means we use our our full body movements. We reach out to grab things, we walk, we crouch to see under objects. This kind of embodied interaction is one of the things that makes VR a uniquely powerful experience.

The trouble is that the way we use our bodies is very complex. Not only that, we often do things subconsciously, we know how to ride a bike, or hit a ball, but that doesn’t mean we can explain to some one else, in words, how to do so. This is called “Tacit Knowledge” and it makes it very difficult to programme computers to recognise movements, because we ourselves don’t know the rules that define these movements.

This is where machine learning comes in. Using machine learning we can design movement interactions systems by giving examples of movement rather than coding. We design interactions by moving. This can make it possible to design much more natural interactions.

I’ve discussed this issue in more detail in the post above, it is also the basis of a new project 4i: Immersive Interaction Design for Indie developers with Interactive machine learning.

Virtual Humans

Everything I have just said about our body movements applies even more so to our interactions with other people. Our body language is vital to face to face interactions, it adds a lot of emotional nuance to our conversation and manages how we talk, for example how we can fluidly take turns between people.

But it is almost entirely subconscious, we don’t consciously know what our bodies are doing most of the time because we are concentrating on the words. Even the world’s top social psychologists and neuroscientists don’t fully understand how body language works. Not only that, but body language is fundamentally interactive, we are constantly responding to what other people are doing. Simply pre-recording animations is now enough, the body language needs to respond in real time.

Again, machine learning can help. In my talks I like to show the diagram above, adapted from a very old paper of mine. We can capture an interaction between two people, typically a conversation. This can use technologies like motion capture, facial recognition, gaze tracking and the humble microphone to capture body language and speech. If we capture two people at the same time we can synchronise the data and learn a model that maps from the behaviour of one person to that of the other. That means we can create models that know how to respond to another person’s body language and can be used to drive the behaviour of virtual characters.

That idea covers a lot of my research over the years, but most recently our collaborative project with Maze Theory and Dream Reality Interactive on socially engaging characters for a new game based on Peaky Blinders. Cristina Dobre has been doing most of the machine learning work on that.

VR and AI: a lot of potential

I really enjoyed giving my lockdown lecture to such an enthusiastic audience. I am glad they are interested in the potential of AI and machine learning for VR, because I think we are only starting to explore the possibilities. The three topics I covered are only a small part of what is possible and I’m really looking forward to what other people come up with in the next few years.