Technology to build holodeck

When I visited my elderly mother in Germany recently, I realized that this might be the last time I saw her in the cozy little house she had called home for more than twenty years. So I did what anyone would do: I took out my phone and took lots of pictures of the place to preserve as many memories as possible: the warm fireplace; the bookshelves filled with familiar books; Old garden bench, signed by everyone at a special birthday celebration many years ago.

Then, I tried something else. I opened Scaniverse, a 3D scanner app Pokémon Go maker Niantic and captured some of these things as 3D objects, crouching down and tiptoing around them while slowly moving the phone to record every angle and inch. The result, while somewhat imperfect, is still impressive. When I opened the scans later on my phone and VR headset, I was able to view the weathered garden bench from all angles as if I were standing right in front of it. This experience touched me emotionally in ways I never expected.

This experience is made possible thanks to Gaussian Splash, a novel 3D capture method invented less than two years ago that has taken the tech industry by storm. Niantic and Google are both using it to build their respective mapping products; Snap has added support for splats (the colloquial name for objects captured with Gaussian splatters) on its Lens Studio developer platform, and Meta wants to use Gaussian splatters to Create a metaverse that looks just like the real world.

Tech companies are fascinated by Gaussian jet technology because of its ability to capture three-dimensional objects in a photorealistic way and then recreate them digitally. It will soon allow anyone to scan an entire room and change the way creatives in Hollywood and beyond record 3D video. When combined with generative AI, it has the potential to not only preserve existing spaces but also transport us into entirely new 3D worlds.

“This is a huge game changer,” said AR/VR expert and investor Tipatat Chennavasin. As co-founder and general partner of Venture Reality Fund, Chennavasin has a financial stake in the success of this technology. As a geek and former 3D artist, he fell in love with it, comparing it to star trek Holodeck, a holographic 3D simulation that allows crew members to enter real and imaginary spaces. “We are starting to achieve photorealistic holodecks.”

Build a 3D map of the world one bit at a time

Even capturing 3D objects on mobile phones is nothing new. However, most previous work has relied on polygons, and if you've ever used a mobile AR app, you've seen this cyberpunk mesh of triangles.

Polygon mesh-based 3D capture and reconstruction is good enough for basic objects with flat surfaces, but it can struggle with detailed textures and complex lighting. Objects captured this way often look plasticky and unrealistic, while humans captured in 3D always seem to have too much gel applied instead of individual strands of hair. "It was promising at the time, but there were always huge limitations," Chenavasinh said.

All of that changed in the summer of 2023, when a group of European scientists published a paper on what they called "3D Gaussian splatter." They solved this problem by abandoning meshes and instead capturing 3D objects as collections of blurry, translucent blobs (also called Gaussians).

Each of these blobs is captured with accurate information about its color, position, scale, rotation and transparency level, and when you combine millions of blobs, you get a more detailed picture of the 3D object , which also details how it looks from any given angle thanks to all this additional data. Using machine learning, they are able to capture more detailed objects with higher fidelity and render them in real time without the need for heavy graphics rendering equipment.

The results immediately shocked experts in the field. "We finally have the opportunity to have true 3D that's photorealistic," Chennavasin said. "This is the JPEG moment for spatial computing."

Brian McClendon, senior vice president of engineering at Niantic, believes that Gaussian splats are the most profound advancement in 3D graphics in more than 30 years. "We think this is a fundamental change," he said.

"We think this is a fundamental change."

McClendon says Gaussian splattering will democratize 3D capture, and Niantic wants to be at the forefront of this change. Niantic added Gaussian Splash as a capture technology last year after acquiring the Scaniverse app in 2021. In August, it launched a new version of Scaniverse that puts splashing at the forefront. In October, the company open sourced its splats file format. In December, Scaniverse expanded to VR, allowing users to view Gaussian blobs in Meta's Quest headset.

Niantic has its own reasons for pushing splash. Scaniverse began as an app for capturing personal mementos and other personal items, but Niantic now encourages people to scan statues, fountains, and other public attractions. The company sees these scans as a key component of the 3D map it is building of the world - the same map it is building for Pokémon Go, peridotand future geospatial AR games and experiences. “We’re very focused on mapping, scanning and reconstructing the outdoors,” McClendon said.

"We have hundreds of thousands of these (types of scans) in Scaniverse now," McClendon said. “Hopefully we’ll hit a million soon.”

Splats are changing 3D video capture

Gaussian patterns are not just for capturing static content. Computer vision startup Gracia AI has been using the technology to record volumetric 3D videos that can be viewed on the Meta Quest headset. One clip shows a chef preparing a meal, and viewers can watch the action from all angles in virtual reality, even zooming in to watch his knife slice into a glistening piece of raw salmon.

Gracia recorded the video in a professional 3D capture studio, using an array of 40 cameras pointing at the chef from every angle. This is how professionals have been recording holographic content for AR and VR experiences for years - but once again, the transition from polygons to Gaussian blobs makes all the difference.

Previously, 3D video capture presented a series of visual challenges, resulting in a strict dress code for the individuals being captured: no busy patterns, nothing translucent, nothing loose and dangling that could lead to strange artifacts. film. When Microsoft captured David Attenborough this way a few years ago, it even had to glue his collar to his shirt and use copious amounts of hairspray to avoid any loose ends that might disrupt the capture process. end.

“It’s amazing how much creative flexibility you can have using Gaussian splats.”

With Gaussian splats, all these limitations disappear. "There are no limits on clothing, and there are no limits on hair," said Gracia co-founder and CEO Georgii Vysotskii, whose company investors include Chennavasin's Venture Reality Fund. While previous generations of volumetric video capture required large amounts of light to eliminate shadows, Gracia has been able to record scenes in almost total darkness. "You can leave all the shadows and use artistic lighting," Vysotsky said. “It’s amazing how much creative flexibility you can have using Gaussian splats.”

That’s not to say there are still challenges. Currently, Gaussian splash clipping still requires 9GB of data per minute of video - too much for streaming or anything other than a brief tech demo. Vysotskii said the company is currently working to get that down to 2-3GB per minute, with 180-degree stereoscopic VR video potentially requiring as little as 1GB of data per minute. He envisions these types of clips eventually replacing recordings of trainers in VR workout apps like Supernatural or professional educational content, because they allow users to view instructions from all angles.

Meta's ambitious Gaussian splats plan

Meta has built one of the most ambitious Gaussian splats demos to date. The company launched Hyperscape at this fall's Meta Connect conference, an app for the Meta Quest headset that lets users explore photorealistic 3D renderings. The app launched by scanning six spaces, including five artist studios and a conference room on Meta's campus that once housed Mark Zuckerberg's office.

The visual fidelity of the hyperlandscape that allows you to move freely around these spaces is a mesmerizing experience. You can browse the many oddities in mixed-media artist Dianne Hoffman's San Francisco studio, including countless dolls and a box labeled "Snake Skin and Seashells." You can marvel at visual artist Daniel Arsham's extensive Porsche collection, and even look at the ferns and trees outside Zach's old office window. The renderings feel so authentic that Meta felt it necessary to add a warning not to rely on any of the furniture depicted.

For now, Hyperscape is nothing more than a custom technology demo. However, Meta has big plans for Gaussian splats, as Meta Horizon OS and Quest VP Mark Rabkin told me at Meta Connect this fall. "Gaussian splats are already running on our engine, which is basically the Horizon engine," Rabkin said, referring to Meta's social VR platform. "So technically the path to making it work in the real world is quite short."

Meta envisions splats as yet another tool for VR creators to build immersive worlds and experiences. horizon world. The company even plans to eventually allow anyone to scan their own home and then upload a digital copy of it to the virtual universe. "Of course," said Rabkin. "That's what we're working towards."

"Is there a way they can scale it up? I don't know."

It's unclear how long the work will take, nor is it clear horizon world Whether it will continue to exist in its current form until then is another question entirely. Meta declined to be interviewed for this article, but Niantic's McClendon warned against underestimating the complexity of building a scanning tool like Hyperscape.

"They've basically produced perfect vision," McClendon said. He suggested that Meta would likely do multiple scans of each room, and likely do a lot of manual editing and cleanup. Because the resulting scans are too large to be processed in real time on the device, Meta renders them in the cloud and streams them directly to the headset.

“It’s not scalable, but it looks really good,” McClendon said. "Is there a way they can scale it up? I don't know."

Clear shot of the holodeck

The development of Gaussian jet technology is advancing rapidly. McClendon told me that the pace at which new scientific papers are being published on the topic reflects the pace at which AI research is being generated. “Papers are published so quickly now,” he said. "The excitement is real." Chennavasin said the technology they are developing is being implemented quickly. “Or become a startup.”

One of the areas ripe for breakthrough is the combination of splats and artificial intelligence. Generative AI can improve the capture and rendering of Gaussian blobs, potentially allowing companies like Gracia AI to capture video with fewer cameras. At the same time, more people capturing 3D objects and scenes will also significantly increase the amount of high-quality training data for generating 3D video models.

"It didn't happen overnight. But it's clear now."

All of this points to a future in which ordinary people will be able to generate realistic 3D spaces through artificial intelligence prompts, Gaussian snapping, or a mixture of the two, and then enter these spaces using VR headsets or AR glasses.

“The killer app for XR is multiplayer holodecks,” Chennavasin said. "We created it by generating artificial intelligence and Gaussian blobs with a visual fidelity that is almost indistinguishable from reality. It didn't happen overnight. But it's clear now."

Such a palpable future raises the question: If you had a holodeck, what would you visit first? Authentic recreation of a faraway place you haven’t had the chance to visit yet? A famous recording studio, museum or library? Or rather, fantasy worlds like medieval castles, dungeons, or Marvel movie sets?

For me, it might just be my mom’s cozy little house and that rickety garden bench.