VR is Dead. Or is it?

•December 9, 2016 • Leave a Comment

This is a reprint of a LinkedIn article I published on Dec. 4.

big-ass-camera

Your mother is not going to buy and use this camera system.

Recently Shaw Walters took to LinkedIn to write a brief article about why he thought “VR is Dead”, or at least stillborn. He comes at this from the point of view of storytelling, and how VR isolates the viewer. There is nothing to disagree with the latter — strapping on a headset does isolate the viewer. However, the storytelling part requires a little more thoughtful analysis. Beyond those, there are many more applications of VR beyond mere entertainment, and Shaw touched on one briefly in his article but did not fully expand upon its applications — volumetric capture and display. In addition, “VR” as a term encompasses many different things to different people, namely:

·        360 degree images (panoramas)

·        360 degree videos (what some mistakenly call VR)

·        360 degree stereoscopic videos (my definition of true VR)

·        Augmented reality (this is not VR)

·        Volumetric (this is definitely not VR)

I will attempt to go through each one, because the impact on storytelling and UX of each is different, and certainly each has their applications beyond entertainment. My thesis is that VR is not, in fact, dead — while agreeing with Shaw’s premise that true VR is a stepping stone to something bigger.

360 degree images (panoramas)

These have been around for quite a long time, even going back to analog film capture systems. Since the advent of good stitching software and elaborate hardware capture systems this has become easier. In the past this required multiple photos from a single camera to be stitched, exposure-matched, and more. They have always been popular as a way to capture moments in time in totality from a single perspective — if difficult and technically challenging to create.  Importantly, the display and rendering of this content doesn’t require a headset. Facebook, for example, has fully embraced this format on its timeline as have other social networks (Twitter, Instagram, the list goes on). One change recently that helped spur this is the advent of easy to use single-shot 360 cameras such as the 360Fly and Ricoh Theta — this has resulted in a veritable explosion of 360 content available on these social networks. Although the experience of one of these 360 photos is certainly more immersive with a VR headset, it is plenty enough to have a desktop with a mouse to move and look around, or have a mobile phone and move it around to experience the totality of the 360 view. Just as a single data-point: Ricoh’s new 360 camera business contributed significantly to their Camera and Leasing business which did about $700M in their recent fiscal year. The addition of those product lines apparently increased that unit’s operating profit by over 100% year over year. This implies unit shipments in the millions. In my estimation, this format is not going anywhere — it will continue to grow, especially as higher quality capture devices become available, as they inevitably will.

360 degree videos (what some mistakenly call VR)

This is a medium in its own right — but again, it’s not VR because it, like a panorama, does not require strapping on a headset in order to fully experience it. This medium does, however, have a significant impact on storytelling. No longer does the cinematographer or casual shooter get to frame the viewer’s attention by pointing the camera in a certain direction and framing the shot. Lighting and gaffing and crew positioning are complicated because the camera sees everything, in 360 degrees, including the crew. (Enter a big rock, stage right) In spite of these facts, they do not mean that storytelling is not possible. For example, I could imagine a long continuous single-shot sequence like the famous and intense street battle in Alfonso Cuaron’s adaptation of P.D. James’s “Children of Men” being shot in 360 successfully — the action is what directs your attention, and the 360 might even add suspense and add to the intended disorientation and anxiety. Other tools can be brought to bear as well such as spatial audio — the direction of someone’s monologue or the crack of a rifle can be used to immediately draw a viewer’s attention to a particular direction. However, this medium does preclude the use of the cinematographer’s basic tools of the trade such as framing, soft focus, shallow depth of field, and even complicates later tasks such as color grading. This is definitely less of a passive-consumption medium than it is interactive, and it requires a new storytelling “language” than the typical cinematographer possesses. Nevertheless, the aforementioned 360 camera systems are democratizing the creation of this kind of content by the masses — they can shoot video as well — and therefore, the social media networks are now supporting this medium. Brand advertisers are following right behind in adoption and adaptation, as well. For these reasons, this type of “VR” is not dead, either — I expect to see continued adoption and growth in cameras, platforms, and content with 360 video.

360 degree stereoscopic videos (my definition of true VR)

This medium is where Shaw’s opinions and ours will converge, somewhat. Unlike the previous two, this medium does require the viewer to strap on a headset — which at best will make them look like Giordi from Star Trek–Next Generation or at worst look like a Cylon from Battlestar Galactica. Furthermore, this medium’s display technology is a single-location experience — seated, standing in place, or at best in a room with no tables and chairs to trip over. The requirement that instigates this is the need to present a different image to each eye for a stereoscopic experience. There’s nothing that Facebook’s timeline can do to accomplish this on a webpage, so headgear like the Oculus will always be required. This will never be a medium experienced “on-the-go”. Creating content in this medium is also fraught with difficulty — all of the foregoing rules and limitations for “simple” 360 video still apply. If creating computer generated imagery, the rules of the road are well known. With live action, much less so. If filmmakers had difficulty with doing *good* stereoscopic 3D with normal cameras, doing so with 360 camera rigs is ten times as difficult (if not more). Nevertheless, I think we will see good VR content being produced in the next few years, but it will be a high-end niche — the realm of VR “cafes”, amusement parks and the high-end gaming crowd. Google Cardboard presents an interesting counterpoint, but I think its widespread adoption will be limited to branding campaigns and real-estate tours, not entertainment. As much as Mark Zuckerberg talks up 360 social experiences with avatars and such — I’m not buying it. So for this one I call it definitely not dead, but a small but growing niche.

Augmented reality (this is not VR)

Most assume that augmented reality (AR) requires a headset of some sort — it does not, as evidenced by applications like Aurasma for mobile devices, and Pokemon Go, being a recent (if frivolous) example. However, my belief is that discrete eyewear is where this will eventually lead — everyone will be wearing them if they’re stylish, nonobtrusive, and adds value to their daily lives whether it be entertainment and amusement, timely information, or learning. Microsoft’s Hololens is a window to the future, in my opinion. However, it is not the present because it is a) too expensive b) too clunky, at least for now and finally c) not high resolution enough to subtend a significant amount of the viewer’s field of view. What about “We-have-a-secret-but-we’re-not-telling-you” Magic Leap? I can’t really comment on them because I don’t know anything — they won’t tell me. However, one of the biggest problems with eyewear AR is — how do you show black when you’re staring at a bright object in real life? Anyone who solves that problem effectively, and who can put that in an otherwise ordinary looking pair of reading glasses will create a massive market for this medium. What about storytelling in AR? This is 360, so a lot of the same rules for 360 video will apply. It’s stereoscopic, too, so all of the usual difficulties for 360 stereoscopic apply as well, at least if one wants live-action AR. It will certainly require the creation of a new language for storytelling, in the very least. It may be different, but doesn’t mean it won’t be created.

Volumetric (this is definitely not VR)

Shaw touched on the capture of multiple images to create 3D models… in the entertainment space we view these tools primarily in the context of visual effects. They can certainly assist with the creation of 360 and VR (and AR) content, but in and of themselves these tools such as multi-image modeling and motion capture are not VR. Nevertheless, I predict (which is easy because I’m seeing this in action now) an explosion of the availability and proliferation of these tools, and not just for VFX and entertainment. In the commercial space, applications such as mapping, construction site management, cell tower inspection, and infrastructure inspection (think bridges, pipelines and power lines) are just now utilizing the power of volumetric capture using tools such as Pix4D, DroneDeploy, and  SketchFab by capturing multiple high resolution photos or video to create volumetric models (while being accurately calibrated for distances), thereby creating new and exciting markets and applications for data that previously were not possible to create.

Conclusion

In conclusion, Is VR dead? No, it’s just getting started, depending upon how you define VR. 360 images and video are entrenched already with a healthy ecosystem of capture devices, content, and platforms to distribute and display them on. Strap-on VR headsets will be a niche, but a growing one. AR is the inevitable future, and volumetric capture is finding applications beyond media and entertainment already, today.

Advertisements

DJI Inspire, 3D GIS Survey Tests with DroneDeploy.com and GemVR

•October 20, 2016 • Leave a Comment

In this video, we take the DJI Inspire for a couple of test flights to gather 4K photos to use for a 3D geographic survey. Along for the ride was a 360Fly VR “witness camera” to document things as we went along. We applied image stabilization and color/exposure balancing in post to make the 360 video a bit easier to watch — as the 360 camera was not on a gimbal.  DroneDeploy.com is a very cool service that allows for automatic flight planning and survey capture, and then they process the footage taken on their cloud to produce a myriad of GIS data such as elevation maps, stitched ultra-high-resolution orthophotos, and 3D maps.

Aerial Stereoscopic 360 Video And Images For True Live-Action Virtual Reality (Part 3 of 3)

•October 17, 2016 • Leave a Comment

To round out this three-part post, I will describe some of the workflow that went into creating the video below:

You can also view this on Vrideo.com if you have an Oculus Rift or similar stereo VR headgear, here:

http://www.vrideo.com/watch/bedpkTNA

Now, to be clear, I expected that there would be no parallax along the 90 degree yaw angle, and that there would be eye-rips due to the gimbaled camera being in the way of the back view; I used the gimbal camera for FPV monitoring while I flew. No global shutter nor synchronized shutter, either… The main purpose of this test flight was to checkout weight and balance and ensure safe flight, and to test out GemVR’s image enhancement and matching functions.

That being said, some interesting things were learned in producing the video above.

First, juggling multiple camera systems (and a drone) that all want to communicate on 2.4 Ghz is a challenge. I had to tune the channel management on the DJI Inspire to get everything to play nice together. Next is the fact that this camera system (360Fly) and the other’s I’ve used (Ricoh Theta, for example) have their signals drowned out by the DJI (plus their range is rather limited, too) so your only choice is to put the cameras into recording mode manually before the flight.

This in turn makes synchronizing the footage from the two cameras challenging — in retrospect, I should have used a Clapboard. Additionally since the two cameras are not on a single, synced global shutter, there are some eyerips during fast motion. For that purpose I just used Premiere to eyeball it and match as best as I could. Also, the B-roll in the beginning of the video had to be re-projected into equirectangular space — for that I used Kolor AutoPano Video.

The images for the two camera views were color-corrected and matched by GemVR’s magic software, which was the real purpose of the test…

No image stabilisation was performed with the above video, but Kolor’s AutoPano Video would probably do a good job of that, especially since it is stereo-3D-aware.

Now that we’ve proven out a safe flight operation and pattern with witness camera(s), the next step will be to use 4 360 cameras such as the Ricoh Theta S to achieve full parallax along every direction of the yaw, with a rig like below:

img_1927sm

The key here is that the stereo parallax will come from disparity measurement, since all the cameras are at 90-degree offsets, but because there are beyond 180-degree cameras, there should be enough overlap to create stereo 3D as long as there aren’t a lot of objects nearby to cause large disocclusions. This is basically what quite a few others are doing, such as Facebook with their Surround 3D camera rig, but this is perhaps a little more ambitious since it’s a much lighter weight rig (able to be flown on a DJI Inspire, for example).

Also, for an example of a “simple” monoscopic 360 witness camera on the DJI Inspire, you can see a pretty cool waterfall shoot here:

http://www.vrideo.com/watch/FqRkahU

This seemed to be a pretty effective weapon, aside from the FPV gimbal being “in the way” — but the 4K footage from the camera on the gimbal was actually the payload of this particular flight, the 360Fly was just along for the ride as a witness camera.

I’ll post again on this subject when we have an update about the 4-camera Ricoh rig. We’re also working with a couple of new innovative VR camera companies, which could turn out to be very useful for aerial VR 360 shooting. Stay tuned!

Aerial Stereoscopic 360 Video And Images For True Live-Action Virtual Reality (Part 2 of 3)

•August 3, 2016 • Leave a Comment

To continue this three part series, where in the first part we covered the Ricoh Theta S camera system, in this second part of the series we’ll show how we married two of these together to form up a stereoscopic VR rig, and the post-processing steps needed. In the third part we’ll cover what we did to take aerial VR 360 with it on a DJI Inspire 1 drone.

First, our camera “accessory tool bag” is quite elaborate:

IMG_1773

The most important thing to note here is that this is a customized extension of a Fairburn Multiplate Kit (available at http://www.mio3d.com) — Sean Fairburn puts together some great, very clever and well-thought-out kit.  This kit is awesome for when you want to marry things together that normally wouldn’t go together, and when you want to experiment.  We also added some tools, step rings, a bunch of Israeli arms/articulating arms and a focus chart.  The focus chart is useful since we’re shooting at very high resolution — 5376×2536, more than 4K for each camera. We used a Fairburn 3×1 multiplate (the little plate at the bottom left) for what you’ll see next… and yes, it all barely fits in the case (except for the digital micrometer, it has its own case):

IMG_1777

To set this rig up you need the 3×1 Fairburn multiplate and two 12/20 screws (provided in the kit, amongst many other things you’ll never think you need until you need them), and a Phillips screwdriver to put it all together, arranged like so:

IMG_1779

Now, with the mounting plate you can directly twist this on to any standard tripod mount or any other 12/20 — in this case, to show how light and flexible this rig is, we mounted it on a the selfie stick (Smatree® SmaPole Q3) mentioned in Part 1:

IMG_1783

To take stereoscopic VR photos or video with this rig will require two phones or tablets, each connected to one of the two camera WIFI access points.  I’d suggest using two of the exact same model phone so the app performance and latency is identical — because you’re going to need to hit the shutter button at the exact same time on both, whether doing timelapse still images at 16Mpel or video at 2Mpel. Mismatches can be corrected in post-production… up to a point.  It’s better to match the settings, and shutter as much as possible.

Another interesting property of this rig is that it’s fairly forgiving of alignment — the individual cameras take care of most of the stiching and camera removal.  However, lining up the cameras carefully with a caliper/micrometer pays off in that the amount of “other camera” each other camera “sees” can be minimized, which makes “rig removal” or inpainting in post production much easier:

IMG_1782crop

One note about the monopod arrangement here — it’s a little top-heavy, so make sure that when setting the monopod up that it’s on a flat, stable surface.  We made the mistake of attempting a midnight moonless night shoot, putting the monopod and plate rig in an elevated flower pot.  The result was tragically predictable — the monopod tumbled about 8 feet to the ground, and landed with one of the cameras head-first taking the full impact.  Thankfully the optics didn’t get scratched up so we were able to perform a little microsurgery and put Humpty Dumpty back together again:

Here is “first light” from the rig.  You can view this on the GemVR channel on YouTube (along with other experiments) or the GemVR channel on VRideo.com if you have an Oculus Rift and the VRideo app.  You don’t need a Google Cardboard nor an Oculus Rift — if you view this with just a computer or phone without other elaboration it will be monoscopic (but still cool). Be sure to set the resolution settings for Youtube to the highest allowable:

Now, two problems with this — you can still see the plate and mounting, plus the cameras are “seeing each other”.  One can use Photoshop CC which has a great in-painting function, or use a free tool called G’MIC (http://www.gmic.eu).  Since the rig removal occurs at the bottom of the equirectangular projection(s), before inpainting it’s best to convert to a cube map projection first — we used PTGui (http://www.ptgui.com):

ptgui

Which netted the following cube-map projections:

r0010032_cubemapr0010031-shift_cubemap

Once in cube map form, rig removal and inpainting is easy (we used Photoshop CC’s context-aware inpainting in this case).  Note the bracket, tripod kickstand legs, and shadow have magically disappeared:

r0010032_cubemap_inpaintedr0010031-shift_cubemap_inpainted

For video, there’s good news:  the mask for the rig removal does not change from frame to frame so one can automate this process using Imagemagick and G’MIC, (which we’ll demonstrate in Part 3).  Then we go back to equirectangular projection (PTGui won’t do this easily, so for this step we used Pano2VR (available at http://www.ggnome.com):

pano2vrNext, we create a stacked stereo (top-and-bottom, or TaB) video from these single left-eye and right-eye frames with the following ffmpeg commands, run in succession:

ffmpeg.exe -loop 1 -i left\left_inpainted_equi.jpg -c:v libx264 -t 15 -pix_fmt yuv420p -vcodec rawvideo Left.avi
ffmpeg.exe -loop 1 -i right\right_inpainted_equi.jpg -c:v libx264 -t 15 -pix_fmt yuv420p -vcodec rawvideo Right.avi

c:\ffmpeg\ffmpeg.exe -y -r 0.405 -i Left.avi -r 0.405 -i Right.avi -pix_fmt yuv420p -s 5376×5376 -filter_complex “[0:v][1:v]vstack[v]” -map “[v]” -pix_fmt yuv420p -vcodec rawvideo stacked.avi

This next one encodes it at a resolution and file format that Youtube will understand:

ffmpeg.exe -y -r 0.405 -i Stacked.avi -s 4096×2160 -pix_fmt yuv420p -vcodec libx264 -profile:v high -level 52 -crf 18 stacked.mp4

The final step is to use Google’s free “Spatial Media Metadata Injector”, available here. This tool allows you to mark the video as both 360 degrees, plus the fact that it’s stereoscopic.

injector

Now it’s ready to upload to Youtube, VRideo, and other services.  Here’s the final result:

Youtube (works with Google Cardboard, obviously):

https://www.youtube.com/watch?v=s_Y7k7y5ZqE

VRideo (works with Oculus Rift):

http://www.vrideo.com/watch/bbwj73ts

There are a few eye-rips here and there but it’s not too bad for a first pass shot.

In the next, final, third part of the series we’ll go over how we mated this rig to a DJI Inspire 1 Raw drone, in order to take aerial stereoscopic VR.  In that case the “rig removal” includes the entire drone platform, which requires some elaborate inpainting!

Aerial Stereoscopic 360 Video And Images For True Live-Action Virtual Reality (Part 1 of 3)

•July 16, 2016 • 1 Comment

In a previous post, I described in a 3 part series the research project that we did a few years ago (2011) to capture stereoscopic 3D video at A2Zlogix.  Over at GemVR we’ve been experimenting with 360 degree video (including stereoscopic) using camera systems such as the Ricoh Theta S, and marrying them to drone platforms for aerial stereoscopic 360 degree video experiences worthy of viewing on today’s best VR viewing platforms.  At only around $350, even using two of the Ricoh Thetas for stereoscopic capture is now approachable, if only you could get the stereoscopic part to work well.  In this first of the series, we’ll talk about the Ricoh Theta camera that we’re using, it’s strong points, and some of its quirks.

To cut to the chase, if you have a Google Cardboard (or Oculus Rift or similar) you can experience acrophobia below — my first time viewing this, I had to grab onto a table.  We may be the first company to achieve aerial full stereoscopic live-action 360 degree video — this is harder than it sounds.  When your stereoscopic VR rig weighs 50 kg, you need a Sikorski Helicoper, not a drone.  We did this with a DJI Inspire 1 Raw with two Ricoh Theta S’s, with mounting plate around 325 grams in total. You can see the result here:

This was just a first aerodynamic and load/moment test to make sure that the Theta S’s “bunny ears”

IMG_1722

didn’t unduly affect the control dynamics of the aircraft, so forgive the eye-rips if you view this with an Oculus Rift or a Google Cardboard. It’s also not full action video, we had the cameras set to interval recording mode every 8 seconds, but it gives you time to look around a bit before the frame changes.

But first let’s go over the basic camera system in this Part 1 of 3.

The Ricoh Theta S is a pretty nifty little camera, smaller than an iPhone 5, and capable of shooting 360 degree photos and videos. To do so it employs two fisheye lenses on opposite sides of the camera body, with each shooting a hemispherical fisheye image more than 180 degrees field-of-view.  The camera is also brain-dead simple to use, as it stitches its own images, which is normally the hardest challenge with 360 video or photography, normally requiring expensive software, or in the case of free software, a good deal of finesse and expertise in the subject to create a worthy final result.

For conducting shoots like for virtual tours, it’s very well thought out since the camera control is almost completely controlled via its own WiFi access point.  This means that you can set the camera up on a selfie stick or monopod, duck behind a bush (or door in a different room), and push the shutter button in the app on your phone — ensuring that you’re not obstructing the shot. Since the camera captures the whole environment, one has to be mindful of the photographer themselves being in the shot!  The WiFi control effectively alleviates the problem in many situations. Very clever.

The camera is very small and I’ve found the most flexible and convenient platform for mounting is a Smatree® SmaPole Q3 monopod with the accompanying micro-tripod (some these days would call it a selfie-stick with a kickstand) seen here:

IMG_1704[1]

A quick turn of the telescoping rod allows the monopole to be extended to eye-height, which is usually the ideal shooting position.  This very simple setup allows for some pretty cool 360 photography at 12 megapixels (final images are 5376×2688).  Still images are stored as equirectangular, stictched automatically by the camera.  Here’s an example:

KTBV4170[1]

This image points out another feature of the camera:  it can shoot HDR images, or to be very precise, auttomatically shoot bracketed-iris images.  Meantime you can let the camera select it’s own shutter speed or provide your own.  To get a good sky exposure I used a shutter speed of 60 seconds to capture the above image.  The main ISO ended up being 1600.  For indoor photographs this is pretty important, since it’s common to have a dark side of the room, and an outside window or windows that typically would be competely blown-out/overexposed without a bracketed shot.

Once you’ve taken the shot, it’s pretty much ready to upload to your Facebook timeline, or other platform that supports 360.  Ricoh provides a free flickr-like service called theta360.com which can host the images (or video).  This brings us to an important matter — playback and reconstruction.  Shooting 360 is all well and good, but how do you show it?

There are many ways to do this, but Facebook has a pretty clean workflow for it.  All you have to do is upload the 360 photo to your timeline — the metadata in the photo tells Facebook that it’s a 360 equirectangual image, and it knows to embed the right kind of javascript player to render it.

You can see a less over-exposed playback of the image above here, hosted on theta360.com (be sure to look up!):

Milky Way nightshot, 360 degrees. ISO 1600, shutter speed 60 seconds. Liking this camera more and more… #theta360 – Spherical Image – RICOH THETA

We’ll go over combining two Theta S’s on a Fairburn Multiplate in Part 2.

Digital Representations of Light/Sound Fields for Immersive Media (jpeg.org)

•June 16, 2016 • Leave a Comment

Today I ran into a very interesting newly published technical report from the JPEG organization that deals with coding and representing light and sound fields (read: virtual reality and more).  It’s a very dense report, but if you’re interested in VR/AR, lightfield, or multichannel object-based audio, it’s worth a glance.

It reminds me of some of the work I did back in the day with MPEG4 BIFS, abusing it to efficiently encode VRML…

The publication from JPEG.org is here: https://jpeg.org/downloads/jpegpleno/wg1n72033_20160603_report_jahg_light_sound_fields.pdf

 

eleVRant: 360 Stereo Consumer Cameras?

•June 12, 2016 • Leave a Comment

Over at eleVRant, I discovered a very well thought out article about the possibilities for 360 degree stereoscopic camera arrangements.  While cameras like the Kodak Pixpro and the Rich Theta S are becoming mainstream monoscopic 360 degree cameras (which is cool enough), this article explores what would be involved to do the same, but in 3D stereoscopic capture.  Have a read, here: http://elevr.com/elevrant-360-stereo-consumer-cameras/