We present Playable Environments-a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. The actions are learnt in an unsupervised manner. The camera can be controlled to get the desired viewpoint. Our method builds an environment state for each frame, which can be manipulated by our proposed action mod-ule and decoded back to the image space with volumetric rendering. To support diverse appearances of objects, we extend neural radiance fields with style-based modulation. Our method trains on a collection of various monocular videos requiring only the estimated camera parameters and 2D object locations. To set a challenging benchmark, we in-troduce two large scale video datasets with significant cam-era movements. As evidenced by our experiments, playable environments enable several creative applications not at-tainable by prior video synthesis works, including playable 3D video generation, stylization and manipulation 1 1 willi-menapace.github.io/playable-environments-website.
Playable Environments: Video Manipulation in Space and Time
Ricci, Elisa
2022-01-01
Abstract
We present Playable Environments-a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. The actions are learnt in an unsupervised manner. The camera can be controlled to get the desired viewpoint. Our method builds an environment state for each frame, which can be manipulated by our proposed action mod-ule and decoded back to the image space with volumetric rendering. To support diverse appearances of objects, we extend neural radiance fields with style-based modulation. Our method trains on a collection of various monocular videos requiring only the estimated camera parameters and 2D object locations. To set a challenging benchmark, we in-troduce two large scale video datasets with significant cam-era movements. As evidenced by our experiments, playable environments enable several creative applications not at-tainable by prior video synthesis works, including playable 3D video generation, stylization and manipulation 1 1 willi-menapace.github.io/playable-environments-website.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.