Deep Learning

Unified Camera Positional Encoding for Controlled Video Generation

Camera-controlled text-to-video generation, now with intrinsics, distortion and orientation control!

PanFlow: Decoupled Motion Control for Panoramic Video Generation

PanFlow is a framework for controllable 360° panoramic video generation that decouples motion input into two interpretable components: rotation flow and derotated flow.

PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

We present PanSplat, a generalizable, feed-forward approach that efficiently supports resolution up to 4K (2048 × 4096).

Taming Stable Diffusion for Text to 360° Panorama Image Generation

We introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt.

DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization

We propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image.

Holistic 3D Scene Understanding from a Single Image with Implicit Representation

We present a new pipeline which takes a single image as input, estimates layout and object poses, then reconstructs the scene with Signed Distance Function (SDF) representation.

Deep Learning Server System

A deep learning system for small research teams, with multiple GPU servers, centralized network storage, server and user management, 10 GbE network, and per-server SSD cache.