We present PanSplat, a generalizable, feed-forward approach that efficiently supports resolution up to 4K (2048 × 4096).
We introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt.
We propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image.
We present a new pipeline which takes a single image as input, estimates layout and object poses, then reconstructs the scene with Signed Distance Function (SDF) representation.
A deep learning system for small research teams, with multiple GPU servers, centralized network storage, server and user management, 10 GbE network, and per-server SSD cache.