We present PanSplat, a generalizable, feed-forward approach that efficiently supports resolution up to 4K (2048 × 4096).
We introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt.
We propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image.