Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene

Shubham Tulsiani; Saurabh Gupta; David F. Fouhey; Alexei A. Efros; Jitendra Malik

doi:10.1109/cvpr.2018.00039

Abstract

1 min read

The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects represented in terms of shape and pose. We propose a convolutional neural network-based approach to predict this representation and benchmark it on a large dataset of indoor scenes. Our experiments evaluate a number of practical design questions, demonstrate that we can infer this representation, and quantitatively and qualitatively demonstrate its merits compared to alternate representations.

Discussion(0)

No comments yet. Be the first to comment.

Open reviews(0)

Public, signed peer feedback on this preprint.

No reviews yet.

Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene

Abstract

Discussion(0)

Open reviews(0)

Related publications

Inferring spatial layout from a single image via depth-ordered grouping

Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild

Intrinsic Scene Properties from a Single RGB-D Image

Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation

Learning Object-to-Class Kernels for Scene Classification

Related publications

Article2008
Inferring spatial layout from a single image via depth-ordered grouping
Article2008

Preprint2020
Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild
Preprint2020

Article2015
Intrinsic Scene Properties from a Single RGB-D Image
Article2015

Preprint2024
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Preprint2024

Article2014
Learning Object-to-Class Kernels for Scene Classification
Article2014