Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

Ainaz Eftekhar; Alexander F. Sax; Roman Bachmann; Jitendra Malik; Amir Zamir

doi:10.48550/arxiv.2110.04994

Abstract

1 min read

This paper introduces a pipeline to parametrically sample and render multi-task vision datasets from comprehensive 3D scans from the real world. Changing the sampling parameters allows one to "steer" the generated datasets to emphasize specific information. In addition to enabling interesting lines of research, we show the tooling and generated data suffice to train robust vision models. Common architectures trained on a generated starter dataset reached state-of-the-art performance on multiple common vision tasks and benchmarks, despite having seen no benchmark or non-pipeline data. The depth estimation network outperforms MiDaS and the surface normal estimation network is the first to achieve human-level performance for in-the-wild surface normal estimation -- at least according to one metric on the OASIS benchmark. The Dockerized pipeline with CLI, the (mostly python) code, PyTorch dataloaders for the generated data, the generated starter dataset, download scripts and other utilities are available through our project website, https://omnidata.vision.

Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

Abstract

Discussion(0)

Open reviews(0)

Related publications

Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans

3DGEN: a framework for generating custom-made synthetic 3D datasets for civil structure health monitoring

SMURF: Spatial Multi-Representation Fusion for 3D Object Detection with 4D Imaging Radar

SMURF: Spatial Multi-Representation Fusion for 3D Object Detection With 4D Imaging Radar

You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction

Related publications

Article2021
Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans
Article2021

Article2024
3DGEN: a framework for generating custom-made synthetic 3D datasets for civil structure health monitoring
Article2024

Preprint2023
SMURF: Spatial Multi-Representation Fusion for 3D Object Detection with 4D Imaging Radar
Preprint2023

Article2023
SMURF: Spatial Multi-Representation Fusion for 3D Object Detection With 4D Imaging Radar
Article2023

Preprint2022
You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction
Preprint2022