We present a framework for learning single-view shape and pose prediction\nwithout using direct supervision for either. Our approach allows leveraging\nmulti-view observations from unknown poses as supervisory signal during\ntraining. Our proposed training setup enforces geometric consistency between\nthe independently predicted shape and pose from two views of the same instance.\nWe consequently learn to predict shape in an emergent canonical (view-agnostic)\nframe along with a corresponding pose predictor. We show empirical and\nqualitative results using the ShapeNet dataset and observe encouragingly\ncompetitive performance to previous techniques which rely on stronger forms of\nsupervision. We also demonstrate the applicability of our framework in a\nrealistic setting which is beyond the scope of existing techniques: using a\ntraining dataset comprised of online product images where the underlying shape\nand pose are unknown.\n
Discussion(0)
No comments yet. Be the first to comment.