Feature Space Transfer as Data Augmentation for Few-shot Classification and Single-view Reconstruction

Roland Kwitt4


UC San Diego1, UC Berkeley2, Microsoft3, University of Salzburg4

Overview


The problem of data augmentation in feature space is considered. A new architecture, denoted the FeATure TransfEr Network (FATTEN), is proposed for the modeling of feature trajectories induced by variations of object pose. This architecture exploits a parametrization of the pose manifold in terms of pose and appearance. This leads to a deep encoder/decoder network architecture, where the encoder factors into an appearance and a pose predictor. Unlike previous attempts at trajectory transfer, FATTEN can be efficiently trained end-to-end, with no need to train separate feature transfer functions. This is realized by supplying the decoder with information about a target pose and the use of a multi-task loss that penalizes category- and pose-mismatches. In result, FATTEN discourages discontinuous or non-smooth trajectories that fail to capture the structure of the pose manifold, and generalizes well on object recognition tasks involving large pose variation. For few-shot recognition, meta-learning is used to further stabilize the model when applied on unseen classes. Experimental results on the artificial ModelNet database show that it can successfully learn to map source features to target features of a desired pose, while preserving class identity. Most notably, by using feature space transfer for data augmentation (w.r.t. pose and depth) on SUN-RGBD objects, we demonstrate considerable performance improvements on one/few-shot object recognition in a transfer learning setup, compared to current state-of-the-art methods. The method is also applied on single-view reconstruction. By augmenting shape codes in terms of poses, it boosts the performance of the auto-encoder based reconstruction method.

paper

Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence.

paper

Published in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

Arxiv

Repository

Bibtex

Models


paper

Architecture: Feature transfer for few-shot recognition.

paper

Architecture: Feature transfer for single-view reconstruction.

Code

Training, evaluation and deployment code available on GitHub.

Results


paper

Retrieval results on ModelNet.

paper

Single-view reconstruction results on ShapeNet.

Authors



Bo Liu

UC San Diego

Xudong Wang

UC Berkeley
roland

Roland Kwitt

University of Salzburg

Acknowledgements

Bo Liu and Nuno Vasconcelos were partially supported by NSF awards IIS-1637941, IIS-1924937, and NVIDIA GPU donations.