Skip to main content

Active Recall for Multiperspectivity

Active Recall Networks for Multiperspectivity Learning through Shared Latent Space Optimization

Introduction

Given that there are numerous amounts of unlabeled data available for usage in training neural networks, it is desirable to implement a neural network architecture and training paradigm to maximize the ability of the latent space representation.  Through multiple perspectives of the latent space using adversarial learning and autoencoding, data requirements can be reduced, which improves learning ability across domains. The entire goal of the proposed work is not to train exhaustively, but to train with multiperspectivity. We propose a new neural network architecture called Active Recall Network (ARN) for learning with less labels by optimizing the latent space. This neural network architecture learns latent space features of unlabeled data by using a fusion framework of an autoencoder and a generative adversarial network. Variations in the latent space representations will be captured and modeled by generation, discrimination, and reconstruction strategies in the network using both unlabeled and labeled data. Performance evaluations conducted on the proposed ARN architectures with two popular datasets demonstrated promising results in terms of generative capabilities and latent space effectiveness.  Through the multiple perspectives that are embedded in ARN, we envision that this architecture will be incredibly versatile in every application that requires learning with less labels, as shown in Figure 1.

 

ImageFigure 1.  Architecture for multiperspectivity learning.  Latent space representations become richer and more generalizable through the unsupervised/supervised training and discriminative/generative abilities of the network.

 

Goals/Objectives

  • A new deep learning architecture, active recall networks, based on shared latent space that is capable of generalizing limited labeled data for learning.
  • Exhibits strong stabilization and convergence characteristics.
  • Improved training and representation learning in unsupervised environments.
  • Effective reduction of labeled data requirements through supervised active learning and novelty detection.
  • Effective domain adaptation capability of ARN due to latent space regularization.
  • Inherent lifelong learning capabilities with the possibility of online training.

 

Methodology

The active recall network (ARN) combines the cost functions of an autoencoder with a GAN in a fused network architecture.  Figure 2 shows the framework of the proposed neural network architecture. 

 

ImageFigure 2. Active Recall Network Architecture.  Both AE and GAN architectures share encoder, decoder, and latent space weights for unsupervised training.

 

It should be noted that the shared latent space is not created by the combination of the encoder and generator together, but rather projects onto the latent space exclusively. The ARN uses the AE loss function to minimize the reconstruction error and the GAN loss function to minimize and maximize the Kullback-Leibler divergence of the generator and discriminator respectively.  The ARN is fashioned to share latent space representations and encoder/decoder architectures to optimize a latent space for both regression and classification.

 

Results/Evaluations

We have trained the active recall network on several different datasets: CIFAR-10 multi-class image dataset (Figure 3), MNIST hand-written digit dataset (Figure 4), and CelebA celebrity faces dataset (Figure 5).  We have evaluated the generative and latent space encoding performance of the proposed architecture.  It is found to perform very well in generative cases and allows for latent space optimization in classification (using a latent space of 10), as shown in Table 1, for MNIST classification.

 

ImageFigure 3. Random generations from the Cifar-10 dataset

 

ImageFigure 4.  Random generations from the MNIST dataset

ImageFigure 5.  Random Generations from the CelebA dataset

 

Table 1.  MNIST classification using various autoencoder strategies for feature extraction and classification with a latent space of 10 (LS 10).

Image

 

Future Work

With the flexibility of the ARN architecture with the shared latent space, it has natural extensions to different applications.  Different loss functions can be used, like Wasserstein distance loss, to optimize the learning capabilities of the ARN.  For active learning, the network should be updated using new labeled data through different real data and even generated data.  For domain adaptation, the network can be adversarially regularized using the discriminators and be processed using CycleGAN concepts.  For lifelong learning, elastic weight consolidation (EWC) and generative memory replay can be used to incrementally learn new information.  Finally for multitask learning, the shared latent space allows common sense between tasks to optimize all.

CONTACT

Vision Lab, Dr. Vijayan Asari, Director

Kettering Laboratories
300 College Park
Dayton, Ohio 45469 - 0232
937-229-1779
Email