![]() |
Abstract:
Recently, Hinton et al. derived a way to perform fast, greedy learning of
deep belief networks (DBN) one layer at a time, with the top two layers
forming an undirected bipartite graph (associate memory).
The learning procedure consists of training a stack of Restricted
Boltzmann Machines (RBM's) each having only one layer of latent (hidden)
feature detectors. The important aspect of this layer-wise training
procedure is that each extra layer increases a variational lower bound on
the log probability of data. The greedy layer-by-layer training can be
repeated several times to learn a deep, hierarchical model in which each
layer of features captures strong high-order correlations between the
activities of features in the layer below.
We will discuss three ideas based on greedily learning a hierarchy of
features:
1. Nonlinear Dimensionality Reduction:
The DBN framework allows us to make nonlinear autoencoders work
considerably better than widely used methods such as PCA, SVD, and LLE.
2. Learning Semantic Address Space (SAS) for Fast Document Retrieval:
The DBN framework allows us to build a model that can learn to map
documents into ``semantic'' binary codes. Using these codes as memory
addresses, we can learn Semantic Address Space, so a document can be
mapped to a memory address in such a way that a small hamming-ball around
that memory address contains semantically similar documents. This
representation allows to retrieve a short-list of semantically similar
documents on very large document sets in time independent of the number of
documents.
3. Learning Nonlinear Similarity Measure:
The DBN framework can also be used to efficiently learn a nonlinear
transformation from the input space to a low-dimensional feature space in
which K-nearest neighbour classification performs well. This can be viewed
as a nonlinear extension of NCA.
Time permits, I will briefly mention how RBM's can be successfully applied
in the collaborative filtering domain, in particular to the Netflix data
set.