Probabilistic Inference Group (PIGS) - Archive
Suggested topics
- Amos: Sequential Monte-Carlo update. Basically quick summaries of salient papers on this SMC page would be worth doing to get everyone up to speed. I'll probably arrange this for my session on 21 Aug. Charles: I agree and suggest this paper if it hasn't been done already: "Sequential Monte Carlo Samplers", (with P. Del Moral & A. Jasra), J. Royal Statist. Soc. B, vol. 68, no. 3, pp. 411-436, 2006.
- NIPS2006 workshop on Dynamical Systems, Stochastic Processes and Bayesian Inference.
- Compressed sensing, see http://www.dsp.ece.rice.edu/cs/
- Further look at deep belief networks, including the work on human motion which gives a good demo of conditional deep belieft network models.
- Submodular functions and optimization (generalizes convexity to functions on sets), see e.g. http://www.mlpedia.org/index.php?title=Submodular_function
Meetings in 2009
Tue 12 December (Nicolas Le Roux)
Talk by Nicolas Le Roux (Microsoft Research). Please see title and abstract below:
How overconfidence slows you down, a learning story.
Abstract: Nowadays, for many tasks such as object recognition or language modeling, data is plentiful. As such, the most important challenge has become to find a way to use all the available data rather than to deal with small training sets. In this setting, coined ``large-scale learning'' by Bottou and Bousquet,learning and optimization become different and powerful optimization algorithms are suboptimal learning algorithms. While many only considered optimization algorithms (or approximations thereof) to perform learning, I will show how designing a proper learning algorithm and making use of the covariance of the gradients can yield faster, more robust, convergence. I will also show that this covariance matrix is not an approximation of the Hessian and that the two matrices can be combined in an principled and efficient way.
Tue 17 November (Jakub Piatkowski)
We will discuss the following paper:
Wed 11 November (Maurizio Filippone)
Talk by
Maurizio Filippone. Please see title, abstract and related material below:
Information Theoretic Novelty Detection
In this talk, we present a novel approach to online change detection problems when the training sample size is small. The proposed method is based on estimating the expected information content of a new data point in the null hypothesis that it has been generated from the same distribution as the training data. In the case of the Gaussian distribution, our approach is analytically tractable and closely related to classical statistical tests, since the expected information content is independent from the statistics of the generating distribution. Such a test naturally takes into account the variability of the statistics due to the finite sample effect, and thus it allows to control the false positive rate even when only a small training set is available. We then discuss two different extensions of the presented method. In the first one, we propose an approximation scheme to evaluate the information content of a new data point when the generating distribution is a mixture of Gaussians. Finally, we study the extension to autoregressive time series with Gaussian noise, thus removing the i.i.d. assumption. The experiments conducted on synthetic and real data sets show that our method maintains a good overall accuracy, but significantly improves the control over the false positive rate.
Part of the material covered in the talk can be found here:
Tue 3 November (David Reichert)
We will discuss the following papers:
Tue 20 October (Jyri Kivinen)
Jyri will present some of his joint work on statistical modeling of natural images and scenes using a hierarchical nonparametric Bayesian framework (J. J. Kivinen, E. B. Sudderth (Brown), and M. I. Jordan (UC Berkeley)). Please see abstract and background papers below:
I will begin by describing the tree-structured latent variable model it employs to generate pyramidally organized multiscale image features, and to couple dependencies between them. I will then describe an extension using Hierarchical Dirichlet Processes to learn data-driven, global statistical image models of unbounded complexity. Finally, we develop effective learning algorithms using Markov chain Monte Carlo methods and belief propagation for categorizing images of novel scenes, and denoising them in a transfer learning-based algorithm.
Fri 16 October (Michalis Titsias)
Title: Variational Inference for Large Datasets in Gaussian Processes
Gaussian processes (GPs) are stochastic processes over real-valued functions that can be used for Bayesian non-linear regression and classification problems. GPs also naturally arise as the solutions of linear stochastic differential equations. However, when the amount of observed or training data is large, the evaluation of posterior GPs is intractable because the computations scale as O(n^3) where n is the number of training examples. Therefore, for large datasets we need to consider approximate or sparse inference methods. In this talk we discuss sparse approximations for GPs based on inducing/support variables and the standard variational inference methodology. We apply this to regression, binary classification and large systems of linear stochastic differential equations.
Tue 6 October (Michael Dewar)
We will discuss Variational Inference in Markov Jump Processes using the following papers:
Tue 22 September (Jan Antolik)
Joint session with joint
DevCompNeuro journal club.
Short description of the talk:
The main goal of the project I'm working at is to predict what image has been presented to an animal based on the activity profile of group of cells (~50) obtain via two-photon imaging. The system would learn this prediction from recordings of pairs of images and activity profiles. Instead of directly predicting the image the goal is to be able to tell from a large set of images which one was the presented one.
I have so far applied several simpler approaches to the problem, including the simple linear perceptrons, multi-layer NN with back-propagation and notably the 'gaussian pyramid model' which worked when applied to analogous problem but with fMRI data in a study by (Gallant et al. 2008). I have also tried several approaches to directly determine the receptive field of the neurons.
So far these techniques haven't worked. My main aim with this presentation is to get some brainstorming going and perhaps learn from the real machine learning people about latest approaches to fit non-linear models. I would be particularly interested in learning about methods of learning recurrent NN, as it appears that a lot of the neural responses are due to lateral interaction as opposed to the feed-forward receptive field structure.
Kay NN, Naselaris T, Prenger RJ and Gallant JL (2008):
Identifying natural images from human brain activity
Tue 25 August (Edwin Bonilla)
Tue 28 July (UAI session)
Brief discussions on the following UAI 2009 papers:
UAI 2009 proceedings at
http://www.cs.mcgill.ca/~uai2009/proceedings.html
Tue 14 July (Kian Ming Chai)
We will discuss the following paper:
Tue 30 June (Amos Storkey)
We will discuss Deep Boltzmann Machines using the paper:
- R. Salakhutdinov and G.E. Hinton, To appear in Artificial Intelligence and Statistics (2009): Deep Boltzmann Machines
If there is enough time, Amos will also give a basic introduction to Martingales using:
Tue 23 June (2nd ICML session)
Brief discussions on the following ICML 2009 papers:
Note: ICML 2009 proceedings at
http://www.cs.mcgill.ca/~icml2009/abstracts.html
Tue 16 June (ICML session)
Brief discussions on the following ICML 2009 papers:
Tue 9 June (Michael Dewar)
We will discuss Hierarchical HMMs using the paper:
Note: There is an extended version of the paper:
Tue 26 May (Athina Spiliopoulou)
We will discuss two variants from the RBM/DBN literature using the papers:
Tue 21 April (Chris Williams)
We will discuss multi-arm bandits and Gittins indices. This is a simple case where the exploration-exploitation tradeoff is seen, and there is an optimal Bayesian solution.
The papers
J. C. Gittins, Bandit Processes and Dynamic Allocation Indices, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 41, No. 2. (1979), pp. 148-177.
J. C. Gittins, D. M. Jones, A Dynamic Allocation Index for the Discounted Multiarmed Bandit Problem, Biometrika, Vol 66, No. 3. (1979), pp. 561-565.
are available via
http://en.wikipedia.org/wiki/Gittins_index
Tue 7 April 2009 (Jakub Piatkowski)
Tue 24 March 2009 (Nicolas Heess)
Tue 10 March 2009 (Andrew Dai)
Tue 24 February 2009 (Edwin Bonilla)
Tue 10 February 2009 (Kian Ming Chai)
Tue 27 January 2009 (Amos Storkey)
Tue 13 January 2009 (NIPS)
Note: NIPS 21 preproceedings at
http://books.nips.cc/nips21.html
- NH: I. Murray, R. Salakhutdinov: Evaluating probabilities under high-dimensional latent variable models
- NH: I. Sutskever, G. Hinton, G. Taylor: The Recurrent Temporal Restricted Boltzmann Machine
- EB: Reducing statistical dependencies in natural signals using radial Gaussianization ( Siwei Lyu, Eero Simoncelli)
- EB: Sparse Convolved Gaussian Processes for Multi-ouptut Regression ( M. Alvarez, N. Lawrence)
- CW: Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes. Erik Sudderth, Michael Jordan
- CW: The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction Fabian H. Sinz, Matthias Bethge
- CW: Bayesian Exponential Family PCA Shakir Mohamed, Katherine Heller, Zoubin Ghahramani. See also [[http://www.cs.ualberta.ca/~dale/papers/allerton08.pdf] [Efficient global optimization for exponential family PCA and low-rank matrix factorization.]] Guo, Y. and Schuurmans, D. (2008) in Allerton Conference on Communication, Control, and Computing.
- MD: Using Bayesian Dynamic Systems for Motion Template Libraries Silvia Chiappa, Jens Kober, Jan Peters
- MD Nonparametric Bayesian Learning of Switching Linear Dynamical Systems Emily Fox, Erik Sudderth, Michael Jordan, Alan Willsky
- DR: Cascaded Classification Models: Combining Models for Holistic Scene Understanding Geremy Heitz, Stephen Gould, Ashutosh Saxena, Daphne Koller
Some other NIPS 21 papers CW found to be of interest:
- The Infinite Factorial Hidden Markov Model, Jurgen Van Gael, Yee Whye Teh, Zoubin Ghahramani
- Deep Learning with Kernel Regularization for Visual Recognition. Kai Yu, Wei Xu, Yihong Gong
- Cascaded Classification Models: Combining Models for Holistic Scene Understanding. Geremy Heitz, Stephen Gould, Ashutosh Saxena, Daphne Koller