TWiki> ANC Web>PIGS (28 Mar 2014, Main.s1058681)EditAttach

Probabilistic Inference Group (PIGS)

The Probabilistic Inference Group (PIGS) is a paper discussion group with meetings held fortnightly. The group focuses on probabilistic and information theoretic approaches to machine learning problems. Meetings are generally held every Monday at 11am in room 4.31/4.33 of the Informatics Forum, though previous bookings mean some meetings will be in 2.33. Announcements are made through the PIGS mailing list.

Academic Year 2013-2014

We have moved to a themed and team based approach to PIGS meetings. These meetings will now happen in 4.31/4.33 at 11am weekly.

Spectral Learning


Mon 31st March Spectral learning for LDS, Byron Boots' thesis (2012), Chapter 2: A Spectral Learning Algorithm for Constant-Covariance Kalman Filters (Konstantinos, Chris W)

Mon 7th April Spectral learning for MoG, Hsu and Kakade (2012), Learning mixtures of spherical Gaussians: moment methods and spectral decompositions (Benigno, Partha)

Mon 14th April Spectral learning for LDA Anandkumar et al. (2013), A Spectral Algorithm for Latent Dirichlet Allocation (Iain, Krzysztof)

Mon 21st April Spectral learning for ( HMM or PCFG or ?) (TBD)

Additional resources


Organiser: Matt (


Mon 3rd March Koller and Friedman (2009) Probabilistic Graphical Models, Chapter 21: Causality, part 1 (Boris, Chris W)

Mon 10th March Koller and Friedman (2009) Probabilistic Graphical Models, Chapter 21: Causality, part 2 (Agamemnon, Amos)

Mon 17th March Schölkopf et al. (2012) On Causal and Anticausal learning (Iain, Matt)

Mon 24th March Winn (2012) Causality with Gates (Amos, Zhanxing).


Possible topics

Active Learning

Mon 25 November: Burr Settles (2010) Active Learning Literature Survey This is an introductory paper to Active Learning. We are aiming to broadly and unevenly cover the material in the first three chapters. (Guido, Kira)

Mon 2 December: Gaussian Processes tutorial (Iain), slides

Mon 13 January: Agamemnon and Amos will read Information-based objective functions for active data selection, David J.C. MacKay Neural Computation 4, 589--603 (1992)

Mon 20 January: NIPS postcards.

Mon 27 January: Srinivas et al "Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting", In IEEE Transactions on Information Theory, vol. 58, no. 5, pp. 3250-3265, 2012 (possibly also looking at Snoek et al Practical Bayesian Optimization of Machine Learning Algorithms;) (Jari will lead);

Mon 3 February: Self-Paced Learning for Latent Variable Models, by Packer, Kumar, & Koller. Relates to curriculum learning rather than active learning per se. (Chris L. and Partha);

Mon 10 February: - 10/02 Ziyu Wang‚ Masrour Zoghi‚ Frank Hutter‚ David Matheson and Nando de Freitas, Bayesian Optimization in High Dimensions via Random Embeddings (Guido and Pavlos).

Online Stochastic Descent and Stochastic Optimization

Mon 7 October: Leon Bottou (2010) Large-Scale Machine Learning with Stochastic Gradient Descent. This is an introductory paper to Stochastic Gradient Descent. For those wanting a little more detail on online learning methods, Sebastian Bubeck's lecture notes may be helpful: Introduction to Online Optimization. (Amos, Konstantinos)

Mon 14 October: Non-stationary loss and adaptive learning rates: No more pesky learning rates (Beni, Jinli) PLEASE NOTE THIS WILL BE IN 2.33 This is an extension of that work, but will not be presented: Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

Mon 21 October: Ahn, Korattikara and Welling (2012) Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring. (Guido, Matt)
Hybrid of stochastic gradient descent and Langevin dynamics based MCMC sampling for learning and sampling from the posterior across model parameters using only small mini-batches of the dataset on each update
Useful resources:
Welling and Teh (2011), Bayesian Learning via Stochastic Gradient Langevin Dynamics - precursor paper to that we'll cover explaining how a stochastic (mini-batch) estimate of the log-likelihood gradient can be used with a Langevin dynamics based update to construct a Markov chain which will converge to the posterior over parameters (video presentation of paper)
Roberts and Tweedie (1996), Exponential convergence of Langevin distributions and their discrete approximations - describes the Metropolis-adjusted Langevin algorithm (MALA) for unbiased sampling using discretised Langevin dynamics
Video presentation by Max Welling explaining SGLD and SGFS

Mon 28 October: Le Roux, Manzagol and Bengio (2007) Topmoumoute Online Natural Gradient Algorithm. This Combines online learning with the idea of natural gradient. (Jeff, Mihai) PLEASE NOTE THIS WILL BE IN 2.33

Mon 4 November: Schmidt, Le Roux and Bach (2013) Minimizaing Finite Sums with the Stochastic Average Gradient. This provides interesting theoretical results on the right scheduling process for online learning methods (Zhanxing, Boris)

Mon 11 November: Discussion meeting covering a) potential research directions relating to stochastic gradients, practical suggestions on how and when to use them. b) Choosing people for the next PIGS theme.
Mon 18 November: We will discuss the practical decisions around using stochastic online methods. This will involve the brief review of the suggestions and empirical issues discussed in the following papers. We suggest a cursory look at the papers - we will not spend much time on any theoretical analyses this time...

For future reference 9 Dec, 23 Jun 2014 and 30 June 2014 will be in 2.33

Other papers:

Duchi, Hazan Singer (2010) Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. The subgradient is particularly important in general settings e.g. where we have potential non-differentiability. This is something that can be combined naturally with online learning. (?,?)

Feng Niu, Benjamin Recht, Christopher R e and Stephen J. Wright (2011) Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. This covers the issue of large scale parallelisation of the methods, which is pretty important inmany settings.

Past meetings

21 May: Zhanxing Zhu

Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses
P. Loh and M.J. Wainwright

7 May: Konstantinos Georgatzis

A Spectral Algorithm for Latent Dirichlet Allocation
Animashree Anandkumar, Dean P. Foster, Daniel Hsu, Sham M. Kakade, Yi-Kai Liu

23 April: Daniel Trejo Baños

Dual-Space Analysis of the Sparse Linear Model
David Wipf, Yi Wu

9 April: Yichuan Zhang

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function
Pedro A. Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel A. Braun

26 March: Peter Orchard

Structure Discovery in Nonparametric Regression through Compositional Kernel Search
David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani

Gaussian Process Covariance Kernels for Pattern Discovery and Extrapolation
Andrew Gordon Wilson, Ryan Prescott Adams

12 March: Iain Murray

Hoifung Poon and Pedro Domingos
Sum-Product Networks: A New Deep Architecture

26 February: Ioan Stanculescu

James Bergstra, Yoshua Bengio
Random Search for Hyper-Parameter Optimization

Jasper Snoek, Hugo Larochelle, Ryan Adams
Practical Bayesian Optimization of Machine Learning Algorithms

12 February: Jinli Hu

R. Gens, P. Domingos. Discriminative Learning of Sum-Product Networks

29 January: Botond Cseke

S. C. Kou, Benjamin P. Olding, Martin Lysy & Jun S. Liu: A Multiresolution Method for Parameter Estimation of Diffusion Processes

15 January: NIPS 2012 review

We will review the proceedings of NIPS 2012. If you would like to discuss a paper, please edit this section to include the title together with your initials.

AJS: Learning from the Wisdom of Crowds by Minimax Entropy. Dengyong Zhou, John Platt, Sumit Basu, Yi Mao.

SL: MCMC for continuous-time discrete-state systems. Vinayak A Rao, Yee Whye Teh.

KG: Spectral learning of linear dynamics from generalised-linear observations with application to neural population data. Lars Buesing, Jakob H. Macke, Maneesh Sahani.

YZ: Modelling Reciprocating Relationships with Hawkes Process. Charles Blundell, Katherine A. Heller.

27 November: Amos Storkey

"Reconceiving Machine Learning", Bob Williamson et al.

"Machine Learning that Matters", Kiri Wagstaff

13 November: Ali Eslami

"Improving neural networks by preventing co-adaptation of feature detectors"
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever and R. R. Salakhutdinov

"Multiresolution Gaussian Processes"
E. B. Fox and D. B. Dunson
NIPS 2012

30 October: Simon Lyons

The scaled unscented transformation
S.J. Julier
Proceedings of the American Control Conference (2002)

New extension of the Kalman filter to nonlinear systems
S.J. Julier, J.K. Uhlmann
Signal Processing, Sensor Fusion, and Target Recognition (1997)


Please fill in your name or contact Krzysztof Geras, if you would like to present and discuss papers in a specific research area of machine learning.

Paper Recommendations

Please add papers you consider appropriate for PIGS. Please also add thematic categories that are not covered.

Deep learning and Energy Based Models
Variational Methods
Sampling Methods
Dynamical Systems
Information Theory

Other Journal Clubs

Previous meetings

16 October: Krzysztof Geras

No Unbiased Estimator of the Variance of K-Fold Cross-Validation (JMLR 2004)
Yoshua Bengio, Yves Grandvalet

2 October: Benigno Uria

2010 Memisevic, R., Hinton, G.
Learning to Represent Spatial Transformations with Factored
Higher-Order Boltzmann Machines.
Neural Computation June 2010, Vol. 22, No. 6: 1473-1492.

18 September: Peter Orchard

Lightning-speed Structure Learning of Nonlinear Continuous Networks

Copula Network Classifiers (CNCs)

4 September: Guido Sanguinetti

P. Diggle et al, Geostatistical inference under preferential sampling,
JRSS C (Appl. Stat.) 59 (2010)

21 August: Chris Williams

Bayesian Model Checking and Model Diagnostics - Hal S. Stern and Sandip Sinharay
Handbook of Statistics, Vol. 25 (2005)


Induction and deduction in Bayesian data analysis, Andrew Gelman, RMM vol 2, 2011 67-78

19th July: Ioan Stanculescu

Ioan will present the following papers:

Bayesian Conditional Cointegration - Chris Bracegirdle and David Barber

State-Space Inference and Learning with Gaussian Processes - Ryan Turner, Marc Deisenroth and Carl Rasmussen

19th July: Yichuan Zhang

Yichuan will present the following paper:

Accelerated Adaptive Markov Chain for Partition Function Computation - S. Ermon, C. P. Gomes, A. Sabharwal, B. Selman (NIPS 2011)

15th May: AISTATS review

Please add your initials below together with a link to the paper you wish to present.

IS: Approximate Inference in Additive Factorial HMMs with Application to Energy Disaggregation - J. Zico Kolter, Tommi Jaakkola

CS: Lightning-speed Structure Learning of Nonlinear Continuous Networks - Gal Elidan

KG: Learning from Weak Teachers - Ruth Urner, Shai Ben-David and Ohad Shamir

PO: Causality with Gates - John Winn

CW:Deep Boltzmann Machines as Feed-Forward Hierarchies -- Montavon, Braun, Mueller

AS: Classifier Cascade for Minimizing Feature Evaluation Cost -- Minmin Chen, Zhixiang Xu, Kilian Weinberger, Olivier Chapelle, Dor Kedem

1st May: Andrea Ocone

Andrea will discuss the following paper:

Designing attractive models via automated identification ofchaotic and oscillatory dynamical regimes: Silk D, Kirk PD, Barnes CP, Toni T, Rose A, Moon S, Dallman MJ, Stumpf

17th April: Simon Lyons

Simon will discuss the following papers:

Bayesian Compressive Sensing: S. Ji, Y. Xue, L. Carin

Bayesian Compressive Sensing Via Belief Propagation: D. Baron , S. Sarvotham , R. G. Baraniuk

3rd April: Botond Cseke

Botond will discuss the following papers:

Opper, Paquet, Winther: Improving on Expectation Propagation

Opper, Paquet, Winther: Cumulant expansions for improved inference with EP in discrete Bayesian networks

20 March: Chris Williams

Chris will discuss the following paper:

Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection
Gavin Brown, Adam Pocock, Ming-Jie Zhao, Mikel Luján;
13(Jan):27−66, 2012.

6 March: Charles Sutton

Charles will discuss the following paper:

A Spectral Algorithm for Learning Hidden Markov Models
Daniel Hsu, Sham M. Kakade, Tong Zhang

21 February: Iain Murray

Iain will discuss the following paper:

Statistical Tests for Optimization Efficiency: L. Boyles, A. Korattikara, D. Ramanan, M. Welling (NIPS 2011)

7 February: Jono Millin

Jono will present the following papers:

Bayesian Bias Mitigation for Crowdsourcing: Fabian L. Wauthier, Michael I. Jordan. Proceedings of NIPS (2011)

A Collaborative Mechanism for Crowdsourcing Prediction Problem: Jacob D. Abernethy, Rafael M. Frongillo, Proceedings of NIPS (2011)

24 January: Amos Storkey

Amos will present the following paper:

Sparse Bayesian Multi-Task Learning, C. Archambeau, S. Guo, O. Zoeter, NIPS 2011.
Amos will discuss this paper with reference to other Bayesian approaches to sparsity. For example:
Bayesian Inference and Optimal Design in the Sparse Linear Model, M. Seeger, JMLR 2008

10 January: NIPS Review

CW: Object Detection with Grammar Models - Ross B. Girshick, Pedro Felzenszwalb, David Mcallester

DR: Learning to Learn with Compound HD Models - Ruslan R. Salakhutdinov, Josh Tenenbaum, Antonio Torralba

BU: Selecting Receptive Fields in Deep Networks - Adam Coates, Andrew Y. Ng

PO: Variational Gaussian Process Dynamical Systems - Andreas C. Damianou, Michalis Titsias, Neil D. Lawrence

(mention) Sparse Inverse Covariance Estimation Using Quadratic Approximation - Cho-Jui Hsieh, Matyas A. Sustik, Inderjit S. Dhillon, Pradeep K. Ravikumar

CS: Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent Benjamin Recht, Christopher Re, Stephen Wright, Feng Niu. I also liked Learning unbelievable probabilities Xaq S. Pitkow, Yashar Ahmadian, Ken D. Miller

KG: Co-Training for Domain Adaptation - Minmin Chen, Kilian Q. Weinberger, John Blitzer

29 November: Peter Orchard

15 November: Guido Sanguinetti

1 November: Ioan Stanculescu

We will discuss the following papers:

18 October: Ali Eslami

We will discuss the following papers:

4th October: UAI review

We will review the proceedings of UAI 2011. If you would like to discuss a paper, please edit this section to include the title together with your initials.

SL: Pitman-Yor Diffusion Trees - Knowles, Ghahramani

AJS: Sum Product Networks - Poon, Domingos

NH: Bregman divergence as general framework to estimate unnormalized statistical models - Gutmann, Hirayama

CKIW: Conditional Restricted Boltzmann Machines for Structured Output Prediction -- Mnih, Larochelle, Hinton

DR: Classification of Sets using Restricted Boltzmann Machines -- Louradour, Larochelle

20th September: Matthias Bethge

  • Guest lecture by Matthias Bethge.

Tue August 23rd - Shahzad Asif

Tue 9th August - Yichuan Zhang

Tue 26 July - "ICML 2011"

Tue 12 July - Athina Spiliopoulou

We will discuss the following paper:

Tue 28 June - Andrea Ocone

We will discuss the following papers:

Tue 14 June - talk by Michalis Titsias

Title: Sparse Variational Inference for Multi-Task Learning

Tue 17 May - "AISTATS 2011"

Please add your paper nominations together with your initials.

Tue 3 May - Ronald Begg

We will discuss the following paper:

Tue 19 April - Simon Lyons

We will discuss the following papers:

Tue 5 April - Frank Dondelinger

We will discuss the following paper:

Tue 22 March - Grigorios Skolidis

We will discuss the following paper:

Tue 8 March - Botond Cseke

We will discuss the following paper:

Tue 22 February - Chris Williams

We will discuss the following paper:

Tue 8 February - David Reichert

We will discuss the following paper:

  • Olivier Breuleux, Yoshua Bengio, and Pascal Vincent, Neural Computation (in press): Quickly Generating Representative Samples from an RBM-Derived Process
If there is time, we will also discuss:

Tue 25 January - Charles Sutton

We will discuss the following paper

Tue 11 January - "NIPS 2010 highlights"

(*) DR: Due to my short notice decision to submit something to ICANN, I probably won't have time to look at the paper, so it's up for grabs.

Leave at bottom of list:

CW: There are lots of other papers I like that I hope someone will choose, e.g. Structured Determinantal Point Processes by Alex Kulesza, Ben Taskar; Tree-Structured Stick Breaking for Hierarchical Data, Ryan Adams, Zoubin Ghahramani, Michael Jordan; The Multidimensional Wisdom of Crowds
Peter Welinder, Steve Branson, Serge Belongie, Pietro Perona; Self-Paced Learning for Latent Variable Models
M. Pawan Kumar, Benjamin Packer, Daphne Koller; Learning Convolutional Feature Hierarchies for Visual Recognition, Kavukcuoglu et al; Divisive Normalization: Justification and Effectiveness as Efficient Coding Transform
Siwei Lyu; Global seismic monitoring as probabilistic inference,Nimar Arora, Stuart Russell, Paul Kidwell, Erik Sudderth (application); Energy Disaggregation via Discriminative Sparse Coding, J. Zico Kolter, Siddharth Batra, Andrew Ng (application)

IM: There are some other papers I could say a sentence or two about: "Tree-Structured Stick Breaking for Hierarchical Data" by Ryan Adams, Zoubin Ghahramani, Michael Jordan; "Global seismic monitoring as probabilistic inference" by Nimar Arora, Stuart Russell, Paul Kidwell, Erik Sudderth; "Label Embedding Trees for Large Multi-Class Tasks" by Samy Bengio, Jason Weston, David Grangier; Comparing "Movement extraction by detecting dynamics switches and repetitions" by Silvia Chiappa, Jan Peters and "Mixture of time-warped trajectory models for movement decoding" by Elaine Corbett, Eric Perreault, Konrad Koerding; "Self-Paced Learning for Latent Variable Models" by M. Pawan Kumar, Benjamin Packer, Daphne Koller.

Meetings in 2010

Meetings in 2009

Meetings in 2008

Meetings in 2007

Earlier meetings (2002-2006) on old website

Topic attachments
I Attachment Action Size Date Who Comment
zipzip manage 4178.0 K 15 Jul 2008 - 12:53 Main.s0565918  
pdfpdf latent-models-covariance.pdf manage 276.1 K 20 Jul 2007 - 13:38 Main.s9810791 Latent models for cross-covariance (PIGS 24th July 2007)
Topic revision: r361 - 28 Mar 2014 - 11:59:58 - Main.s1058681
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
This Wiki uses Cookies