NIPS 2016 Review, Days 0 & 1

A Lab41 Perspective

Published in

Gab41

8 min readDec 8, 2016

Good morning, fellow machine learners. A few of us from Lab41 recently jumped the pond over to Barcelona, Spain, to see what machine learning and artificial intelligence stuffs we could glean from eager minds. We found former colleagues, former students, and more deep learning algorithms than the number of cat pictures on the internet.

NIPS 2016 in Barcelona. Picture at Park Guell

Conference Overview

The organizers came out before the keynote (Yann LeCun) to introduce us to NIPS 2016. They pulled together some statistics from the tags that submissions self-identified with. According to the picture below, the distribution of papers is heavy tailed, and the spread of topics makes for a rich problem set. That’s a first order statement, since there seems to be high correlation between topics in a given paper. (I’m sure large scale learning can be applied to computer vision, and they’re using deep learning to do it.) Still, there’s a wide variety of things you can see here.

Ever the scientists, the two organizers justified their choice on the program committee by maintaining that they want to grow the number submissions while decreasing bias and variance. They treated the problem with unknown ground truth of what the “best papers” were,

High Profile Talks

Keynote — Yann LeCun gave the keynote on “Predictive Learning”, an ambiguous title of a talk he must have presented a million times by now. It’s the one where he makes an analogy to parts of a cake: unsupervised learning is the filling of the cake…the big kahuna, supervised learning is frosting on the cake, and reinforcement learning is the cherry on top. It seems like he’s been burned by this, and was apologetic, saying that he’ll make it up (to Deep Mind, maybe?) because reinforcement learning, if aided by unsupervised approaches, is also a big chunk of where research should lie. He pretty much gave an overview of what he viewed as important topics. Among other big ideas, he said that Generative Adversarial Networks were the most important innovation in machine learning in recent times, and he credited Ian Goodfellow. High praise from one of the elders of Deep Learning.

Intelligent Biosphere — Google Deep Mind’s invited talk to kick off the first day on the Intelligent Biosphere. Drew Purves from Deep Mind is the speaker, and it’s a refreshing insight into how AI can be used for social good. The premise is that AI can help Nature, but on a less intuitive note, Nature can help AI. On the former, if you think hard about it, you’ll know that statistics and machine learning can aid policy makers on producing less waste in farming, agriculture, and other efficiency saving measures. Besides augmenting efficiency, breakthroughs can occur to make new innovations that can aid humanity. On the latter, he made the distinction between natural versus artificial and real versus simulated. The world is scale-less, cyclic on many levels, fuzzy, and just plain hard to work with. On the other hand, everything we train has been scoped. We can take cues to build better simulations of the real world.

There was a whole bunch of other stuff, but the idea is how to make sure machine learning algorithms applied to the real and natural, and so they were proud to introduce their real-world simulator. And…their slides are, by far, the most stylish.

The Masterful Artistry of Deep Mind Graphics

One fun note is that Purves pointed to two high school students getting a head start on things. These guys had started playing around with Tensorflow, and it’s encouraging to see the future talent come. I was slightly disappointed that there wasn’t any applause, but Purves moved on quickly and needed to wrap up.

Best Paper Award: Value Iteration Networks— The best paper award went to Aviv Tamar, who talked about Value Iteration Networks, from Berkeley’s Artificial Intelligence Research Laboratory. It was a new look at reinforcement learning. He wanted to build a neural network that can learn to plan a policy rather than follow a totally reactive policy. He also wanted it to be model-free, and what better way to do that than to use CNNs? His paper is on ArXiV and will appear in proceedings.

General Trends & Some Specific Talks

Some big themes that I noticed, people were either using batch normalization or layer normalization, it has become somewhat of a mantra. I noticed at the posters, there was something called Weight Normalization too; I’ll report on that tomorrow when it’s in an oral.

The themes of the conference were:

Generative methods or GANs all day every day. It’s official; declared by Yann LeCun: GANs are the best thing to happen in twenty years. The number of ArXiV papers on the subject is staggering. Even the physicist in tomorrow’s talk is enamored with GANs.
Reinforcement learning is not really a cherry on the top. Both DeepMind and OpenAI are doubling down on real-environment simulators. These will get us closer to true AI working in true environments, which is a theme of a couple of the highest profile invited talks.
Human inspired computing hasn’t gone away, and chief among these ideas is the philosophy of short term and long term memory. Despite our personal difficulties working with memory networks, the community is surging forward with the idea that like humans, computers need to store stuff into local memory (to put another analogy on it: registers) but also should be able to reach far back into their past (and again, hard drive). Or maybe this was my impression after listening to the keynote, since LeCun is at FAIR and they’re into that.

On more specific themes, I happen to be working on an audio project currently, so maybe my thoughts are a bit skewed on which things are most interesting. If you have alternative views of what was going on in the conference, please do send me a note.

Time Series and Recurrent Networks

Who knew that one dimensional signals would give us so much trouble. Time series and audio were very present at this year’s NIPS, the hard problems, of course, being non-stationary signals.

Aapo Hyvarinen (Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA) was first in the unsupervised track with nonlinear ICA, where he asserted that independent components in nonlinear systems are difficult to obtain because the inverse problem (of a deep learning function) is impossible to solve. He proposes using Time-Contrastive Learning, which seems like another “something”-2vec at first glance.

One talk, Using Fast Weights to Attend to the Recent Past, was on fast and slow weights for LSTMs. There is slow varying, longer term information and then there’s fast weight rapid learning also decaying rapidly, storing specific temporary information. Solution? Add a layer to attend to the fast stuff.

Other works dealt with compressing neural networks and providing approximate bounds on them (a la Supervised learning through the lens of compression). The concept was simple: you add an epsilon to your hypothesis, and say that you’re compressing alright if you match your truth+ epsilon instead of just truth.

Phased LSTMs (PLSTMs) — As if LSTMs aren’t complicated enough, we’d like to put more stuff in there. It seems like a sampler, but sampling at different phase. They built this to deal with really long sequences, and they sampled at regular frequencies (think Wavelets style: at low, medium, high frequencies) by putting a gates inside the LSTM.

SRNNs — This one got some flak from Li Deng and others: use recurrent neural networks (RNNs) in combination with State Space Models (SSMs). RNN’s are good at long-term dependencies. And SSMs are good at modeling uncertainty. So, let’s put’m together! Then, model non-stationary stuff.

Unsupervised Learning

Lots of GANs, here. Also, Ian Goodfellow gave a shout out to plug and play generative networks, which I saw at the posters. The idea was to synthesize images, but do it with a prior so that it doesn’t look so creepy. Here are some of the images.

One of the practical papers on seeding k-means and improving upon k-means++ was Fast and Provably Good Seedings for k-Means by Oliver Bachem. The idea is to use MCMC to jump around the clusters in an efficient manner and seed the k-means with good clusters.

General Attendance Logistics

Having a discussion with another conference-goer, I learned that European conferences have a reputation for cutting registration right off. I contrasted with my experience at ICML in New York, where they booked an entirely different venue to supplement the overflow of conference participants. Up and downsides.

I don’t know if it’s now commonplace, but WiFi is intermittent, which is frustrating. But on the positive side, the conference app, “Whova”, is neat. The forum is worth checking out. You should note that “Whova” actually doesn’t need the internet, so this mitigates the fact that not everyone got hard-copy programs. The poster session was packed; it was clear that this was one of the largest conferences I’ve been to in a while!

I personally attended Nonstationary Time Series and Generative Adversarial Networks tutorials, where it was so crowded, people stood on the sides. On this point, I must apologize to the twenty people that I had to step over to get to my seat. I promise that I would have been just fine sitting on the sides, but twenty minutes after the lecture started, the security guard kicked everyone off from the edges, and made us all find seats. For good reason though, I think it’s good that there’s strict adherence to their fire codes in mind.

Ian Goodfellow delivered the GANs tutorial. So…Las Fallas is a celebration in Spain in March, but there were some fireworks today. I’m not going to gossip, but there was some contention about who came up with what first. Goodfellow shut it down pretty hard and quick.

Did I miss something?

Please let us know if there’s anything that I missed, since I was severely jet lagged. More fun tomorrow! Please let us know what other fun things were at NIPS if you’ve attended via e-mail (kni@iqt.org). Tune in tomorrow for the second blog post on NIPS 2016. It’s a blast out here in Barcelona, and we love sharing what we’re seeing!