Up and coming research

What graph neural networks cannot learn

My recent work studies the capacity limits of graph neural networks (GNN) falling in the message-passing framework.

What graph neural networks cannot learn: depth vs width (ICLR 2020)

Two main results are derived:

  • First, GNN are shown to be Turing universal under sufficient conditions on their construction.
  • Second, it is discovered that GNN can lose a significant portion of their power when their depth and width is restricted.

The proposed impossibility statements concern an array of decision, optimization, and estimation problems involving graphs (e.g., does my graph contain a cycle, estimate the diameter, or identify the minimum cut). Strikingly, several of these problems are deemed impossible unless the product of a GNN’s depth and width exceeds the graph size. This dependence remains significant even for tasks that appear simple or when considering approximation.

For an in depth exposition, see this blogpost.

On the relationship between self-attention and convolution

Self-attention has became the de-facto building block for NLP. But this success is not restricted to text—transformer-based architectures have been shown to be competitive with state-of-the-art ResNets on computer vision tasks.

Together with Jean-Baptiste Cordonnier and Martin Jaggi, we attempted to understand the relationship between self-attention and convolution.

On the Relationship between Self-Attention and Convolutional Layers (ICLR 2020)

Our work shows that not only multi-head self-attention has the capacity to express a CNN layer, but intriguingly, this also tends to happen in practice.

For more information, see this blog-post.

Modeling paths with graph neural networks

Jean-Baptiste Cordonnier and I worked on the problem of path inference: given a path prefix, i.e., a partially observed sequence of nodes in a graph, we aim to predict which nodes are in the missing suffix.

Extrapolating paths with graph neural networks (IJCAI 2019)

For more information, see this blog-post.

Review article on sampling techniques

Nicolas Tremblay and I recently dived into the very interesting literature of sampling methods for spectral clustering. Here is the (pre-print) of our labors:

Approximating Spectral Clustering via Sampling: a Review (chapter of the book “Sampling Techniques for Supervised or Unsupervised Tasks“)

In it, we discuss a lot of cool techniques from numerical linear algebra and machine learning (such as Nystrom approximation, Random Fourier Features, Coresets, Landmarks, Sparsification), in addition to summarizing our recent discoveries with random graph filters and coarsening.

Graph reduction and coarsening

I have been thinking for some time now on how can we reduce the size of a graph without significantly altering its basic properties.  In my latest work, I present coarsening algorithms that preserve the spectrum of a graph as well as its community structure.

Graph reduction with spectral and cut guarantees (JMLR 2019)

A demonstration can be found in this blog-post.

The discrimination capacity of graph neural network layers

Our new work (with Youngjoo Seo & Nathanaël Perraudin) considers the discrimination capacity of aggregation functions—a key ingredient of graph neural networks, generalizing convolution and pooling.

Discriminative structural graph classification

Taking first a theoretical standpoint, we argue that the maximum capacity of aggregation functions cannot be achieved in the context of learning from finite samples (by deriving a curse-of-dimensionality type lower bound).

Inspired by our findings, we design a graph neural network for structural graph classification aiming, not to have the maximum discrimination power, but to learn discriminative graph representations that generalize well. This is achieved by introducing a new type of aggregation function having fewer parameters and by exploiting randomization in training in order to combat overfitting. These choices are shown to yield benefits in practice.