**What graph neural networks cannot learn **

My recent work studies the capacity limits of graph neural networks (GNN) falling in the message-passing framework.

What graph neural networks cannot learn: depth vs width (to appear in ICLR 2020)

Two main results are derived:

*First, GNN are shown to be Turing universal under sufficient conditions on their construction.**Second, it is discovered that GNN can lose a significant portion of their power when their depth and width is restricted.*

The proposed impossibility statements concern an array of *decision*, *optimization*, and *estimation* problems involving graphs (e.g., does my graph contain a cycle, estimate the diameter, or identify the minimum cut). Strikingly, several of these problems are deemed impossible unless the product of a GNN’s depth and width exceeds the graph size. This dependence remains significant even for tasks that appear simple or when considering approximation.

For an in depth exposition, see* this blogpost.*

**On the relationship between self-attention and convolution**

Self-attention has became the de-facto building block for NLP. But this success is not restricted to text—transformer-based architectures have been shown to be competitive with state-of-the-art ResNets on computer vision tasks.

Together with Jean-Baptiste Cordonnier and Martin Jaggi, we attempted to understand the relationship between self-attention and convolution.

On the Relationship between Self-Attention and Convolutional Layers (to appear in ICLR 2020)

Our work shows that not only multi-head self-attention has the capacity to express a CNN layer, but intriguingly, this also tends to happen in practice.

For more information, see *this blog-post*.

**Modeling paths with graph neural networks**

Jean-Baptiste Cordonnier and I worked on the problem of path inference: given a path prefix, i.e., a partially observed sequence of nodes in a graph, we aim to predict which nodes are in the missing suffix.

Extrapolating paths with graph neural networks (IJCAI 2019)

For more information, see *this blog-post*.

**Review article on sampling techniques**

Nicolas Tremblay and I recently dived into the very interesting literature of sampling methods for spectral clustering. Here is the (pre-print) of our labors:

Approximating Spectral Clustering via Sampling: a Review (chapter of the book “*Sampling Techniques for Supervised or Unsupervised Tasks*“)

In it, we discuss a lot of cool techniques from numerical linear algebra and machine learning (such as *Nystrom approximation*, *Random Fourier Features*, *Coresets*, *Landmarks, Sparsification*), in addition to summarizing our recent discoveries with *random graph filters* and *coarsening*.

**Graph reduction and coarsening **

I have been thinking for some time now on *how can we reduce the size of a graph without significantly altering its basic properties*. In my latest work, I present coarsening algorithms that preserve the *spectrum of a graph* as well as its *community structure*.

Graph reduction with spectral and cut guarantees (JMLR 2019)

A demonstration can be found in *this blog-post*.

**The discrimination capacity of graph neural network layers**

Our new work (with Youngjoo Seo & Nathanaël Perraudin) considers the discrimination capacity of aggregation functions—a key ingredient of graph neural networks, generalizing convolution and pooling.

Discriminative structural graph classification

Taking first a theoretical standpoint, we argue that the maximum capacity of aggregation functions cannot be achieved in the context of learning from finite samples (by deriving a curse-of-dimensionality type lower bound).

Inspired by our findings, we design a graph neural network for structural graph classification aiming, not to have the maximum discrimination power, but to learn discriminative graph representations that generalize well. This is achieved by introducing a new type of aggregation function having fewer parameters and by exploiting randomization in training in order to combat overfitting. These choices are shown to yield benefits in practice.