Coordinate independent CNNs on Riemannian manifolds
An introduction to equivariant & coordinate independent CNNs – Part 5

This is the last post in my series on equivariant deep learning and coordinate independent CNNs.
- Part 1: Equivariant neural networks – what, why and how ?
- Part 2: Convolutional networks & translation equivariance
- Part 3: Equivariant CNNs & G-steerable kernels
- Part 4: Data gauging, co variance and equi variance
- Part 5: Coordinate independent CNNs on Riemannian manifolds
In this post we investigate how convolutional networks are generalized to the differential geometric setting, that is, to process feature fields on manifolds (curved spaces). There are numerous applications of such networks, for instance, to classify, segment or deform meshes, or to predict physical quantities like the wall shear stress on an artery surface or tensor fields in curved spacetime. A differential geometric formulation establishes furthermore a unified framework for more classical models like spherical and Euclidean CNNs.
Manifolds do in general not come with a canonical choice of coordinates. CNNs are therefore naturally formulated as a gauge field theory, where the gauge freedom is given by choices of local reference frames, relative to which features and network layers are expressed.
Demanding the coordinate independence (gauge independence) of such CNNs leads inevitably to their equivariance under local gauge transformations.
These gauge equivariance requirements correspond exactly to the
This post is structured in the following five sections, which cover:
- An intuitive introduction from an engineering viewpoint. It identifies the gauge freedom of choosing reference frames with an ambiguity in aligning convolution kernels on manifolds.
- Coordinate independent feature spaces, which may be represented in arbitrary gauges, and are characterized by their transformation laws when transforming frames.
- The necessity for the gauge equivariance of neural network layers.
- The global isometry equivariance of these operations.
- Applications on different manifolds and with various equivariance properties.
This post’s content is more thoroughly covered in our book Equivariant and Coordinate Independent CNNs, specifically in part II (simplified formulation), part III (fiber bundle formulation), and part IV (applications).
To define CNNs on manifolds, one needs to come up with a reasonable definition of convolution operations. As discussed in the second post of this series, convolutions on Euclidean spaces can be defined as those linear maps that share synapse weights across space, i.e. apply the same kernel at each location.
Kernel alignments as gauge freedom
As it turns out, finding a consistent definition of spatial weight sharing on manifolds is quite tricky. The central issue is the following:
The geometric alignment of convolution kernels on manifolds is inherently ambiguous.
For instance, on the monkey’s head below, it is unclear in which rotation a given kernel should be applied.

The specific level of ambiguity depends on the manifold’s geometry. For example, the Möbius strip allows for kernels to be aligned along the strip’s circular direction, disambiguating rotations. However, as the strip is twisted, it is a non-orientable manifold, i.e. it does not have a well-defined inside and outside. This implies that the kernel’s reflection remains ambiguous.

As a third example, consider Euclidean vector spaces

In each of these examples, kernel alignments are specified up to transformations in some matrix group
Steerable kernels as gauge independent operations
To remain general, we assume any Riemannian manifolds with any additional
Given the context of steerable CNNs from the
third post,
an obvious solution is to use
Here we have the slightly different situation of
In short, coordinate independent CNNs are just neural networks which apply
As the visualizations above already suggest,




Any geometric quantities, in particular feature vectors, are required to be
To explain what I mean with “coordinate independent feature spaces”, this section discusses
- tangent spaces and their frames of reference,
-structures as bundles of geometrically preferred frames, and- coordinate independent feature vectors.
Tangent vectors and reference frames
The idea of coordinate independence is best illustrated by the example of tangent vectors. Click on the ❯ arrow to uncover the diagram and the explanatory bullet points step by step.
Note how transformations of reference frames and tangent vector coefficients are coupled to each other. This is what is meant when saying that they are associated to each other.
A more formal mathematical description would consider the whole tangent bundle instead of a single tangent space
since this allows to capture concepts like the continuity or smoothness of tangent and feature fields.
Gauges (local bundle trivializations) correspond then to smooth frame fields (gauge fields) on local neighborhoods

If one picks a coordinate chart
-structures
Recall how the manifold’s mathematical structure reduces the ambiguity in kernel alignments such that transformations between them took values in some subgroup
A-priori, a smooth manifold has no additional structure that would prefer any reference frame.
One therefore considers sets

Additional structure on smooth manifolds allows to restrict attention to specific subsets of frames.
For instance, a Riemannian metric allows to measure distances and angles,
and hence to single out orthonormal frames, which are mutually related by rotations and reflections in
structure group | ||
---|---|---|
smooth structure only | any frames | |
orientation | right-handed frames | |
volume form | unit-volume frames | |
Riemannian metric | orthonormal frames | |
pseudo-Riemannian metric | Lorentz frames | |
metric + orientation | right-handed orthonormal frames | |
parallelization | frame field (unique frames) |
The graphics below give a visual intuition for












As explained below, each of these
The manifold’s topology may obstruct the existence of a continuous
Any CNN will necessarily have to be
Coordinate independent feature vectors
Feature vectors on a manifold with

This construction allows to model scalar, vector, tensor, or more general feature fields:
feature field | field type |
---|---|
scalar field | trivial representation |
vector field | standard representation |
tensor field | tensor representation |
irrep field | irreducible representation |
regular feature field | regular representation |
For a geometric interpretation and specific examples of feature field types, have a look at the examples given in the third post on Euclidean steerable CNNs. In fact, the feature fields introduced here are the differential geometric generalization of the fields discussed there.
Overall, we have the following
- frames transform according to a
right action of
- tangent vector coefficients get left-multiplied by
- feature vector coefficients transform according to
Furthermore, these objects have by construction compatible parallel transporters and isometry pushforwards (global group actions).
Coordinate independent CNNs are built from layers that
are 1) coordinate independent
and 2) share synapse weights between spatial locations (synapse weights referring e.g. to kernels, biases or nonlinearities).
Together, these two requirements enforce the shared weight’s steerability,
that is, their equivariance under
Kernels in geodesic normal coordinates
In contrast to Euclidean spaces, the local geometry of a Riemannian manifold might vary from point to point. It is therefore not immediately clear how convolution kernels should be defined on it and how they could be shared between different locations. A common solution is to define kernels as usual on flat Euclidean space and to apply them on tangent spaces instead of the manifold itself.
To match the kernel with feature fields it needs to be projected to the manifold, for which we leverage the Riemannian exponential map. Equivalently, one can think about this as pulling back the feature field from the manifold to the tangent spaces. When being expressed in a gauge, this corresponds to applying the kernel in geodesic normal coordinates.
Gauge equivariance
The equivariance requirement on kernels follows by the same logic as discussed in the
previous post:
a-priori,
G-covariance :
Assume that we are given a coordinate free kernel

G-equivariance :
In the case of convolutions, there is no initial kernel

Aligning
Example :
To get an intuition for the role of steerable kernels,
let's consider the example of a reflection group structure.
As discussed in the
third post,
two possible field types
Applying such an antisymmetric kernel in some gauge

You can check equivalent properties for any of the other pairs of field types and their steerable kernels from the third post. The difference here is that we are considering passive transformations of frames and kernel alignments instead of active transformations of signals. As only the relative alignment of kernels and signals matters, the behavior is ultimately equivalent.
Similar
Classically, convolutional networks are those networks that are equivariant w.r.t. symmetries of the space they are operating on.
For instance,
conventional CNNs
on Euclidean spaces commute with translations,
Euclidean steerable CNNs
commute with affine groups, and
spherical CNNs
commute with
The following section investigates the prerequisites for a layer’s isometry equivariance prior to convolutional weight sharing. In the one thereafter we apply the results to coordinate independent convolutions.
Isometry invariant kernel fields
Our main theorem regarding the isometry equivariance of coordinate independent CNNs establishes the mutual implication
isometry equivariant layer
- weight sharing of kernels across isometry orbits (points related by the isometry action) and
- the kernels' steerability w.r.t. their respective stabilizer subgroup.
Two examples are shown below.
The first one considers the rotational isometries of an egg-shaped manifold,
whose orbits are rings at different heights and the north and south pole.
In principle, equivariance does not require weight sharing across the whole manifold, but just on the rings, allowing for different kernels on different rings.
The stabilizer subgroups on the rings are trivial, leaving the kernels themselves unconstrained.
The second example considers an

A special case are manifolds like Euclidean spaces or the sphere
Isometry equivariant linear layers on homogeneous spaces are necessarily convolutions.
Isometry equivariance of convolutions
Coordinate independent convolutions rely on specific convolutional kernel fields,
constructed by sharing a single
As a first example, consider the


The observation that convolutional kernel fields inherit the symmetries of their underlying
Coordinate independent CNNs are equivariant w.r.t. those isometries that are symmetries of the
This allows us to design equivariant convolutions on manifolds by designing
Diffeomorphism and affine group equivariance
Beyond isometries, one could consider general diffeomorphisms.
Any operation that acts pointwise,
for instance bias summation or
Specifically on Euclidean spaces,
the Riemannian exponential map does not only commute with isometries in the Euclidean group
Our gauge theoretic formulation of feature vector fields and network layers is quite general:
we were able to identify more than 100 models from the literature as specific instantiations of coordinate independent CNNs.
While the authors did not formulate their models in terms of

The next sections give a brief overview of the different model categories in this literature review.
As the networks’ local and global equivariance properties correspond to symmetries of the underlying
Euclidean steerable CNNs
All of the models in the first 30 lines are
steerable CNNs
on Euclidean spaces, that is, conventional convolutions with




The major new insight in comparison to the classical formulation of equivariant CNNs
is that coordinate independent CNNs do not only describe the models’ global
For more details on Euclidean steerable CNNs, have a look at the third post of this series.
Polar and hyperspherical convolutions on Euclidean spaces


Instead of using polar coordinates with an isometric radial part one may use
log-polar coordinates
(line 32), whose frames scale exponentially with the radius.
This

Higher-dimensional analogs of these architectures on
Spherical CNNs
Spherical CNNs are relevant for processing omnidirectional images from 360° cameras, the cosmic microwave background, or climate patterns on the earth’s surface. They come in two main flavors:


The spherical geometry may furthermore be approximated by that of an
icosahedron (lines 39,40).
An advantage of this approach is that the icosahedron is locally flat and allows for an efficient implementation via Euclidean convolutions on the five visualized charts.
The non-trivial topology and geometry manifests in parallel feature transporters along the cut edges (colored chart borders).
Icosahedral CNNs appear again in the two flavors above, where
Möbius CNNs

Assuming the strip to be flat (have zero curvature), such convolutions are conveniently implemented in isometric coordinate charts.
When splitting the strip in two charts as shown below, their transition map will at one end be trivial and at the other be related by a reflection.
In an implementation, one can glue the two chart codomains at their trivial transition (
If you are interested in learning more about Möbius convolutions, check out our implementation on github and the explanations and derivations in chapter 10 of our book.
General surfaces

An alternative approach is to address the ambiguity of kernel alignments on surfaces by computing them via some sort of heuristic.
The examples in line 45 of the table above do this
by aligning kernels along principal curvature directions of their embedding in
The main points discussed in this post are:
- The geometric alignment of convolution kernels on a manifolds is often inherently ambiguous. This ambiguity can be identified with the gauge freedom of choosing reference frames.
- The specific level of ambiguity depends on the manifold's mathematical structure.
-structures disambiguate frames up to -valued gauge transformations. - Feature vectors and other mathematical objects on the manifold should be
-covariant, i.e. expressible relative to any frame from the -structure (coordinate independent). They transform according to a -representation , called field type. Gauge transformations of frames, tangent and feature vector coefficients are synchronized, that is, their fiber bundles are -associated. - In order for the spatial weight sharing of a kernel to remain coordinate independent,
the kernel is required to be
-steerable, i.e. equivariant under gauge transformations. The same holds for other shared operations like bias summation or nonlinearities. - A layer is isometry equivariant iff its neural connectivity is invariant under isometry actions.
- For convolutions, this neural connectivity is given by a kernel field
whose symmetries coincide by construction with those of the
-structure. Convolutions are therefore equivariant under those isometries that are symmetries of the -structure.
While being somewhat abstract,
our differential geometric formulation of coordinate independent CNNs in terms of fiber bundles
is highly flexible and allows to unify a wide range of related work in a common framework.
It even includes completely non-equivariant models like those in line 45 of the table above –
they correspond in our framework to asymmetric
Of course there are neural networks for processing feature fields that are not explained by our formulation of coordinate independent CNNs.
Such models could, for instance,
rely on
spectral operations,
involve
multi-linear correlators of feature vectors,
operate on
renderings,
be based on
graph neural networks,
or on
stochastic PDEs like diffusion processes,
to name but a few alternatives.
Importantly, these approaches are compatible with our definition of feature spaces in terms of associated
An interesting extension would be to formulate a differential version of coordinate independent CNNs, replacing our spatially extended steerable kernels by steerable partial differential operators. As mentioned above, this would allow for diffeomorphism equivariant CNNs.
Lastly, I would like to mention gauge equivariant neural networks for lattice gauge theories in fundamental physics, for instance (Boyda et al., 2021) or (Katsman et al., 2021). The main difference to our work is that their gauge transformations operate in an “internal” quantum space instead of spatial dimensions. However, both are naturally formulated in terms of associated fiber bundles. Their models are furthermore spatially equivariant and are in this sense compatible with our gauge equivariant CNNs.
Image references
- Lizards and butterflies adapted under the Creative Commons Attribution 4.0 International license by courtesy of Twitter.
- Mesh segmentation rendering from Sidi et al. (2021).
- Artery wall stress rendering from Shiba et al. (2017).
- Spacetime visualization from WGBH.
- Cosmic microwave background adapted from Tegmark et al. (2023).
- Electron microscopy of neural tissue adapted from the ISBI 2012 EM segmentation challenge.
- Lobsters adapted under the Apache license 2.0 by courtesy of Google.
- Owl teacher adapted from Freepik