Categories for AI

Program

The program consists of 2 parts, both consisting of online virtual talks that are streamed on Zoom, recorded, with additional discussions happening on the Zulip chat.
Subscribe to our calendar!

The Seminars (Ongoing)

A more irregular schedule of deep dives into specific topics of Category Theory, taught by invited experts in the area, some already showing applications to Machine Learning and some which have not been applied yet.

Introductory Lectures (Completed)

The lectures are finished for the moment, but you can still check out their recordings!
We had weekly introductory lectures, where we taught the basics of category theory with a focus on applications to Machine Learning.

Seminars

Future

Past

November 14	Neural network layers as parametric spans - Recording link and Slides
	Pietro Vertechi
	Properties such as composability and automatic differentiation made artificial neural networks a pervasive tool in applications. Tackling more challenging problems caused neural networks to progressively become more complex and thus difficult to define from a mathematical perspective. In this talk, we will discuss a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional), while guaranteeing existence and computability of the layer's derivatives for backpropagation.

November 21	Causal Model Abstraction & Grounding via Category Theory - Recording link and Slides
	Taco Cohen
	Causal models are used in many areas of science to describe data generating processes and reason about the effect of changes to these processes (interventions). Causal models are typically highly abstracted representations of the underlying process, consisting of only a few carefully selected variables, and the causal mechanisms between them. This simplifies causal reasoning, but the relation between the model and the underlying system is never described in mathematical terms, and this has led to considerable philosophical confusions. Furthermore, it has made it hard to understand how causal modeling relates to other fields such as physics (where systems are described by dynamical laws without reference to causes), dynamical systems, and agent-centric frameworks such as Markov Decision Processes (MDPs). In this talk we study this idea of abstraction from a categorical perspective, focussing on two questions in particular: What is an appropriate notion of morphism between causal models? When can we say that one model is an abstraction of another? How can we set up a convenient category of causal models? What does it mean for a causal model to be an abstraction of an underlying dynamical system or Markov decision process? To answer the first question we will mainly survey the existing literature, while for the second we will present a new approach to grounding causal models in dynamical systems and MDPs via natural transformations, and giving for the first time a mathematical definition of "causal mechanism" as a functional relationship between outcome variables that is invariant to interventions (modelled as transformations of the state space).

December 12	Category Theory Inspired by LLMs - Recording link and Slides
	Tai-Danae Bradley
	The success of today's large language models (LLMs) is striking, especially given that the training data consists of raw, unstructured text. In this talk, we'll see that category theory can provide a natural framework for investigating this passage from texts—and probability distributions on them—to a more semantically meaningful space. To motivate the mathematics involved, we will open with a basic, yet curious, analogy between linear algebra and category theory. We will then define a category of expressions in language enriched over the unit interval and afterwards pass to enriched copresheaves on that category. We will see that the latter setting has rich mathematical structure and comes with ready-made tools to begin exploring that structure.

March 20	Introduction to Categorical Cybernetics - Recording link and Slides
	Jules Hedge
	Categorical cybernetics is based on two things: (1) the abstract theory of categories of optics and related things, and (2) a whole bunch of specific examples. These tend to arise in topics that historically were called "cybernetics" (before that term drifted beyond recognition) - AI, control theory, game theory, systems theory. Specific examples of "things that compose optically" are derivatives (well known as backprop), exact and approximate Bayesian inverses, payoffs in game theory, values in control theory and reinforcement learning, updates of data (the original setting for lenses), and updates of state machines. I'll do a gentle tour through these, emphasising their shared structure and the field we're developing to study it. The talk will cover material related to the paper Towards Foundations of Categorical Cybernetics

March 27	Dynamic organizational systems: from deep learning to prediction markets - Recording link and Slides
	David Spivak
	In training artificial neural networks (ANNs), both neurons and arbitrary populations of neurons can be seen to perform the same type of task. Indeed, at any given moment they provide a function A-->B, and given any input from A and loss signal on B, they do two things: provide an updated function A-->B and backpropagate a loss signal on A. Populations of neurons, which we called "Learners", can be put together in series or in parallel, forming a symmetric monoidal category. However, ANNs satisfy an additional property: there is a consistent method by which the functions update and errors backpropagate; namely, they all use gradient descent. The chain rule implies that the composite of gradient descenders is again a gradient descender. In this talk I will discuss a generalization called "dynamic organizational systems", which includes ANNs, prediction markets, Hebbian learning, and strategic games. It is founded on the category Poly of polynomial functors, which generalizes Lens. I will review the relevant background on Poly and then explain dynamic organizational systems as coherent procedures by which a network of component systems can rewire its network structure in response to the data flowing through it. I'll explain the ANN case, and possibly the prediction market case, time permitting. The talk will cover material related to the paper Dynamic categories, dynamic operads: From deep learning to prediction markets

May 29	Sheaves for AI - Recording link and Slides
	Thomas Gebhart
	Many data-generating systems studied within machine learning derive global semantics from a collection of complex, local interactions among subsets of the system’s atomic elements. In order to properly learn representations of such systems, a machine learning algorithm must have the capacity to faithfully model these local interactions while also ensuring the resulting representations fuse properly into a consistent whole. In this talk, we will see that cellular sheaf theory offers an ideal algebro-topological framework for both reasoning about and implementing machine learning models on data which are subject to such local-to-global constraints over a topological space. We will introduce cellular sheaves from a categorical perspective before turning to a discussion of sheaf (co)homology as a semi-computable tool for implementing these categorical concepts. Finally, we will observe two practical applications of these ideas in the form of sheaf neural networks, a generalization of graph neural networks for processing sheaf-valued signals; and knowledge sheaves, a sheaf-theoretic reformulation of knowledge graph embedding.

Lecture Series

Week 1

Week of October 10	Week 1: Why Category Theory? - Recording link and Slides
	Bruno Gavranović
	By the end of this week you will: Get a sense of the philosophy and motivation behind Category Theory Learn about the recent wave of its applications emerging throughout the sciences Understand how this formal mathematical language rigorously adheres to the concept of modularity Dispel with the fallacy that CT is not relevant to practical disciplines such as programming or engineering Get a sense of how CT can help us design and scale our deep learning systems

Week 2

Week of October 17	Week 2: Essential building blocks: Categories and Functors - Recording link and Slides
	Petar Veličković
	By the end of this week you will: Understand the key building blocks of categories: objects, morphisms and functors. Leverage these concepts to explain several standard mathematical constructs: sets, relations, and groups. Get comfortable manipulating these concepts through several worked exercises. Ground all of the above in relevant deep learning context, with links to functional programming. Show how we can build an effective "type checker" for deep learning using the category of sets. These lectures will help explain key parts of Graph Neural Networks are Dynamic Programmers (NeurIPS 2022)

Week 3

Week of October 24	Week 3: Categorical Dataflow: Optics and Lenses as data structures for backpropagation - Recording link and Slides
	Bruno Gavranović
	By the end of this week you will: Understand the difference between a monoidal and a cartesian category Get comfortable using their formal graphical language: string diagrams Learn about lenses and optics, abstract interfaces for modelling bidirectional data flow See examples of lenses and optics modelling backpropagation, gradient descent, value iteration and more Understand how the chain rule is a special case of lens composition These lectures will help explain key parts of Categorical Foundations of Gradient-Based Learning (ESOP 2022)

Week 4

Week of October 31	Week 4: Geometric Deep Learning & Naturality - Recording link and Slides
	Pim de Haan
	By the end of this week you will: Understand the role of Symmetry equivariance in geometric deep learning Learn about Natural transformations between functors as a generalization of equivariant transformations between group representations Be able to build more expressive graph networks via naturality These lectures will help explain key parts of Natural Graph Networks (NeurIPS 2020)

Week 5

Week of November 7	Week 5: Monoids, Monads, Mappings, and lstMs - Recording link and Slides
	Andrew Dudzik
	By the end of this week you will: Know several useful basic monads Be familiar with different equivalent descriptions of monads, including the Kleisli and Eilenberg-Moore categories Understand how monoids formalize recurrence and aggregation Finally laugh at the in-joke about monads and monoids These lectures will help explain key parts of Graph Neural Networks are Dynamic Programmers (NeurIPS 2022)