Chapter 5

Basic Category Theory

…We know only a very few—and, therefore, very precious—schemes whose unifying powers cross many realms.”—Marvin Minsky.1

Categories, or an equivalent notion, have already been introduced as ologs, or equivalently, as database schemas. One can think of a category as a graph (as in Section 4.3) in which certain paths have been declared congruent. (Ologs demand an extra requirement that everything be readable in natural language, and this cannot be part of the mathematical definition of category.) The formal definition of category is given in Definition, but it will not appear obvious that it is equivalent to the graph + congruence notion of schema, found in Definition Once we know how different categories can be compared using functors (Definition, and how different schemas can be compared using schema mappings (Definition, we prove that the two notions are indeed equivalent (Theorem

5.1   Categories and functors

This section gives the standard definition of categories and functors. These, together with natural transformations (Section 5.3), form the backbone of category theory. It also gives several examples.

5.1.1   Categories

In everyday speech we think of a category as a kind of thing. A category consists of a collection of things, all of which are related in some way. In mathematics a category can also be construed as a collection of things and a type of relationship between pairs of such things. For this kind of thing-relationship duo to count as a category, we need to check two rules, which have the following flavor: every thing must be related to itself by simply being itself, and if one thing is related to another and the second is related to a third, then the first is related to the third. In a category the things are called objects and the relationships are called morphisms.

So far we have discussed things of various sorts, e.g., sets, monoids, graphs. In each case we discussed how such things should be appropriately compared as homomorphisms. In each case the things stand as the objects and the appropriate comparisons stand as the morphisms in the category. Here is the definition.

Definition A category C is defined as follows: One announces some constituents (A. objects, B. morphisms, C. identities, D. compositions) and shows that they conform to some laws (1. identity law, 2. associativity law). Specifically, one announces

A. a collection Ob(C), elements of which are called objects;

B. for every pair x, yOb(C), a set HomC(x,y)Set; it is called the hom-set from x to y; its elements are called morphisms from x to y;2

C. for every object xOb(C), a specified morphism, denoted idxHomC(x,x), and called the identity morphism on x;

D. for every three objects x,y,zOb(C), a function


called the composition formula.

Given objects x, y ∈ Ob(C), we can denote a morphism fHomC(x,y) by f: xy; we say that x is the domain of f and that y is the codomain of f. Given also g: yz, the composition formula is written using infix notation, so gf: xz means (g,f)HomC(x,z).

One must then show that the following category laws hold:

  1. For every x, y ∈ Ob(C) and every morphism f: xy, we have


  2. If w, x, y, z ∈ Ob(C) are any objects, and f: wx, g: xy, and h: yz are any morphisms, then the two ways to compose yield the same element in HomC(w,z):


Remark There is perhaps much that is unfamiliar about Definition, but there is also one thing that is strange about it. The objects Ob(C) of C are said to be a collection rather than a set. This is because we sometimes want to talk about the category of all sets, in which every possible set is an object, and if we try to say that the collection of sets is itself a set, we run into Russell’s paradox. Modeling this was a sticking point in the foundations of category theory, but it was eventually fixed by Grothendieck’s notion of expanding universes. Roughly, the idea is to choose some huge set κ (with certain properties making it a universe), to work entirely inside of it when possible, and to call anything in that world κ-small (or just small if κ is clear from context). When we need to look at κ itself, we choose an even bigger universe κ′ and work entirely within it.

A category in which the collection Ob(C) is a set (or a small set) is called a small category. From here on I do not take note of the difference; I refer to Ob(C) as a set. I do not think this will do any harm to scientists using category theory, at least not in the beginning phases of their learning.

Example (The category Set of sets). Chapters 2 and 3 were about the category of sets, denoted Set. The objects are the sets and the morphisms are the functions; and the current notation HomSet(X, Y) was used to refer to the set of functions XY. The composition formula ○ is given by function composition, and for every set X, the identity function idX: XX serves as the identity morphism for X ∈ Ob(Set). The two laws clearly hold, so Set is indeed a category.

Example (The category Fin of finite sets). Inside the category Set is a subcategory FinSet, called the category of finite sets. Whereas an object S ∈ Ob(Set) is a set that can have arbitrary cardinality, Fin is defined such that Ob(Fin) includes all (and only) those sets S having finitely many elements, i.e., |S| = n for some natural number n ∈ ℕ. Every object of Fin is an object of Set, but not vice versa.

Although Fin and Set have different collections of objects, their notions of morphism are in some sense the same. For any two finite sets S, S′ ∈ Ob(Fin), we can also think of S, S′ ∈ Ob(Set), and we have


That is, a morphism in Fin between finite sets S and S′ is simply a function f: SS′.

Example (The category Mon of monoids). Monoids were defined in Definition, and monoid homomorphisms in Definition Every monoid M(M,e,M) has an identity homomorphism idM:MM, given by the identity function idM: MM. To compose two monoid homomorphisms f:MM and g:MM, we compose their underlying functions f: MM′ and g: M′ → M″, and check that the result gf is a monoid homomorphism. Indeed,


gf(m1Mm2)=g(f(m1)Mf(m2))=gf(m1)M gf(m2).

It is clear that the two category laws (unit and associativity) hold, because monoid morphisms are special kinds of functions, and functions compose unitally and associatively. So Mon is a category.

Remark The following will be informal, but it can be formalized. Let’s define a questionable category to be the specification of A, B, C, D from Definition, without enforcing either of the category laws (1, 2). Suppose that Q is a questionable category and C is a category. If Q sits somehow inside of C, in the precise sense that

A. there is a function U:Ob(Q)Ob(C),

B. for all a,bOb(Q), we have an injection U:HomQ(a,b)HomC(U(a),U(b)),

C. for all aOb(Q), both Q and C have the same version of the identity on a, i.e., U(ida) = idU(a),

D. for all f: ab and g: bc in Q, both Q and C have the same version of composition gf, i.e., U(gf) = U(g) ○ U(f),

then Q is a category (no longer questionable).

This fact was used in Example for MonSet.


Suppose we set out to define a category Grp, having groups as objects and group homomorphisms as morphisms (see Definition Show that the rest of the conditions for Grp to be a category are satisfied.


Suppose we set out to define a category PrO, having preorders as objects and preorder homomorphisms as morphisms (see Definition Show (to the level of detail of Example that the rest of the conditions for PrO to be a category are satisfied.

Example (Noncategory 1). What is not a category? Two things can go wrong: either one fails to specify all the relevant constituents (A, B, C, D from Definition, or the constituents do not obey the category laws (1, 2).

Let G be the following graph:


Suppose we try to define a category G by faithfully recording vertices as objects and arrows as morphisms. Will that be a category?

Following that scheme, we put Ob(G)={a,b,c}. For all nine pairs of objects we need a hom-set. Since the only things we are calling morphisms are the arrows of G, we put


If we say we are done, the listener should object that we have given neither identities (C) nor a composition formula (D), and these are necessary constituents. Now we are at a loss: it is impossible to give identities under this scheme, because, e.g., HomG(a,a)=. So what we have for G is not a category.

Suppose we fix that problem, adding an element to each of the diagonals so that


But the listener still demands a composition formula. In particular, we need a function


but the domain is nonempty (it is {(f, g)}) and the codomain HomG(a,c)= is empty; there is no such function. In other words, to satisfy the listener we need to add a composite for the arrows f and g.

So again we must make a change, adding an element to make HomG(a,c)={h}. We can now say gf = h. Finally, this does the trick and we have a category with the following morphisms:


A computer could check this quickly, as can someone with good intuition for categories; for everyone else, it may be a painstaking process involving determining whether there is a unique composition formula for each of the 27 pairs of hom-sets and whether the associative law holds in the 81 necessary cases. Luckily this computation is sparse (lots of ∅’s).

If all the morphisms are drawn as arrows, the graph becomes:


Example (Noncategory 2). In this example, we make a faux category F with one object and many morphisms. The problem here is the composition formula.

Define F to have one object Ob(F)={}, and HomF(,)= . Define id= 1 ∈ ℕ. Define the composition formula ○: ℕ × ℕ → ℕ by the usual exponentiation function for natural numbers, mn = mn. This is a perfectly cromulent function, but it does not work right as a composition formula. Indeed, for the identity law to hold, we would need m1 = m = 1m, and one side of this is false. For the associativity law to hold, we would need (mn)p=m(np), but this is also not the case.

To fix this problem we must completely revamp the composition formula. It would work to use multiplication, mn = m * n. Then the identity law would read 1 * m = m = m * 1, and that holds; and the associativity law would read (m * n) * p = m * (n * p), and that holds.

Example (The category of preorders with joins). Suppose we are only interested in preorders (X, ⩽) for which every pair of elements has a join. We saw in Exercise that not all preorders have this property. However, we can create a category C in which every object does have this property. To begin, let’s put


for the set of objects. What about morphisms?

One option would be to put in no morphisms (other than identities) and to just consider this collection of objects as having no structure other than a set. In other words, we can take C to be the discrete category on the preceding set Ob(C)=C.

Another option, say, C with objects Ob(C)C, would be to put in exactly the same morphisms as in PrO: for any objects a, bC, we consider a and b as ordinary preorders and put HomC(a,b)HomPrO(a,b). The resulting category C of preorders with joins is called the full subcategory of PrO spanned by the preorders with joins.3

A third option, say, C with objects Ob(C)C, would stand out to a category theorist. That is, the conscientious modeler takes the choice about how we define objects as a clue to how we should define morphisms.


If you like joins so much, why don’t you marry them?

Morphisms are often billed as preserving all the structure we care about, so it is worth asking whether we want to enforce that constraint on morphisms. That is, suppose f: (X, ⩽X) → (Y, ⩽Y) is a morphism of preorders. We might want to condition the decision of whether to include f as a morphism in C on whether, for any join w = xx′ in X, it is the case that f(w) = f(x) ∨ f(x′) in Y. Concisely, we could define the morphisms in C by


One can check easily that the identity morphisms preserve joins and that compositions of join-preserving morphisms are join-preserving, so this version of homomorphisms makes C a well defined category.

These options are by no means comprehensive, and none of these options is better than any other. Which category to use is decided by whatever fits the situation being modeled.

Example (Category FLin of finite linear orders). We have a category PrO of preorders, and some of its objects are finite linear orders. Let FLin be the full subcategory of PrO spanned by the linear orders. That is, following Definition, given linear orders X, Y ∈ Ob(FLin), every morphism of preorders XY counts as a morphism in FLin:



Let FLin be the category of finite linear orders, defined in Example For n ∈ ℕ, let [n] be the linear order defined in Example What are the cardinalities of the following sets?

a. HomFLin([0], [3])

b. HomFLin([3], [0])

c. HomFLin([2], [3])

d. HomFLin([1], [n])

e. (Challenge) HomFLin([m], [n])

It turns out that the category FLin of linear orders is sufficiently rich that much of algebraic topology (the study of arbitrary spaces, such as Mobius strips and seven-dimensional spheres) can be understood in its terms. See Example

Example (Category of graphs). Graphs were defined in Definition and graph homomorphisms in Definition To see that these are sufficient to form a category is considered routine to a seasoned category theorist, so let’s see why.

Since a morphism from G = (V, A, src, tgt) to G = (V′, A′, src′, tgt′) involves two functions f0: VV′ and f1: AA′, the identity and composition formulas simply arise from the identity and composition formulas for sets. Associativity follow similarly. The only thing that needs to be checked is that the composition of two such morphisms, each satisfying (4.5), will itself satisfy (4.5). For completeness, we check that now.

Suppose that f=(f0,f1):GG and g=(g0,g1):GG are graph homomorphisms, where G=(V,A,src,tgt). Then in each diagram in (5.2)


the left-hand square commutes because f is a graph homomorphism and the right-hand square commutes because g is a graph homomorphism. Thus the whole rectangle commutes, meaning that gf is a graph homomorphism, as desired.

We denote the category of graphs and graph homomorphisms Grph.

Remark When one is struggling to understand basic definitions, notation, and style, a phase that naturally occurs when learning new mathematics (or any new language), the preceding example will probably appear long and tiring. I would say the reader has mastered the basics when the example seems straightforward. Around this time, I hope the reader will get a sense of the remarkable organizational potential of the categorical way of thinking.


Let F be a vector field defined on all of ℝ2. Recall that for two points x, x′ ∈ ℝ2, any curve C with endpoints x and x′, and any parameterization r: [a, b] → C, the line integral ∫C F(rdr returns a real number. It does not depend on r, except its orientation (direction). Therefore, if we think of C has having an orientation, say, going from x to x′, then ∫C F is a well defined real number. If C goes from x to x′, let’s write C: xx′. Define an equivalence relation ∼ on the set of oriented curves in ℝ2 by saying CC′ if

  • C and C′ start at the same point;
  • C and C′ end at the same point;
  • C F = ∫C′ F.

Suppose we try to make a category CF as follows. Put Ob(CF)=2, and for every pair of points x, x′ ∈ ℝ2, let HomCF(x,x)={C:xx}/~, where C: xx′ is an oriented curve and ∼ means “same line integral,” as explained.

Is there an identity morphism and a composition formula that will make CF into a category?


Yes. For every object x ∈ ℝ2, the constant curve at x serves as the identity on x. If C: xy and C′: yz are curves, their composition is given by joining them to get a curve xz.   Isomorphisms

In any category we have a notion of isomorphism between objects.

Definition Let C be a category, and let X,YOb(C) be objects. An isomorphism f from X to Y is a morphism f: XY in C such that there exists a morphism g: YX in C with


In this case we say that the morphism f is invertible and that g is the inverse of f. We may also say that the objects X and Y are isomorphic.

Example If C=Set is the category of sets, then Definition coincides precisely with the one given in Definition


Let C be a category, and let c ∈ Ob(C) be an object. Show that idc is an isomorphism.


We have a morphism idc: cc. To show it is an isomorphism we just need to find a morphism f: cc such that f ○ idc = idc and idcf = idc. Taking f = idc works.


Let C be a category, and let f: XY be a morphism. Suppose that both g: YX and g′: YX are inverses of f. Show that they are the same morphism, g = g′.


Suppose that G = (V, A, src, tgt) and G′ = (V′, A′, src′, tgt′) are graphs and that f = (f0, f1): GG′ is a graph homomorphism (as in Definition

a. If f is an isomorphism in Grph, does this imply that f0: VV′ and f1: AA′ are isomorphisms in Set?

b. If so, why; if not, show a counterexample (where f is an isomorphism but either f0 or f1 is not).


Suppose that G = (V, A, src, tgt) and G′ = (V′, A′, src′, tgt′) are graphs and that f = (f0, f1): GG′ is a graph homomorphism (as in Definition

a. If f0: VV′ and f1: AA′ are isomorphisms in Set, does this imply that f is an isomorphism in Grph?

b. If so, why; if not, show a counterexample (where f0 and f1 are isomorphisms but f is not).

Proposition Let C be a category, and letbe the relation on Ob(C) given by saying XY iff X and Y are isomorphic. Thenis an equivalence relation.

Proof. The proof of Proposition can be mimicked in this more general setting.   Another viewpoint on categories

Here is an alternative definition of category, using the work done in Chapter 2.


Suppose we begin our definition of category as follows.

A category C consists of a sequence (Ob(C),HomC,dom,cod,ids,comp), where

  • Ob(C) is a set;4
  • HomC is a set, and dom,cod:HomCOb(C) are functions;
  • ids: Ob(C)HomC is a function;
  • comp is a function as depicted in the commutative diagram (5.3)


a. Add to diagram (5.3) to express the fact that for any xOb(C), the morphism idx points from x to x.

b. Express the condition that composing a morphism f with an appropriate identity morphism yields f.


a. This is expressed by the equations: domids=idOb(C) and codids=idOb(C). One could express this with the diagram:


b. We have idHomC:HomCHomC and idscod:HomCHomC, and these commute over Ob(C), meaning that for any morphism f: AB, its codomain is the domain of idB. Thus a unique map


is induced (see Proposition Similarly there is a function


When we compose either of these morphisms with comp, we are taking the composition of a morphism and the identity (either on the domain or the codomain). Thus, the fact that composing any morphism with an identity morphism returns that morphism is expressed by asserting two path equivalences,


in the following diagram:


Example (Partial olog for a category). Diagram (5.4) is an olog that captures some of the essential structures of a category:


Missing from (5.4) is the notion of identity morphism (as an arrow from ⌜an object of C⌝ to ⌜a morphism in C⌝) and the associated path equivalences, as well as the identity and associativity laws. All of these can be added to the olog, at the expense of some clutter.

Remark Perhaps it is already clear that category theory is very interconnected. It may feel like everything relates to everything, and this feeling may intensify as you go on. However, the relationships between different notions are rigorously defined, not random. Moreover, almost everything presented in this book can be formalized in a proof system like Coq (the most obvious exceptions being things like the readability requirement of ologs and the modeling of scientific applications).

Whenever you feel cognitive vertigo, use the interplay between examples and formal definitions to solidify your understanding. Go through each example, making sure it conforms to the definitions or theorems it purports to exemplify.

5.1.2   Functors

A category C=(Ob(C),HomC,dom,cod,ids,comp), involves a set of objects, a set of morphisms, a notion of domains and codomains, a notion of identity morphisms, and a composition formula. For two categories to be comparable, these various components should be appropriately comparable.

Definition Let C and C be categories. A functor F from C to C, denoted F:CC, is defined as follows: One announces some constituents (A. on-objects part, B. on-morphisms part) and shows that they conform to some laws (1. preservation of identities, 2. preservation of composition). Specifically, one announces

A. a function Ob(F):Ob(C)Ob(C), sometimes denoted simply F:Ob(C)Ob(C);

B. for every pair of objects c,dOb(C), a function


sometimes denoted simply F:HomC(c,d)HomC(F(c),F(d)).

One must then show that the following functor laws hold:

  1. Identities are preserved by F, that is, for any object cOb(C), we have F(idc) = idF(c).
  2. Composition is preserved by F, that is, for any objects b,c,dOb(C) and morphisms g: bc and h: cd, we have F(hg) = F(h) ○ F(g).

Example (Monoids have underlying sets). Recall from Definition that if M = (M, e, ⋆) is a monoid, then M is a set. And recall from Definition that if f:MM is a monoid homomorphism, then f: MM′ is a function. Thus we can define a functor


The on-objects part of U sends every monoid to its underlying set, U(M)=M, and sends every monoid homomorphism to its underlying function U(f) = f. It is easy to check that the functor laws hold, so U is indeed a functor.

Given two monoids M=(M,e,) and M=(M,e,), there may be many functions from M to M′ that do not arise from monoid homomorphisms. In other words, U:HomMon(M,M)HomSet(M,M) may not be surjective. It is often useful to speak of such functions. For example, one could assign to every command in one video game V a command in another video game V′, but this may not work in accordance with the monoid laws when performing a sequence of commands. By being able to speak of M as a set or of M as a monoid, and understanding the relationship U between them, we can be clear about where we stand at all times in the discussion.

Example (Groups have underlying monoids). Recall that a group is just a monoid (M, e, ⋆) with the extra property that every element mM has an inverse m′ ⋆ m = e = mm′. Thus to every group we can assign its underlying monoid. Similarly, a group homomorphism is just a monoid homomorphism of its underlying monoids. This means that there is a functor


that sends every group or group homomorphism to its underlying monoid or monoid homomorphism. Identity and composition are preserved.

Application Suppose you are a scientist working with symmetries. But then suppose that the symmetry breaks somewhere, or you add some extra observable that is not reversible under the symmetry. You want to seamlessly relax the requirement that every action be reversible without changing anything else. You want to know how you can proceed, or what is allowed. The answer is to simply pass from the category of groups (or group actions) to the category of monoids (or monoid actions).

We can also reverse this change of perspective. Recall that Example discussed a monoid M controlling the actions of a video game character. The character position (P) could be moved up (u), moved down (d), or moved right (r). The path equivalences P.u.d = P and P.d.u = P imply that these two actions are mutually inverse, whereas moving right has no inverse. This, plus equivalences P.r.u = P.u.r and P.r.d = P.d.r, defined a monoid M.

Inside M is a submonoid G, which includes just upward and downward movement. It has one object, just like M, i.e., Ob(M) = {P} = Ob(G). But it has fewer morphisms. In fact, there is a monoid isomorphism G ≅ ℤ because we can assign to any movement in G the number of ups, e.g., P[u, u, u, u, u] is assigned the integer 5, P[d, d, d] is assigned the integer −3, and P[d, u, u, d, d, u] is assigned the integer 0 ∈ ℤ. But ℤ is a group, because every integer has an inverse.

The upshot is that we can use functors to compare groups and monoids.


Out of all our available actions, some are reversible.

Example Recall that we have a category Set of sets and a category Fin of finite sets. We said that Fin was a subcategory of Set. In fact, we can think of this subcategory relationship in terms of functors, just as we thought of the subset relationship in terms of functions in Example Recall that if we have a subset SS′, then every element sS is an element of S′, so we make a function f: SS′ such that f(s) = sS′.

To give a functor i: FinSet, we have to announce how it works on objects and how it works on morphisms. We begin by announcing a function i: Ob(Fin) → Ob(Set). By analogy with the preceding, we have a subset Ob(Fin) ⊆ Ob(Set). Hence every element s ∈ Ob(Fin) is an element of Ob(Set), so we put i(s) = s. We also have to announce, for each pair of objects s, s′ ∈ Ob(Fin), a function


But again, that is easy because we know by definition (see Example that these two sets are equal, HomFin(s, s′) = HomSet(s, s′). Hence we can simply take i to be the identity function on morphisms. It is evident that identities and compositions are preserved by i. Therefore, we have defined a functor i.

Remark Recall that any group is just a monoid, except that it has an extra property: every element has an inverse. Thus one can start with a group, “forget” the fact that it is a group and remember only that it is a monoid. Doing this is functorial—Example discussed it as a functor U: GrpMon. We say that U is a forgetful functor. There is also a forgetful functor MonSet and so GrpSet.


You can use a smartphone as a paperweight.

Colloquially, people often say things like, “Carol wears many hats” to mean that Carol acts in different roles, even though substantively she is somehow the same. The hat Carol currently wears is the analogous to the category, or context of interaction, that she is currently in.


A partial order is just a preorder with a special property. A linear order is just a partial order with a special property.

a. Is there a useful functor FLinPrO?

b. Is there a useful functor PrOFLin?

Proposition (Preorders to graphs). Let PrO be the category of preorders and Grph be the category of graphs. There is a functor P: PrOGrph such that for any preorder X=(X,), the graph P(X) has vertices X.

Proof. Given a preorder X=(X,X), we can make a graph F(X) with vertices X and an arrow xx′ whenever xX x′, as in Remark More precisely, the preorder ⩽X is a relation, i.e., a subset RXX×X, which we think of as a function i:RXX×X. Composing with projections π1, π2: X × XX gives


Then we put F(X)(X,RX,srcX,tgtX). This gives us a function F: Ob(PrO) → Ob(Grph).

Suppose now that f:XY is a preorder morphism, where Y=(Y,Y). This is a function f: XY such that for any (x, x′) ∈ X ×X, if xX x′, then f(x) ⩽ f(x′). But that is the same as saying that there exists a dotted arrow making the following diagram of sets commute


(Note that there cannot be two different dotted arrows making that diagram commute because RYY×Y is a monomorphism.) This commutative square is precisely what is needed for a graph homomorphism, as shown in Exercise Thus, we have defined F on objects and on morphisms. It is clear that F preserves identity and composition.


Proposition gave a functor P: PrOGrph.

a. Is every graph G ∈ Ob(Grph) in the image of P (or more precisely, is the function



b. If so, why; if not, name a graph not in the image.

c. Suppose that G′ and H′ are preorders with graph formats P(G′) = G and P(H′) = H. Is every graph homomorphism f: GH in the image of


In other words, does every graph homomorphism between G and H come from a preorder homomorphism between G′ and H′?

Remark There is a functor W: PrOSet sending (X, ⩽) to X. There is a functor T: GrphSet sending (V, A, src, tgt) to V. When we study the category of categories (see Section, it will be clear that Proposition can be summarized as a commutative triangle in Cat,



Recall from (2.3) that every function f: AB has an image, imf(A) ⊆ B. Use this idea and Example to construct a functor Im: GrphPrO such that for any graph G = (V, A, src, tgt), the vertices of G are the elements of Im(G). That is, find some ordering ⩽G, such that we have Im(G) = (V, ⩽G).


Suppose given an object G ∈ Ob(Grph), i.e., a graph G = (V, A, src, tgt). The source and target functions combine to give a function 〈src, tgt〉: AV × V. Its image is a subset RV × V, i.e., a binary relation. But R is not necessarily a preorder. We can remedy that by using the preorder R¯ generated by R, as in Example On objects we put Im(G) ≔ R¯. One way to understand this preorder is that it has as elements V, the vertices of G, and it has vv′ if and only if there exists a path from v to v′ in G.

Given a morphism f: GG′, we need to provide a preorder morphism Im(G) → Im(G′). The obvious choice is to use f0 (what f does on vertices), but we need to check that it preserves the order. This is clear because graph morphisms send paths to paths—if there was a path from v to v′ in G, there will be one from f(v) to f(v′). We need to check that Im(idG) = idIm(G), but this is straightforward.


In Exercise you constructed a functor Im: GrphPrO. What is the preorder Im(G) when G ∈ Ob(Grph) is the following graph?



Consider the functor Im: GrphPrO constructed in Exercise

a. Is every preorder XOb(PrO) in the image of Im (or more precisely, in the image of Ob(Im): Ob(Grph) → Ob(PrO))?

b. If so, why; if not, name a preorder not in the image.

c. Suppose that X,YOb(Grph) are graphs, with XIm(X) and YIm(Y) in the preorder format. Is every preorder morphism f:XY in the image of


In other words, does every preorder homomorphism between X and Y come from a graph homomorphism between X and Y?


We have functors P: PrOGrph and Im: GrphPrO.

a. What can you say about ImP: PrOPrO?

b. What can you say about PIm: GrphGrph?


Consider the functors P: PrOGrph and Im: GrphPrO. And consider the chain graph [n] of length n from Example and the linear order [n] of length n from Example To differentiate the two, let’s rename them for this exercise as [n]Grph ∈ Ob(Grph) and [n]PrO ∈ Ob(PrO). We see a similarity between [n]Grph and [n]PrO, and we might hope that the functors help formalize this similarity. That is, we might hope that one of the following hold:

P([ n ]PrO)?[ n ]GrphorIm([ n ]Grph)?[ n ]PrO.

Do either, both, or neither of these hold?

Remark In the course announcement for MIT’s 18-S996 course, I wrote the following:

It is often useful to focus one’s study by viewing an individual thing, or a group of things, as though it exists in isolation. However, the ability to rigorously change our point of view, seeing our object of study in a different context, often yields unexpected insights. Moreover, this ability to change perspective is indispensable for effectively communicating with and learning from others. It is the relationships between things, rather than the things in and by themselves, that are responsible for generating the rich variety of phenomena we observe in the physical, informational, and mathematical worlds.

This holds at many different levels. For example, one can study a group (in the sense of Definition in isolation, trying to understand its subgroups or its automorphisms, and this is mathematically interesting. But one can also view it as a quotient of something else, or as a subgroup of something else. One can view the group as a monoid and look at monoid homomorphisms to or from it. One can look at the group in the context of symmetries by seeing how it acts on sets. These changes of viewpoint are all clearly and formally expressible within category theory. We know how the different changes of viewpoint compose and how they fit together in a larger context.


a. Is the preceding quotation also true in your scientific discipline of expertise? How so?

b. Can you imagine a way that category theory can help catalogue the kinds of relationships or changes of viewpoint that exist in your discipline?

c. What kinds of structures that you use often deserve to be better formalized?

Example (Free monoids). Let G be a set. Definition defined a monoid List(G), called the free monoid on G. Given a function f: GG′, there is an induced function List(f): List(G) → List(G′), and this preserves the identity element [ ] and concatenation of lists, so List(f) is a monoid homomorphism. It is easy to check that List: SetMon is a functor.

Application Application discussed an isomorphism NucDNA ≅ NucRNA given by RNA transcription. Applying the functor List, we get a function


which will send sequences of DNA nucleotides to sequences of RNA nucleotides, and vice versa. This is performed by polymerases.


Let G = {1, 2, 3, 4, 5}, G′ = {a, b, c}, and let f: GG′ be given by the sequence (a, c, b, a, c).5 Then if L = [1, 1, 3, 5, 4, 5, 3, 2, 4, 1], what is List(f)(L)?


Use f to translate L, entry by entry:

List(f)( [ 1,1,3,5,4,5,3,2,4,1 ]=[ a,a,b,c,a,c,b,c,a,a ].

Remark (Questionable functor). Recall from Remark that a questionable category is defined to be a structure that looks like a category (objects, morphisms, identities, composition formula), but which is not required to satisfy any laws. Similarly, given categories (or questionable categories) C and D, we can define a questionable functor F: CD to consist of

A. a function Ob(F):Ob(C)Ob(C), sometimes denoted simply F:Ob(C)Ob(C);

B. for every pair of objects c, d ∈ Ob(C), a function


sometimes denoted simply F:HomC(c,d)HomC(F(c),F(d)).


We can rephrase the notion of functor in terms compatible with Exercise We begin by saying that a functor F:CC consists of two functions,

Ob(F):Ob(C)Ob(C)andHomF:HomCHomC ,

called the on-objects part and the on-morphisms part respectively. They must follow some rules, expressed by the commutativity of the following squares in Set:



a. In the right-hand diagram in (5.6), where does the (unlabeled) left-hand function come from? Hint: Use Exercise

Consider diagram (5.3); imagine it as though it were contained in a pane of glass. Then imagine a parallel pane of glass involving C in place of C everywhere.

b. Draw arrows from the C pane to the C pane, each labeled Ob(F), HomF, and so on, as appropriate.

c. If F is a functor, i.e., it satisfies (5.5) and (5.6), do all the squares in your drawing commute?

d. Does the definition of functor involve anything not captured in this setup?


a. We have HomF: HomC → HomC′, and since it commutes with dom and cod, we have the desired function, by Exercise

b. Let CPC=HomC×Ob(C)HomC denote the set of composable pairs of arrows in C (and similarly define CPC and CPF:CPCCPC). The two-pane diagram is a bit cluttered, but looks like this:


c. Yes.

d. No, this is all one needs: functions Ob(F):Ob(C)Ob(C) and HomF:HomCHomC such that all the squares commute.

Example (Paths-graph). Let G = (V, A, src, tgt) be a graph. We have a set PathG of paths in G, and functions src,¯tgt¯:PathGV. That information is enough to define a new graph,


Moreover, given a graph homomorphism f: GG′, every path in G is sent under f to a path in G′. So Paths: GrphGrph is a functor.


a. Consider the graph G from Example Draw the paths-graph Paths(G) for G.

b. Repeating part (a) for G′ from the same example would be hard, because the paths-graph Paths(G′) has infinitely many arrows. However, the graph homomorphism f: GG′ does induce a morphism of paths-graphs Paths(f): Paths(G) → Paths(G′). How does that act on the vertices and arrows of Paths(G)?

c. Given a graph homomorphism f: GG′ and two paths p: vw and q: wx in G, is it true that Paths(f) preserves the concatenation? Explain also what it means to say Paths(f) preserves the concatenation.


a. Here are G and Paths(G).


b. For the reader’s convenience, here is a copy of f: GG′:


By definition Paths(f) acts like f on the vertices, and arrow by arrow on paths. Here is the formal answer:


c. Yes, that is true. It means that f(p) ++f(q) = f(p ++q), where ++ denotes concatenation of paths.


Suppose that C and D are categories, c, c′ ∈ Ob(C) are objects, and F: CD is a functor. Suppose that c and c′ are isomorphic in C. Show that this implies that F(c) and F(c′) are isomorphic in D.


If c and c′ are isomorphic, that means there exists a morphism f: cc′ and a morphism f′: c′ → c in C, such that f′ ○ f = idc and ff′ = idc′. But then F(f): F(c) → F(c′) and F(f′): F(c′) → F(c) are mutually inverse morphisms between F(c) and F(c′). Indeed, since F preserves composition and identities, we have F(f′) ○ F(f) = F(f′ ○ f) = F(idc) = idF(c) and F(f) ○ F(f′) = F(ff′) = F (idc′) = idF(c′). So F(f) is an isomorphism, which means that F(c) and F(c′) are isomorphic in D.

Example For any graph G, we can assign its set of length 1 loops Eq(G) as in Exercise This assignment is functorial in that given a graph homomorphism GG′, there is an induced function Eq(G) → Eq(G′). Similarly, we can functorially assign the set of connected components of the graph, Coeq(G). In other words, Eq: GrphSet and Coeq: GrphSet are functors. The assignment of vertex set and arrow set are two more functors GrphSet.

Suppose you want to decide whether two graphs G and G′ are isomorphic. If the graphs have thousands of vertices and thousands of arrows, this could take a long time. However, the preceding functors, in combination with Exercise give us some things to try.

The first thing to do is to count the number of loops of each, because these numbers are generally small. If the number of loops in G is different than the number of loops in G′, then because functors preserve isomorphisms, G and G′ cannot be isomorphic. Similarly, one can count the number of connected components, again generally a small number. If the number of components in G is different than the number of components in G′, then GG′. Similarly, one can simply count the number of vertices or the number of arrows in G and G′. These are all isomorphism invariants.

All this is a bit like trying to decide if a number is prime by checking if it is even, if its digits add up to a multiple of 3, or if it ends in a 5; these tests do not determine the answer, but they offer some level of discernment.

Remark As mentioned, functors allow ideas in one domain to be rigorously imported to another. Example is a first taste. Because functors preserve isomorphisms, we can tell graphs apart by looking at them in a simpler category, Set, using various lenses (in that case, four). There is relatively simple theorem in Set that says that for different natural numbers m, n the sets m and n are never isomorphic. This theorem is transported via the four functors to four different theorems about telling graphs apart.   The category of categories

Recall from Remark that a small category C is one in which Ob(C) is a set. But everything said so far works whether or not C is small. The following definition gives more precision.

Proposition There exists a category, called the category of small categories and denoted Cat, in which the objects are the small categories and the morphisms are the functors,

HomCat(C,D)={ F:CD|F is a functor }.

That is, there are identity functors, functors can be composed, and the identity and associativity laws hold.

Proof. We follow Definition We have already specified Ob(Cat) and HomCat in the statement of the proposition. Given a small category C, there is an identity functor idC:CC that is identity on the set of objects and the set of morphisms. And given a functor F:CD and a functor G:DE, it is easy to check that GF:CE, defined by composition of functions Ob(G)Ob(F):Ob(C)Ob(E) and HomGHomF:HomCHomE (see Exercise, is a functor; thus we have a composition formula. For the same reasons, one can show that functors, as morphisms, obey the identity law and the composition law. Therefore, this specification of Cat satisfies the definition of being a category.

Example (Categories have underlying graphs). Suppose given a category in the notation is as in Exercise, C=(Ob(C),HomC,dom,cod,ids,comp). Then (Ob(C),HomC,dom,cod) is a graph, called the graph underlying C and denoted U(C)Ob(Grph). A functor F:CD induces a graph morphism U(F):U(C)U(D), as seen in (5.5). So we have a functor,


Example (Free category on a graph). Example discussed a functor Paths: GrphGrph that considered all the paths in a graph G as the arrows of a new graph Paths(G). In fact, Paths(G) could be construed as a category, denoted F(G) ∈ Ob(Cat) and called the free category generated by G.

The objects of the category F(G) are the vertices of G. For any two vertices v, v′, the hom-set HomF(G)(v, v′) is the set of paths in G from v to v′. The identity elements are given by the trivial paths, and the composition formula is given by concatenation of paths.

For the on-morphisms part of F, we need to see that a graph homomorphism f: GG′ induces a functor F(f): F(G) → F(G′). But this was shown in Exercise Thus we have a functor


called the free category functor.


Let G be the graph depicted


and let [1] ∈ Ob(Cat) denote the free category on G, i.e., [1] ≔ F(G), as in Example We call [1] the free arrow category.

a. What are the objects of [1]?

b. For every pair of objects in [1], write the hom-set.


a. Ob([1]) = {v0, v1}.

b. There are four pairs of objects, so the four hom-sets are:

Hom[ 1 ](υ0,υ0)={ idυ0 };Hom[ 1 ](υ0,υ1)={ e };Hom[ 1 ](υ1,υ0)=;Hom[ 1 ](υ1,υ1)={ idυ1 }.


Let G be the graph whose vertices are all U.S. cities and whose arrows are airplane flights connecting the cities. What idea is captured by the free category on G?


Let F: GrphCat denote the free category functor from Example, and let U: CatGrph denote the underlying graph functor from Example What is the composition UF: GrphGrph called?


Since F: GrphCat freely adds all paths, one can check that UF: GrphGrph is the construction that takes a graph and adds all paths; i.e., UF = Paths (see Example


Recall the graph G from Example Let C = F(G) be the free category on G.

a. What is HomC(v,x)?

b. What is HomC(x,v)?

Example (Discrete graphs, discrete categories). There is a functor Disc: SetGrph that sends a set S to the graph


where !: ∅ → S is the unique function. We call Disc(S) the discrete graph on the set S. It is clear that a function SS′ induces a morphism of discrete graphs. Now applying the free category functor F: GrphCat, we get the discrete category on the set S. This composition is also denoted Disc: SetCat.


Recall from (2.4) the definition of the set n for any natural number n ∈ ℕ, and let DnDisc(n) ∈ Ob(Cat) be the discrete category on the set n, as in Example

a. List all the morphisms in D4.

b. List all the functors D3D2.


Let C be a category. How many functors are there CD1, where D1Disc(1) is the discrete category on one element?


There is always one functor CD1. There is no choice about where to send objects (all go to the object 1), and there is no choice about where to send morphisms (all go to the morphism id1).

We sometimes refer to Disc(1) as the terminal category (see Section 6.1.3) and for simplicity denote it 1. Its unique object is denoted 1.


If someone said, “Ob is a functor from Cat to Set,” what might they mean?


They probably mean that there is a functor CatSet that sends a category C to its set of objects Ob(C). Since the speaker does not say what this functor, Ob, does on morphisms, he is suggesting it is obvious. A morphism in Cat is a functor F: CD, which includes an on-objects part by definition. In other words, it is indeed obvious what Ob(F): Ob(C) → Ob(D) should mean because this is given in the specification of F (see Definition It is not hard to check that Ob preserves identities and compositions, so it is indeed a functor.


If someone said, “Hom is a functor from Cat to Set, where by Hom I mean the mapping that takes C to the set HomC, as in Exercise,” what might they mean?


They probably mean that there is a functor CatSet that sends a category C to its set of morphisms HomC. Since the speaker does not indicate what this functor, Hom, does on morphisms, she is suggesting it is obvious. A morphism in Cat is a functor F:CD, which includes an on-morphisms part by definition. In other words, it is indeed obvious what Hom(F):Hom(C)Hom(D) should mean because this is given in the specification of F (see Definition It is easy to check that Hom preserves identities and compositions, so it is indeed a functor.

5.2   Common categories and functors from pure math

5.2.1   Monoids, groups, preorders, and graphs

We saw in Section 5.1.1 that there is a category Mon of monoids, a category Grp of groups, a category PrO of preorders, and a category Grph of graphs. This section shows that each monoid M, each group G, and each preorder P can be considered as its own category. If each object in Mon is a category, we might hope that each morphism in Mon is just a functor, and this is true. The same holds for Grp and PrO. We saw in Example how each graph can be regarded as giving a free category. Another perspective on graphs (i.e., graphs as functors) is discussed in Section   Monoids as categories

Example said that to olog a monoid, one should use only one box. And again Example said that a monoid action could be captured by only one table. These ideas are encapsulated by the understanding that a monoid is perfectly modeled as a category with one object.

Each monoid as a category with one object Let (M, e, ⋆) be a monoid. We consider it as a category M with one object, Ob(M) = {▲}, and


The identity morphism id serves as the monoid identity e, and the composition formula


is given by ⋆: M × MM. The associativity and identity laws for the monoid match precisely with the associativity and identity laws for categories.

If a monoid is a category with one object, is there any categorical way of phrasing the notion of monoid homomorphism? Suppose that M = (M, e, ⋆) and M′ = (M′, e′, ⋆′). We know that a monoid homomorphism is a function f : MM′ such that f(e) = e′ and such that for every pair m0, m1M, we have f(m0m1) = f(m0) ⋆′ f(m1). What is a functor MM′?

Each monoid homomorphism as a functor between one-object categories Say that Ob(M) = {▲} and Ob(M′) = {▲′}, and we know that HomM(,)=M and HomM(,)=M. A functor F:MM consists first of a function Ob(M) → Ob(M′), but these sets have only one element each, so there is nothing to say on that front: we must have F(▲) = ▲′. It also consists of a function HomMhomM, but that is just a function MM′. The identity and composition formulas for functors match precisely with the identity and composition formula for monoid homomorphisms. Thus a monoid homomorphism is nothing more than a functor between one-object categories.


A monoid is a category with one object. A monoid homomorphism is just a functor between one-object categories.

This is formalized in the following theorem.

Theorem There is a functor i : MonCat with the following properties:

  • For every monoid MOb(Mon), the category i(M)Ob(Cat) itself has exactly one object,

    |Ob(i(M)) |=1.

  • For every pair of monoids M,MOb(Mon), the function


    induced by the functor i, is a bijection.

Proof. This is basically the content of the preceding paragraphs. The functor i sends a monoid to the corresponding category with one object and i sends a monoid homomorphism to the corresponding functor. One can check that i preserves identities and compositions.

Theorem situates the theory of monoids very nicely within the world of categories. But we have other ways of thinking about monoids, namely, their actions on sets. It would greatly strengthen the story if we could subsume monoid actions within category theory also, and we can.

Each monoid action as a set-valued functor Recall from Definition that if (M, e, ⋆) is a monoid, an action consists of a set S and a function art such that art and art for all sS. How might we relate the notion of monoid actions to the notion of functors? Since monoids act on sets, one idea is to try asking what a functor F:MSet is; this idea will work.

The monoid-as-category M has only one object, ▲, so F provides one set, SF(▲) ∈ Ob(Set). It also provides a function HomF:HomM(,)HomSet(F(),F()), or more concisely, a function


By currying (see Proposition, this is the same as a function art. The first monoid action law, that art, becomes the law that functors preserve identities, HomF (id) = idS. The other monoid action law is equivalent to the composition law for functors.   Groups as categories

A group is just a monoid (M, e, ⋆) in which every element mM is invertible, meaning there exists some m′ ∈ M with mm′ = e = m′ ⋆ m. If a monoid is the same thing as a category M with one object, then a group must be a category with one object and with an additional property having to do with invertibility. The elements of M are the morphisms of the category M, so we need a notion of invertibility for morphisms. Luckily we have such a notion already, namely, isomorphism.


A group is a category G with one object, such that every morphism in G is an isomorphism. A group homomorphism is just a functor between such categories.

Theorem There is a functor i : GrpCat with the following properties:

  • For every group GOb(Grp), the category i(G)Ob(Cat) itself has exactly one object, and every morphism m in i(G) is an isomorphism.
  • For every pair of groups G,GOb(Grp), the function


    induced by the functor i, is a bijection.

Just as with monoids, an action of some group (G, e, ⋆) on a set S ∈ Ob(Set) is the same thing as a functor GSet sending the unique object of G to the set S.   A monoid and a group stationed at each object in any category

If a monoid is just a category with one object, we can locate monoids in any category C by focusing on one object in C. Similarly for groups.

Example (Endomorphism monoid). Let C be a category and xOb(C) an object. Let M=HomC(x,x). Note that for any two elements f, gM, we have fg : xx in M. Let M = (M, idx, ○). It is easy to check that M is a monoid; it is called the endomorphism monoid of x in C, denoted End(x).

Example (Automorphism group). Let C be a category and xOb(C) an object. Let G={fHomC(x,x)|f is an isomorphism}. Let G=(G,idx,). One can check that G is a group; it is called the automorphism group of x in C denoted Aut(x).


Let S = {1, 2, 3, 4} ∈ Ob(Set).

a. What is the automorphism group Aut(S) of S in Set, and how many elements does this group have?

b. What is the endomorphism monoid End(S) of S in Set, and how many elements does this monoid have?

c. Recall from Example that every group has an underlying monoid U(G). Is the endomorphism monoid of S the underlying monoid of the automorphism group of S? That is, is it the case that End(S) = U(Aut(S))?


Consider the following graph G, which has four vertices and eight arrows:


What is the automorphism group Aut(G) of G ∈ Ob(Grph) Hint: Every automorphism of G will induce an automorphism of the set {1, 2, 3, 4}; which ones will preserve the endpoints of arrows?


We use visual perception to guide us. The graph G has the shape of a square. Of the 4! different possible automorphisms of {1, 2, 3, 4}, only those preserving the square shape will be automorphisms of G. The group of automorphisms of G is called the dihedral group of order 8 (see Example It has eight elements,


where r means rotate the square clockwise 90°, and f means flip the square horizontally. For example, flipping the square vertically can be obtained by flipping horizontally and then rotating twice: fr2.   Preorders as categories

A preorder (X, ⩽) consists of a set X and a binary relation ⩽ that is reflexive and transitive. We can make from (X, ⩽) ∈ Ob(PrO) a category XOb(Cat) as follows. Define Ob(X)=X and for every two objects x, yX, define


To clarify: if xy, we assign HomX(x,y) to be the set containing only one element, namely, the string “xy.”6 If the pair (x, y) is not in relation ⩽, then we assign HomX(x,y) to be the empty set. The composition formula


is completely determined because either one of two possibilities occurs. One possibility is that the left-hand side is empty (if either xy or yz; in this case there is a unique function ○ as in (5.7)). The other possibility is that the left-hand side is not empty in case xy and yz, which implies xz, so the right-hand side has exactly one element “xz” in which case again there is a unique function ○ as in (5.7).

On the other hand, if C is a category having the property that for every pair of objects x,yOb(C), the set HomC(x,y) is either empty or has one element, then we can form a preorder out of C. Namely, take X=Ob(C) and say xy if there exists a morphism xy in C.

Proposition There is a functor i: PrOCat with the following properties for every preorder (X, ⩽):

  1. the category Xi(X,) has objects Ob(X)=X.
  2. For each pair of elements x,xOb(X), the set HomX(x,x) has at most one element.

Moreover, any category with property 2 is in the image of the functor i.

Proof. To specify a functor i : PrOCat, we need to say what it does on objects and on morphisms. To an object (X, ⩽) in PrO, we assign the category X with objects X and a unique morphism xx′ if xx′. To a morphism f : (X, ⩽X) → (Y, ⩽Y) of preorders, we must assign a functor i(f):XY. Again, to specify a functor, we need to say what it does on objects and morphisms of X. To an object xOb(X)=X, we assign the object f(x)Y=Ob(Y). Given a morphism f : xx′ in X, we know that xx′, so by Definition we have that f(x) ⩽ f(x′), and we assign to f the unique morphism f(x) → f(x′) in Y. To check that the rules of functors (preservation of identities and composition) are obeyed is routine.


A preorder is a category in which every hom-set has either 0 elements or 1 element. A preorder morphism is just a functor between such categories.


Suppose that C is a preorder (considered as a category). Let x,yOb(C) be objects such that xy and yx. Prove that there is an isomorphism xy in C.


Proposition stated that a preorder can be considered as a category P. Recall from Definition that a partial order is a preorder with an additional property. Phrase the defining property for partial orders in terms of isomorphisms in the category P.

Example The olog from Example depicted a partial order, call it P. In it we have

HomP(a diamond,a red card)={is}


HomP(a black queen,a card){isis}.

Both of these sets contain exactly one element; the name is not important. The set HomP(a 4,a 4 of diamonds)=.


Every linear order is a preorder with a special property. Using the categorical interpretation of preorders, can you phrase the property of being a linear order in terms of hom-sets?


Recall the functor P : PrOGrph from Proposition, the functors F : GrphCat and U : CatGrph from Example, and the functor i: PrOCat from Proposition

a. Do either of the following diagrams of categories commute?


b. We also gave a functor Im: GrphPrO in Exercise Does the following diagram of categories commute?


Proposition There is a unique functor R: CatPrO with the following properties:

  1. For each category C, the preorder (X,)R(C) has the same set of objects, X=Ob(C).
  2. For each pair of objects x,yOb(C), we have xy in R(C) if and only if the hom-set HomC(x,y) is nonempty.

Furthermore, if i: PrOCat is the inclusion from Proposition, we have Ri = idPrO.

Proof. Given a category C, we define a preorder R(C)(Ob(C),), where xy if and only if HomC(x,y). This is indeed a preorder because the identity law and composition law for a category ensure the reflexivity and transitivity properties of preorders hold. Given a functor F:CD (i.e., a morphism in Cat), we get Ob(F):Ob(C)Ob(C), and for R to be defined on morphisms, we need to check that this function preserves order. If xy in R(C), then there is a morphism g : xy in C, so there is a morphism F(g) : F(x) → F(y), which means F(x) ⩽ F(y) in C. It is straightforward to see now that R is a functor, and there was no other way to construct R satisfying the desired properties. It is also easy to see that Ri = idPrO.   Graphs as functors

Let C denote the category depicted as follows:


Then a functor G : GrInSet is the same thing as two sets G(Ar), G(Ve) and two functions G(src) : G(Ar) → G(Ve) and G(tgt) : G(Ar) → G(Ve). This is precisely what is needed for a graph; see Definition We call GrIn the graph-indexing category.


Consider the terminal category, 1, also known as the discrete category on one element (see Exercise Let GrIn be as in (5.8) and consider the functor i0 : 1GrIn sending the unique object of 1 to the object V e ∈ Ob(GrIn).

a. If G : GrInSet is a graph, what is the composite Gi0? It consists of only one set; in terms of the graph G, what set is it?

b. As an example, what set is it when G is the graph from Example

If a graph is a functor GrInSet, what is a graph homomorphism? Example shows that graph homomorphisms are homomorphisms between functors, which are called natural transformations. (Natural transformations are the highest-level structure in ordinary category theory.)

Example Let SGrIn be the category depicted as follows:


with the following composition formula:

ρρ=idA;  srcρ=tgt;  and tgtρ=src.

The idea here is that the morphism ρ: AA reverses arrows. The PED A[ρ, ρ] = A[ ] forces the fact that the reverse of the reverse of an arrow yields the original arrow. The PEDs A[ρ, src] = A[tgt] and A[ρ, tgt] = A[src] force the fact that when we reverse an arrow, its source and target switch roles.

This category SGrIn is the symmetric graph-indexing category. Just as any graph can be understood as a functor GrInSet, where GrIn is the graph-indexing category displayed in (5.8), any symmetric graph can be understood as a functor SGrInSet, where SGrIn is the category drawn in (5.9). Given a functor G : SGrInSet, we will have a set of arrows, a set of vertices, a source operation, a target operation, and a reverse-direction operation (ρ) that all behave as expected.

It is customary to draw the connections in a symmetric graph G as line segments rather than arrows between vertices. However, a better heuristic is to think that each connection between vertices in G consists of two arrows, one pointing in each direction.


In a symmetric graph, every arrow has an equal and opposite arrow.


Which of the following graphs are symmetric:

a. The graph G from (4.3)?

b. The graph G from Exercise

c. The graph G′ from (4.6)?

d. The graph Loop from (4.16), i.e., the graph having exactly one vertex and one arrow?

e. The graph G from Exercise


Let GrIn be the graph-indexing category shown in (5.8), and let SGrIn be the symmetric graph-indexing category displayed in (5.9).

a. How many functors are there of the form GrInSGrIn?

b. Is one more reasonable than the others? If so, call it i : GrInSGrIn, and write how it acts on objects and morphisms.

c. Choose a functor i : GrInSGrIn, the most reasonable one, if such a thing exists. seems most reasonable and call it i : GrInSGrIn. If a symmetric graph is a functor S : SGrInSet, you can compose with i to get a functor Si : GrInSet. This is a graph; what graph is it? What has changed?

Example Let C be a category, and consider the set of isomorphisms in C. Each isomorphism f : cc′ in C has an inverse as well as a domain (c) and a codomain (c′). Thus we can build a symmetric graph I(C):SGrInSet. Its vertices are the objects in C, and its arrows are the isomorphisms in C.

5.2.2   Database schemas present categories

Recall from Definition that a database schema (or schema, for short) consists of a graph together with a certain kind of equivalence relation, namely a congruence, on its paths. Section 5.4.1 defines a category Sch that has schemas as objects and appropriately modified graph homomorphisms as morphisms. Section 5.4.2 proves that the category of schemas is equivalent (in the sense of Definition to the category of categories,


The difference between schemas and categories is like the difference between monoid presentations, given by generators and relations as in Definition, and the monoids themselves. The same monoid has (infinitely) many different presentations, and so it is for categories: many different schemas can present the same category. Computer scientists may think of the schema as syntax and the category it presents as the corresponding semantics. A schema is a compact form and can be specified in finite space and time, whereas the category it generates can be infinite.


A database schema is a category presentation.

Section 5.4.2 formally shows how to turn a schema into a category (the category it presents). For now, it seems better not to be so formal, because the idea is fairly straightforward. Suppose given a schema S, which consists of a graph G = (V, A, src, tgt) equipped with a congruence ~ (see Definition It presents a category C defined as follows. The set of objects in C is defined to be the vertices V; the set of morphisms in C is defined to be the quotient Paths(G)/ ~; and the composition formula is given by concatenation of paths. The path equivalences making up ~ become commutative diagrams in C.

Example The following schema Loop has no path equivalence declarations. As a graph it has one vertex and one arrow.


The category it generates, however, is the free monoid on one generator, ℕ. It has one object s, but a morphism fn : ss for every natural number n ∈ ℕ, thought of as “how many times to go around the loop f.” Clearly, the schema is more compact than the infinite category it generates.


Consider the olog from Exercise, which says that for any father x, his youngest child’s father is x and his tallest child’s father is x. It is redrawn here as a schema S, which includes the desired path equivalence declarations, F[t, f] = F [ ] and F[y, f] = F [ ].


How many morphisms are there (total) in the category presented by S?


There are seven. Let S¯ be the category presented by S. We have

HomS¯(F,F)={F[]};  HomS¯(F,C)={F[t], F[y]};HomS¯(C,F)={C[f]};  HomS¯(C,C)={C[], C[f,t], C[f,y]}.

Given a child, the three morphisms CC respectively return the child herself, her tallest sibling (technically, her father’s tallest child), and her youngest sibling (technically, her father’s youngest child).


Suppose that G is a graph and that G is the schema generated by G with no PEDs. What is the relationship between the category generated by G and the free category F(G) ∈ Ob(Cat), as defined in Example


Let C=(G,) be a schema. A leaf table is an object cOb(C) with no outgoing arrows.

a. Express the condition of being a leaf table mathematically in three different languages: that of graphs (using symbols V, A, src, tgt), that of categories (using HomC, etc.), and that of tables (in terms of columns, tables, rows, etc.).

b. In the language of categories, is there a difference between a terminal object and a leaf table? Explain.   Instances on a schema C

If schemas are like categories, what are instances? Recall that an instance I on a schema S=(G,) assigns to each vertex v in G a set of rows, say, I(v) ∈ Ob(Set). And to every arrow a : vv′ in G the instance assigns a function I(a): I(v) → I(v′). The rule is that given two equivalent paths, their compositions must give the same function. Concisely, an instance is a functor I:SSet.

Example We have seen that a monoid is just a category M with one object and that a monoid action is a functor MSet. With database schemas as categories, M is a schema, and so an action becomes an instance of that schema. The monoid action table from Example was simply a manifestation of the database instance according to the Rules


Section discussed how each graph is a functor GrInSet for the graph-indexing category depicted here:


But now we know that if a graph is a set-valued functor, then we can consider GrIn as a database schema.

a. How many tables, and how many foreign key columns of each should there be (if unsure, consult Rules

b. Write the table view of graph G from Example

5.2.3   Spaces

Category theory was invented for use in algebraic topology, and in particular, to discuss natural transformations between certain functors. Section 5.3 discusses natural transformations more formally. It suffices now to say a natural transformation is some kind of morphism between functors. In the original use, Eilenberg and Mac Lane were interested in functors that connect topological spaces (e.g., shapes such as spheres) to algebraic systems (e.g., groups).

For example, there is a functor that assigns to each space X its group π1(X) of round-trip voyages (starting and ending at some chosen point xX), modulo some equivalence relation. There is another functor that assigns to every space its group H1(X) of ways to drop some (positive or negative) number of circles on X.

These two functors, π1 and H1 are related, but they are not equal. For example, when X is the figure-8 space (two circles joined at a point) the group π1(X) is much bigger than the group H1(X). Indeed, π1(X) includes information about the order and direction of loops traveled during the voyage, whereas the group H1(X) includes only information about how many times one goes around each loop. However, there is a natural transformation of functors π1H1, called the Hurewicz transformation, which takes π1’s voyage, counts how many times it went around each loop, and delivers that information to H1.

Example Given a set X, recall that ℙ(X) denotes the preorder of subsets of X. A topology on X is a choice of which subsets U ∈ ℙ(X) will be called open sets. To be a topology, these open sets must follow two rules. Namely, the union of any number of open sets must be considered to be an open set, and the intersection of any finite number of open sets must be considered open. One could say succinctly that a topology on X is a suborder Open(X) ⊆ ℙ(X) that is closed under taking finite meets and infinite joins.

A topological space is a pair (X, Open(X)), where X is a set and Open(X) is a topology on X. The elements of the set X are called points. A morphism of topological spaces (also called a continuous map) is a function f : XY such that for every V ∈ Open(Y), the preimage f−1(V) ∈ ℙ(X) is actually in Open(X), that is, such that there exists a dashed arrow making the following diagram commute:


The category of topological spaces, denoted Top, is the category having the preceding objects and morphisms.


a. Explain how looking at points gives a functor TopSet.

b. Does looking at open sets give a functor TopPrO?


a. A topological space (X, Open(X)) includes a set X ∈ Ob(Set) of points. A morphism (X, Open(X)) → (Y, Open(Y)) of spaces includes a function XY . Thus we have a functor TopSet, because the identity morphisms and compositions of morphisms in Top are sent to their counterparts in Set.

b. No. A morphism (X, Open(X)) → (Y, Open(Y)) includes a preorder morphism in the direction Open(Y) → Open(X), not the other way around. Definition shows that every category C has an opposite category Cop. Looking at open sets does give a functor Open: TopopPrO.

Example (Continuous dynamical systems). The set ℝ can be given a topology in a standard way.7 But (ℝ, 0, +) is also a monoid. Moreover, for every x ∈ ℝ, the monoid operation + : ℝ × ℝ → ℝ is continuous.8 So we say that R(,0,+) is a topological monoid, or that it is a monoid enriched in topological spaces.

Recall from Section that an action of R is a functor RSet. Imagine a functor a : RTop. Since R is a category with one object, this amounts to an object X ∈ Ob(Top), a space. And for every real number t ∈ ℝ, we obtain a continuous map a(t): XX. Further we can ask this a(t) to vary continuously as t moves around in ℝ. If we consider X as the set of states of some system and ℝ as the time line, we have modeled what is called a continuous dynamical system.

Example Recall (see Axler [3]) that a real vector space is a set X, elements of which are called vectors, which is closed under addition and scalar multiplication. For example, ℝ3 is a vector space. A linear transformation f from X to Y is a function f : XY that appropriately preserves addition and scalar multiplication. The category of real vector spaces, denoted Vect, has as objects the real vector spaces and as morphisms the linear transformations.

There is a functor VectGrp sending a vector space to its underlying group of vectors, where the group operation is addition of vectors and the group identity is the 0-vector.


Every vector space has vector subspaces, ordered by inclusion (the origin is inside of any line that is inside of certain planes, and all are inside of the whole space V). If you know about this topic, answer the following questions.

a. Does a linear transformation VV′ induce a morphism of these orders? In other words, is there a functor subspaces: VectPrO?

b. Would you guess that there is a nice functor VectTop? By “nice functor” I mean a substantive one. For example, there is a functor VectTop that sends every vector space to the empty topological space; if someone asked for a functor VectTop for their birthday, this functor would make them sad. Give a functor VectTop that would make them happy.

There is a functor | · |: VectSet sending every vector space X to its set |X| of vectors. A categorically nice way to understand this functor is as HomVect(,), which sends X to the set of linear transformations ℝ → X. Each linear transformation ℝ → X is completely determined by where it sends 1 ∈ ℝ, which can be any vector in X. Thus we get the bijection | X |HomVect(,X).


Suppose we think of Vect as a database schema, and we think of | · |: VectSet as an instance (see Section 4.5). Of course, the schema and the instance are both infinite, but let’s not worry about that.

a. Pick two objects x, y and two morphisms f, g : xy from Vect, actual vector spaces and linear transformations, and call this your subschema. Draw it as dots and arrows.

b. Write four rows in each table of the instance | · | on your subschema.   Groupoids

Groupoids are like groups except a groupoid can have more than one object.

Definition A groupoid is a category C such that every morphism is an isomorphism. If C and D are groupoids, a morphism of groupoids, denoted F:CD, is simply a functor. The category of groupoids is denoted Grpd.

Example There is a functor GrpdCat, sending a groupoid to its underlying category. There is also a functor GrpGrpd sending a group to itself as a groupoid with one object.

There is also a functor Core: CatGrpd, sending a category C to the largest groupoid inside C, called its core. That is, Ob(Core(C))=Ob(C) and

HomCore(C)(x,y)={fHomC(x,y)|f is an isomorphism}.

Application Let M be a material in some original state s0.9 Construct a category SM whose objects are the states of M (which are obtained by pulling on M in different ways, heating it up, and so on). Include a morphism from state s to state s′ for every physical transformation from s to s′. Physical transformations can be performed one after another, so we can compose morphisms, and perhaps we can agree this composition is associative. Note that there is a morphism is : s0s representing any physical transformation that can bring M from its initial state s0 to s.

The elastic deformation region of the material is the set of states s such that there exists an inverse ss0 to the morphism is. A transformation is irreversible if its representing morphism has no inverse. If a state s1 is not in the elastic deformation region, we can still talk about the region that is (inventing a term) elastically equivalent to s1. It is all the objects in SM that are isomorphic to s1. If we consider only elastic equivalences in SM, we are looking at a groupoid inside it, namely, the core Core(SM), as in Example

Example Alan Weinstein [45] explains groupoids in terms of tiling patterns on a bathroom floor. This is worth reading.

Example Let I = {x ∈ ℝ | 0 ⩽ x ⩽ 1} denote the unit interval. It can be given a topology in a standard way, as a subset of ℝ (see Example

For any topological space X, a path in X is a continuous map IX. Two paths are called homotopic if one can be continuously deformed to the other, where the deformation occurs completely within X.10 One can prove that being homotopic is an equivalence relation on paths.

Paths in X can be composed, one after the other, and the composition is associative (up to homotopy). Moreover, for any point xX, there is a trivial path (that stays at x). Finally every path is invertible (by traversing it backward) up to homotopy.

This all means that to any space X ∈ Ob(Top) we can associate a groupoid, called the fundamental groupoid of X and denoted Π1(X) ∈ Ob(Grpd). The objects of Π1(X) are the points of X; the morphisms in Π1(X) are the paths in X (up to homotopy). A continuous map f : XY can be composed with any path IX to give a path IY, and this preserves homotopy. So, in fact, Π1 : TopGrpd is a functor.


Let T denote the surface of a doughnut, i.e., a torus. Choose two points p, qT. Since Π1(T) is a groupoid, it is also a category. What would the hom-set HomΠ1(T)(p,q) represent?


Let U ⊆ ℝ2 be an open subset of the plane, and let F be an irrotational vector field on U (i.e., one with curl(F) = 0). Following Exercise, we have a category CF. If two curves C, C′ in U are homotopic, then they have the same line integral, ∫C F = ∫C F.

We also have a category Π1U, given by the fundamental groupoid, as in Example Both categories have the same objects, Ob(CF)=|U|=Ob(Π1U), the set of points in U.

a. Is there a functor CF?Π1U or a functor Π1U?CF that is identity on the underlying objects?

b. Let CFCF denote the subcategory with the same objects but only those morphisms corresponding to curves C with ∫C F = 0. Is CF a groupoid?

c. If F is a conservative vector field, what is CF?

d. If F is a conservative vector field, how does CF compare with Π1U?


Consider the set A of all (well-formed) arithmetic expressions that can be written with the symbols


For example, here are four different elements of A :

52, 527, 45+0, 50+3*(62).

We can say that an equivalence between two arithmetic expressions is a justification that they give the same final answer, e.g., 52 + 60 is equivalent to 10 * (5 + 6) + (2 + 0), which is equivalent to 10 * 11 + 2.

a. I have basically described a category G. What are its objects, and what are its morphisms?

b. Is G a groupoid?

5.2.4   Logic, set theory, and computer science   The category of propositions

Given a domain of discourse, a logical proposition is a statement that is evaluated in any model of that domain as either true or not always true, which the black-and-white thinker might dub “false.” For example, in the domain of real numbers we might have the proposition

For any real number x ∈ ℝ, there exists a real number y ∈ ℝ such that y > 3x.

That is true: for x = 22, we can offer y = 100. But the following proposition is not true:

Every integer x ∈ ℤ is divisible by 2 or 3.

It is true for the majority of integers, but not for all integers; thus it is dubbed false.

We say that one logical proposition P implies another proposition Q, denoted PQ, if for every model in which P is true, so is Q. There is a category Prop whose objects are logical propositions and whose morphisms are proofs that one statement implies another. Crudely, one might say that B holds at least as often as A if there is a morphism AB (meaning in any model for which A holds, so does B). So the proposition “xx” holds very seldom, and the proposition “x = x” holds very often.

Example We can repeat this idea for nonmathematical statements. Take the set of all possible statements that are verifiable by experiment as the objects of a category. Given two such statements, it may be that one implies the other (e.g., “If the speed of light is fixed, then there are relativistic effects”). Every statement implies itself (identity) and implication is transitive, so we have a category.

Let’s consider differences in proofs to be irrelevant, in which case the category Prop is simply a preorder (Prop, ⇒): either A implies B or it does not. Then it makes sense to discuss meets and joins. It turns out that meets are “and’s,” and joins are “or’s.” That is, given propositions A, B, the meet AB is defined to be a proposition that holds as often as possible subject to the constraint that it implies both A and B; the proposition “A holds and B holds” fits the bill. Similarly, the join AB is given by “A holds or B holds.”


Consider the set of possible laws (most likely an infinite set) that can be dictated to hold throughout a jurisdiction. Consider each law as a proposition (“such and such is the case”), i.e., as an object of the preorder Prop. Given a jurisdiction V, and a set of laws {1, 2, …, n} that are dictated to hold throughout V, we take their meet L(V) ≔ 12 ∧ ⋯ ∧ n and consider it to be the single law of the land V. Suppose that V is a jurisdiction and U is a subjurisdiction (e.g., U is a county and V is a state); write UV. Then any law dictated by the large jurisdiction (the state) must also hold throughout the small jurisdiction (the county). Let J be the set of jurisdictions, so that (J, ⊆) is a preorder.

a. If VU are jurisdictions, what is the relation in Prop between L(U) and L(V)?

b. Consider the preorder (J, ⊆) of jurisdictions. Is the law of the land a morphism of preorders JProp? That is, considering both J and Prop to be categories (by Proposition, we have a function L : Ob(J) → Ob(Prop); does L extend to a functor JProp.


This exercise is strangely tricky, so we go through it slowly.

a. Suppose that the proposition L(V) is true, i.e., we are in a model where all V’s laws are being followed. Does this imply that L(U) is true? Since VU, every law of U is a law of V (e.g., if one may not own slaves anywhere in the United States, one may not own slaves in Maine). So indeed L(U) is true; thus we have L(V) ⇒ L(U).

b. Yes, L extends to a preorder morphism L : JProp because if VU, then L(V) ⇒ L(U).


Take again the preorder (J, ⊆) of jurisdictions from Exercise and the idea that laws are propositions. But this time, let R(V) be the set of all possible laws (not just those dictated to hold) that are, in actuality, being respected, i.e., followed, by all people in V. This assigns to each jurisdiction a set. Does the “set of respected laws” function R : Ob(J) → Ob(Set) extend to a functor JSet?


If VU, then any law respected throughout U is respected throughout V, i.e., R(U) ⊆ R(V). In other words, R is contravariant (see Section 6.2.1), meaning it constitutes a functor R : JopSet. (Every law is being respected throughout the jurisdiction ∅, and physicists want to know what laws are being respected throughout the universe-as-jurisdiction.)   A categorical characterization of Set

The category Set of sets is fundamental in mathematics, but instead of thinking of it as something given or somehow special, it can be shown to merely be a category with certain properties, each of which can be phrased purely categorically. This was shown by Lawvere [23]. A very readable account is given in [26].   Categories in computer science

Computer science makes heavy use of trees, graphs, orders, lists, and monoids. All of these can be understood in the context of category theory, although it seems the categorical interpretation is rarely mentioned explicitly in computer science textbooks. However, categories are used explicitly in the theory of programming languages (PL). Researchers in that field attempt to understand the connection between what programs are supposed to do (their denotation) and what they actually cause to occur (their operation). Category theory provides a useful mathematical formalism in which to study this.

The kind of category most often considered by a PL researcher is known as a Cartesian closed category, or CCC, which means a category T that has products (like A × B in Set) and exponential objects (like BA in Set). So Set is an example of a CCC, but there are others that are more appropriate for actual computation. The objects in a PL person’s CCC represent the types of the programming language, types such as integers, strings, floats. The morphisms represent computable functions, e.g., length: strings→integers. The products allow one to discuss pairs (a, b), where a is of one type and b is of another type. Exponential objects allow one to consider computable functions as things that can be input to a function (e.g., given any computable function floats→integers, one can consistently multiply its results by 2 and get a new computable function floats→integers). Products are studied in Section and exponential objects in Section 5.3.2.

But category theory does not only offer a language for thinking about programs, it offers an unexpected tool called monads. The CCC model for types allows researchers only to discuss functions, leading to the notion of functional programming languages; however, not all things that a computer does are functions. For example, reading input and output, changing internal state, and so on, are operations that can be performed on a computer but that ruin the functional aspect of programs. Monads were found in 1991 by Moggi [33] to provide a powerful abstraction that opens the doors to such nonfunction operations without forcing the developer to leave the category-theoretic paradise. Monads are discussed in Section 7.3.

Section 5.2.2 showed that databases are well captured by the language of categories (this is formalized in Section 5.4). Databases are used in this book to bring clarity to concepts within standard category theory.

5.2.5   Categories applied in science

Categories are used throughout mathematics to relate various subjects as well as to draw out the essential structures within these subjects. For example, there is active research in categorifying classical theories like that of knots, links, and braids (Khovanov [21]). It is similarly applied in science to clarify complex subjects. Here are some very brief descriptions of scientific disciplines to which category theory is applied.

Quantum field theory was categorified by Atiyah [2] in the late 1980s, with much success (at least in producing interesting mathematics). In this domain, one takes a category in which an object is a reasonable space, called a manifold, and a morphism is a manifold connecting two manifolds, like a cylinder connecting two circles. Such connecting manifolds are called cobordisms and the category of manifolds and cobordisms is denoted Cob. Topological quantum field theory is the study of functors CobVect that assign a vector space to each manifold and a linear transformation of vector spaces to each cobordism.

Samson Abramsky [1] showed a relationship between database theory, category theory, and quantum physics. He used the notion of sheaves on a database (see Section 7.2.3) and the sheaf cohomology thereof, to derive Bell’s theorem, which roughly states that certain variables that can be observed locally do not extend to globally observable variables.

Information theory, invented in 1948 by Claude Shannon, is the study of how to ideally compress messages so that they can be sent quickly and accurately across a noisy channel.11 Its main quantity of interest is the number of bits necessary to encode a piece of information. For example, the amount of information in an English sentence can be greatly reduced. The fact that t’s are often followed by h’s, or that e’s are much more common than z’s, implies that letters are not being used as efficiently as possible. The amount of bits necessary to encode a message is called its entropy and has been linked to the commonly used notion of the same name in physics.

Baez, Fritz, and Leinster [7] show that entropy can be captured quite cleanly using category theory. They make a category FinProb whose objects are finite sets equipped with a probability measure, and whose morphisms are probability-preserving functions. They characterize information loss as a way to assign numbers to such morphisms, subject to certain explicit constraints. They then show that the entropy of an object in FinProb is the amount of information lost under the unique map to the singleton set {☺}. This approach explicates (by way of the explicit constraints for information loss functions) the essential idea of Shannon’s information theory, allowing it to be generalized to categories other than FinProb. Thus Baez and colleagues effectively categorified information theory.

Robert Rosen proposed in the 1970s that category theory could play a major role in biology. That is only now starting to be fleshed out. There is a categorical account of evolution and memory, called Memory Evolutive Systems [15]. There is also a paper [10] by Brown and Porter with applications to neuroscience.

5.3   Natural transformations

The Big 3 of category theory are categories, functors, and natural transformations. This section introduces the last of these, natural transformations. Category theory was originally invented to discuss natural transformations. These were sufficiently conceptually challenging that they required formalization and thus the invention of category theory. If we think of categories as domains (e.g., of discourse, interaction, comparability) and functors as translations between different domains, the natural transformations compare different translations.

Natural transformations can seem a bit abstruse at first, but hopefully some examples and exercises may help.

5.3.1   Definition and examples

Let’s begin with an example. There is a functor List: SetSet, which sends a set X to the set List(X) consisting of all lists whose entries are elements of X. Given a morphism f : XY, we can transform a list with entries in X into a list with entries in Y by applying f to each entry (see Exercise Call this process translating the list.

It may seem a strange thing to contemplate, but there is also a functor List○List: SetSet that sends a set X to the set of lists of lists in X. If X = {a, b, c}, then List ○ List(X) contains elements like [[a, b], [a, c, a, b, c], [c]] and [[ ]] and [[a], [ ], [a, a, a]]. We can naturally transform a list of lists into a list by concatenation. In other words, for any set X there is a function µX : List ○ List(X) → List(X), which sends that list of lists to [a, b, a, c, a, b, c, c] and [ ] and [a, a, a, a] respectively. In fact, even if we use a function f : XY to translate a list of X’s into a list of Y’s (or a list of lists of X’s into a list of lists of Y’s), the concatenation works correctly.


What does it mean to say that concatenation of lists is natural with respect to translation? It means that concatenating then translating is the same thing as translating then concatenating.

Let’s make this concrete. Let X = {a, b, c}, let Y = {1, 2, 3}, and let f : XY assign f(a) = 1, f(b) = 1, f(c) = 2. The naturality condition says the following for any list of lists of X’s, in particular, for [[a, b], [a, c, a, b, c], [c]] ∈ List ○ List(X):


The top right path is concatenating then translating, and the left bottom path is translating then concatenating, and one sees here that they do the same thing.

Here is how the preceding example fits with the terminology of Definition The categories C and D are both Set, the functor F:CD is List ○ List, and the functor G:CD is List. The natural transformation is µ : List○List → List. It can be depicted:


Definition Let C and D be categories, and let F:CD and G:CD be functors. A natural transformation α from F to G, denoted α : FG and depicted


is defined as follows. One announces some constituents (A. components) and shows that they conform to a law (1. naturality squares). Specifically, one announces

A. for each object XOb(C), a morphism αX : F (X) → G(X) in D, called the X-component of α.

One must then show that the following natural transformation law holds:

  1. For every morphism f : XY in C, the square (5.10), called the naturality square for f, must commute:


The set of natural transformations FG is denoted Nat(F, G).

Remark If we have two functors F, G:CD, providing a morphism αX : F(X) → G(X) for every object XOb(C) is called a questionably natural transformation. Once we check the commutativity of all the naturality squares, i.e., once we know it satisfies Definition, we drop the “questionably” part.

Example Consider the following categories C[1] and D[2]:


Consider the functors F, G : [1] → [2], where F(0) = A, F(1) = B, G(0) = A, and G(1) = C. It turns out that there is only one possible natural transformation FG; we call it α and explore its naturality square. The components of α : FG are shown in green. These components are α0 = idA : F(0) → G(0) and α1 = g : F(1) → G(1). The naturality square for p : 0 → 1 is shown twice below, once with notation following that in (5.10) and once in local notation:


It is clear that this diagram commutes, so the components α0 and α1 satisfy the law of Definition, making α a natural transformation.

Proposition Let C and D be categories, let F,G:CD be functors, and for every object cOb(C), let αc:F(c)G(c) be a morphism in D. Suppose given a path c0 f1 c1 f2  fn cn such that for each arrow fi in it, the following naturality square commutes:


Then the naturality square for the composite pfn ○ ⋯ ○ f2f1 : c0cn


also commutes. In particular, the naturality square commutes for every identity morphism idc.

Proof. When n = 0, we have a path of length 0 starting at each cOb(C). It vacuously satisfies the condition, so we need to see that its naturality square


commutes. But this is clear because functors preserve identities.

The rest of the proof follows by induction on n. Suppose q = fn−1 ○ ⋯ ○ f2f1 : c0cn−1 and p = fnq and that the naturality squares for q and for fn commute; we need only show that the naturality square for p commutes. That is, we assume the two small squares commute; it follows that the large rectangle does too, completing the proof.


Example Let C=D=[1] be the linear order of length 1, thought of as a category (by Proposition There are three functors CD, which we can write as (0, 0), (0, 1), and (1, 1); these are depicted left to right as follows:


These are just functors so far. What are the natural transformations say, α : (0, 0) → (0, 1)? To specify a natural transformation, we must specify a component for each object in C. In this case α0 : 0 → 0 and α1 : 0 → 1. There is only one possible choice: α0 = id0 and α1 = f. Now that we have chosen components, we need to check the naturality squares.

There are three morphisms in C, namely, id0, f, id1. By Proposition, we need only check the naturality square for f. We write it twice, once in abstract notation and once in concrete notation:


This commutes, so α is indeed a natural transformation.


With notation as in Example, we have three functors CD, namely, (0, 0), (0, 1), and (1, 1). How many natural transformations are there from F to G, i.e., what is the cardinality of Nat(F, G)

a. when F = (0, 0) and G = (1, 1)?

b. when F = (0, 0) and G = (0, 0)?

c. when F = (0, 1) and G = (0, 0)?

d. when F = (0, 1) and G = (1, 1)?


Let 1 denote the discrete category on one object, Ob(1) = {1}, and let Loop denote the category with one object Ob(Loop)={s} and HomLoop(s,s)= (see Example There is exactly one functor S:1¯Loop. Characterize the natural transformations α : SS.


Let [1] denote the free arrow category,


as in Exercise, and let Loop be as in Example

a. What are all the functors [1] → Loop?

b. For any two functors F, G : [1] → Loop, characterize the set Nat(F, G) of natural transformations FG.


Consider the functor List: SetSet sending a set X to the set List(X) of lists with entries in X. There is a natural transformation List○List → List given by concatenation.

a. If someone said, “Singleton lists give a natural transformation σ from idSet to List,” what might she mean? That is, for a set X, what component σX might she be suggesting?

b. Do these components satisfy the necessary naturality squares for functions f : XY? In other words, given your interpretation of what the person is saying, is she correct?


Let C and D be categories, and suppose that dOb(D) is a terminal object. Consider the constant functor {d}C:CD, which sends each object cOb(C) to d and each morphism in C to the identity morphism idd on d.

a. For any other functor F:CD, how many natural transformations are there F{d}C?

b. Let D=Set, and let d = {☺}, which is a terminal object in Set (see Exercise or Warning If C=[1] is the linear order of length 1, and F:CSet is any functor, what does it mean to give a natural transformation {d}CF?

Application Figure 4.2 showed a finite state machine on alphabet Σ = {a, b}, and Example shows its associated action table. Imagine this was your model for understanding the behavior of some system when acted on by commands a and b. Suppose a colleague tells you he has a more refined model that fits with the same data. His model has six states rather than three, but it is compatible. What might that mean?

Both the original state machine, X, the proposed model, Y, and their associated action tables are shown in Figure 5.1 (see page 247).

How are these models compatible? In the table for Y, if one removes the distinction between states 1A, 1B, 1C and between states 2A and 2B, then one returns with the table for X. The table for Y is more specific, but it is fully compatible with the table for X. The sense in which it is compatible is precisely the sense defined by there being a natural transformation.

Recall that M=(List(),[],++) is a monoid, and that a monoid is simply a category with one object, say, Ob(M)={} (see Section 5.2.1). With Σ = {a, b}, the monoid M can be visualized as follows:


Recall also that a state machine on M is simply a functor MSet. We thus have two such functors, X and Y. A natural transformation α : YX would consist of a component αm for every object mOb(M) such that certain diagrams commute. But M having only one object, we need only one function α : Y(▲) → X(▲), where Y(▲) is the set of (6) states of Y and X(▲) is the set of (3) states of X.

The states of Y have been named so as to make the function α particularly easy to guess.12 We need to check that two squares commute:


This can only be checked by going through and making sure that certain things match, as specified by (5.11); this is spelled out in detail. The columns that should match are those whose entries are written in blue. These correspond to the left bottom composites being matched with the top right composites in the naturality squares of (5.11).



To recap, scientists may often have the idea that two models Y and X are compatible, and such notions of compatibility may be broadly agreed upon. However, these notions can at the same time be challenging to explain to an outsider, e.g., a regulatory body or auditor, especially in more complex situations. On the other hand, it is unambiguous to simply claim “there is a natural transformation from Y to X.” If, in a given domain, the notion of natural transformation captures the essence of compatible models, it may bring clarity.


Let F:CD be a functor. Suppose someone said, “The identity on F is a natural transformation from F to itself.”

a. What might he mean?

b. What components is he suggesting?

c. Are the components natural?


a. He is certainly telling us about a natural transformation α : FF, and he seems to be telling us that it will somehow act like an identity.

b. To give a questionably natural transformation, we need to provide, for every cOb(C) a morphism αc : F(c) → F(c) in D. Since we have in mind the word identity, we could take αc ≔ idF(c) for all c. This is probably what the person means.

c. For α to be natural we need to check that the following square commutes for any f : cc′ in C:


It clearly does commute, so α is natural. This natural transformation α is usually denoted idF : FF.

Example Let [1] ∈ Ob(Cat) be the free arrow category described in Exercise, and let D be any category. To specify a functor F:[1]D requires the specification of two objects, F(v1),F(v2)Ob(D) and a morphism F(e): F(v1) → F(v2) in D . The identity and composition formulas are taken care of once that much is specified. To recap, a functor F:[1]D is the same thing as a morphism in D .

Thus, choosing two functors F,G:[1]D is precisely the same thing as choosing two morphisms in D . Let us call them f : a0a1 and g : b0b1, where we have f = F(e), a0 = F(v0), a1 = F(v1) and g = G(e), b0 = G(v0), b1 = G(v1).

A natural transformation α : FG consists of two components, i.e., morphisms αv0:a0b0 and αv1:a1b1, drawn as dashed lines:


The condition for α to be a natural transformation is that this square commutes.

In other words, a functor [1]D is a morphism in D and a natural transformation between two such functors is just a commutative square in D .

Example Recall that to any graph G we can associate the paths-graph Paths(G) (see Example This is a functor Paths: GrphGrph. There is also an identity functor idGrph : GrphGrph. A natural transformation η : idGrph → Paths would consist of a graph homomorphism ηG : idGrph(G) → Paths(G) for every graph G. But idGrph(G) = G by definition, so we need ηG : G → Paths(G). Recall that Paths(G) has the same vertices as G, and every arrow in G counts as a path (of length 1). So there is an obvious graph homomorphism from G to Paths(G). It is not hard to see that the necessary naturality squares commute.

Example For any graph G we can associate the paths-graph Paths(G), and can do that twice to yield a new graph Paths(Paths(G)). Let’s think through what a path of paths in G is. It is a head-to-tail sequence of arrows in Paths(G), meaning a head-to-tail sequence of paths in G. These composable sequences of paths (or “paths of paths”) are the individual arrows in Paths(Paths(G)). The vertices in Paths(G) and Paths(Paths(G)) are the same as those in G, and all source and target functions are as expected.

Clearly, given such a sequence of paths in G, we could compose them to one big path in G with the same endpoints. In other words, for every G ∈ Ob(Grph), there is graph homomorphism µG : Paths(Paths(G)) → Paths(G) that is called concatenation. In fact, this concatenation extends to a natural transformation


between functors GrphGrph. Example compared a graph to its paths-graph using a natural transformation idGrph → Paths; here we are making a similar kind of comparison.

Remark Example showed that there is a natural transformation comparing each graph to its paths-graph. There is a formal sense in which a category is nothing more than a kind of reverse mapping. That is, to specify a category is the same thing as to specify a graph G together with a graph homomorphism Paths(G) → G. The formalities involve monads (see Section 7.3).


Let X and Y be sets, and let h : XY. There is a functor CX : GrphSet that sends every graph to the set X and sends every morphism of graphs to the identity morphism idX : XX. This functor is called the constant functor at X. Similarly, there is a constant functor CY : GrphSet.

a. Use h to construct the components of a questionably natural transformation α : CXCY.

b. Is α natural?


For any graph (V, A, src, tgt) we can extract the set of arrows or the set of vertices. Since each morphism of graphs includes a function between their arrow sets and a function between their vertex sets, we actually have functors Ar : GrphSet and Ve : GrphSet.

a. If someone said, “Taking source vertices gives a natural transformation from Ar to Ve,” what questionably natural transformation might she be referring to?

b. Is she correct, i.e., is it natural?

c. If a different person, say, from a totally different city and in a totally different frame of mind, were to hear this and say, “Taking target vertices also gives a natural transformation from Ar to Ve,” would they also be correct?

Example (Graph homomorphisms are natural transformations). As discussed (see diagram (5.8)), there is a category GrIn for which a functor G : GrInSet is the same thing as a graph. Namely, we have


A natural transformation of two such functors α : GG′ involves two components, αAr : G(Ar) → G′(Ar) and αVe : G(Ve) → G′(Ve), and two naturality squares, one for src and one for tgt. This is precisely the same thing as a graph homomorphism, as defined in Definition

5.3.2   Vertical and horizontal composition

This section discusses two types of compositions for natural transformations. The terms vertical and horizontal are used to describe them; these terms come from the following pictures:


We use the symbol ○ to denote vertical composition, so we have βα : FH in the left-hand diagram. We use the symbol ◇ for horizontal composition, so we have γ2γ1 : F2F1G2G1 in the right-hand diagram. Of course, the actual arrangement of things on a page of text does not correlate with verticality or horizontality—these are just names. We define them more carefully in the following.   Vertical composition of natural transformations

The following proposition proves that functors and natural transformations (using vertical composition) form a category.

Proposition Let C and D be categories. There exists a category, called the category of functors from C to D and denoted Fun(C,D), whose objects are the functors CD and whose morphisms are the natural transformations,

HomFun(C,D)(F,G)={α:FG|α is a natural transformation}.

Under this setup, there are indeed identity natural transformations and a composition formula for natural transformations, so we have defined a questionable category Fun(C,D). The category laws hold, so it is indeed a category.

Proof. Exercise showed that for any functor F:CD, there is an identity natural transformation idF : FF (its component at cOb(C) is idF(c) : F(c) → F(c)).

Given a natural transformation α : FG and a natural transformation β : GH, we need a composite βα. We propose the transformation γ : FH having components βcαc for every cOb(C). To see that γ is indeed a natural transformation, one simply puts together naturality squares for α and β to get naturality squares for βα.

One proves the associativity and identity laws in Fun(C,D) using the fact that they hold in D.

Notation We sometimes denote the category Fun(C,D) by DC.

Example Recall from Exercise that there is a functor Ob: CatSet sending a category to its set of objects. And recall from Example that there is a functor SetDiscCat sending a set to the discrete category with that set of objects (all morphisms in Disc(S) are identity morphisms). Let P : CatCat be the composition P = Disc ○ Ob. Then P takes a category and makes a new category with the same objects but no morphisms. It is like crystal meth for categories.

Let idCat : CatCat be the identity functor. There is a natural transformation i : P → idCat. For any category C, the component iC:P(C)C is pretty easily understood. It is a morphism of categories, i.e., a functor. The two categories P(C) and C have the same set of objects, namely, Ob(C), so the functor is identity on objects; and P(C) has no nonidentity morphisms, so nothing else needs be specified.


Let art be the category with Ob(D)={A}, and HomD(A,A)={idA}. What is Fun(D,Set)? In particular, characterize the objects and the morphisms.

Notation Recall from Notation that if X is a set, we can represent an element xX as a function {}xX. Similarly, suppose that C is a category and cOb(C) is an object. There is a functor 1¯C that sends 1 ↦ c. We say that this functor represents cOb(C). We may denote it c:1¯C.


Let n ∈ ℕ, and let n be the set with n elements, considered as a discrete category.13 In other words, we write n to mean what should really be called Disc(n). Describe the category Fun(3, 2).

Example Let 1 denote the discrete category with one object (also known as the trivial monoid). For any category C, we investigate the category DFun(C,1¯). Its objects are functors C1¯. Such a functor F assigns to each object in C an object in 1, of which there is one; so there is no choice in what F does on objects. And there is only one morphism in 1, so there is no choice in what F does on morphisms. The upshot is that there is only one object in D, let’s call it F, so D is a monoid. What are its morphisms?

A morphism α : FF in D is a natural transformation of functors. For every cOb(C), we need a component αc : F (c) → F (c), which is a morphism 1 → 1 in 1. But there is only one morphism in 1, namely, id1, so there is no choice about what these components should be: they are all id1. The necessary naturality squares commute, so α is indeed a natural transformation. Thus the monoid D is the trivial monoid; that is, Fun(C,1¯)1¯ for any category C.


Let 0 represent the discrete category on 0 objects; it has no objects and no morphisms. Let C be any category.

a. What is Fun(0¯,C)?

b. What is Fun(C,0¯)?


Let [1] denote the free arrow category as in Exercise, and let GrIn be the graph-indexing category (see (5.8). Draw the underlying graph of the category Fun([1], GrIn).   Natural isomorphisms

Let C and D be categories. We have defined a category Fun(C,D) whose objects are functors CD and whose morphisms are natural transformations. What are the isomorphisms in this category?

Proposition (Natural isomorphism). Let C and D be categories, and let F,G:CD be functors. A natural transformation α : FG is an isomorphism in Fun(C,D) if and only if the component αc : F(c) → G(c) is an isomorphism for each object cOb(C). In this case α is called a natural isomorphism.

Proof. First, suppose that α is an isomorphism with inverse β : GF, and let βc : G(c) → F (c) denote its c component. We know that αβ = idG and βα = idF. Using the definitions of composition and identity given in Proposition, this means that for every cOb(C), we have αcβc = idG(c) and βcαc = idF(c); in other words, αc is an isomorphism.

Second, suppose that each αc is an isomorphism with inverse βc : G(c) → F(c). We need to see that these components assemble into a natural transformation, i.e., for every morphism h : cc′ in C, the right-hand square


commutes. We know that the left-hand square commutes because α is a natural transformation; each square is labeled with a ? or a ✓ accordingly. In the following diagram we want to show that the left-hand square commutes. We know that the middle square commutes.


To complete the proof we need only show that F(h) ○ βc = βc′G(h). This can be shown by a “diagram chase.” We go through it symbolically, for demonstration. The following three equalities come from the three check marks in the (5.14).



Recall from Application that a finite state machine on alphabet Σ can be understood as a functor MSet, where M=List(Σ) is the free monoid generated by Σ. That example also discussed how natural transformations provide a language for changing state machines. Describe what kinds of changes are made by natural isomorphisms.   Horizontal composition of natural transformations

Example (Whiskering). Suppose that M=List(a,b) and M=List(m,n,p) are free monoids, and let F:MM be given by sending [m] ↦ [a], [n] ↦ [b], and [p] ↦ [b, a, a]. An application of this might be if the sequence [b, a, a] were commonly used in practice and one wanted to add a new button just for that sequence.

Recall Application and Figure 5.1, which is reproduced here. Let X:MSet and Y:MSet be the functors, and let α : YX be the natural transformation.


We can compose X and Y with F as in the diagram below


to get functors YF and XF, both of type MSet. These would be as follows:14


The map α is what sent both State 1A and State 1B in Y to State 1 in X, and so on. We can see that the same α works now: the p columns of the tables respect that mapping; that is, they act like [b, a, a] or equivalently [n, m, m]. This is called whiskering. We used α : YX to get a natural transformation YFXF . It is a kind of horizontal composition of natural transformation.

Definition (Whiskering). Let B,C,D, and E be categories, let G1,G2:CD be functors, and let α : G1G2 be a natural transformation. Suppose that F:BC (resp. H:DE) is a functor as depicted here:


Then the prewhiskering of α by F, denoted αF : G1FG2F (resp. the post-whiskering of α by H, denoted Hα : HG1HG2),


is defined as follows.

For each bOb(B) the component (αF)b : G1F(b) → G2F(b) is defined to be αF(b) (resp. for each cOb(C), the component (Hα)c : HG1(c) → HG2(c) is defined to be H(αc)). Checking that the naturality squares commute (in each case) is straightforward.


Suppose given functors BFCGD, and let idG : GG be the identity natural isomorphism. Show that idGF = idGF.


By Definition, for each object bOb(B), the component (idGF)b is the identity morphism (idG)F(b) : G(F(b)) → G(F(b)). But there can be only one identity morphism, so (idG)F(b) = idGF(b) = idGF(b).

Definition (Horizontal composition of natural transformations). Let B, C, and D be categories, let F1,F2:BC and G1,G2:CD be functors, and let α : F1F2 and β : G1G2 be natural transformations, as depicted here:


By pre- and postwhiskering in one order or the other we get the following diagram:


It is straightforward to show that this diagram commutes, so we can take the composition to be the definition of the horizontal composition:


Remark Whiskering a natural transformation α with a functor F is the same thing as horizontally composing α with the identity natural transformation idF . This is true for both pre- and postwhiskering. For example, in the notation of Definition, we have

αF=αidF  and   Hα=idHα.

Theorem (Interchange).


Given a setup of categories, functors, and natural transformations as shown, we have


Proof. One need only observe that each square commutes in the following diagram, so taking either outer path to get (β2β1) ◇ (α2α1) yields the same morphism as taking the diagonal path, (β2α2) ○ (β1α1):



Suppose given categories, functors, and natural transformations as shown:


such that α : FF′ and β : GG′ are natural isomorphisms. Show that βα : GFG′ ○ F′ is a natural isomorphism.


Let α′ : F′ → α and β′ : G′ → G be the inverses of α and β respectively. To check that βα is an isomorphism, we use Theorem (and Exercise to see that


and similarly for the other order, (β′ ◇ α′) ○ (βα) = idGf.

5.3.3   The category of instances on a database schema

Section 5.2.2 showed that schemas are presentations of categories, and Section 5.4 shows that in fact the category of schemas is equivalent to the category of categories. This section therefore takes license to blur the distinction between schemas and categories.

If C is a schema, i.e., a category, then as discussed in Section, an instance on C is a functor I:CSet. But now we have a notion beyond categories and functors, namely, that of natural transformations. So we make the following definition.

Definition Let C be a schema (or category). The category of instances on C, denoted CSet, is Fun(C,Set). Its objects are C-instances (i.e., functors CSet), and its morphisms are natural transformations.

Remark One might object to Definition on the grounds that database instances should not be infinite. This is a reasonable perspective, and the definition can be modified easily to accommodate it. The subcategory Fin (see Example of finite sets can be substituted for Set in Definition One could define the category of finite instances on C as CFin=Fun(C,Fin). Almost all of the ideas in this book will make perfect sense in CFin.

Natural transformations should serve as some kind of morphism between instances on the same schema. How are we to interpret a natural transformation α : IJ between database instances I,J:CSet?

A first clue comes from Application There we considered the case of a monoid M, and we thought about a natural transformation between two functors X,Y:MSet, considered as different finite state machines. The notion of natural transformation captured the idea of one model being a refinement of another. This same kind of idea works for databases with more than one table (categories with more than one object). Let’s work it through slowly.

Example Consider the terminal schema, art. An instance is a functor 1Set, which represents a set (see Notation A natural transformation α : IJ is a function from set I to set J. In the standard table view, we might have I and J as shown here:


There are 343 natural transformations IJ. Perhaps some of them make more sense than others, e.g., we could hope that the numbers in I corresponded to the numbers after the hyphen in J or perhaps to what seems to be the date in January. Knowing something like this would reduce this to only a few options out of 343 possible mappings. But it could be that the rows in J correspond to batches, and all three grapes in I are part of the first batch on Jan-01.

The point is that the notion of natural transformation is a mathematical one; it has nothing to do with the kinds of associations we might find natural, unless we have found a categorical encoding for this intuition.


Recall the notion of set-indexed sets from Definition Let A be a set, and devise a schema A such that instances on A are A-indexed sets. Is our current notion of morphism between instances (i.e., natural transformations) well aligned with this definition of mapping of A-indexed sets?


Definition actually gives us the objects and morphisms of a category, say, the category of A-indexed sets, in that it tells us that the objects and morphisms are merely the A-indexed sets and the A-indexed functions. Let us denote the category of A-indexed sets ASet; this exercise is asking for a category A for which there is an isomorphism

ASet  Fun(A,Set).

And indeed there is. Let A=Disc(A) be the discrete category on A objects. Then a functor S:ASet is just a set S(a) for every aA, and a morphism SS′ is just a component fa : S(a) → S′(a) for each aA. These coincide exactly with the notions of A-indexed set and of mappings between them.

For a general schema (or category) C, let us think through what a morphism α : IJ between instances I,J:CSet is. For each object cOb(C), there is a component αc : I(c) → J(c). This means that just as in Example, there is for each table c a function from the rows in I’s manifestation of c to the rows in J’s manifestation of c. So to make a natural transformation, such a function has to be specified table by table. But then we have to contend with naturality squares, one for every arrow in C. Arrows in C correspond to foreign key columns in the database. The naturality requirement was already covered in Application (see especially how (5.11) is checked in (5.12) and (5.13)).

Example We saw in Section that graphs can be regarded as functors GSet, where GGrIn is the schema for graphs shown here:


A database instance I:GSet on G consists of two tables. Here is an example instance:


To discuss natural transformations, we need two instances. Here is another, J:GSet:


To give a natural transformation α : IJ, we give two components: one for Arrow and one for Vertex. We need to say where each vertex in I goes in J, and we need to say where each arrow in I goes in J. The naturality squares insist that if we specify that gj, for example, then we had better specify that wr and that xs. What a computer is very good at, but a human is fairly slow at, is checking that a given pair of components (arrows and vertices) really is natural.

There are 8000 ways to devise component functions αArrow and αVertex, but precisely six natural transformations, i.e., six graph homomorphisms, IJ; the other 7,994 are haphazard flingings of arrows to arrows and vertices to vertices without any regard to sources and targets. The six are briefly described now. The reader should look at the graph diagrams of I and J while following along.

Every vertex in I has to be sent to some vertex in J, so we think about where to send v and proceed from there.

  • If we try to send v? u, we fail because u touches no arrows, so there is nowhere for f to go. (0)
  • If we send vq, then f must map to i, and w must map to r, and both g and h must map to j, and x must map to s. (1)
  • If we send vr, then there are two choices for g times two choices for h. (4)
  • If we send vs, then there is one way to obtain a graph morphism. (1)
  • If we try to send v? t, we fail as before. (0)

Humans may follow the diagrams better than the tables, whereas computers probably understand the tables better.


If I,J:GSet, as in Example, how many natural transformations are there JI?


Let GrIn be the graph-indexing category, and let YA : GrInSet denote the following instance:


Let I : GrInSet be as in Example

a. How many natural transformations are there YAI?

b. With J as previously, how many natural transformations are there YAJ?

c. Do you have any conjecture about the way natural transformations YAX behave for arbitrary graphs X:GSet?


It is useful to see YA as a graph so we can visualize the graph morphisms YAI or YAJ.


a. A graph morphism YAI amounts to an arrow in graph I. In other words, there is a natural isomorphism


How does this works? What might g mean as a natural transformation YAI?

To give a questionably natural transformation α : YAI, we need to give a component αAr : {a} → {f, g, h} and a component αVe : {v0, v1} → {v, w, x}. Since we have g in mind, let’s put αAr(a) ≔ g. There are 32 choices for αVe, but only one is natural because the two morphisms src, tgt : ArVe demand two naturality equations,


In other words, once we choose αAr(a) to be g, the rest is forced on us. In the same way, we could have chosen αAr(a) to be any of f, g, h, which is why we said Nat(YA, I) ≅ {f, g, h}.

b. There are four, Nat(YA, J) ≅ {i, j, k, }.

In terms of databases, this notion of instance morphism α : IJ on a schema C is sometimes called a database homomorphism. It is related to what is known as provenance, in that it tells us how every row in I relates to a counterpart row in J. More precisely, for every table in C, the morphism α gives a mapping from the set of rows in I’s version of the table to J’s version of the table, such that all the foreign keys are respected. This notion of morphism has excellent formal properties, so projections, unions, and joins of tables (the typical database operations) would be predicted to be interesting by a category theorist who has no idea what a database is.15

5.3.4   Equivalence of categories

We have a category Cat of categories, and in every category there is a notion of isomorphism between objects: one morphism each way, such that each round-trip composition is the identity. An isomorphism in Cat, therefore, takes place between two categories, say, C and D: it is a functor F:CD and a functor G:DC such that GF=idC and FG=idD.

It turns out that categories are often similar enough to be considered equivalent without being isomorphic. For this reason, the notion of isomorphism is considered too strong to be useful for categories, akin to saying that two material samples are the same if there is an atom by atom matching, or that two words are the same if they are written in the same font and size, by the same person, in the same state of mind.

As reasonable as isomorphism is as a notion in most categories, it fails to be the right notion about categories. The reason is that in categories there are objects and morphisms, whereas when we talk about categories, we have categories and functors plus natural transformations. Natural transformations serve as mappings between mappings, and this is not part of the structure of an ordinary category. In cases where a category C does have such mappings between mappings, it is often best to take that extra structure into account, as we do for C=Cat. This whole subject leads to the study of 2-categories (or n-categories, or ∞-categories), not discussed in this book. See, for example, Leinster [25] for an introduction.

The purpose now is to explain this “good notion” of sameness for categories, namely, equivalence of categories, which appropriately takes natural transformations into account. Instead of functors going both ways with round-trips equal to identity, which is required in order to be an isomorphism of categories, equivalence of categories demands functors going both ways with roundtrips naturally isomorphic to identity.

Definition (Equivalence of categories). Let C and C be categories. A functor F:CC is called an equivalence of categories and denoted F:CC16 if there exists a functor F:CC and natural isomorphisms α:idC  FF and α:idC  FF. In this case we say that F and F′ are mutually inverse equivalences.

Suppose we are given functors F:CC and F:CC. We want to know something about the round-trips on C and on C; we want to know the same kind of information about each round-trip, so let’s concentrate on the C side. We want to know something about FF:CC, so let’s name it i:CC; we want to know that i is a natural isomorphism. That is, for every cOb(C), we want an isomorphism αc:c  i(c), and we want to know that these isomorphisms are picked carefully enough that given g : cc′ in C, the choice of isomorphisms for c and c′ are compatible:


To be an equivalence, the same has to hold for the other round-trip, i=FF:CC.


Let C and C be categories. Suppose that F:CC is an isomorphism of categories.

a. Is it an equivalence of categories?

b. If not, why? If so, what are the components of α and α′ (with notation as in Definition


a. Yes.

b. If a functor F:CC is an isomorphism of categories, then there exists a functor F:CC such that FF=idC and FF=idC. We might hope that F and F′ are mutually inverse equivalences of categories as well. We need natural transformations α:idCFF and α:idCFF. But since FF=idC and FF=idC, we can take α and α′ to be the identity transformations. Thus F and F′ are indeed mutually inverse equivalences of categories.

Example Let S be a set, and let S × SS × S be the complete relation on S, which is a preorder KS. Recall from Proposition that there is a functor i : PrOCat, and the resulting category i(KS) is called the indiscrete category on S; it has objects S and a single morphism between every pair of objects. Here is a diagram of K{1,2,3}:


It is easy check that K1, the indiscrete category on one element, is isomorphic to 1, the discrete category on one object, also known as the terminal category (see Exercise The category 1 consists of one object, its identity morphism, and nothing else. Let’s think about the difference between isomorphism and equivalence using KS ∈ Ob(Cat).

The only way that KS can be isomorphic to 1 is if S has one element.17 On the other hand, there is an equivalence of categories


for every set S ≠ ∅. So for example, K{1,2,3} from (5.15) is equivalent to the terminal category, 1.

In fact, there are many such equivalences, one for each element of S. To see this, let S be a nonempty set, and choose an element s0S. For every sS, there is a unique isomorphism ks:s  s0 in KS. Let F : KS1 be the only possible functor (see Exercise, and let F′ : 1KS represent the object s0. Note that F′ ○ F = id1 : 11 is the identity, but that FF′ : KSKS sends everything to s0. So F is not an isomorphism. We need to show that it is an equivalence.

Let α = id1, and define α:idKSFF by αs=ks. Note that αs is an isomorphism for each s ∈ Ob(KS) and that α′ is a natural transformation (hence, a natural isomorphism) because every possible square commutes in KS. This completes the proof, initiated in the preceding paragraph, that the category KS is equivalent to 1 for every nonempty set S and that this fact can be witnessed by any element s0S.

Example Consider the category FLin, described in Example, of finite nonempty linear orders. For every natural number n ∈ ℕ, let [n] ∈ Ob(FLin) denote the linear order shown in Example Define a category Δ whose objects are given by Ob(Δ) = {[n] | n ∈ ℕ} and with HomΔ([m], [n]) = HomFLin([m], [n]). The difference between FLin and Δ is only that objects in FLin may have odd labels, e.g.,

5      x      Sam

whereas objects in Δ all have standard labels, e.g.,

0      1      2

Clearly, FLin is a much larger category, and yet it feels as if it is pretty much the same as Δ. Actually, they are equivalent, FLinΔ. We will find functors F and F′ which witness this equivalence.

Let F′ : ΔFLin be the inclusion; and let F : FLinΔ send every finite nonempty linear order X ∈ Ob(FLin) to the object F(X) ≔ [n] ∈ Δ, where Ob(X) ≅ {0, 1, … , n}. For each such X, there is a unique isomorphism αX : X[n] , and these fit together into18 the required natural isomorphism idFLinF′ ○ F. The other natural isomorphism α′ : idΔFF′ is the identity.


Recall from Definition that a set X is called finite if there exists a natural number n ∈ ℕ and an isomorphism of sets Xn. Let Fin denote the category whose objects are the finite sets and whose morphisms are the functions. Let S denote the category whose objects are the sets n and whose morphisms are again the functions. The difference between Fin and S is that every object in S is one of these n’s, whereas every object in Fin is just isomorphic to one of these n’s.

For every object X ∈ Ob(Fin), there exists an isomorphism pX : Xm¯ for some unique object m¯Ob(S). Find an equivalence of categories FinS.


We say that two categories C and D are equivalent if there exists an equivalence of categories between them. Show that the relation of being equivalent is an equivalence relation on Ob(Cat).

Example Consider the group ℤ2 ≔ ({0, 1}, 0, +), where 1 + 1 = 0. As a category, ℤ2 has one object ▲ and two morphisms, namely, 0, 1, such that 0 is the identity. Since ℤ2 is a group, every morphism is an isomorphism.

Let C=1¯ be the terminal category, as in Exercise One might accidentally believe that C is equivalent to ℤ2, but this is not the case. The argument in favor of the accidental belief is that we have unique functors F:2C and F:C2 (and this is true); the round-trip FF:CC is the identity (and this is true); and for the round-trip F′ ○ F : ℤ2 → ℤ2 both morphisms in ℤ2 are isomorphisms, so any choice of morphism α: ▲ → F′ ○ F(▲) will be an isomorphism (and this is true). The problem is that whatever one does with α, one gets a questionably natural isomorphism, but it will never be natural.

When we round-trip F′ ○ F : ℤ2 → ℤ2, the image of 1: ▲ → ▲ is F′ ○ F(1) = 0 = id. So the naturality square for the morphism 1 looks like this:


where it is undecided whether α is to be 0 or 1. Unfortunately, neither choice works (i.e., for neither choice will the diagram commute) because x + 1 ≠ x + 0 in ℤ2.

Definition (Full and faithful functors). Let C and D be categories, and let F:CD be a functor. For any two objects c,cOb(C), there is a function


guaranteed by the definition of functor. We say that F is a full functor if HomF(c, c′) is surjective for every c,cOb(C). We say that F is a faithful functor if HomF(c, c′) is injective for every c, c′. We say that F is a fully faithful functor if HomF(c, c′) is bijective for every c, c′.


Let 1 and 2 be the discrete categories on one and two objects respectively. There is only one functor F : 21.

a. Is it full?

b. Is it faithful?


Let 0 denote the empty category, and let C be any category. There is a unique functor F:0¯C.

a. For general C, will F be full?

b. For general C, will F be faithful?

c. For general C, will F be an equivalence of categories?

Proposition Let C and C be categories, and let F:CC be an equivalence of categories. Then F is fully faithful.

Sketch of proof. Suppose F is an equivalence, so we can find a functor F:CC and natural isomorphisms α:idCFF and α:idCFF. We need to know that for any objects c,dOb(C), the map


is bijective. Consider the following diagram


One can check that HomC(αc1,αd) is bijective, so the vertical function is surjective by Exercise The fact that HomC((αFC)1,αFD) is bijective implies that the vertical function is injective. Thus we know that HomF′ (Fc, Fd) is bijective. This implies that HomF(c, d) is bijective as well.


Let ℤ2 be the group (as category) from Example Are there any fully faithful functors ℤ21?

5.4   Categories and schemas are equivalent, Cat ≃ Sch

Perhaps it is intuitively clear that schemas are somehow equivalent to categories. In fact, this is a reason that so much attention has been given to databases (and ologs). This section makes the equivalence between schemas and categories precise; it is proved in Section 5.4.2. The basic idea was laid out in Section 5.2.2.

5.4.1   The category Sch of schemas

Recall from Definition that a schema consists of a pair C(G,), where G = (V, A, src, tgt) is a graph and ≃ is a congruence, meaning a kind of equivalence relation on the paths in G (see Definition If we think of a schema as being analogous to a category, what in schema-land should fulfill the role of functors? That is, what are to be the morphisms in Sch?

Unfortunately, one’s first guess may give the wrong idea if we want an equivalence SchCat. Since an object in Sch is a graph with a congruence, one might imagine that a morphism CC in Sch should be a graph homomorphism (as in Definition that preserves the congruence. But graph homomorphisms require that arrows be sent to arrows, whereas we are more interested in paths than in individual arrows—the arrows are merely useful for presentation.

If instead we define morphisms between schemas to be maps that send paths in C to paths in C, subject to the requirements that path endpoints, path concatenations, and path equivalences are preserved, this will turn out to give the correct notion. In fact, since a path is a concatenation of its arrows, it is more concise to give a function F from the arrows of C to the paths of C. This is how we proceed.

Recall from Examples and the paths-graph functor Paths: GrphGrph, the paths of paths functor Paths ○ Paths: GrphGrph, and the natural transformations for any graph G,


The function ηG spells out the fact that every arrow in G counts as a path in G, and the function μG spells out the fact that a head-to-tail sequence of paths (a path of paths) in G can be concatenated to a single path in G.


Let [2] denote the graph 0e11e22, and let Loop denote the unique graph having one vertex and one arrow


a. Find a graph homomorphism f : [2] → Paths(Loop) that is injective on arrows (i.e., such that no two arrows in the graph [2] are sent by f to the same arrow in Paths(Loop)).

b. The graph [2] has six paths, so Paths([2]) has six arrows. What are the images of these arrows under the graph homomorphism Paths(f): Paths([2]) → Paths(Paths(Loop)), where f is the morphism you chose in part (a)?

c. Finally, using μLoop : Paths(Paths(Loop)) → Paths(Loop), a path of paths in Loop can be concatenated to a path. Write what the composite graph homomorphism


does to the six arrows in Paths([2]).

Before we look at the definition of schema morphism, let’s return to the original question. Given graphs G, G′ (underlying schemas C, C) we wanted a function from the paths in G to the paths in G′, but it was more concise to speak of a function from arrows in G to paths in G′. How do we get what we originally wanted from the concise version?

Given a graph homomorphism f : G → Paths(G′), we use (5.16) to form the following composition, denoted simply Pathsf : Paths(G) → Paths(G′):

Paths(G)Paths(f)Paths(Paths(G))μGPaths(G) (5.17)

This says that given a function from arrows in G to paths in G′, a path in G becomes a path of paths in G′, which can be concatenated to a path in G′.

Definition (Schema morphism). Let G = (V, A, src, tgt) and G′ = (V′, A′, src′, tgt′) be graphs, and let C=(G,G) and C=(G,G) be schemas. A schema morphism F from C to D, denoted F:CD, is a graph homomorphism 19


that satisfies the following condition for any paths p and q in G:

if  pG q  then  PathsF(p) G PathsF(q).(5.18)

Two schema morphisms E,F:CC are considered identical if they agree on vertices (i.e., E0 = F0) and if, for every arrow f in G, there is a path equivalence in G


We now define the category of schemas, denoted Sch, to be the category whose objects are schemas as in Definition and whose morphisms are schema morphisms, as in Definition The identity morphism on schema C=(G,G) is the schema morphism idCηG:GPaths(G), as defined in Equation (5.16). We need only understand how to compose schema morphisms F:CC and F:CC. On objects their composition is clear. Given an arrow in C, it is sent to a path in C; each arrow in that path is sent to a path in C. We then have a path of paths, which we can concatenate (via μG : Paths(Paths(G″)) → Paths(G″), as in (5.16)) to get a path in C as desired.


A schema morphism sends vertices to vertices, arrows to paths, and path equivalences to path equivalences.

Example Let [2] be the linear order graph of length 2, at the left, and let C denote the diagram at the right:


We impose on C the path equivalence declaration a[g, h] ≃a[i] and show that in this case C and [2] are isomorphic in Sch. There is a unique schema morphism F:[2]C such that 0 ↦a, 1 ↦b, 2 ↦c; it sends each arrow in [2] to a path of length 1 in C. And we have a schema morphism F:C[2], which reverses this mapping on vertices; note that F′ must send the arrow i in C to the path 0[f1, f2] in [2], which is okay. The round-trip F ′ ○ F : [2] → [2] is identity. The round-trip FF:CC may look like it is not the identity; indeed it sends vertices to themselves and sends i to the path a[g, h]. But according to Definition, this schema morphism is considered identical to idC because there is a path equivalence idC(i)=[i]a[g,h]a=FF(i).


Consider the schema [2] and the schema C pictured in (5.19); this time we do not impose any path equivalence declarations on C, so a[g, h] ≄ a[i] in the current version of C.

a. How many schema morphisms are there [2]C that send 0 to a?

b. How many schema morphisms are there C[2] that send a to 0?


Consider the graph Loop as follows:


and for any natural number n ∈ ℕ, let Ln denote the schema (Loop, ≃n), where ≃n is the PED fn+1fn. Then Ln is the “finite hierarchy of height n” schema of Example Let 1 denote the graph with one vertex and no arrows; consider it a schema.

a. Is 1 isomorphic to L1 in Sch?

b. Is 1 isomorphic to any (other) Ln?


a. No. The schema L1 is the graph Loop with the PED f2 = f, so there is still one nontrivial arrow in L1, namely, f1f0, whereas 1 has only the identity arrow.

b. Yes, there is an isomorphism of schemas 1L0, because ff0 = ids in L0.


Let Loop and Ln be schemas as defined in Exercise

a. What is the cardinality of the set HomSch(L3, L5)?

b. What is the cardinality of the set HomSch(L5, L3)? Hint: The cardinality of the set HomSch(L4, L9) is 8.

5.4.2   Proving the equivalence

This section proves the equivalence of categories, SchCat. We construct the two functors SchCat and CatSch and then prove that these are mutually inverse equivalences (see Theorem

Construction (From schema to category). We first define a functor L : SchCat. Let C=(G,) be a schema, where G = (V, A, src, tgt). Define L(C) to be the category with Ob(L(C))=V, and with HomL(C)(v1,v2)PathG(v,w)/, i.e., the set of paths in G modulo the path equivalence relation for C. The composition of morphisms is defined by concatenation of paths, and part (4) of Definition implies that such composition is well defined. We have thus defined L on objects of Sch.

Given a schema morphism F:CC, where C=( G,), we need to produce a functor L(F):L(C)L(C). The objects of L(C) and L(C) are the vertices of G and G′ respectively, and F provides the necessary function on objects. Diagram (5.17) provides a function PathsF : Paths(G) → Paths(G′) provides the requisite function for morphisms.

A morphism in L(C) is an equivalence class of paths in C. For any representative path p ∈ Paths(G), we have PathsF(p) ∈ Paths(G′), and if pq, then PathsF(p) ≃′ PathsF(q) by condition (5.18). Thus PathsF indeed provides us with a function HomL(C)HomL(C). This defines L on morphisms in Sch. It is clear that L preserves composition and identities, so it is a functor.

Construction (From category to schema). We first define a functor R : CatSch. Let C=(Ob(C),HomC,dom,cod,ids,comp) be a category (see Exercise Let R(C)=(G,), where G is the graph


and with ≃ defined as the congruence generated by the following path equivalence declarations: for any composable sequence of morphisms f1, f2, …, fn (with dom(fi+1) = cod(fi) for each 1 ⩽ in − 1), we put

[f1,f2,...,fn]dom(f1)[fnf2f1]dom(f1), (5.20)

equating a path of length n with a path of length 1. This defines R on objects of Cat.

A functor F:CD induces a schema morphism R(F):R(C)R(D), because vertices are sent to vertices, arrows are sent to arrows (as paths of length 1), and path equivalence is preserved by (7.17) and the fact that F preserves the composition formula. This defines R on morphisms in Cat. It is clear that R preserves compositions, so it is a functor.

Theorem The functors


are mutually inverse equivalences of categories.

Sketch of proof. It is clear that there is a natural isomorphism α:idCatLR; i.e., for any category C, there is an isomorphism CL(R(C)).

Before giving an isomorphism β:idSchRL, we look at R(L(G))( G,) for a schema G=(G,). Write G = (V, A, src, tgt) and G′ = (V ′, A′, src′, tgt′). On vertices we have V = V′. On arrows we have A′ = PathG/≃. The congruence ≃′ for R(L(G)) is imposed in (5.20). Under ≃′, every path of paths in G is made equivalent to its concatenation, considered as a single path of length 1 in G′.

There is a natural transformation β : idSchRL whose G component sends each arrow in G to a certain path of length 1 in G′. We need to see that βG has an inverse. But this is straightforward: every arrow f in RL(G) is an equivalence class of paths in G; choose any one, and have β−1 send f there; by Definition, any other choice will give the identical morphism of schemas. It is easy to show that each round-trip is equal to the identity (again up to the notion of equality of schema morphism given in Definition

Figure 5.1 Finite state machines X and Y with alphabet Σ = {a, b} and three states (left) or six states (right), and their associated action tables.


1In Society of Mind [32].

2The reason for the notation Hom and the word hom-set is that morphisms are often called homomorphisms, e.g., in group theory.

3Full subcategory will be defined in Definition

4See Remark

5See Exercise if there is any confusion about this.

6The name of this morphism is unimportant. What matters is that HomX(x,y) has exactly one element iff xy.

7The topology is given by saying that U ⊆ ℝ is open iff for every xU, there exists ϵ > 0 such that {y ∈ ℝ | |yx| < ϵ} ⊆ U}. One says, “U ⊆ ℝ is open if every point in U has an epsilon-neighborhood fully contained in U.”

8The topology on ℝ × ℝ is similar; a subset U ⊆ ℝ × ℝ is open if every point xU has an epsilon-neighborhood (a disk around x of some positive radius) fully contained in U.

9This example may be somewhat crude, in accordance with the crudeness of my understanding of materials science.

10 Let I × I = {(x, y) ∈ ℝ2 | 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1} denote the square. There are two inclusions i0, i1 : IS that put the interval inside the square at the left and right sides. Two paths f0, f1 : IX are homotopic if there exists a continuous map f : I × IX such that f0 = fi0 and f1 = fi1,

I      i1i0I×I  f    X

11The discipline called information theory, invented by Claude Shannon, is concerned only with ideal compression schemes. It does not pay attention to the content of the messages—what they mean—as Shannon says specifically in his seminal paper: “Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.” Thus I think the subject is badly named. It should be called compression theory or redundancy theory.

Information is inherently meaningful—that is its purpose—so a theory unconcerned with meaning is not really studying information per se. (The people who decide on speed limits for roads and highways may care about human health, but a study limited to understanding ideal speed limit schemes would not be called “human health theory.”)

Information theory is extremely important in a diverse array of fields, including computer science [28], neuroscience [5], [27], and physics [16]. I am not trying to denigrate the field; I only disagree with its name.

12The function α : Y (▲) → X(▲) makes the following assignments: State 0 ↦ State 0, State 1A ↦ State 1, State 1B ↦ State 1, State 1C ↦ State 1, State 2A ↦ State 2, State 2B ↦ State 2.

13When we have a functor, such as Disc : SetCat, we sometimes say, “Let S be a set, considered as a category.” This means that we want to take ideas and methods available in Cat and use them on the set S. Having the functor Disc, we use it to move S into Cat, as Disc(S) ∈ Ob(Cat), upon which we can use the intended methods. However, Disc(S) is bulky, e.g., Fun(Disc(3), Disc(2)) is harder to read than Fun(3, 2). So we abuse notation and write S instead of Disc(S), and talk about S as though it were still a set, e.g., discussing its elements rather than its objects. This kind of conceptual abbreviation is standard practice in mathematical discussion because it eases the mental burden, but when one says “Let S be an X considered as a Y,” the other may always ask, “How are you considering X’s to be Y’s?” and expect a functor.

14The p column comes from applying b, then a, then a, as specified by F.

15More precisely, given a functor between schemas F:CD, the pullback ΔF:DSetCSet, its left ΣF and its right adjoint ΠF constitute these important queries. See Section 7.1.4.

16The notation ≃ has already been used for equivalences of paths in a schema. I do not mean to equate these ideas; I am just reusing the symbol. Hopefully, no confusion will arise.

17One way to see this is that by Exercise, we have a functor Ob: CatSet, and we know by Exercise that functors preserve isomorphisms, so an isomorphism between categories must restrict to an isomorphism between their sets of objects. The only sets that are isomorphic to 1 have one element.

18The phrase “these fit together into” is shorthand for, and can be replaced by, “the naturality squares commute for these components, so together they constitute.”

19By Definition, a graph homomorphism F : G → Paths(G′) will consist of a vertex part F0 : VV′ and an arrows part F1 : E → Path(G′). See also Definition