Sheaves on my mind

So 5am rolls around, and I’m still not asleep.  Of course.  It’s not like I have to be up in 2.5 hours or anything.  My brain is conspiring against me.

Whilst rolling around in bed NOT SLEEPING, my thoughts turned to sheaves: just what are they?  I posted a bit ago (longer than I like) about sheaves, but it was kind of a lame post, caught up in the definition of the objects.  This is a way of remedying that, and a bit more of some dry definitions (you really can’t escape them, unfortunately).

Put simply, I think of a sheaf (of sets) as a means of “observing” or “exploring” a space X.  A topological space is a mathematical object, independent of our puny reality and ordinary means of observation.  Think of a scientist walking around “in/ on” X.  Let’s call him monsieur Faisceau.  Monsieur Faisceau walks around X, recording what he sees in every open set U \subseteq X.  He starts noticing patterns in his observations…like X happens to look red at every open set.  Being a good scientist and having dutifully recorded all of his observations, he sees that the observation of X looking red in every open set agrees “on overlaps”.  That is, whenever he compares the results of his observations, they agree.  He then deduces (and maybe even publishes a paper) that “X looks red”.

This is obviously a vast simplification of what scientists can do (i.e. serve us better than acting as locally constant functions).  Monsieur Faisceau could be taking temperature readings on every open set of X, i.e. compiling a family of continuous functions (or smooth, depending on what he wants/ if X is a smooth manifold) T_i : U \to \mathbb{R} that give the give the temperature of U at every point.  Suppose, for the sake of the argument, that \cup U_i = X.  Again, he notices that his results agree on overlaps.  In mathspeak, T_i |_{U_i \cap U_j} = T_j |_{U_i \cap U_j}, for any i,j (in whatever indexing set we’re dealing with).  Being a good lazy intellectual, he doesn’t like keeping the data of each function T_i, so he instead defines a function T : X \to \mathbb{R} via T(x) = T_i(x) if x \in U_i.  This is well defined, since T_i(x) = T_j(x) if x \in U_i and x \in U_j.  He then, of course, publishes a paper about his findings.

There’s then a theorem that says specifying a family of sheaves on an open cover \{U_i\}_{i \in I} (i.e. F_i is a sheaf on U_i for each i) subject to some nice “gluing” conditions amounts to specifying a unique sheaf defined on all of X (see Hartshorne’s Algebraic Geometry, ex. 22 in section 2.1).  This can then be thought of as a collection of scientists who work together observing X.  They each work on their own open set, and talk to each other whenever they work on overlapping interests.  That is, scientist F_i specializes in studying the open set U_i, and scientist F_j specializes in studying U_j.  Thus they both specialize in studying U_i \cap U_j as well, and any observation made by one scientist on U_i \cap U_j is communicated “isomorphically” to the other scientist about that area.  Obviously, any scientist communicates with himself about his work without having to do any extra work.  Papers published about global results then amount to “joint work” by the scientists.

Another fundamental process to understand is “sheafification” (an AWESOME word, btw).  To understand this, we must understand why presheaves are so lacking.  These can be thought of as people with really poor memory who are walking around in X.  They’re just as smart as a scientist, and thus see all the same things when they look at each open set U \subseteq X.  They just have a hard time remembering their results/ observations, and can’t infer global data by “gluing” local data.  This puts the “forgetful functor \textbf{Sh}(X) \to \textbf{Psh}(X) (now aptly named) in a new perspective: it takes a scientist and hits him in the head, rendering him unable to remember anything he sees.

Hopefully this makes some sense to other people.  I’ll return with sheafification soon.

Presheaves of Sets are (finitely) Bi-Complete

As the title says, I want to show that for any topological space X, the category of set-valued presheaves PSh(X) on X has all finite limits and co-limits.  


First, PSh(X) has both initial and terminal objects.  With a bit of thought, these are (obviously) the constant functors  and (resp.) where, for all open subsets U of X we have \textbf{0}(U) = \emptyset and \textbf{1}(U) = \{ *\} and the morphisms for 0 and 1 are just the identities.  

Second, we need to show that PSh(X) has a pullback for every diagram A \overset{\varphi}{\to} C \overset{\psi}{\leftarrow} B and a pushout for every diagram A \overset{\alpha}{\leftarrow} C \overset{\beta}{\to} B.  


Indeed, let A \overset{\varphi}{\to} C \overset{\psi}{\leftarrow} B be a diagram of presheaves.  I claim that the presheaf defined via A \times_C B(U) = A(U) \times_{C(U)} B(U) for all U \subseteq X open.  Basically, we just define the pullback “element-wise” on the source category.  The morphisms are a bit tricky though.  Indeed, given a map i: V \hookrightarrow U of open sets, how do we define the “restriction” map A \times_C B(i) : A \times_C B(U) \to A \times_C B(V)?  One should never diagram chase in public, but let me assure you that it really ends up just being the map guaranteed by the universal property of the pullback.  




(picture for clarity).  It’s a bit clumsy, but one gets a pushout in the same manner, defining it element-wise as a the quotient A(U) \sqcup B(U)/ \thicksim, where a \thicksim b iff there exists a c \in C(U) such that a = \alpha(c) and b = \beta(c).  This sometimes denoted as A \sqcup_C B.  


Since PSh(X) has a terminal object and pullbacks, it has a finite limits.  On the other hand, it has an initial object and pushouts, so it has all finite co-limits.  Easy.  


It turns out that the category of sheaves on XSh(X) is finitely bi-complete as well, but showing it for PSh(X) is really easy so I decided to just do that one.  
Next time, I want to look at the following case:  If there is pair of adjoint functors F: \textbf{C} \to \textbf{D} and G : \textbf{D} \to \textbf{C} with F \dashv G, do we have a pair of adjoint functors between sheaves with values in C and sheaves with values in D on some topological space X?  I’ve only looked at the case where the adjunction is the forgetful functor and free abelian group functor and the case of presheaves, where this statement does in fact hold.  Until next time.  🙂

What are Sheaves, and why should I care?

For anyone who has done a bit of work in modern geometry (primarily the notion of a (smooth) manifold), we want objects to be “locally” trivial, or easy to study.  The global structure might be this crazy awesome geometric shape, but locally it’s going to look like boring old \mathbb{R}^n or something like that.  How much it’s supposed to “look like” \mathbb{R}^n depends on what you want to study.  For example, a smooth manifold M is a set together with an atlas of “smooth” charts, such that for any point p \in M, there is an open neighborhood U of p that is diffeomorphic to an open subset of \mathbb{R}^n.

The idea is that although the global structure of some object might be hard to study, local behavior should be easy.  Think of looking at say… a torus (doughnut).  For any point on the torus, if you look close enough, it looks pretty much flat.  Even though the global shape is decidedly not flat.

Think now of something like a smooth function on a smooth manifold M, say f: M \to \mathbb{R}.  We don’t really have to define f everywhere, we just have to know that f behaves smoothly with respect to the atlas of M.  That is, for any point p \in M, there is a neighborhood U \ni p, and chart \varphi: \mathbb{R}^n \to U, such that f \circ \varphi is a smooth, real-valued function.

Most people don’t go this deep down the rabbit hole, but there is a unifying principle behind extending local data to global data.  This is given by the notion of a “sheaf.”  Most of the time, people first encounter these things in an algebraic geometry or algebraic topology class, in the context of “cohomology with local coefficients” which are usually abelian groups or something similar.

First, presheaves (of abelian groups) on a topological space X.  A presheaf F on X constists of the data of:

  • For every open set U \subseteq X, an abelian group F(U).
  • For every inclusion of open sets V \hookrightarrow U, a “restriction” homomorphism \rho_{UV} : F(U) \to F(V).
  • F(\emptyset) = 0, the trivial group.

A sheaf is all this, subject to a nice “gluing” condition.  That is:

  • For every open set U \subseteq X and open cover \{U_i\}_{i \in I} of U, if s \in F(U) is such that s|_{U_i} = 0 for all i \in I, then s = 0 \in F(U).
  • For every open set U \subseteq X and open cover \{U_i\}_{i \in I} of U, if s_i \in F(U_i) are sections such that for all i,j \in I we have s_i|_{U_i \cap U_j} = s_j|_{U_i \cap U_j}, then there exists a section s \in F(U) such that s|_{U_i} = s_i for all i \in I.

Note here that the former condition implies that the section s \in F(U) in the latter condition is unique.

That was a bit of a mouthful.  So complicated a definition.  I don’t really like this way of defining it, but it’s okay.

Let’s start again.  Let X be a topological space, and make the category X whose objects are the open sets of X and morphisms are those induced by the obvious poset structure.  Then a presheaf F of abelian groups is just a functor F: \textbf{X}^{op} \to \textbf{Ab}.  Simple!

F is a sheaf if, for every open set U and cover \{U_i\}_{i \in I},

F(U) \to \prod_{i \in I} F(U_i) \rightrightarrows \prod_{(i,j) \in I \times I} F(U_i \cap U_j)

is an equalizer diagram.

So now we have sheaves.  What are the maps?  Well, the sheaves are just functors, so the obvious choice is that they’re natural transformations of functors.  Hence, we have a category of sheaves!  Denote this by \textbf{Sh}_{\textbf{C}}(X) if the sheaves have values in a category C.  

Why should I care?

\textbf{Sh}_{\textbf{C}}(X) tends to retain a lot of the structure of the category C.  The most encountered example is that \textbf{Sh}_{\textbf{C}}(X) is an abelian category whenever C is (I’ll revisit these neat abelian categories in a later post.  They basically “behave like abelian groups” enough for us to do homological algebra.).   The example I want to pursue is that \textbf{Sh}_{\textbf{C}}(X) is a topos whenever C is (I’ll DEFINITELY do a post on these things later).

They come up everywhere in geometry.  Smooth function on a smooth manifold?  Sheaf.  Continous functions on a topological space?  Sheaf.  Measurable functions on a a measure space?  Sheaf.   Regular functions on a variety? Sheaf.

I’m still learning this stuff, and I’m continually amazed at how pervasive the idea is.  Turns out that you can also define sheaves on a category by giving a the category a certain “topology” called a “Grothendieck topology.”

Wherever there is the study of local vs. global behavior, there is sheaf theory.  Even in physics now, where one studies the structure of “quantum events” via covers of boolean reference frames, or where “locality and contextuality” is the cohomology of sheaves.  So. Fucking. Cool.

Until next time.

Universal Properties IV: Cones and a first look at Limits

Sorry for the delay since my last post (to those who actually read this…)

So I stumbled across a really nice way of looking at universal properties that is equivalent to specifying them as a terminal (or initial) object in a suitable comma category, but it has a much nicer “intuitive feel.”

Cones (and co-Cones)

Let C be some category.  Let \{d_i\}_{i \in I} be a collection of objects of C indexed by some set I, and let \{g_{ij}: d_i \to d_j\}_{i,j \in I} be a collection of morphisms in C (we do not require that there is a morphism for any two i,j, we only allow for the possibility of there being one).  We call this collection of objects and morphisms a diagram  in C.  

Let D be a diagram in C.  cone on D is a C-object c and collection of morphisms f_i : c \to d_i, such that for all i,j \in I, f_j \circ g_{ij} = f_i.  If we have two such “cones,” c and c' on this diagram, we say h: c' \to c such that the appropriate diagram commutes (try to figure it out! It’s a good idea to get an intuitive feel for these things).  It follows pretty quickly then that we have a category of cones over D, call it \textbf{C}_D.  A limit of the diagram D is then just a terminal object in \textbf{C}_D.

Similarly, a co-cone is just a cone with all the arrows reversed (i.e. an object c together with maps f_i : d_i \to c for each i).  A co-limit of such a co-cone is an initial object in the category of co-cones over the appropriate diagram.


Say we’re working in the category R-Mod  for some ring with unity R.

  • Pullbacks:  Let A,B,C be three R-modules, and consider the diagram A \overset{f}{\to} C \overset{g}{\leftarrow} B.  The limit of this diagram is then just the ordinary pullback (or fiber product), the module A \times_C B = \{ (a,b) | f(a) = g(b)\}
  • Products: Let A,B be R-modules.  Then the limit of the “diagram” consisting of just A and B and no morphisms between them is the product A \times B.
  • co-Products: consider the same diagram used for the product.  The co-limit of this is then the co-product of A and B, A \oplus B.
  • Terminal objects: are just the limit of the “empty diagram.”
  • Initial objects: are just the co-limit of the “empty diagram.”

and so on.

Having “Finite (co-) Limits”

Notice that all the above limits and co-limits were taken over a “finite” diagram.  That is, there were only finitely many objects and morphisms in each diagram.  Such (co-)limits are referred to as “finite” (co-)limits (I wonder why…).  It turns out that it’s a highly desirable property for a category to “have all (finite) (co-)limits.”  It took me a lonnnnnggg time to grok this.

Remember when we first started talking about universal properties?  When you specify that an object satisfies a certain universal property, it is unique up to unique isomorphism if it actually exists.  These objects don’t have to exist.  The property of having, say, all finite limits or co-limits means that whenever you specify a universal property for an object with a finite diagram, that object actually exists.  It’s a theorem (that I don’t currently know how to prove) that a category C with a terminal object and all pullbacks has all finite limits.  Dually, if C has an initial object and all pushouts, it has all finite co-limits. Is this so unreasonable?  Look at the list of examples again.

Back?  Good.  Suppose we’ve got all pullbacks and a terminal object, call it 1.  Then the product is just the limit of the pullback diagram A \to \textbf{1} \leftarrow B.  The equalizer of two parallel maps f,g : A \to B is the pullback of A \overset{f}{\to} B \overset{g}{\leftarrow} A.  The kernel of f: A \to B (we’re still working with R-modules) is the pullback of A \overset{f}{\to} B \overset{0}{\leftarrow} A.  Get the picture?

A pretty good thing to try here would be to find out how these are equivalent to universal properties.  So go try that.  🙂

Universal Properties III: Bringing it all together

So last time I mentioned that we could describe the kernel of a group homomorphism via a universal property.  For example, let \varphi: G \to H be a group homomorphism, and let D be the full subcategory of Grp consisting of all groups K such that for any group homomorphism f: K \to G we have \varphi \circ f = 0_H is the zero homomorphism from K to H.  Good.  Now if A is the category with one element, and S : \textbf{A} \to \textbf{Grp} is a functor with S(*) = G, U: \textbf{D} \to \textbf{Grp} the inclusion functor,  then the terminal object in the comma category (U \downarrow G) is the kernel of \varphi!  Simple.

If you can understand all that, then it shouldn’t be too hard to see that the cokernel has a similar description.  Cokernels are a bit more annoying to deal with when we’re just talking about ordinary groups (the image of a homomorphism is not necessarily a normal subgroup of the target group).  Let’s then just restrict our attention to Ab, where things are much nicer.   Ordinarily, we would define the cokernel of a homomorphism \varphi: G \to H as the quotient group H/ Im(\varphi).    As before, let D be the full subcategory of Ab such that for any group C in D, we have that for any homomorphism f: H \to C, the composition f \circ \varphi = 0_C, the homomorphism that sends everything in G to 0 in C.  Let U : \textbf{D} \to \textbf{Ab} be the inclusion functor, and S: \textbf{A} \to \textbf{Ab} the functor from the category with one element with value S(*) = H.  Then the cokernel is the initial object in the comma category (H \downarrow U).

If we’re in a “nice” category, like the category Ab of abelian groups.  Then the image of a group homomorphism \varphi: G \to H has a particularly cool “set free” definition.  Recall that when we defined the cokernel of a homomorphism, the object is actually a pair (C,f), where f: H \to C is a homomorphism and C is the object that we normally think of as a cokernel.  Since f is a group homomorphism, we can ask “what is the kernel of f?”  It’s the image of f! You should check this for yourself, but its pretty mechanical if you know the definition of what the cokernel is and have been following along.  This property is often expressed as “The image is the kernel of the cokernel of \varphi.”

Of course, there is a much more “involved” definition of the image of a morphism for when we don’t have things like kernels or cokernels to play with.  I don’t really like it as much, but it follows the same basic idea of being an initial object in a certain comma category.

If we’re back in Ab, and have the same group homomorphism \varphi: G \to H, what would the “cokernel of the kernel” be?  What would it mean for the “cokernel of the  kernel” and the “kernel of the cokernel” of \varphi to be isomorphic, and how does this relate to the first isomorphism theorem for groups?

Universal Properties II: Comma Categories

In my last post, I spent a good bit trying to get you interested in looking at universal properties.  Hopefully, you’ve read that post, and are still sufficiently interested to continue, because it’s only going to get harder before we see the light.

We left off at defining these special objects in some category C called “initial” and “terminal” objects.  Go read the previous post now if you need a refresher on what they are.

Back now?  Good.  The next object of study is called a comma category, a category that, in a sense, examines a particular category by looking at certain kinds of morphisms in it.  Take that with a grain of salt, please.  Formally, if we have three categories A,B, C, and functors S: \textbf{A} \to \textbf{C} and T: \textbf{B} \to \textbf{C}, the comma category (S \downarrow T) is the category where

  • The objects are triples (\alpha, \beta, f) with \alpha \in \text{Ob}(\textbf{A}),  \beta \in \text{Ob}(\textbf{B}), and f: S(\alpha) \to T(\beta) is a morphism in C.
  • The morphisms are pairs (g,h): (\alpha, \beta, f) \to (\alpha',\beta',f') with g: \alpha \to \alpha' in A and h: \beta \to \beta' in B, such that T(h) \circ f = f' \circ S(g).
  • Composition of morphisms is done component-wise.  Thus if (g,h) : (\alpha,\beta,f) \to (\alpha',\beta',f') and (g',h') : (\alpha',\beta',f') \to (\alpha'',\beta'',f''), then (g',h') \circ (g,h) := (g' \circ g, h' \circ h).

Now, as far as I’ve seen, one most often comes across comma category theory through a select few vast simplifications.

The Slice Category

Let A = 1, the category with only one object (usually denoted * and one morphism, the identity map.  Then a functor from 1 to any other category C simply “picks out” an object of C.  That is, S : \textbf{1} \to \textbf{C} is uniquely determined by the image of *, say S(*) = X.  

In the definition of a comma category, we need three categories.  Let C be any category, and suppose B = C.  let Id_C: \textbf{C} \to \textbf{C} be the indentity functor.  Then our three categories are 1,C, and C.  The comma category (S \downarrow Id_C) is most often written as X/ C, and is called the slice category and can be seen as the category of “objects of C ‘under’ X.”  Specifically:

  • The objects of X/C are triples (*, \beta, f), with f: S(*) \to T(\beta), with \beta an object of T.  The objects are usually simplified to (\beta,f), since * is the only object in 1.
  • The morphisms are F: (\beta, f) \to (\gamma, g), with F: \beta \to \gamma a morphism in C such that F \circ f = g.
  • Composition is defined in the only natural way (it’s a trivial exercise to check).

One can, of course, define the co-slice category which is the same as the slice category, except the directions of all the arrows are reversed.  These are the “objects ‘over’ X.”

Remember the category of “pointed topological spaces” from before?  It turns out that this is actually a comma category!  Let S : \textbf{1} \to \textbf{Top} be the functor with value S(*) = \{pt\}=p any singleton set p.  Then the category p/ \textbf{Top} has objects (X, f) with X a topological space and f: p \to X an inclusion of a point into X.  We can then make the obvious identification (X,f) \cong (X, f(p)).  The morphisms here are precisely the basepoint preserving ones.

Here’s another cool example: Let  be a category with an initial object x.  Then I want to show that x/ \textbf{C} is “isomorphic as a category” to C.  I haven’t yet defined what that means, sorry.  It just means that there are functors U: x/ \textbf{C} \to \textbf{C} and T: \textbf{C} \to x/ \textbf{C} such that T \circ U = Id_{x/\textbf{C}} and U \circ T= Id_{\textbf{C}}.  Anyway, let U: x/\textbf{C} \to \textbf{C} be the functor that sends each pair (\beta, f) to the C-object \beta, and each morphism h: (\beta, f) \to (\beta',f') to the map h: \beta \to \beta'.  This is another instance of a “forgetful functor,” by the way.

Since x is an initial object, for any other C-object \beta there is one and only one morphism f: x \to \beta.  With this in mind, we define T: \textbf{C} \to x/\textbf{C} via T(\beta) = (\beta,f: x \to \beta), and for any morphism g: \beta \to \beta', T(g) = g : (\beta, f) \to (\beta',f').  It’s then trivial to check that these functors compose to get the identity functors on both sides.  Therefore they are isomorphic.

Obviously the dual statement holds for categories C with a terminal object y and the co-slice category $\textbf{C}/y$.  (note: I owe these above cool examples to this fantastic post:  You should really visit this guy’s blog.)

Almost Slice Categories

Let’s step up the abstraction a bit.  Let C and D be two categories, and let U: \textbf{D} \to \textbf{C} be a functor.  Let S: \textbf{1} \to \textbf{C} be the functor that picks out a C-object X.  Then the comma category (S \downarrow U), written most often as (X \downarrow U), is the category of “morphisms from X to U” (so sayeth the wiki page).  You can think of these as (almost) slice categories, in that X is now “over” objects of the form U(\beta) for \beta an object in D instead of just all C-objects.

Remember the example of the kernel of a group homomorphism \varphi: G \to H? We can now almost talk about that whole business of “the largest group that is killed off by \varphi.”   let D be the subcategory of Grp whose objects are groups K such that for any group homomorphism f: K \to G, the composition \varphi \circ f is the zero map to H.  The morphisms are simply those induced by the parent category Grp.

Then if we let S: \textbf{1} \to \textbf{Grp} pick out G, and U : \textbf{D} \to \textbf{Grp} be the functor that sends each object and morphism of D to itself, then (G \downarrow U) is the category that simply “pairs off” groups K and morphisms $i_K : K \to G$ such that \varphi \circ i_K = 0_H.

What would a terminal object be in (G \downarrow U)? 🙂  Try to find it!

Universal Properties: a Prelude

So I want to take some time to talk about universal properties.  I personally think they’re awesome because if you look hard enough, you start to see them everywhere in mathematics.  Especially in abstract algebra and algebraic geometry.  They admit a fairly intuitive explanation, but the actual details of their definition require a lot of work.  A lot.

Universal properties are used to define certain “special” objects in a category.  That is, a universal property, in a sense, picks out the “best possible” object in a category that satisfies a certain property.  Any other object that satisfies this property then has a morphism to/from (depending on the type of property) this “best possible” object.  The nicest thing is then that an object satisfying a universal property is unique up to unique isomorphism.  Yes, there are two “uniques” there.

For example (still working intuitively), let’s look at the kernel of a group homomorphism.  If \varphi: G \to H is a group homomorphism, then we usually define the kernel to be the normal subgroup \text{Ker}\varphi = \{ g \in G | \varphi(g) = 0\} of G.  Easy enough.  This also relies on the fact that groups are also sets.  Another way of looking at the kernel: It is the largest group K that is “killed off” by \varphi,  i.e. if i_K : K \to G is a group homomorphism, then \varphi \circ i_k = 0_H, the map that sends everything in K to 0 in H.  The same idea holds for the cokernel of \varphi, and tons of other special objects that you’ve undoubtedly run into before.  

The technical details of the universal property are in the “largest group such that…” part.  Here, this means that for any other group K' and homomorphism i_{K'} : K' \to G such that \varphi \circ i_{K'} = 0_H, there is a unique group K, homomorphism i_K : K \to G, and homomorphism g: K' \to K such that i_{K'} = i_K \circ g.

Yes, that is a bit wordy, it’s not just you.  The point of this post (and probably the next as well), is to unravel that mess of words into something tractable.  The first step to understanding these properties is to understand initial and terminal objects.

Initial and Terminal Objects

These aren’t too bad.  Let C be some category.  An initial object in C is an object a such that for any other object b, there is a unique morphism a \to b.  A simple example to keep in mind is the empty set in the category Set.  Since the empty set is a subset of every set, it admits a unique map into any set (there’s only one way to send nothing to nothing!).

A terminal object in C is an object a such that for any other object b, there is a unique morphism b \to a.  Again, a simple example is any singleton set in the category of sets.  To see this, if X is any set, there is a unique map that sends everything everything in X to the singleton set.  I’m lying a tiny bit here (in what the “universal ” object is), and I’ll tell you later what it was.  Is “the” singleton set unique?  Of course not.  It is, however, unique up to a unique isomorphism.  

The lesson here is that initial objects capture the idea of “the most efficient” or “the smallest such that” and that terminal objects capture the idea of  “the largest such that.”  The key, however, is to define them in the right category.

Functors and Natural Transformations!

Hello again!  Last time I got to talking about these mathematical things called “categories.”  If you’ve ever taken a class in higher math, whatever that means,  you should know by now that whenever we define a new mathematical object, the next step is to define what it means to talk about “functions” between them.  In abstract algebra, these are our “- homomorphisms;” in topology, our continuous functions, etc.  You get the idea.  What would it mean to then talk about a “structure-preserving function” between two categories, say and D?  Remember, we have to deal with the objects AND the morphisms in both categories.  A “function” from C to D should therefore send objects of C to objects of D, and morphisms in C to those in D.  Obviously, these “functions” of objects and arrows can’t be completely independent of each other.  If they’re going to be useful in any way (i.e. preserve stuff like function composition and things like the identity map).

Cutting to the chase, these categorical functions are called functors.  You’ve actually probably seen these before (if you’re a math/ physics major/ person who has played with abstract algebra).  Let’s play around in groups (i.e. Grp).  Recall that a group is a set with a certain operation defined on it, and that group homomorphisms are set functions that respect the group operations of the domain and codomain.  Thus for each group G, we can associate the set U(G) which is just the underlying set of G.  Similarly, we can associate to every group homomorphism \varphi: G \to H the set function U(\varphi) : U(G) \to U(H).  For example, if we take G = \mathbb{Z}/3\mathbb{Z} = \{0,1,2\} (technically I should write these as cosets, but it’s all the same up to isomorphism anyway), the cyclic group of order 3, Then U(G) = \{0,1,2\}.  Clearly, if f_1 : G \to H and f_2 : H \to K are two group homomorphisms, it follows that U(f_2 \circ f_1) = U(f_2) \circ U(f_1), so that U respects “compositions of arrows.” Also, remember those identity maps for every object?  If G is any group and 1_G: G \to G the identity homomorphism, U(1_G) = 1_{U(G)}, so that U respects the identity map.  Thus U is a pretty convenient thing.

Now specifically, a (covariant) functor F: \textbf{C} \to \textbf{D} associates to every object a of C an object F(a) of D, and to every morphism f: a \to b of a morphism F(f) of D.  Furthermore, we need to have for every pair of composable morphism f and g of C, F(f \circ g) = F(f) \circ F(g).   Lastly, for every object a in we have F(1_a) = 1_{F(a)}.  We say that F is contravariant if (for the same f and g in the last sentence) F(f \circ g) = F(g) \circ F(f).

Sooo now that we have this definition, it’s immediate that the “function” U : \textbf{Grp} \to \textbf{Set} is a covariant functor, which is called the underlying set functor or the forgetful functor (which is a much cooler name, let’s be honest).   One can, of course, do the same thing for the category of rings, of topological spaces, etc.

Some other interesting examples are

  • The fundamental group functor \pi_1: \textbf{Top}_\cdot \to \textbf{Grp} that sends each pointed topological space (X,x_0) to its fundamental group \pi_1(X,x_0).  If f: (X,x_0) \to (Y,y_0) is a continuous (basepoint preserving) function, then \pi_1(f) = f_* is just the pushfoward map, i.e. it sends loops \gamma in X based at x_0 to loops f \circ \gamma in Y with basepoint y_0 = f(x_0).
  • The dual vector space functor D : \textbf{Vect}_F \to \textbf{Vect}_F that sends each vector space (over some field F to its algebraic dual \text{Hom}_F(V,F).  If \varphi : V \to W is a linear transformation, then D(\varphi) = \varphi^* is the linear transformation that sends each functional f: V \to F to \varphi^*(f) = f \circ \varphi.

You can even compose functors, if the source/ target categories match up.  As an example, if D is the dual space functor from above, the functor D^2 = D \circ D: \textbf{Vect}_F \to \textbf{Vect}_F sends each vector space to its double dual.

As mathematicians are wont to do, whenever we define some kind of mathematical object, the next step is (just about always) to define some concept of morphisms between them.  E.g. groups and group homorphisms, smooth manifolds and smooth maps, topological spaces and continuous functions, etc.  So now suppose we have two functors F,G : \textbf{C} \to \textbf{D} between two categories C and D.  We say that \eta : F \rightsquigarrow G (always use the squiggle arrow, it’s much cooler) is a natural transformation if for every object a in Ob(C) there is a function \eta_a: F(a) \to G(a) such that for any morphism f: a \to b in C,

\eta_b \circ F(f) = G(f) \circ \eta_a

This is best shown with a commutative diagram (I guess I’ll have to do a post on those at some point too…).  Those maps \eta_a are called the components of the natural transformation.  If all the \eta_a are isomorphisms, then we say that $\latex \eta$ is a natural isomorphism.  

So I feel obliged to make a quick remark here. People often say that abstract algebra (and category theory) are especially hard because there are just sooo many definitions to remember.  This is in fact the case. Suck it up.  Reading a sophisticated piece of work requires a large vocabulary, regardless of the discipline.

If you think about it, for any category C, there is an identity functor that just sends every object to itself and every morphism to itself.  Usually, people just write this as Id: \textbf{C} \to \textbf{C} or as I : \textbf{C} \to \textbf{C}.

Ever heard that any (finite dimensional) vector space is naturally isomorphic to its double dual?  What’s really going on here is that there is a natural isomorphism of functors \delta : Id \rightsquigarrow D^2 on the full subcategory (of all vector spaces over a field F) of finite dimensional vector spaces over that field.  Here, the components of the natural transformation for any vector space V, \eta_V : V \to V^{**} is that linear transformation that sends V \ni v \mapsto v^{**}(f) = f(v) \in V^{**} (where f: V \to F) is a linear functional).  Just a bit of works shows that this is actually an isomorphism.

Stepping up the abstraction (again):

So we’ve defined functors, natural transformations, and categories.  We can actually go pretty meta and define categories like “the category of (small) categories” (I’ll define small at some other time; it’s a technical condition that let’s us get around things like Russell’s Paradox), and things like “the category of functors between two categories, functors which are paired in a really useful way (an adjunction), etc.  There’s a ton to explore.

Hello & Here we go

Hello!  I’m a terrible writer, so I’m going to dive right it. My selfish goal is to gain a thorough understanding of category theory, but that road is not a straight shot.  It requires a great deal of knowledge and experience from all of mathematics to really grok many of the abstract methods employed.  This blog is my way of keeping track of the myriad of examples and topics that motivate these ideas.


Setting the setting (prologue?).  If I’m going to talk about category theory, I should probably say what a category “is”.  A category C, intuitively, consists of a collection of “objects” with similar properties and a collection of “arrows” (also often called “morphisms”) between objects.  For some examples, think of:

  • The category Set with objects being sets and arrows being functions between sets.
  • The category Grp with objects being groups and arrows being group homomorphisms.
  • The category Top with objects being topological spaces and arrows being continuous functions between topological spaces.
  • If R is a ring, then the category R-Mod is the category with objects left R-modules and arrows R-module homomorphims. (This includes things like the category of vector spaces over a field)
  • Any partially ordered set (C,\leq) can be turned into a category as well.  We define the objects of this new category are the elements of C, and for any two objects x and y of C, there is one and only one arrow from x to y if and only if x \leq y.

and so on.  You get the idea.   Formally, C consists of a class Ob(C) of objects and a class Hom(C) of arrows, such that

  • Each arrow f has a unique source object a and final object b.  We write this as f: a \to b.
  • For any two objects a and b of C there is a set of arrows from a to b, called \text{Hom}_\textbf{C}(a,b).  If a' and b' are two objects (with a \neq a' and b \neq b'), then \text{Hom}(a,b) and \text{Hom}(a',b') are disjoint.
  • For any three objects a,b and c of C, there is a binary operation \text{Hom}(a,b) \times \text{Hom}(b,c) \to \text{Hom}(a,c) called “composition of arrows/ morphisms.”
  • Arrow composition is associative.
  • For any object a, there is a morphism 1_a : a \to a such that for any other arrow f: a \to b, we have f \circ 1_a = f = 1_a \circ f.

So that’s a bit of a mouthful.  Unwinding all these criteria basically yields the above “intuitive” explanation.  The criteria concerning arrows simply axiomatize this intuition (i.e. arrows basically act like we think functions “should” act).

We shall encounter many examples of categories in future posts.  The ones that will come up quite often (as they contain a host of interesting examples) are AbR-mod, and Top (which are the categories of abelian groups, left R-modules, and topological spaces, respectively).