Measurable Counterexamples

There are many important perspectives to take when trying to understand something. A very common theme in really understanding mathematical ideas is to conceptualize them as Theorems, and understand intuitively why the assumptions and criteria become useful conditions for the statement of a theorem.

“Don’t just read it; fight it! Ask your own questions, look for your own examples, discover your own proofs. Is the hypothesis necessary? Is the converse true? What happens in the classical special case? What about the degenerate cases? Where does the proof use the hypothesis?” (Paul Halmos, “I want to be a mathematician”)

The particular aspect of understanding I want to address in several posts is mathematical understanding over time. Another pertinent quote to this is John Von Neumann’s famous quip, “Young man, in mathematics you don’t understand things. You just get used to them.” Personal experience and conversation have told me that it can be many years after first seeing a concept in a lecture, a book, a paper, or Wikipedia before one understands it. Each exposure or look at it reveals new insights and deeper understanding. Sometimes a certain appreciation or level of understanding is sufficient to use a concept, and other times it requires some time and consistent use before the place of an idea in a grander scheme of theory becomes readily apparent. Much of this is likely very dependent on an individual’s interests, capabilities, modes of learning, and their current social and intellectual context in general.

Counterexamples play an incredibly important role in understanding mathematics. I won’t spend too much time berating the general concept, but one should be aware of how crucial these ideas can be in elucidating a general theory. There are a plethora of important classical counterexamples (e.g. the Weierstrass Function, etc), as well as many general texts on the area, (in Analysis, in Topology, etc). These are good ideas to study for their own sake, but especially if you want to really grapple with these areas; in other words, consider reading these counterexamples closely if you want to do research in their respective areas. I will also tend to not distinguish between formal counterexamples, and paradoxical results. They are different objects, but they often provide similar motivation for mathematical thinking.

For the rest of this post, I want to write a short proof of the existence of Non-measurable sets that are subsets of the real line, and explain why their existence motivates some definitions. In my third year of University, I had my first real exposure to Measure Theory. This is an intriguing cornerstone of modern mathematics, particularly underpinning a significant amount of modern Analysis and areas of Topology and Geometry. There are even very useful and interesting generalizations and connections with Algebra.

As a young student in Mathematics and Computer Science however, I did not really understand why measures were so intentionally complicated. They are of course not the most complicated construction one usually sees in an undergraduate mathematics curriculum, but it became very hard to motivate many of the seemingly obscure definitions at that point in my education. Two repeated motivations my instructor (and later my Master’s Thesis advisor) used were “We have to carefully define measure, because not all sets are measurable” and “We want to be able to integrate functions which are not Riemann integrable.” The former was difficult for me to imagine, as it turns out that these sets are not often encountered. Even the proofs that they exist were really beyond my abilities (or attention span) at the time. The latter justification or motivation for measures at all, specifically to construct the Lebesgue Integral, turns out to have very important and fundamental implications in areas I would later fall in love with, such as Functional Analysis.

Non-measurable subsets of $\mathbb{R}$

As a nice simple proof, I will construct an example of a Vitali Set, as well as recall some definitions of measure that we will need to prove the non-measurability of the construction. Much of this will mirror the Wiki article, but the material that I was influenced by the most was Real Analysis for Graduate Students. It is assumed that readers are familiar with Outer Measure, as well as the Carathéodory Theorem used to construct the one dimensional Lebesgue measure, and some basic definitions of measure. I will recall a few definitions and propositions to ease the discussion.

A brief remark about notation: ${2^X}$ is the powerset of ${X}$, ${[0,\infty]}$ stands for the extended-real number system (e.g. one containing a point at infinity). ${A^c}$ denotes the complement of the set ${A}$.

Definition 1 Let ${X}$ be a set. A ${\sigma}$-algebra ${\mathcal{A}}$ is a collection of subsets of ${X}$ such that,

1. ${\emptyset \in \mathcal{A}}$ and ${X \in \mathcal{A}}$;
2. if ${A \in \mathcal{A}}$ then ${A^c \in \mathcal{A}}$;
3. if ${\{A_i \}_{i=1}^{\infty} \in \mathcal{A}}$ then ${\bigcup_{i=1}^{\infty}A_i \in \mathcal{A}}$ and ${\bigcap_{i=1}^{\infty}A_i \in \mathcal{A}}$.

Definition 2 Let ${X}$ be a set, and ${\mathcal{A} \subseteq 2^{X}}$ a ${\sigma}$-algebra. A measure on ${(X,\mathcal{A})}$ is a function ${\mu: \mathcal{A} \rightarrow [0,\infty]}$ such that,

1. ${\mu(\emptyset) = 0}$;
2. ${\mu(\bigcup_{i=1}^{\infty}A_i) = \sum_{i=1}^{\infty}\mu(A_i)}$ whenever ${A_i,A_j \in \mathcal{A}}$ and ${A_i\cap A_j = \emptyset}$.

Definition 3 Let ${X}$ be a set. An outer measure is a function ${\mu^*}$ defined on the collection of all subsets of ${X}$ satisfying,

1. ${\mu^*(\emptyset) = 0}$;
2. if ${A \subseteq B}$ then ${\mu^*(A) \leq \mu^*(B)}$;
3. ${\mu^*(\bigcup_{i=1}^{\infty}A_i) \leq \sum_{i=1}^{\infty}\mu^*(A_i)}$ whenever ${A_i \in 2^X}$.

Definition 4 Let ${\mu^*}$ be an outer measure on a set ${X}$. A set ${A}$ is called ${\mu^*}$-measurable if

$\displaystyle \mu^*(E) = \mu^*(E \cap A) + \mu^*(E \cap A^c) \ \ \ \ \ (1)$

for all ${E \subseteq X}$.

We can connect these ideas in the usual way to define a measure using the concept of outer measure and ${\mu^*}$-measurable sets.

Proposition 5 If ${\mu^*}$ is an outer measure on ${X}$, then the collection ${\mathcal{A}}$ of ${\mu^*}$-measurable sets is a ${\sigma}$-algebra. If ${\mu}$ is the restriction of ${\mu^*}$ to ${\mathcal{A}}$, then ${\mu}$ is a measure.

I won’t go through the entire motivation or discussion of the Lebesgue measure, but I will simply remind you of the outer measure we normally use to construct it (in many undergraduate and introductory Real Analysis texts at least). I will use ${m^*}$ to indicate the outer Lebesgue measure in the following discussion. I call the collection of all open intervals ${(a,b) \subseteq \mathbb{R}}$ as ${\mathcal{C}}$, and define a function on it by

$\displaystyle \ell((a,b)) = b - a. \ \ \ \ \ (2)$

We can define the outer measure as,

$\displaystyle m^*(E) = inf \left \{\sum_{i=1}^\infty\ell(A_i) : A_i \in \mathcal{C}, E \subseteq \bigcup_{i=1}^\infty A_i \right \} \ \ \ \ \ (3)$

where the infimum is taken ove all covers of ${E}$ as indicated. We can use the concept of outer measure to construct the Lebesgue measure, and carefully choose the ${\sigma}$-algebra so as to avoid sets which are not ${m^*}$-measurable. But why can’t we just take the above definition as the definition of the Lebesgue measure? Why is the restriction to a strictly smaller ${\sigma}$-algebra than ${2^{\mathbb{R}}}$ necessary? The answer is exactly that there exist sets which are not measurable by the function defined in (3).

Theorem 6 Let ${m^*}$, ${\mathcal{C}}$ and ${\ell}$ be defined as above. ${m^*}$ is not a measure on ${(\mathbb{R},2^{\mathbb{R}})}$

Proof: Proceeding by contradiction, we assume ${m^*}$ is a measure. We define an equivalence relation over the interval ${[0,1]}$ by saying that ${x~y}$ if and only if ${x-y \in \mathbb{Q}}$. Next, we select a single representative element from each equivalence class (Note: In order to justify our ability to choose such representative elements, we must invoke the axiom of choice). Call the collection of such elements ${A}$. Given a set ${B \subseteq \mathbb{R}}$ and an element ${x \in \mathbb{R}}$, we define ${B + x = \{y + x: y \in B\}}$. It is clear that ${\ell((a + y, b + y)) = b - a = \ell((a,b))}$, for each ${a,b,q \in \mathbb{R}}$. So by the definition of ${m^*}$ we see that ${m^*(A + q) = m^*(A)}$ for each ${q \in \mathbb{Q}}$. If ${x,y \in A}$, then it is clear that there is no ${q \in \mathbb{Q}}$ such that ${x + q = y}$. By this logic, we can see that the sets ${A + q}$ are disjoint for every ${q \in \mathbb{Q}}$. Now carefully looking at the definition of ${A}$, we have

$\displaystyle [0,1] \subseteq \bigcup_{q \in [-1,1]\cap \mathbb{Q}}(A + q)$

where, since this union was only over rational ${q}$ and these sets are disjoint, by the subadditivity of ${m^*}$ we have,

$\displaystyle 1 \leq \sum_{q \in [-1,1]\cap \mathbb{Q}}m^*(A + q). \ \ \ \ \ (4)$

So by this inequality we must have that ${m^*(A) > 0}$, since again ${m^*(A + q) = m^*(A), \forall q \in \mathbb{Q}}$. But,

$\displaystyle \bigcup_{q \in [-1,1]\cap \mathbb{Q}}(A + q) \subseteq [-1, 2] \ \ \ \ \ (5)$

which, if ${m^*}$ was actually a measure would imply

$\displaystyle 3 \geq \sum_{q \in [-1,1]\cap \mathbb{Q}}m^*(A + q). \ \ \ \ \ (6)$

Thus, because ${m^*}$ was shown to be translation-invariant, we have a countable sum of the term ${m^*(A) > 0}$, which is impossible. Therefore, ${m^*}$ is not a measure on ${2^{\mathbb{R}}}$. $\Box$