Chapter 2: Analysis on the Real Line

# Chapter 2: Analysis on the Real Line This chapter lays the foundational groundwork for real analysis by exploring the fundamental properties of the real numbers, which are the bedrock of the subject. Real numbers, unlike the rational numbers we learn about in high school, possess a crucial property: ==completeness==. This single property allows us to define limits, continuity, and convergence, which are the cornerstones of calculus and analysis. ## 2.1 Properties of Real Numbers The real numbers, $\mathbb{R}$, are more than just a collection of numbers. They form a mathematical structure called an ordered field. This means they obey a specific set of axioms that dictate how addition, multiplication, and order work. The field axioms establish the algebraic structure of the real numbers. ### Set of Real Numbers as an Ordered Field The set $\mathbb{R}$ of real numbers with two operations, addition $(+)$ and multiplication $(*)$ satisfy the following ==field axioms==: **A1.** $\forall x, y \in \mathbb{R}$, $x+y \in \mathbb{R}$. **A2.** $\forall x, y \in \mathbb{R}$, $x+y=y+x$. **A3.** $\forall x, y, z \in \mathbb{R}$, $x+(y+z)=(x+y)+z$. **A4.** $\exists$ (unique) $0 \in \mathbb{R}$ such that $\forall x \in \mathbb{R}$, $x+0=0+x=x$. **A5.** $\forall x \in \mathbb{R}$, $\exists(-x) \in \mathbb{R}$ such that $x+(-x)=0$. **M1.** $\forall x,y \in \mathbb{R}$, $x*y \in \mathbb{R}$. **M2.** $\forall x,y \in \mathbb{R}$, $x*y=y*x$. **M3.** $\forall x, y, z \in \mathbb{R}$, $x *(y * z)=(x * y) * z$. **M4.** $\exists$ (unique) $\in \mathbb{R}$ such that $\forall x \in \mathbb{R}$, $x*1=1*x=x$. **M5.** $\forall x \in \mathbb{R} \backslash\{0\}, $\exists x^{-1} \in \mathbb{R}$ such that $x * x^{-1}=1$. **AM1.** $\forall x, y, z \in \mathbb{R},(x+y) * z=x * z+y * z$. To simplify notation, when we write $xy$ we mean $x*y$. In what follows, a few notations would be handy. $$ \begin{aligned} & \mathbb{R}_{+}=\{x \in \mathbb{R} \mid x \geqslant 0\}. \\ & \mathbb{R}_{++}=\{x \in \mathbb{R} \mid x>0\}. \end{aligned} $$ The field axioms alone aren't enough to capture all the properties of the real numbers. They also have a natural sense of order, which is why we add the order axioms. These axioms introduce the concept of *less than or equal to* $(\leq)$, allowing us to compare any two real numbers. The order axioms ensure that our ordering is consistent with the field operations. This ordered structure is what allows us to visualize real numbers on a continuous number line. $\mathbb{R}$ is, infact, an ==ordered field==. Let $\leq$ be a weak order (or a binary relation). Then $\mathbb{R}$ satisfies the following axioms as well: **O1.** $\forall x, y \in \mathbb{R}$, $x \leq y$ or $y \leq x$. **O2.** $\forall x, y, z \in \mathbb{R}$, $x \leq y+y \leq z \Rightarrow x \leq z$. **O3.** $\forall x, y, z \in \mathbb{R}$, $x \leq y \Rightarrow x+z \leq y+z$. **O4.** $\forall x,y \in \mathbb{R}$ and $\forall z \in \mathbb{R}_{+}$, $x \leq y \Rightarrow x z \leq y z$. Finally, we use this order to define ==intervals== - contiguous subsets of real numbers - which are essential building blocks for more complex concepts in analysis. Given the order relation $\leq$ on $\mathbb{R}$ and $x,y \in \mathbb{R}$, we can define ==intervals of real numbers== that are *between* the numbers $x$ and $y$. There are four types of intervals given by: (i) $[x,y] = \{z \in \mathbb{R} \mid x \leq z \leq y\}$, (ii) $(x,y] = \{z \in \mathbb{R} \mid x < z \leq y\}$, (iii) $[x,y) = \{z \in \mathbb{R} \mid x \leq z < y\}$, and (iv) $(x,y) = \{z \in \mathbb{R} \mid x < z < y\}$. The first one is called the ==closed interval== and the last one the ==open interval== of real numbers between $x$ and $y$. The second and the third intervals are half-open or half-closed intervals of real numbers between $x$ and $y$. We assume that $+\infty$ and $-\infty$ are not part of the real line $\mathbb{R}$. When these are also included, we call this new set, the ==extended real line==, denoted by $\overline{\mathbb{R}}$, given by $\overline{\mathbb{R}} = \mathbb{R} \cup \{-\infty, \infty\}$. ### Bounded Sets of Real Numbers While the real line is infinite, we often work with bounded sets - subsets of real numbers that don't stretch to infinity. **Definition 2.1.** Let $X$ be a set of real numbers, i.e., $X \subseteq \mathbb{R}$. A number $M_{X}$ is said to be an ==upper bound== of $X$ if for all $x \in X, x \leq M_{X}$. The ==maximum== of $X$, denoted by $\max(X)$, if it is an upper bound of $X$ and an element of $X$. The ==supremum== of $X$, denoted by $\sup(X)$, if it is the least of all upper bounds of $X$, i.e., $$\sup(X) = \min \{M_{X} \mid \forall x \in X, x \leq M_{X}\}.$$ **Definition 2.2.** Let $X$ be a set of real numbers, i.e., $X \subseteq \mathbb{R}$. A number $m_{X}$ is said to be a ==lower bound== of $X$ if for all $x \in X$, $m_{X} \leq x$. The ==minimum== of $X$, denoted by $\min(X)$, if it is a lower bound of $X$ and an element of $X$. The ==infimum== of $X$, denoted by $\inf(X)$, if it is the greatest lower bound of $x$, i.e., $$\inf(X) = \max\{m_{X} \mid \forall x \in X, m_{X} \leq x\}.$$ The set $\mathbb{R}$ of real numbers satisfies the ==completeness axiom== i.e., every $\emptyset \subsetneq X \subseteq \mathbb{R}$ that is bounded above has a supremum in $\mathbb{R}$. Since every real number is an upper bound and a lower bound of the $\emptyset$, $\sup(\emptyset) = -\infty$ and $\inf(\emptyset) = +\infty$. Similarly, since $+\infty$ is the only upper bound of $\mathbb{R}$, $\sup(\mathbb{R}) = +\infty$ an since $-\infty$ is the only lower bound of $\mathbb{R}$, $\inf(\mathbb{R}) = -\infty$. The following theorem establishes a few important properties of supremums, and infimums. **Theorem 2.2.** Let $X,Y \subseteq \mathbb{R}$. Then, the following holds: (i) $\sup(\alpha X) = \alpha \sup(X)$ if $\alpha > 0$ and $\sup(\alpha X) = \alpha \inf(X)$ if $\alpha < 0$ where $\alpha X = \{\alpha x \in \mathbb{R} \mid x \in X\}$. Similarly, $\inf(\alpha X) = \alpha \inf(X)$ if $\alpha > 0$ and $\inf(\alpha X) = \alpha \sup(X)$ if $\alpha < 0$ where $\alpha X = \{\alpha x \in \mathbb{R} \mid x \in X\}$. (ii) $\sup(X \cup Y) = \max\{\sup(X), \sup(Y)\}$ and $\inf(X \cup Y) = \min\{\inf(X), \inf(Y)\}$. (iii) $\sup(X \cap Y) \leq \min\{\sup(X), \sup(Y)\}$ and $\inf(X \cap Y) \geq \max\{\inf(X), \inf(Y)\}$ (iv) if $X \subseteq Y$, $\sup(X) \leq \sup(Y)$ and $\inf(X) \geq \inf(Y)$. (v) $\sup(X+Y) \leq \sup(X)+\sup(Y)$ and $\inf(X+Y) \geq \inf(X)+\inf(Y)$ where $X+Y = \{x+y \in \mathbb{R} \mid x \in X \mbox{ and } y \in Y\}$. (vi) For all $X$ bounded above, $\sup(X)=M_{X}^{*}$ if and only if $\forall x \in X$, $x \leq M_{X}^{*}$, and $\forall \varepsilon > 0$, $\exists x \in X$, $M_{X}^{*} - \varepsilon < x \leq M_{X}^{*}$. Similarly, for all $X$ bounded below, $\inf(X)=m_{X}^{*}$ if and only if $\forall x \in X$, $m_{X}^{*} \leq x$, and $\forall \varepsilon > 0$, $\exists x \in X$, $m_{X}^{*} \leq x < m_{X}^{*} + \varepsilon$. *Proof.* We only prove the above claims for the case of supremums; the claim with infimums follow using symmetric arguments. (i) Let $\alpha > 0$. First observe that $\forall x \in X$, $\alpha x \in \alpha X$ and for all upper bounds $M_{X}$ of $X$, and $\alpha x \leq \alpha \sup(X) \leq \alpha M_{X}$. This means that $\alpha \sup(X)$ is an upper bound of $\alpha X$. Next, we show that $\alpha \sup(X)$ is a supremum of $\alpha X$. Suppose not, i.e., there is an upper bound of $\alpha X$, $M_{\alpha X}$, such that $\forall y \in \alpha X$, $y \leq M_{\alpha X} < \alpha \sup(X)$. Since $\forall y \in \alpha X$, $\exists x \in X$ such that $y=\alpha x$, we have $\alpha x \leq M_{\alpha X} < \alpha \sup(X)$ or $x \leq \frac{M_{\alpha X}}{\alpha} < \sup(X)$, contradicting the fact that $\sup(X)$ is a supremum of $X$. (ii) Since $\sup(X)$ is a supremum of $X$ and $\sup(Y)$ is a supremum of $Y$, we have $\forall x \in X$, $$x \leq \sup(X) \leq \max\{\sup(X),\sup(Y)\},$$ and $\forall y \in Y$, $$y \leq \sup(Y) \leq \max\{\sup(X),\sup(Y)\}.$$ This means that $\forall z \in X \cup Y$, $$z \leq \max\{\sup(X),\sup(Y)\},$$ and therefore, by the definition of supremum, $$\sup(X \cup Y) \leq \max\{\sup(X),\sup(Y)\}.$$ Next, we show that $\sup(X \cup Y) = \max\{\sup(X),\sup(Y)\}$. Since $\sup(X \cup Y)$ is an upper bound of $X \cup Y$, $\sup(X \cup Y)$ is an upper bound of both $X$ and $Y$. By definition, we have $\sup(X) \leq \sup(X \cup Y)$ and $\sup(Y) \leq \sup(X \cup Y)$. Therefore, $\max\{\sup(X),\sup(Y)\} \leq \sup(X \cup Y)$. (iii) Given the definitions of $\sup(X)$ and $\sup(Y)$, we have $$\forall x \in X, x \leq \sup(X),$$ and $$\forall y \in Y, y \leq \sup(Y).$$ Therefore, $\forall z \in X \cap Y$, $z \leq \min\{\sup(X),\sup(Y)\}$. By definition of $\sup(X \cap Y)$, we have $\sup(X \cap Y) \leq \min\{\sup(X),\sup(Y)\}$. (iv) Consider subsets $X$ and $Y$ of real numbers such that $X \subseteq Y$. Suppose not, i.e., $\sup(Y) < \sup(X)$. Since $X \subseteq Y$ and $\sup(Y)$ is the supremum of $Y$, we have $$\forall x \in X, (x \in Y) \land (x \leq \sup(Y) < \sup(X))$, which contradicts the fact that $\sup(X)$ is the supremum of $X$. (v) By the definition of $\sup(X)$, $\sup(Y)$, and the set $X+Y$, we have $$\forall x \in X, \forall y \in Y, x+y \leq \sup(X)+y \leq \sup(X)+\sup(Y),$$ $\sup(X)+\sup(Y)$ is an upper bound. Using the definition of $\sup(X+Y)$ (the fact that it is an upper bound), we have $\sup(X+Y) \leq \sup(X)+\sup(Y)$ as required. (vi) Let $X$ be bounded above. For the if part, consider a number $M_{X}^{*}$ such that $\forall x \in X$, $x \leq M_{X}^{*}$, and $\forall \varepsilon > 0$, $\exists x \in X$, $M_{X}^{*} - \varepsilon < x \leq M_{X}^{*}$. We show that $M_{X}^{*}$ is a supremum of $X$. Since $\forall x \in X$, $x \leq M_{X}^{*}$, $M_{X}^{*}$ is an upper bound. Next, assume that $M_{X}^{*}$ is not the least upper bound, i.e., say there is an upper bound $M_{X}$ of $X$ such that $M_{X} < M_{X}^{*}$. Choose $\varepsilon=M_{X}^{*}-M_{X}$. We know that $\exists x \in X$ such that $M_{X}^{*}-\varepsilon=M_{X} < x \leq M_{X}^{*}$ implying that $M_{X}$ is not an upper bound of $X$, a contradiction. For the only-if part, assume that $M_{X}^{*}$ is a supremum. Since $M_{X}^{*}$ is an upper bound, we have $\forall x \in X$, $x \leq M_{X}^{*}$. Next, we show that $\forall \varepsilon > 0$, $\exists x \in X$, $M_{X}^{*} - \varepsilon < x \leq M_{X}^{*}$. Suppose not, i.e., $\exists \varepsilon > 0$, $\forall x \in X$, either $M_{X}^{*} - \varepsilon \geq x$ or $x > M_{X}^{*}$. Since $M_{X}^{*}$ is an upper bound, it cannot be the case that $\forall x \in X$, $x > M_{X}^{*}$. Therefore, assume first that $\forall x \in X$, either $M_{X}^{*} - \varepsilon \geq x$. Since $\varepsilon > 0$, this means that there is a number $M'_{X}$ such that $x \leq M'_{X}=M_{X}^{*}-\frac{\varepsilon}{2} < M_{X}^{*}-\varepsilon$, implying that $M_{X}^{*}$ is not the least upper bound and hence, not the supremum of $X$, a contradiction. $\blacksquare$ ### Archimedean Property and Inductive Property of Real Numbers This next section builds on the foundational properties of the real numbers by exploring two key consequences of the completeness axiom. We will first prove the ==archimedean property== of real numbers, which, while seemingly obvious, has profound implications. **Theorem 2.2.** (Archimedean Property of $\mathbb{R}$) The set $\mathbb{N}$ of natural numbers has no upper bound. *Proof.* Suppose not i.e., $\exists$ a number $M_{\mathbb{N}}$ such that $\forall n \in \mathbb{N}, n \leq M_{\mathbb{N}}$. By completeness axiom, $\mathbb{N}$ must have a supremum. Let $M^{*}_{\mathbb{N}}$ be the supremum of $\mathbb{N}$. This means that $\exists n \in \mathbb{N}$ such that $M^{*}_{\mathbb{N}}-1<n.$ or $n+1>M^{*}_{\mathbb{N}}$, which contradicts the fact that $M^{*}_{\mathbb{N}}$ is the supremum of $\mathbb{N}$. $\blacksquare$ Theorem 2.2 has some consequences that have a great impact on how we must think of the real numbers. (i) No matter how large a real number $x$ is given, there is always a natural number $n$ larger. (ii) Given any positive number $y$, no matter how large, and any positive number $x$, no matter how small, one can add $x$ to itself sufficiently many times so that the result exceeds $y$ (i.e., $n x>y$ for some $n \in \mathbb{N}$ ). (iii) Given any positive number $x$, no matter how small, one can always find a fraction $1 / n$ with $n$ a natural number that is smaller (i.e., so that $\frac{1}{n}<x$ ). The second key result, often called the ==Well-Ordering Principle for Integers==, guarantees that any non-empty set of integers that has a lower bound must have a smallest element. While this might seem intuitive for integers, its proof relies on the completeness of the real numbers and provides a solid logical foundation for many proofs in analysis. **Theorem 2.3.** Every non-empty subset of integers that is bounded below has a smallest element. *Proof.* Let $\emptyset \subsetneq X \subseteq \mathbb{Z}$ and let $X$ be bounded below. By the completeness property of $\mathbb{R}, X$ has an infimum. Let $m_{X}^{*}$ be the infimum of $X$. We show that $m_{X}^{*} \in X$. Suppose not, i.e., $m_{X}^{*} \notin X$. Since $m_{X}^{*}$ is an infimum of $X$ and $m_{X}^{*} \notin X$, $\exists x \in X$ such that $m_{X}^{*}<x<m_{X}^{*}+1$. Since $x$ is not an infimum of $X$ as well, $\exists y \in X$ such that $m_{X}^{*}<y<x<m_{X}^{*}+1$, thereby implying that $x-y<1$. This contradicts the fact that $x$ and $y$ are integers. $\blacksquare$ **Corollary 2.1.** Every non-empty subsets of natural numbers $\mathbb{N}$ has a smallest element. *Proof.* Note that $0$ is a lower bound of any non-empty subset $X$ of $\mathbb{N}$. Then, by **Theorem 2.2.**, $X$ has a smallest element. **Corollary 2.2.** For all $x \in \mathbb{R}, \exists$ a unique $m \in \mathbb{Z}$ such that $m \leq x<m+1$. *Proof.* Consider the set $Z_{x}=\{z \in \mathbb{Z} \mid-x \leq-z\}$. Clearly, $Z_{x}$ is bounded below and has a unique (left as an exercise) smallest element, say $m$. This means $m \leq x<m+1$. ### Triangle Inequality In real analysis, we are often concerned with the distance between numbers. The concept of ==absolute value== allows us to measure this distance without worrying about direction. The absolute value of a number is simply its distance from zero on the number line. For $x \in \mathbb{R}$, define the ==absolute value== of $x$, $|x|$, by $$ |x|= \begin{cases}x & \text { if } x \geq 0 \\ -x & \text { if } x \leq 0\end{cases} $$ Using this definition, we can prove one of the most fundamental and widely used inequalities in all of mathematics: the ==triangle inequality==. **Theorem 2.4.** For all $x,y \in \mathbb{R}$, the following hold. (i) $|x|=|-x|$. (ii) $|xy|=|x||y|$. (iii) (Triangle Inequality) $|x+y| \leq |x|+|y|$. (iv) (Reverse Triangle Inequality) $||x|-|y|| \leq |x-y|$. *Proof.* (i) First, consider $x \geq 0$ or $-x \leq 0$. This means that $|-x| = -(-x) = x$. Therefore, $|-x| = |x|$ when $x \geq 0$. Next, consider $x \leq 0$ or $-x \geq 0$. This means that $|-x| = -x$. Therefore, $|-x| = |x|$ when $x \leq 0$. (ii) First, consider the case where $x,y \geq 0$. Since $x,y \geq 0$, $xy \geq 0$. Therefore, $|xy|=xy=|x||y|$. Next, consider the case where $x,y \leq 0$. Since $x,y \leq 0$, $xy \geq 0$. Therefore, $|xy|=xy=(-x)(-y)=|x||y|$. Lastly, consider the case where $x \geq 0$ and $y \leq 0$ (the case where $x \leq 0$ and $y \geq 0$). Since $x \geq 0$ and $y \leq 0$, $xy \leq 0$. Therefore, $|xy|=-(xy)=(x)(-y)=|x||y|$. (iii) First, observe that $\forall x \in \mathbb{R}$, $x \leq |x|$. Therefore, $\forall x,y \in \mathbb{R}$, we have $(x+y) \leq |x|+|y|$. Similarly, $-(x+y)=(-x)+(-y) \leq |-x|+|-y| = |x|+|y|$. Combining these two observations, by the definition of $|x+y|$, $|x+y| \leq |x|+|y|$. (iv) Observe that $|x|=|(x-y)+y| \leq|x-y|+|y|$. Subtracting we get that $|x|-|y| \leq|x-y|$. Now, reverse the roles of $x$ and $y$ in this inequality to get $|y|-|x|<|x-y|$. Using the definition of absolute values, the result follows. $\blacksquare$ ## 2.2. Sequences of Real Numbers Let's start by thinking about sequences as ordered lists of numbers. These numbers stop after a certain point or can go on forever. If a sequence stops, we call it a ==finite sequence==. If it keeps going forever, we call it an ==infinite sequence==. ### Finite and Infinite Sequences: A Gentle Start A finite sequence is something like: 1,3,5,7,9. Here, the sequence has 5 terms, and it stops. We can easily calculate the sum, average, or any specific term in the sequence because it's finite. But what if the sequence doesn't stop? This is where things start getting interesting. Consider the sequence: $$1, \dfrac{1}{2}, \dfrac{1}{4}, \dfrac{1}{8}, \dfrac{1}{16}, \ldots$$ This sequence goes on forever. We can see a pattern: each term is half of the previous one. This is an example of a ==geometric sequence==, where each term is a fixed multiple of the previous one. Now, let's think about adding up all the terms of our geometric sequence. If we just add a finite number of terms, we get something like: $$S_{n} = 1 + \dfrac{1}{2} + \dfrac{1}{4} + \dfrac{1}{8} + \dfrac{1}{16} + \ldots + \dfrac{1}{2^{n}}.$$ This sum is easy to calculate when n is finite (and small), but what happens when we try to add up all the terms, forever? This is where we encounter the concept of an infinite series - a sum of infinitely many terms. We write it as: $$S = 1 + \dfrac{1}{2} + \dfrac{1}{4} + \dfrac{1}{8} + \dfrac{1}{16} + \ldots + \infty.$$ At first glance, it might seem impossible to add an infinite number of terms. But notice that each term in this series is getting smaller and smaller. So, can the sum approach a specific value? Indeed, it can! There's a formula for the sum $S$ of an infinite geometric series where each term is multiplied by a fixed number $r$ (called the common ratio): $$S=\dfrac{a}{1−r},$$ where $a$ is the first term. In our case, $a=1$ and $r=\dfrac{1}{2}$, and therefore, $S = 2$. This tells us that even though we're adding an infinite number of terms, the total sum is finite and equals 2. Let us consider another (disguised) application of an infinite sequence. Consider the number $0.99999\ldots\infty$. It looks like it's just a bit less than $1$, but it's actually equal to $1$. In common parlance, you wouold have often heard that $0.99999\ldots\infty$ can be *approximated* as $1$. How do we establish this rigorously? Let's represent $x=0.99999\ldots\infty$ and observe that: $$10x = 9.9999\ldots\infty$$ and therefore, $$9x = 10x-x = 9.9999\ldots\infty-0.9999\ldots\infty = 9.$$ Solving, we get $0.99999\ldots\infty=x=1$! That is, $0.99999\ldots\infty$ is not just close to 1 - it actually equals 1! Through these examples, we see that infinite sequences allow us to understand concepts that go beyond the finite. They help us answer questions like: * What happens when you add infinitely many numbers?, * How do we approach numbers like $\pi$ or $e$, which can't be written as simple fractions? * What does it mean for a number to be *infinitely close* to another number? Infinite sequences open the door to calculus, real analysis, and a deeper understanding of mathematics. They allow us to deal with ideas of infinity, limits, and convergence - concepts that are foundational in both theoretical and applied mathematics. ### Convergence of a Sequence Now that we've seen the fascinating ways in which infinite sequences differ from finite ones, we're ready to delve deeper into the formal definitions and concepts that underpin these ideas. Sequences, whether finite or infinite, are at the core of many mathematical discoveries, and understanding how they behave is crucial for grasping more advanced topics. However, infinite sequences, in particular, introduce us to the concept of convergence—how a sequence approaches a specific value as we progress through its terms. Before we explore convergence, let's start by defining what we mean by a sequence of real numbers and how we can describe its behaviour over time. **Definition 2.3.** A ==sequence of real numbers== is a function $s: \mathbb{N} \to \mathbb{R}$. Notice that the natural numbers play a crucial role here to formalize the idea of the position of real numbers in a sequence. When represented as a function from naturals to reals, we formalize the idea of labelling positions like the first term, second term and so on of a sequence. From now on, for notational convenience, we will write a sequence as $\{s_{n}\}$, $\{t_{n}\}$, etc. What do we mean when we say that a sequence *converges* to a real number? A rough idea is the following: as you keep moving forward in the sequence, the value of the terms of the sequence get closer and closer to this specific real number, eventually settling near it. Let us formalize this idea in the next definition. **Definition 2.4.** The sequence $\{s_{n}\}$ ==converges to a number $x \in \mathbb{R}$== if $\forall \varepsilon > 0$, $\exists N_{\varepsilon} \in \mathbb{N}$, $\forall n \geq N_{\varepsilon}$, we have $$|s_{n}-x| < \varepsilon.$$ Two observations of the above definition are in order. One, the natural number $N_{\varepsilon}$ is dependent on $\varepsilon$. For instance, if $\varepsilon$ is very small then $N_{\varepsilon}$ might have to be very large. Second, once you find an $N_{\varepsilon}$, any $N'$ larger than $N_{\varepsilon}$ also works. >**Example.** Let $\{s_{n}\}$ be a sequence such that $\forall n \in \mathbb{N}$, $$s_{n} = \dfrac{1+2+\ldots+n}{n^{2}}.$$ We will show that this sequence converges to $\dfrac{1}{2}$. Before we use the definition of convergent sequences, let us first analyse each term of the sequence. We know that $\forall n \in \mathbb{N}$, $$1+2+\ldots+n = \dfrac{n(n+1)}{2}.$$ Therefore, $\forall n \in \mathbb{N}$, $$s_{n} = \dfrac{n(n+1)}{2n^{2}} = \dfrac{(n+1)}{2n} = \dfrac{1}{2} + \dfrac{1}{2n},$$ which means that, *intuitively*, this sequence converges to $\dfrac{1}{2}$. Next, let us rigorously establish this observation using the definition of convergent sequences. Fix an arbitrary $\varepsilon > 0$. We need to prove the existence of a natural number $N_{\varepsilon}$ so that every term in the sequence on and after the $N_{\varepsilon}^{th}$ term is closer to $\dfrac{1}{2}$ than $\varepsilon$. That is, $\forall n \geq N_{\varepsilon}$, $$\left\lvert \dfrac{1+2+\ldots+n}{n^{2}} - \dfrac{1}{2}\right\rvert < \varepsilon.$$ This means that $$\left\lvert \dfrac{(N_{\varepsilon}+1)}{2N_{\varepsilon}} - \dfrac{1}{2}\right\rvert < \varepsilon,$$ and this inequality is true if $$N_{\varepsilon} > \dfrac{1}{2\varepsilon}.$$ Another way of thinking about a convergent sequence is in terms of its *tail*, i.e., the portion of the sequence starting from any point onward. If a sequence converges, its tail will also converge to the same real number. This implies that the first few terms of the sequence do not affect its overall convergence. The tail captures the long-term behavior, meaning that the ultimate value the sequence approaches is determined by what happens after those initial terms. Conversely, if the tail converges to a particular value, the entire sequence must converge to that same value, showing that the convergence is dictated by the tail and not influenced by the earlier terms. **Theorem 2.5.** If a sequence $\{s_{n}\}$ is convergent then it converges to a unique number. *Proof.* Assume that the sequence $\{s_{n}\}$ converges to $x,y \in \mathbb{R}$ with $x \neq y$. Then, $\forall \varepsilon>0$, $\exists N_{\varepsilon} \in \mathbb{N}$, $\forall n \geq N_{\varepsilon}$, $$\lvert x-y \rvert = \lvert x-s_{n}+s_{n}-y \rvert \leq \lvert s_{n}-x \rvert + \lvert s_{n}-y \rvert < 2\varepsilon,$$ a contradiction to the assumption that $x \neq y$. $\blacksquare$ What does it mean when a sequence doesn't converge? Let us start by considering three examples of non-converging sequences. >**Example.** Consider the sequence $\{s_{n}\}$ such that $\forall n \in \mathbb{N}$, $s_{n}=n$. This sequence doesn't converge. To see this, say the sequence $\{s_{n}\}$ converges to an $x \in \mathbb{R}$. No matter how large $x$ is, by the Archimedean property of $\mathbb{R}$, one can always find $n \in \mathbb{N}$ such that $1 \leq x < n$, which indicates that this sequence is not convergent. In fact, the sequence *diverges* to $\infty$. >**Example.** Consider the sequence $\{s_{n}\}$ such that $\forall n \in \mathbb{N}$, $s_{n}=(-1)^{n}+\dfrac{1}{n}$. One can see that when $n$ is odd, $s_{n}$ is $-1+\dfrac{1}{n}$, and when $n$ is even, $s_{n}$ is $1+\dfrac{1}{n}$. If at all this sequence converges, then it must converge to either $-1$ or $1$. However, it doesn't converge to both these values as for all $\varepsilon<2$, one cannot find an natural number $N_{\varepsilon}$ so that all terms after the $N_{\varepsilon}^{th}$ term gets closer to one of these numbers (why?). >**Example.** It is often a misunderstanding that a non-convergent sequence is either divergent or oscillating. Consider the $\{s_{n}\}$ such that $\forall n \in \mathbb{N}$, $$s_{n} = \left[\dfrac{n}{\sqrt{2}} - \left\lfloor\dfrac{n}{\sqrt{2}}\right\rfloor \right] - \dfrac{1}{2}.$$ One can show that the terms of this sequence lie between $\left[-\dfrac{1}{2},\dfrac{1}{2}\right]$. However, as the below diagram depicts, it doesn't exhibit any oscillating behaviour. <iframe src="https://www.desmos.com/calculator/zpplcbn1oj?embed" width="200" height="200" style="border: 1px solid #ccc" frameborder=0></iframe> If a sequence $\{s_{n}\}$ does not converge to $x \in \mathbb{R}$ then $\exists \varepsilon > 0$, $\forall N \in \mathbb{N}$, $\exists n_{N,\varepsilon} \in \mathbb{N}$ with $n_{N,\varepsilon} \geq N$, $$\lvert s_{n}-x \rvert \geq \varepsilon > \delta,$$ where $\delta = \dfrac{\varepsilon}{2} > 0$. By the Archimedean property of $\mathbb{R}$, $\mathbb{N}$ has no upper bound, and therefore, the above inequality holds for infinitely many values of $n \in \mathbb{N}$. This leads us to the following definition. **Definition 2.5.** A sequence $\{s_{n}\}$ ==does not converge to== $x \in \mathbb{R}$ if and only if $\exists \varepsilon > 0$ such that $$\lvert s_{n}-x \rvert > \varepsilon,$$ holds for infinitely many values of $n$. Now that we have explored convergence of sequence in detail, let us now study sequences that diverge to $\infty$. **Definition 2.6.** A sequence $\{s_{n}\}$ ==diverges to $\infty$== if $\forall M \in \mathbb{R}$, $\exists N_{M} \in \mathbb{N}$, $\forall n \geq N_{M}$, $s_{n} \geq M$. Therefore, when a sequence diverges to $\infty$, no matter how far up the number line you go, the sequence will eventually and permanently surpass that point. Think of it as an unstoppable process—no matter how large a value $M$ you choose, the sequence will eventually get there and keep going beyond it, never looking back. Equivalently, if the terms of the sequence is *growing* at a fixed rate $\alpha > 1$ then clearly this sequence demonstrates exponential growth in the values of the terms of its sequence, essentially capturing the idea of a divergent sequence. **Definition 2.7.** A sequence $\{s_{n}\}$ is ==bounded== if $\exists M \in \mathbb{R}$, $\forall n \in \mathbb{N}$, $\lvert s_{n} \rvert \leq M$. The next theorem shouldn't come as a surprise; every convergent sequence is bounded. This is because if a sequence is convergent then we know that after a certain point, the sequence becomes closer to a finite value almost implying that the sequence is bounded (what is left to be proved is that the behaviour before this "certain point" can also be tamed). **Theorem 2.6.** Every convergent sequence $\{s_{n}\}$ is bounded. *Proof.* Let the sequence $\{s_{n}\}$ converge to $x \in \mathbb{R}$. Fix a $\varepsilon >0$. Using triangle inequality and the fact that the sequence $\{s_{n}\}$ converges to $x$, we know $\exists N_{\varepsilon} \in \mathbb{N}$, $\forall n \geq N_{\varepsilon}$, $$\lvert s_{n} \rvert = \lvert s_{n}-x+x \rvert \leq \lvert s_{n}-x \rvert + \lvert x \rvert < \varepsilon+|x|.$$ Therefore, for all $n \in \mathbb{N}$, $$\lvert s_{n} \rvert < \max\{\lvert s_{1} \rvert, \lvert s_{2} \rvert,\ldots, \lvert s_{n} \rvert, |x|+\varepsilon\},$$ as required. $\blacksquare$ Next theorem states important algebraic properties of convergent sequences. **Theorem 2.7.** Let $\{s_{n}\}$ and $\{t_{n}\}$ be sequences converging to $x \in \mathbb{R}$ and $y \in \mathbb{R}$, respectively. Then, (i) for all $\alpha, \beta \in \mathbb{R}$, the sequence $\{\alpha s_{n}+\beta t_{n}\}$ converges to $\alpha x+\beta y$. (ii) the sequence $\{s_{n}t_{n}\}$ converges to $xy$. (iii) if for all $n \in \mathbb{N}$, $t_{n} \neq 0$, $\left\{\dfrac{s_{n}}{t_{n}}\right\}$ converges to $\dfrac{x}{y}$. *Proof.* (i) Choose an arbitrary $\varepsilon > 0$. Since the sequence $\{s_{n}\}$ converges to $x$, there exists $N_{1} \in \mathbb{N}$ such that $\forall n \geq N_{1}$, $\left\lvert s_{n}-x\right\rvert < \frac{\varepsilon}{2\lvert \alpha \rvert}$, and since the sequence $\{t_{n}\}$ converges to $y$, there exists $N_{2} \in \mathbb{N}$ such that $\forall n \geq N_{2}$, $\left\lvert t_{n}-y\right\rvert < \frac{\varepsilon}{2\lvert \beta \rvert}$. Therefore, for all $n \geq \max\{N_{1},N_{2}\}$, we have $$\left\lvert (\alpha s_{n}+ \beta t_{n}) - (\alpha x+ \beta y)\right\rvert \leq \lvert \alpha \rvert \cdot \left\lvert s_{n}-x \right\rvert + \lvert \beta \rvert \cdot \left\lvert t_{n} - x \right\rvert < \dfrac{\varepsilon}{2} + \dfrac{\varepsilon}{2} = \varepsilon,$$ as required. (ii) Choose an arbitrary $\varepsilon > 0$. Since the sequence $\{s_{n}\}$ converges to $x$, there exists $N_{1} \in \mathbb{N}$ such that $\forall n \geq N_{1}$, $\left\lvert s_{n}-x\right\rvert < \frac{\varepsilon}{2(\lvert y \rvert +1)}$, and since the sequence $\{t_{n}\}$ converges to $y$, there exists $N_{2} \in \mathbb{N}$ such that $\forall n \geq N_{2}$ $\left\lvert t_{n}-y\right\rvert < \frac{\varepsilon}{2\lvert x \rvert}$. Therefore, for all $n \geq \max\{N_{1},N_{2}\}$, we have $$\lvert s_{n}t_{n} - xy \rvert = \lvert s_{n}t_{n} - xt_{n} + xt_{n} - xy \rvert \leq \lvert t_{n} \rvert \cdot \lvert s_{n} - x \rvert + \lvert x \rvert \cdot \lvert t_{n} - y \rvert < \dfrac{\varepsilon}{2} + \dfrac{\varepsilon}{2} = \varepsilon,$$ as required. (iii) We show that the sequence $\left\{\dfrac{1}{t_{n}}\right\}$ converges to $\dfrac{1}{y}$ because if this is true then we are done using (ii). Choose an arbitrary $\varepsilon > 0$. Since the sequence $\{t_{n}\}$ converges to $y$, there exists $N_{1} \in \mathbb{N}$ such that $\forall n \geq N_{1}$, $\lvert t_{n}-y \rvert < \frac{\lvert y \rvert}{2}$ and there exists $N_{2} \in \mathbb{N}$ such that $\forall n \geq N_{2}$, $\lvert t_{n}-y \rvert < 2\varepsilon$. Therefore, $\forall n \geq N_{1}$, $\lvert y \rvert=\lvert y-t_{n}+t_{n} \rvert \leq \lvert y-t_{n} \rvert + \lvert t_{n} \rvert = \frac{\lvert y \rvert}{2}+ \lvert t_{n} \rvert$ or $\lvert t_{n} \rvert > \frac{\lvert y \rvert}{2}$ and $\forall n \geq \max\{N_{1},N_{2}\}$, we have $$\left\lvert \dfrac{1}{t_{n}} - \dfrac{1}{y} \right\rvert = \left\lvert \dfrac{y-t_{n}}{t_{n}y}\right\rvert = \dfrac{\lvert t_{n}-y \rvert}{\lvert t_{n} \rvert \lvert y \rvert} < \frac{2\varepsilon}{\frac{2}{\lvert y \rvert}\lvert y \rvert} =\varepsilon,$$ as required. $\blacksquare$ The following theorem establishes that if all terms of a sequence is smaller than the terms of another sequence then the former sequence converges to a smaller value than the latter. **Theorem 2.8.** Let $\{s_{n}\}$ and $\{t_{n}\}$ be sequences converging to $x \in \mathbb{R}$ and $y \in \mathbb{R}$. If for all $n \in \mathbb{N}$, $s_{n} \leq t_{n}$, then $x \leq y$. *Proof.* Let the sequence $\{s_{n}\}$ converge to $x \in \mathbb{R}$ and the sequence $\{t_{n}\}$ converge to $y \in \mathbb{R}$. This means $\forall \varepsilon > 0$, $\exists N_{\varepsilon}^{1} \in \mathbb{N}$, $\forall n \geq N_{\varepsilon}^{1}$, $\lvert s_{n}-x \rvert < \varepsilon$, and $\forall \varepsilon > 0$, $\exists N_{\varepsilon}^{2} \in \mathbb{N}$, $\forall n \geq N_{\varepsilon}^{2}$, $\lvert t_{n}-y \rvert < \varepsilon$. Combining, we have $\forall \varepsilon > 0$, $\forall n \geq N_{\varepsilon} = \max\{N_{\varepsilon}^{1}, N_{\varepsilon}^{2}\}$, $\lvert s_{n}-x \rvert < \varepsilon$ and $\lvert t_{n}-y \rvert < \varepsilon$. Therefore, $\forall \varepsilon > 0$, $\forall n \geq N_{\varepsilon}$, $$0 \leq t_{n}-s_{n} < 2\varepsilon + y-x,$$ implying $\forall \varepsilon > 0$, $x-y < 2\varepsilon$, completing the proof of the theorem. $\blacksquare$ **Corollary 2.3.** Let $\{s_{n}\}$ be a sequence converging to $x \in \mathbb{R}$. If $\forall n \in \mathbb{N}$, $\exists \alpha, \beta \in \mathbb{R}$ such that $\alpha \leq s_{n} \leq \beta$, then $\alpha \leq x \leq \beta$. **Theorem 2.9.** Let $\{s_{n}\}$ and $\{t_{n}\}$ be sequences converging to $x \in \mathbb{R}$ and $y \in \mathbb{R}$, respectively. Further, assume that $x=y$ and there exists a sequence $\{r_{n}\}$ such that for all $n \in \mathbb{N}$, $s_{n} \leq r_{n} \leq t_{n}$. Then, the sequence $\{r_{n}\}$ also converges to $x(=y)$. *Proof.* Assume for contradiction that the sequence $\{r_{n}\}$ does not converge to $x(=y)$, i.e., $\exists \varepsilon > 0$, $\forall N \in \mathbb{N}$, $\exists n _{N} \geq N$ such that $\lvert r_{n}-x \rvert \geq \varepsilon \Rightarrow r_{\bar{n}} \geq x+\varepsilon$. Since the sequence $\{t_{n}\}$ converges $x$, $\exists N_{\varepsilon} \in \mathbb{N}$, $\forall n \geq N$, $\lvert t_{n}-x \rvert < \varepsilon \Rightarrow t_{n} < x+\varepsilon$. Fix an arbitrary $\bar{\varepsilon} > 0$. Since $\forall n \in \mathbb{N}$, $r_{n} \leq t_{n}$, we have $$x+\bar{\varepsilon} \leq r_{n_{N_{\varepsilon}}} \leq t_{n_{N_{\varepsilon}}} < x+\bar{\varepsilon},$$ a contradiction. $\blacksquare$ Next, we study a special class of sequences with nice properties of convergence, which will come handy in this course and beyond. **Definition 2.8.** A sequence $\{s_{n}\}$ is called ==non-decreasing== if $\forall n \in \mathbb{N}$, $s_{n} \leq s_{n+1}$. Similarly, a sequence $\{s_{n}\}$ is called ==non-increasing== if $\forall n \in \mathbb{N}$, $s_{n} \geq s_{n+1}$. A sequence is called ==monotonic== if it is either non-decreasing or non-increasing. Recall that Theorem 2.5 showed that every convergent sequence is bounded. What about its converse? Of course, there are bounded sequences that are not convergent. However, can we characterize bounded sequences that are convergent? **Theorem 2.10.** Suppose $\{s_{n}\}$ is a monotonic sequence. Then, the sequence $\{s_{n}\}$ converges if and only if it is bounded. In particular, if the sequence $\{s_{n}\}$ is non-decreasing then it converges to $\text{sup}\{s_{n}\}$ and if the sequence $\{s_{n}\}$ is non-increasing then it converges to $\text{inf}\{s_{n}\}$. *Proof.* We prove this theorem for the case where $\{s_{n}\}$ is a non-decreasing and bounded sequence; one can establish the results in the case of a non-increasing and bounded sequence using symmetric arguments. Since the sequence $\{s_{n}\}$ is bounded, $\text{sup }\{s_{n}\}$ exists and therefore, $\forall \varepsilon > 0$, $\exists N_{\varepsilon} \in \mathbb{N}$ such that $\text{sup }\{s_{n}\}-\varepsilon < s_{N_{\varepsilon}} \leq \text{sup }\{s_{n}\} < \text{sup }\{s_{n}\}+\varepsilon$. Since the sequence $\{s_{n}\}$ is non-decreasing (and since $\text{sup }\{s_{n}\}$ is an upper bound), $\forall n \geq N_{\varepsilon}$, we have $\text{sup }\{s_{n}\}-\varepsilon < s_{N_{\varepsilon}} \leq s_{n} \leq \text{sup }\{s_{n}\} < \text{sup }\{s_{n}\}+\varepsilon$, implying that the sequence $\{s_{n}\}$ converges to $\text{sup }\{s_{n}\}$. $\blacksquare$ ### Subsequences and Bolzano-Weierstrass Property Imagine you're on a journey toward a destination, but the road ahead is full of twists and turns. As you navigate, you might find that not every road leads directly to where you want to go. However, if you carefully choose the right turns along the way—picking only those roads that steadily guide you in the correct direction—you'll eventually reach your destination. These carefully chosen roads represent a subsequence. Even though the entire network of roads (the original sequence) might seem confusing or overwhelming, by focusing on just the right turns, you uncover a clear path that gets you exactly where you need to be. This approach reveals that, despite the complexity, there's always a way to find a direct route to your goal, highlighting the structure hidden within the winding paths. **Definition 2.9.** Let $s: \mathbb{N} \to \mathbb{R}$ be a sequence. A sequence $t: \mathbb{N} \to \mathbb{R}$ is called a ==subsequence== of the sequence $s: \mathbb{N} \to \mathbb{R}$ if $\forall k \in \mathbb{N}$, $\exists n_{k} \in \mathbb{N}$ such that $t_{k} = s_{n_{k}}$ and $\forall k,k' \in \mathbb{N}$, $k < k' \Rightarrow n_{k} < n_{k'}$. Let us take a few examples to get our idea about a subsequence precise. > **Example.** Consider the sequence $\{s_{n}\}$ where $\forall n \in \mathbb{N}$, $s_{n} = \dfrac{n}{n+1}$. Then the sequence $\{t_{k}\}$ where $\forall k \in \mathbb{N}$, $t_{k} = \frac{1}{2}$ is not a subsequence of the sequence $\{s_{n}\}$. > **Example.** Consider the sequence $\{s_{n}\}$ where $\forall n \in \mathbb{N}$, $s_{n} = (-1)^{n}$. Then the sequence $\{t_{k}\}$ where $\forall k \in \mathbb{N}$, $t_{k} = 1$ is a subsequence of the sequence $\{s_{n}\}$. Next result should not be surprising - any subsequence of a convergent sequence must also be convergent. **Theorem 2.11.** If a sequence $\{s_{n}\}$ converges to $x \in \mathbb{R}$ then all subsequences of the sequence $\{s_{n_{k}}\}$ also converges to $x$. *Proof*. Since the sequence $\{s_{n}\}$ converges to $x \in \mathbb{R}$, $\forall \varepsilon > 0$, $\exists N \in \mathbb{N}$, $\forall n \geq N$, $\lvert s_{n}-x \rvert < \varepsilon$. Consider a subsequence $\{t_{k}\}$ of the sequence $\{s_{n}\}$ and let $\mathbb{N}_{N} = \{m \in \mathbb{N} \mid \exists k \in \mathbb{N}, s_{m} = t_{k} \land m \geq N\}$. Clearly, $\mathbb{N}_{N}$ is bounded below and therefore, it has a minimum element. Let $N_{k} = \min\mathbb{N}_{N}$ and observe that $\forall \varepsilon > 0$, $\forall k \geq N_{k}$, $\lvert t_{k}-x \rvert < \varepsilon$, implying that the subsequence $\{t_{k}\}$ of the sequence $\{s_{n}\}$ also converges to $x$. The next important and powerful theorem is due to Bernard Bolzano and Karl Weierstrass who claim that every bounded sequence has a convergent subsequence.The proof is constructive and therefore, let us first form an intuitive idea of this construction. Begin with the first term of a bounded sequence and search through the sequence for subsequent elements that are at least as large as the current one. By repeatedly selecting these larger or equal elements, you build a non-decreasing subsequence. If you can continue this process indefinitely, the subsequence will converge due to the sequence's boundedness. If at some point, no larger or equal elements can be found, you stop, having formed a finite non-decreasing subsequence. Either way, the boundedness ensures the existence of a convergent subsequence, illustrating the Bolzano-Weierstrass property. One can also use the picture below to draw some inspiration before we proceed to a formal proof.  **Theorem 2.12.** (Bolzano-Weierstrass) A bounded sequence has a convergent subsequence. *Proof*. Let $\left\{s_{n}\right\}$ be a bounded sequence, i.e., $\exists M \in \mathbb{R}$, $\forall n \in \mathbb{N}$, $\lvert s_{n} \rvert \leq M$ for all $n$. The proof is trivial if the sequence has only a finite number of distinct elements since in that case there is at least one point that is repeated infinitely often; hence we can take a subsequence all of whose entries are equal to that point. So assume $\left\{s_{n}\right\}$ has an infinite number of distinct points. Divide the interval $I=\left[-M,M\right]$ into two equal parts and look at which points in the sequence $\left\{s_{n}\right\}$ belong to each half. Since $\left\{s_{n}: n \in \mathbb{N}\right\}$ is an infinite set, one of the intervals has an infinite number of these points; call it $J_{1}$ and let $\mathbb{N}_{1}=\left\{n \in \mathbb{N}: s_{n} \in J_{1}\right\}$. Put $t_{1}=\inf \left\{s_{n}: {n} \in \mathbb{N}_1\right\}$, and let $n_{1}=\min \mathbb{N}_{1}$. We have the first element in our desired subsequence, $s_{n_{1}}$, and it satisfies $\left|t_{1}-s_{n_{1}}\right| \leq \dfrac{2M}{2}$. To find the second element in the desired subsequence, we proceed in a similar way. Divide the interval $J_{1}$ into two equal parts, and let $J_{2}$ be a half interval that contains infinitely many of the points $\left\{s_{n}: n \in \mathbb{N}_1\right\}$. Put $\mathbb{N}_{2}=\left\{n \in \mathbb{N}_1: s_{n} \in J_{2}\right\}$, and let $t_{2}=\inf \left\{s_{n}: n \in \mathbb{N}_{2}\right\}$. Put $n_{2}=\min \left\{n \in \mathbb{N}_2 \setminus \left\{n_{1}\right\}\right\}$. Because $\mathbb{N}_{2} \subseteq \mathbb{N}_1$, we have that $$ t_{1} \leq t_{2},$$ and $$\left\lvert t_{2} - s_{n_{2}} \right\rvert \leq \dfrac{M}{2}.$$ Continuing this process, we obtain decreasing intervals $J_{1}, J_{2}, \ldots$; infinite sets of integers $\mathbb{N}_{1}, \mathbb{N}_{2}, \ldots$ with $\mathbb{N}_{1} \supseteq \mathbb{N}_{2} \supseteq \cdots$; and integers $n_{1}<n_{2}<\cdots$. We define $t_{k}=\inf \left\{s_{n_{k}}: n_{k} \in \mathbb{N}_{k}\right\}$. We have that these satisfy: $$ \begin{aligned} & t_{k} \leq t_{k+1} \\ & \left|t_{k}-s_{n_{k}}\right| \leq \frac{M}{2^{k-1}} \end{aligned} $$ Now these conditions imply that $\left\{t_{k}\right\}$ is an increasing sequence and it is bounded above by $M$. By Theorem 2.6, there is a number $x$ such that the sequence $\left\{t_{k}\right\}$ converges to $x$, i.e., $\exists N^{1} \in \mathbb{N}$, $\forall k \geq N^{1}$, $\lvert t_{k}-x \rvert < \frac{\varepsilon}{2}$. Moreover, by the Archimedean Property, $\exists N^{2} \in \mathbb{N}$, $\frac{M}{2^{N^{2}-1}} < \frac{\varepsilon}{2}$. Therefore, $\forall \varepsilon > 0$, $\forall n_{k} \geq N_{\varepsilon} = \max\{N^{1},N^{2}\}$, $$ \begin{aligned} \left|s_{n_{k}}-x\right| & =\left|s_{n_{k}}-t_{k}+t_{k}-x\right| \\ & \leq\left|s_{n_{k}}-t_{k}\right|+\left|t_{k}-x\right| \\ & \leq \dfrac{M}{2^{k-1}}+\left|t_{k}-x\right| \\ & \leq \varepsilon, \end{aligned} $$ implying that subsequence $\{s_{n_{k}}\}$ of the $\{s_{n}\}$ converges to $x$, completing the proof of the theorem. $\blacksquare$ **Definition 2.10.** A sequence $\{s_{n}\}$ is called a ==Cauchy sequence== or is said to satisfy ==Cauchy property== if $\forall \varepsilon > 0$, $\exists N_{\varepsilon} \in \mathbb{N}$, $\forall m,n \geq N_{\varepsilon}$, $\lvert s_{n}-s_{m} \rvert < \varepsilon$. **Theorem 2.13.** A sequence is convergent if and only if it is a Cauchy sequence. *Proof*. (If part) Let $\{s_{n}\}$ be a sequence satisfying the Cauchy property. First, we show that the sequence $\{s_{n}\}$ is bounded. To see this, observe that $\exists N \in \mathbb{N}$, $\forall n \geq N$ such that $\lvert s_{n} - s_{N} \rvert < 1$. Using triangle inequality, we have $\forall n \geq N$, $\lvert s_{n} \rvert < \lvert s_{N} \rvert +1$ and using similar arguments as in Theorem 2.6. Next, using Bolzano-Weierstrass theorem, there exists a convergent subsequence $\{s_{n_{k}}\}$ of the sequence $\{s_{n}\}$ and say the subsequence $\{s_{n_{k}}\}$ converges to $x \in \mathbb{R}$. We claim that the sequence $\{s_{n}\}$ also converges to $x$. Fix an arbitrary $\varepsilon > 0$. Since the sequence $\{s_{n}\}$ satisfies Cauchy property, $\exists N_{1} \in \mathbb{N}$, $\forall n,m \geq N_{1}$, $$\lvert s_{n}-s_{m} \rvert < \dfrac{\varepsilon}{2},$$ Since the subsequence $\{s_{n_{k}}\}$ converges to $x$, $\exists N_{2} \in \mathbb{N}$, $\forall k \geq N_{2}$, $$\lvert s_{n_{k}}-x \rvert < \dfrac{\varepsilon}{2}.$$ Choose $m$ such that for some $k \geq \max\{N_{1},N_{2}\}$, $m=n_{k}$. Therefore, for all $n \geq N+m$, $$\lvert s_{n}-x \rvert = \lvert s_{n}-s_{n_{k}}+s_{n_{k}}-x \lvert \leq \lvert s_{n}-s_{n_{k}} \rvert + \lvert s_{n_{k}}-x \rvert < \dfrac{\varepsilon}{2} + \dfrac{\varepsilon}{2} = \varepsilon,$$ implying that the sequence $\{s_{n}\}$ converges to $x$. (Only if part) Let the sequence $\{s_{n}\}$ converge to $x \in \mathbb{R}$. We show that $\{s_{n}\}$ is a Cauchy sequence. Fix an arbitrary $\varepsilon > 0$. Since the sequence $\{s_{n}\}$ converges to $x$, $\exists N \in \mathbb{N}$, $\forall n,m \geq N$, $$\lvert s_{n}-s_{m} \rvert \leq \lvert s_{n}-x+x-s_{m} \rvert < 2\varepsilon,$$ thereby implying that the sequence $\{s_{n}\}$ satisfies the Cauchy property. $\blacksquare$ ## 2.3. Continuity of Functions ### Continuity: Definition(s) and Algebraic Properties **Definition 2.11.** (Sequential version) Let $X \subseteq \mathbb{R}$. A function $f: X \to \mathbb{R}$ is ==continuous at $x_{0} \in X$== if for all sequences $\{s_{n}\} \subseteq X$ that converges to $x_{0}$, the sequence $\{f(s_{n})\}$ converges to $f(x_{0})$. A function $f: X \to \mathbb{R}$ is ==continuous== if it is continuous at all $x \in X$. **Definition 2.12.** ($\varepsilon$-$\delta$ version) Let $X \subseteq \mathbb{R}$. A function $f: X \to \mathbb{R}$ is ==continuous at $x_{0} \in X$== if $\forall \varepsilon > 0$, $\exists \delta > 0$, $\forall x \in X$, $\left[\lvert x-x_{0} \rvert < \delta \Rightarrow \lvert f(x)-f(x_{0}) \rvert < \varepsilon \right]$. > **Example**. Consider the function $f: \mathbb{R} \to \mathbb{R}$ such that $\forall x \in \mathbb{R}$, $f(x)=ax+b$ where $a,b \in \mathbb{R}$. We show that $f$ is continuous at every point $x_{0} \in \mathbb{R}$ using Definition 2.12. Pick an arbitrary $\varepsilon > 0$ and let $\lvert f(x)-f(x_{0}) \rvert < \varepsilon$. Set $\delta = \frac{\varepsilon}{\lvert a \rvert}$. Observe that $$\lvert f(x)-f(x_{0}) \rvert = \lvert ax+b-ax_{0}-b \rvert = \lvert a \rvert \lvert x - x_{0} \rvert < \varepsilon,$$ as required. > **Example**. Next, let us consider a slightly more involved example. Let $f: \mathbb{R} \to \mathbb{R}$ such that $$ f(x) = \begin{cases} x^2 & \text{if } x \geq 0, \\ x^3 & \text{if } x < 0. \end{cases} $$ We claim that this function is continuous at $x=0$ (in fact, this function is continuous at all $x_{0} \in \mathbb{R}$ and the continuity at all values of $x_{0}$ other than $0$ can be established in a similar fashion as in the case of the example above). > **Example**. Consider the function $f: \mathbb{R} \to \mathbb{R}$ such that $\forall x \in \mathbb{R}$, $f(x)=\lfloor x \rfloor$. First, we show that $f$ is discontinuous at $x=3$. In other words, we show that $\exists \varepsilon > 0$, $\forall \delta > 0$, $\exists x \in X$, $\left[\lvert x-3 \rvert < \delta \Rightarrow \lvert f(x)-f(3) \rvert \geq \varepsilon \right]$. For an arbitrary $\delta > 0$, note that if $x = 3-\frac{\delta}{2}$ then \begin{align*} \lvert x-3 \rvert &= \left\lvert 3-\frac{\delta}{2}-3 \right\rvert \\ &= \left\lvert 3-\frac{\delta}{2}-3 \right\rvert \\ &= \left\lvert -\frac{\delta}{2} \right\rvert \\ &= \left\lvert \frac{\delta}{2} \right\rvert \\ &< \delta. \end{align*} We know that $\forall \delta > 0$, $\exists n \in \mathbb{Z}$ such that $n \leq 3-\frac{\delta}{2} < n+1$. However, by definition of the function $f$, \begin{align*} \lvert f(x)-f(3) \rvert &= \left\lvert f\left(3-\frac{\delta}{2}\right)-3 \right\rvert \\ &= \lvert n-3 \rvert \\ &\leq \left\lvert -\frac{\delta}{2} \right\rvert \\ &= \left\lvert \frac{\delta}{2} \right\rvert \\ &< \delta \end{align*} Therefore, if $\varepsilon = \delta$ then $\lvert f(x)-f(3) \rvert < \varepsilon$, as required. Next, assume that the definition of the function is modified as follows: $f: \mathbb{Z} \to \mathbb{R}$ such that $\forall x \in \mathbb{R}$, $f(x)=\lfloor x \rfloor$. Then, I claim that $f$ is continuous at all $x \in \mathbb{Z}$. **Theorem 2.14.** Definitions 2.11. and 2.12. are equivalent. *Proof*. ($2.11. \Rightarrow 2.12.$) Let $X \subseteq \mathbb{R}$ and let $f: X \to \mathbb{R}$ be a function continuous at $x_{0} \in X$, i.e., if for all sequences $\{s_{n}\} \subseteq X$ that converges to $x_{0}$, the sequence $\{f(s_{n})\}$ converges to $f(x_{0})$. Consider an arbitrary $\varepsilon >0$ and an arbitrary $x \in X$. Assume for contradiction that $\forall \delta>0$, $\exists x \in X$ such that $\lvert x-x_{0} \rvert < \delta$ and $\lvert f(x)-f(x_{0}) \rvert \geq \varepsilon$. Since this assumption holds for all $\delta>0$, it also holds for $\delta = \frac{1}{n}$ for all $n \in \mathbb{N}$. Consider an arbitrary sequence $\{s_{n}\}$ that converges to $x_{0}$. Then, $\forall n \in \mathbb{N}$, $\exists N \in \mathbb{N}$, $\forall m \geq N$, $\lvert s_{m}-x_{0} \rvert < \delta$. This implies that $\forall n \in \mathbb{N}$, $\exists N \in \mathbb{N}$, $\forall m \geq N$, $\lvert f(s_{m})-f(x_{0}) \rvert \geq \varepsilon$, implying that the sequence $\{f(s_{n})\}$ does not converge to $f(x_{0})$. ($2.12. \Rightarrow 2.11.$) Let $X \subseteq \mathbb{R}$ and let $f: X \to \mathbb{R}$ be a function continuous at $x_{0} \in X$, i.e., $\forall \varepsilon>0$, $\exists \delta > 0$, $\forall x \in X$, $\left[\lvert x-x_{0} \rvert < \delta \Rightarrow \lvert f(x)-f(x_{0}) \rvert < \varepsilon \right]$. Consider an arbitrary sequence $\{s_{n}\} \subseteq X$ that converges to $x_{0}$, i.e., $\forall \delta>0$, $\exists N \in \mathbb{N}$, $\forall n \geq N$, $\lvert s_{n}-x_{0} \rvert < \delta$. Since $f$ is continuous, this means that $\forall \varepsilon>0$, $\exists N \in \mathbb{N}$, $\forall n \geq N$,$\lvert f(s_{n})-f(x_{0}) \rvert < \varepsilon$, implying that the sequence $\{f(s_{n})\}$ that converges to $f(x_{0})$, as required. $\blacksquare$ **Theorem 2.15.** Let $X \subseteq \mathbb{R}$. Let $f: X \to \mathbb{R}$ and $g: X \to \mathbb{R}$ be functions continuous at $x_{0} \in X$ then the following holds: (i) the function $f+g$ is continuous at $x_{0} \in X$. (ii) $fg$ is continuous at $x_{0} \in X$. (iii) If $\forall x \in X$, $g(x) \neq 0$ then $\frac{f}{g}$ is continuous at $x_{0}$. *Proof*. Follows from Theorem 2.7. **Definition 2.13.** Let $X \subseteq \mathbb{R}$. A function $f: X \rightarrow \mathbb{R}$ is ==Lipschitz continuous on $X$== if $\exists M > 0$, $\forall x,y \in X$, $\lvert f(x)-f(y) \rvert < M \lvert x-y \rvert$. >**Example.** Consider the function $f(x) = \sqrt{x}$ where $x \in [0, \infty)$. It is left to the reader to verify that the function is continuous at all points in its domain. However, this function is not Lipschitz continuous. Assume for contradiction that this function is Lipschitz continuous. This means that $\exists M > 0$, $\forall x,y \in [0,\infty)$, $\lvert f(x)-f(y) \rvert = \lvert \sqrt{x}-\sqrt{y} \rvert < M \lvert x-y \rvert$. Set $x = \frac{1}{n}$ for some $n \in \mathbb{N}$ and $y=0$. Then,$\exists M > 0$, $\forall n \in \mathbb{N}$, $$\left\lvert \frac{1}{\sqrt{n}} \right\rvert < M \left\lvert \frac{1}{n} \right\rvert \Leftrightarrow n < M^{2},$$ a contradiction. **Theorem 2.16.** Let $X \subseteq \mathbb{R}$. If function $f: X \rightarrow \mathbb{R}$ is Lipschitz continuous function is continuous at all points in $X$. *Proof*. Fix an arbitrary $x_{0} \in X$ and an arbitrary $\varepsilon > 0$. We show that $f$ is continuous at $x_{0}$. Let $\delta_{\varepsilon} = \frac{\varepsilon}{M}$ and consider $x \in X$ such that $\lvert x - x_{0} \rvert < \delta_{\varepsilon}$. Since $f$ is Lipschitz continuous, we have $$\lvert f(x) - f(x_{0}) \rvert < M\lvert x - x_{0} \rvert < M\delta_{\varepsilon} = \varepsilon,$$ as required. $\blacksquare$ ### Limit of a function **Definition 2.14.** (Finite limits) Let $X \subseteq \mathbb{R}$. Consider a function $f: X \rightarrow \mathbb{R}$ and an element $x_{0} \in X$. A number $L \in \mathbb{R}$ is called ==the limit of the function $f$ at $x_0$==, denoted by $\lim_{x \rightarrow x_0} f(x)=L$, if $\forall \varepsilon>0, \exists \delta>0, \forall x \in X \setminus \left\{x_0\right\},\left[\left|x-x_0\right|<\delta \Rightarrow|f(x)-L|<\varepsilon\right]$. **Definition 2.15.** (Finite one-sided limits) Let $X \subseteq \mathbb{R}$. Consider a function $f: X \to \mathbb{R}$ and an element $x_{0} \in X$. A number $L \in \mathbb{R}$ is called ==the left-sided limit of the function $f$ at $x_{0}$==, denoted by $\lim_{x \rightarrow x_{0}^{-}} f(x) = L$, if $\forall \varepsilon > 0$, $\exists \delta > 0$, $\forall x \in X \setminus \{x_{0}\}$, $\left[ 0 < (x_{0}-x) < \delta \Rightarrow \lvert f(x)-L \rvert < \varepsilon \right]$. A number $L \in \mathbb{R}$ is called ==the right-sided limit of the function $f$ at $x_{0}$==, denoted by $\lim_{x \rightarrow x_{0}^{+}} f(x) = L$, if $\forall \varepsilon > 0$, $\exists \delta > 0$, $\forall x \in X \setminus \{x_{0}\}$, $\left[ 0 < (x-x_{0}) < \delta \Rightarrow \lvert f(x)-L \rvert < \varepsilon \right]$. **Theorem 2.17.** Let $X \subseteq \mathbb{R}$. A function $f: X \to \mathbb{R}$ is continuous at $x_{0} \in X$ if and only if $\lim_{x \rightarrow x_{0}^{-}} f(x) = \lim_{x \rightarrow x_{0}^{+}} f(x) = \lim_{x \rightarrow x_{0}} f(x) = f(x_{0})$. *Proof*. Similar to the proof of Theorem 2.14. Finite limit of a function can be defined using converging sequences. For instance, if $\lim_{x \rightarrow x_0} f(x)=L$ then for all sequences $\{s_{n}\} \subseteq X$ converging $x_{0}$, the sequence $\{f(s_{n})\} \subseteq X$ converges to $L$. **Theorem 2.18.** Let $X \subseteq \mathbb{R}$ and consider functions $f: X \to \mathbb{R}$ and $g: X \to \mathbb{R}$. Fix a point $x_{0} \in X$. Then, the following hold: (i) $\lim_{x \to x_{0}} f+g(x) = \lim_{x \to x_{0}} f(x)+\lim_{x \to x_{0}} g(x)$; (ii) $\lim_{x \to x_{0}} fg(x) = \lim_{x \to x_{0}} f(x) \cdot\lim_{x \to x_{0}} g(x)$; (iii) Suppose $\forall x \in X$, $g(x) \neq 0$. Then, $\lim_{x \to x_{0}} \frac{f}{g}(x) = \frac{\lim_{x \to x_{0}} f(x)}{\lim_{x \to x_{0}} g(x)}$. *Proof*. The result follows when using the sequential definition of limits and from Theorem 2.7. $\blacksquare$ **Definition 2.16.** (Infinite limits) Let $X \subseteq \mathbb{R}$. Consider a function $f: X \rightarrow \mathbb{R}$ and an element $x_0 \in \mathbb{R}$. The ==function $f$ tends to $+\infty$ at $x_0$==, denoted by $\lim_{x \rightarrow x_0} f(x)=+\infty$ if $\forall M>0, \exists \delta>0, \forall x \in X \setminus \left\{x_0\right\},\left[\left|x-x_0\right|<\delta \Rightarrow f(x) \geq M\right]$. The ==function $f$ tends to $-\infty$ at $x_0$==, denoted by $\lim_{x \rightarrow x_0} f(x)=-\infty$ if $\forall M<0, \exists \delta>0, \forall x \in X \setminus \left\{x_0\right\},\left[\left|x-x_0\right|<\delta \Rightarrow f(x) \leq M\right].$ **Definition 2.17.** (Infinite one-sided limits) Let $X \subseteq \mathbb{R}$. Consider a function $f: X \rightarrow \mathbb{R}$ and an element $x_0 \in \mathbb{R}$. The ==function $f$ approaches $+\infty$ at $x_0$ from the right==, denoted by $\lim_{x \rightarrow x_{0}^{+}} f(x)=+\infty$ if $\forall M>0, \exists \delta>0, \forall x \in X \setminus \left\{x_0\right\},\left[0 < (x-x_0) <\delta \Rightarrow f(x) \geq M\right]$. The ==function $f$ approaches $+\infty$ at $x_{0}$ from the right==, denoted by $\lim_{x \rightarrow x_{0}^{-}} f(x)=+\infty$ if $\forall M>0, \exists \delta>0, \forall x \in X \setminus \left\{x_{0}\right\},\left[0 < (x_{0}-x) < \delta \Rightarrow f(x) \geq M\right]$. One can define one-sided limits in a similar fashion when the function approaches $-\infty$ at $x_{0}$. ### Extreme and Intermediate Value Theorems **Theorem 2.19.** (Extreme Value Theorem) Let $a,b \in \mathbb{R}$ with $a<b$ and recall that $\left[a,b\right] = \left\{x \in \mathbb{R} \mid a \leq x \leq b \right\}$. If $f: \left[a,b\right] \to \mathbb{R}$ is continuous then $\exists \underline{x}, \bar{x} \in \left[a,b\right]$, $\forall x \in \left[a,b\right]$, $f(\underline{x}) \leq f(x) \leq f(\bar{x})$. *Proof*. Let $\alpha = \inf\left\{f(x) \mid x \in \left[a,b\right]\right\}$ (since $f$ is continuous, by the sequential definition of continuity, the $\mbox{Im}(f)$ is a bounded set). We first construct a sequence $\{s_{n}\} \subseteq \left[a,b\right]$ such that the sequence $\{f(s_{n})\}$ converges to $\alpha$. By the definition of infimum, we know that $\exists s_{1} \in \left[a,b\right]$, $\alpha < f(s_{1}) \leq \alpha+1$. More generally, for all $n \in \mathbb{N}$, since $\alpha = \inf\left\{f(x) \mid x \in \left[a,b\right]\right\}$, we know that $\exists s_{n} \in \left[a,b\right]$, $\alpha < f(s_{n}) \leq \alpha+\frac{1}{n}$. By construction, the sequence $\{f(s_{n})\}$ converges to $\alpha$. Next, note that the sequence $\{s_{n}\}$ is bounded and therefore, by Theorem 2.8, implying that there is a convergent subsequence $\{s_{n_{k}}\}$ of the sequence $\{s_{n}\}$. Define $\underline{x}$ as the value to which the subsequence $\{s_{n_{k}}\}$ converges to and note that, due to Corollary 2.3, $\underline{x} \in \left[a,b\right]$. Since $f$ is continuous, this implies that $f(\underline{x}) = \alpha$. Using similar arguments, one can establish that there exists $\bar{x} \in \left[a,b\right]$ such that $f(\bar{x}) = \beta = \sup\left\{f(x) \mid x \in \left[a,b\right]\right\}$, completing the proof of the theorem. $\blacksquare$ **Theorem 2.20.** (Intermediate Value Theorem) Let $a,b \in \mathbb{R}$ with $a<b$ and consider a continuous function $f: \left[a,b\right] \to \mathbb{R}$ such that for some $\tau \in \mathbb{R}$, $f(a) < \tau < f(b)$. Then, there exists $x^{*} \in \left[a,b\right]$ such that $f(x^{*})=\tau$. *Proof*. Let $X=\{x \in \left[a,b\right] \mid f(x) < \tau\}$. Since $a \in X$, $X \neq \emptyset$ and since $b \not\in X$, $\forall x \in X$, $x<b$. Let $x^{*} = \sup X$. As in the proof of Theorem 2.10., the sequence $\{s_{n}\} \subseteq X$ that converges to $x^{*}$. Since $f$ is continuous, $\{f(s_{n})\}$ converges to $f(x^{*})$ implying that $f(x^{*}) \leq \tau$. We claim that $f(x^{*}) = \tau$. Suppose not, i.e., $f(x^{*}) < \tau$. Consider $\varepsilon > 0$ such that $\varepsilon < \tau-f(x^{*})$. Since $f$ is continuous, $\exists \delta > 0$, $\forall x \in \left[a,b\right]$, $\lvert x-x^{*} \rvert < \delta$ implies that $\lvert f(x)-f(x^{*}) \rvert < \varepsilon$, implying that $x \in X$, a contradiction to the fact that $x^{*} = \sup X$. $\blacksquare$ ## 2.4. Differentiation **Definition 2.18.** Let $a,b \in \mathbb{R}$ such that $a<b$. A function $f: \left[a,b\right] \rightarrow \mathbb{R}$ is ==differentiable at $x \in \left(a,b\right)$== if the following limit $$f'(x) = \lim_{t \rightarrow 0} \dfrac{f(x+t)-f(x)}{t}$$ exists. A function $f: \left[a,b\right] \rightarrow \mathbb{R}$ is ==differentiable at $a$== if the following limit $$f'(a) = \lim_{t \rightarrow 0^{+}} \dfrac{f(a+t)-f(a)}{t}$$ exists. A function $f: \left[a,b\right] \rightarrow \mathbb{R}$ is ==differentiable at $b$== if the following limit $$f'(b) = \lim_{t \rightarrow 0^{-}} \dfrac{f(b)-f(b-t)}{t}$$ exists. **Theorem 2.21.** Let $a,b \in \mathbb{R}$ such that $a<b$ and consider a function $f: \left[a,b\right] \rightarrow \mathbb{R}$ which is differentiable at $x \in \left[a,b\right]$. Then, $f$ is continuous at $x$. *Proof*. We only show the claim when $x \in(a, b)$; similar arguments work when $x=a$ or $x=b$. Pick an (arbitrary) $x \in(a, b)$. We need to show that $f$ is continuous, i.e., $\forall \varepsilon > 0$, $\exists \delta > 0$, $$|y-x| < \delta \Rightarrow |f(y)-f(x)| < \varepsilon.$$ Equivalently, we must show that $$(y-x) \rightarrow 0 \Rightarrow (f(y)-f(x)) \rightarrow 0.$$ In other words, we establish that $$\lim_{(y-x) \rightarrow 0}\lvert f(y)-f(x)) \rvert =0.$$ Note that \begin{align*} \lim_{(y-x) \rightarrow 0}\lvert f(y)-f(x)) \rvert &= \lim_{(y-x) \rightarrow 0}\lvert f(y)-f(x)) \rvert \\ &= \lim_{(y-x) \rightarrow 0}\frac{\lvert f(x+y-x)-f(x) \rvert}{\lvert y-x \rvert} \cdot \lvert y-x \rvert. \end{align*} Substituting $t = (y-x)$, we get \begin{align*} \lim_{(y-x) \rightarrow 0}\lvert f(y)-f(x)) \rvert &= \lim_{t \rightarrow 0}\frac{\lvert f(x+t)-f(x) \rvert}{\lvert t \rvert} \cdot \lvert t \rvert \\ &= \lim_{t \rightarrow 0}\left\lvert \frac{f(x+t)-f(x)}{t} \right\rvert \cdot \lvert t \rvert \\ &= 0, \end{align*} as required. $\blacksquare$ **Theorem 2.22.** Let $a,b \in \mathbb{R}$ such that $a<b$ and consider functions $f,g: \left[a,b\right] \rightarrow \mathbb{R}$ which is differentiable at $x \in \left[a,b\right]$. Then, the following hold: (i) $f+g$ is differentiable at $x$ and $(f+g)'(x) = f'(x)+g'(x)$. (ii) $f\cdot g$ is differentiable at $x$ and $(f \cdot g)'(x) = f'(x) \cdot g(x) + g'(x) \cdot f(x)$. (iii) when $g(y) \neq 0$ for all $y \in \left[a,b\right]$, $\frac{f}{g}$ is differentiable at $x$ and $$\left(\frac{f}{g}\right)'(x) = \dfrac{f'(x) \cdot g(x) - g'(x) \cdot f(x)}{g(x)^{2}}.$$ *Proof*. (i) Follows Theorem 2.18. (ii) We assume $x \in(a, b)$. The proof of (a) is easy in light of previous results on limits. To establish (b), put $h=f g$ and note that $h(x+t)-h(x)=[f(x+t)- f(x)] g(x+t)+f(x)[g(x+t)-g(x)]$. Dividing by $t$ gives $$ \frac{h(x+t)-h(x)}{t}=\frac{f(x+t)-f(x)}{t} g(x+t)+f(x) \frac{g(x+t)-g(x)}{t} $$ Now let $t \rightarrow 0$ and using the fact that $g$ is continuous at $x$ (using Theorem 2.21, the function $g$ is continuous and $\displaystyle\lim_{t \to 0}g(x+t) = g(x)$) to obtain part (ii). (iii) Let $h=\frac{f}{g}$. Observe that $$\frac{h(x+t)-h(x)}{t} =\frac{1}{g(x+t) g(x)} \cdot\left[\frac{f(x+t)-f(x)}{t} g(x)-f(x) \frac{g(x+t)-g(x)}{t}\right]$$ Letting $t \rightarrow 0$ produces the result. $\blacksquare$ We also state the Chain Rule of differentiation without a proof. **Theorem 2.23.** (Chain Rule). Let $f:(a, b) \rightarrow \mathbb{R}$ be differentiable at a point $x$ in $(a, b), f((a, b)) \subseteq(c, d)$, and let $g:(c, d) \rightarrow \mathbb{R}$ be a function that is differentiable at $f(x)$. If $h:(a, b) \rightarrow \mathbb{R}$ is defined by $h=g \circ f$, then $h$ is differentiable at $x$ and $$ h^{\prime}(x)=g^{\prime}(f(x)) f^{\prime}(x) $$ ## 2.5. Integration of Functions on the Real Line In this section, we give a mathematical foundation of the theory of integration the reader began in calculus. ### The Riemann Integral Throughout this section, we work with a closed and bounded interval, $[a, b]$. A partition of $[a, b]$ is a finite, ordered subset of the form $P=\left\{a=x_{0}<x_{1}<\cdots<x_{n}=b\right\}$. We say that a partition $Q=\left\{a=y_{0}<y_{1}<\cdots<y_{n'}=b\right\}$ of $[a,b]$ is ==finer than the partition $P=\left\{a=x_{0}<x_{1}<\cdots<x_{n}=b\right\}$ of $[a,b]$== if for all $1 \leq i \leq n'$, there exists $1 \leq j \leq n$ such that $[y_{i-1},y_{i}] \subseteq [x_{j-1},x_{j}]$. In other words, $Q$ adds additional points to $P$, and each subinterval of $[a, b]$ determined by two consecutive elements of $P$ is the union of one or more of the intervals determined by the elements of $Q$. For example, $P=\{0<0.2<0.4<0.7<1\}$ is a partition of $[0,1]$. Both $\{0<0.1<0.2<0.4<0.6<0.7<1\}$ and $\{0<0.2<0.4<0.6<0.7<0.9<1\}$ are refinements of $P$ (Every partition $P$ is a refinement $P$ itself). An observation that will be used often as we proceed is that if $P$ and $Q$ are two partitions of $[a, b]$, the partition $P \cup Q$ is a refinement of both $P$ and $Q$. In most textbooks, what is being introduced as the Riemann Integral is in fact the Darboux integral. Riemann's original idea was quite complex. For pedagogical interest, we explore this first. Fix a bounded function $f: [a,b] \to \mathbb{R}$. Along with a partition $P$ of $[a,b]$, Riemann also considered ==representative values== in each element of the partition, denoted by $P^{*} = \left\{x_{1}^{*} < x_{2}^{*} < \cdots < x_{n-1}^{*} < x_{n}^{*}\right\}$ where $x_{i-1} \leq x_{i}^{*} \leq x_{i}$. The Riemann Sum of $f$ with respect to $P$, $R(f,P)$ is given by $$R(f,P) = \sum_{i=1}^{n} f(x_{i}^{*})(x_{i}-x_{i-1}),$$ and let $\lVert P \rVert=\max\{(x_{i}-x_{i-1}) \mid 1 \leq i \leq n \}$. Riemann then defines the integral of $f$ in the domain $[a,b]$, $\int_{a}^{b}f$ as $$\int_{a}^{b}f = \lim_{\lVert P \rVert \to 0}R(f,P).$$ Let us consider an example to further enhance our understanding. > Example. Let $0 \leq a < b$ and let $f: [a,b] \to \mathbb{R}$ such that $\forall x \in [a,b]$, $f(x)=x$. We know from elementary calculus lessons that $\int_{a}^{b} f(x) dx = \frac{(b^{2}-a^{2})}{2}$. Given a partition $P$ of $[a,b]$, $$R(f,P) = \sum_{i=1}^{n}f(x_{i}^{*})(x_{i}-x_{i-1}) = \sum_{i=1}^{n} x_{i}^{*}(x_{i}-x_{i-1}).$$ We show that $$\lim_{\lVert P \rVert \to 0}R(f,P) = \frac{(b^{2}-a^{2})}{2}.$$ In other words, we show that $$\left\lvert \sum_{i=1}^{n} x_{i}^{*}(x_{i}-x_{i-1}) - \frac{(b^{2}-a^{2})}{2} \right\rvert \to 0$$ whenever $\lVert P \rVert \to 0$. Or, \begin{align*} \left\lvert \sum_{i=1}^{n} x_{i}^{*}(x_{i}-x_{i-1})\right. &- \left.\frac{(b^{2}-a^{2})}{2} \right\rvert \\ &= \frac{1}{2}\left\lvert \sum_{i=1}^{n} 2x_{i}^{*}(x_{i}-x_{i-1}) - \sum_{i=1}^{n}(x_{i}^{2}-x_{i-1}^{2}) \right\rvert \\ &= \frac{1}{2}(x_{i}-x_{i-1})\left\lvert \sum_{i=1}^{n} 2x_{i}^{*} - \sum_{i=1}^{n}(x_{i}+x_{i-1}) \right\rvert \\ &\leq \frac{1}{2}(x_{i}-x_{i-1})\left[\sum_{i=1}^{n}\lvert x_{i}^{*}-x_{i} \rvert +\sum_{i=1}^{n}\lvert x_{i}^{*}-x_{i-1} \rvert\right] \\ &\leq (x_{i}-x_{i-1})\left[\sum_{i=1}^{n}(x_{i}-x_{i-1})\right] \\ &\leq \lVert P \rVert (b-a). \end{align*} In the above, choose $\lVert P \rVert < \frac{\varepsilon}{(b-a)}$ where $\varepsilon > 0$ is arbitrary and the required result follows. In the above example, note that we used our knowledge of elementary calculus to evaluate an integral and then showed that the formula we know is indeed the limit of the Riemann sum. Darboux presented a more elegant idea. Let $f:[a, b] \rightarrow \mathbb{R}$ is any bounded function and $P=\left\{a=x_{0}<x_{1}<\cdots<x_{n}=b\right\}$, then for $1 \leq j \leq n$ define $$ \begin{aligned} M_{j}^{P} & =\sup \left\{f(x): x_{j-1} \leq x \leq x_j\right\} \\ m_{j}^{P} & =\inf \left\{f(x): x_{j-1} \leq x \leq x_j\right\} \end{aligned} $$ and let $$U(f,P)=\sum_{j=1}^{n} M_{j}^{P}\left(x_{j}-x_{j-1}\right) \text { and } L(f, P)=\sum_{j=1}^{n} m_{j}^{P}\left(x_{j}-x_{j-1}\right).$$ The term $U(f,P)$ is called the ==upper sum of $f$ with respect to the partition $P$== and $L(f,P)$ is called the ==lower sum of $f$ with respect to the partition $P$==. > Example. Assume $f:[a, b] \rightarrow \mathbb{R}$ is a constant function: $f(x)=c$ for all $x$ in $[a, b]$. For any partition $P$ and with the notation as above, $m_{j}^{P}=c=M_{j}^{P}$ for each $j$. Thus for any partition $P, L(f,P)=c(b-a)=U(f,P)$. > Example. Now define $f:[a, b] \rightarrow \mathbb{R}$ by $$ f(x)= \begin{cases}1 & \text { if } x \text { is a rational number } \\ 0 & \text { if } x \text { is an irrational number }\end{cases}$$ Since rational numbers are dense in $\mathbb{R}$, for any partition $P=\left\{a=x_0<x_1<\cdots<\right.$ $\left.x_n=b\right\}$ and $1 \leq j \leq n, m_j=0$ and $M_j=1$. This implies that $L(f, P)=0$ and $U(f, P)=b-a$. Of course, we'll see many other examples, but not until we develop a little more of the theory. Before we start we might call attention to Exercise 1, which can often be used to derive a result for the lower sum from a result for the upper sum. **Theorem 2.24.** If $f$ is a bounded function on $[a, b]$, i.e., $\forall x \in [a, b]$, $m \leq f(x) \leq M$, and $P$ and $Q$ are partitions of $[a, b]$, then the following hold: (i) If $Q$ is a refinement of $P$, then $$L(f, P) \leq L(f, Q) \leq U(f, Q) \leq U(f, P).$$ (ii) If $Q$ is a refinement of $P$, then $$0 \leq U(f,Q)-L(f,Q) \leq U(f,P)-L(f,P).$$ (iii) $m(b-a) \leq L(f,P) \leq U(f,Q) \leq M(b-a)$. *Proof*. (i) Let $P=\left\{a=x_{0}<x_{1}<\cdots<x_{n}=b\right\}$ be a partition of $[a,b]$ and, the partition $P=\left\{a=y_{0}<y_{1}<\cdots<y_{n'}=b\right\}$ of $[a,b]$ be a refinement of $P$. Clearly, we can obtain $Q$ from $P$ by successively adding points and hence, it is sufficient to prove this result when $Q$ is obtained by adding one point to $P$. Without loss of generality, assume that $$Q=\left\{a=y_{0}=x_{0}<y_{1}=x^{*}<y_{2}=x_{1}<\cdots<y_{n+1}=x_{n}=b \right\}.$$ Let $$m_{1}^{Q}=\inf \left\{f(x) \mid y_{0}=x_{0} \leq x \leq x^{*}=y_{1}\right\},$$ and, $$m_{2}^{Q}=\inf\left\{f(x) \mid y_{1}=x^{*} \leq x \leq y_{2}=x_{1}\right\}.$$ Note that $m_{1}^{P}=\min \left\{m_{1}^{Q}, m_{2}^{Q}\right\}$. Thus, $$m_{1}\left(x_{1}-x_{0}\right) \leq m_{1}^{Q}\left(x^*-x_0\right)+m_{2}^{Q}\left(x_1-x^*\right),$$ and so $L(f, P) \leq L(f, Q)$. The proof that $U(f, Q) \leq U(f, P)$ follows from analogous arguments. (ii) This is immediate from (i). (iii) Let $P=\left\{a=x_{0}<x_{1}<\cdots<x_{n}=b\right\}$ be a partition of $[a,b]$. Since $\forall x \in [a, b]$, $m \leq f(x) \leq M$, $\forall j=1,\ldots,n$, $$m \leq m_{j}^{P} = \inf \left\{f(x): x_{j-1} \leq x \leq x_j\right\} \leq M_{j}^{P} = \sup \left\{f(x): x_{j-1} \leq x \leq x_j\right\} \leq M.$$ Summing over $j$, we get $$\begin{align*}m(b-a) &= \sum_{i=1}^{n}m\left(x_{j}-x_{j-1}\right) \\ &\leq \sum_{i=1}^{n}m_{j}^{P}\left(x_{j}-x_{j-1}\right) = L(f,P) \\ &\leq \sum_{i=1}^{n}M_{j}^{P}\left(x_{j}-x_{j-1}\right) = U(f,P) \\ &\leq \sum_{i=1}^{n}M\left(x_{j}-x_{j-1}\right) = M(b-a), \end{align*}$$ as required. Next, using part (i) of this theorem, we have $$m(b-a) \leq L(f,P) \leq L(f,P \cup Q) \leq U(f,P \cup Q) \leq U(f,Q) \leq M(b-a),$$ as required.$\blacksquare$ Using Theorem 2.24, we know that $$\sup \{L(f,P) \mid P \text { is a partition of }[a, b]\} \leq \inf \{U(f,P) \mid P \text { is a partition of }[a, b]\}.$$ We can use this idea to define the Riemann integral of a bounded function $f:[a, b] \rightarrow \mathbb{R}$. **Definition 2.19.** A bounded function $f:[a, b] \rightarrow \mathbb{R}$ is ==Riemann integrable== if $$\sup \{L(f,P): P \text { is a partition of }[a, b]\} = \inf \{U(f,P): P \text { is a partition of }[a, b]\}.$$ Moreover, the Riemann integral of this function, denoted by $\int_{a}^{b} f =\int_{a}^{b} f(x)dx$ is given by $$\begin{aligned} \int_a^b f & =\int_a^b f(x) d x \\ & =\sup \{L(f,P) \mid P \text { is a partition of }[a, b]\}\\ & =\inf \{U(f,P) \mid P \text { is a partition of }[a, b]\} \end{aligned}$$ The set of all Riemann integrable functions on $[a, b]$ is denoted by $\mathcal{R}[a, b]$. Unless otherwise mentioned, we will use the notation $\int_a^b f$ rather than $\int_{a}^{b} f(x) dx$ as seen in calculus. The use of the notation involving $d x$ will be limited to those occasions where there might be some confusion as to the variable of integration. This will be our practice partly because the $x$ is redundant, but mainly to emphasize that we are integrating a function. The notation for a function is $f$, while $f(x)$ is the value of the function at the point $x$ in its domain. The next result gives a necessary and sufficient condition for integrability that is more convenient than the definition and that will usually be employed when we prove results about integrability. **Theorem 2.25.** If $f$ is a bounded function on $[a,b]$, then $f$ is Riemann integrable if and only if $\forall \varepsilon>0$, there exists a partition $P$ of $[a,b]$ such that $U(f, P)-L(f, P) < \varepsilon$. Moreover $\int_a^b f$ is the unique number such that for every refinement $Q$ of $P$ we have $$L(f, Q) \leq \int_{a}^{b} f \leq U(f, Q)$$ *Proof*. (If part) Define $$ \begin{aligned} L & =\sup \{L(f, P): P \text { is a partition of }[a, b]\} \\ U & =\inf \{U(f, Q): Q \text { is a partition of }[a, b]\} \end{aligned}$$ Using Theorem 2.24 (i), we have $L \leq U$. Assume that $\forall \varepsilon>0$, there exists a partition $P$ of $[a,b]$ with $U(f,P)-L(f,P)<\varepsilon$. By definition of $U$ and $L$, $U-L \leq U(f,P)-L(f,P) < \varepsilon$. Since the choice of $\varepsilon$ was arbitrary, we have $f \in \mathcal{R}[a, b]$. (Only if) Assume $f$ is Riemann integrable and consider an arbitrary $\varepsilon>0$. Choose partitions $P_{1}, P_{2}$ of $[a,b]$ such that $0 \leq L-L\left(f,P_{1}\right)<\frac{\varepsilon}{2}$ and $0 \leq U\left(f, P_{2}\right)-U<\frac{\varepsilon}{2}$. If we put $P=P_{1} \cup P_{2}$, then Theorem 2.24 implies that $U(f,P)-L(f,P)<\varepsilon$. If $Q$ is a refinement of $P$, then Theorem 2.24 implies that $$U(f, Q)-L(f, Q) \leq U(f, P)-L(f, P)<\varepsilon.$$ Since $\varepsilon$ was arbitrary we have that there can be only one number between $L(f, Q)$ and $U(f, Q)$ for every such refinement. By definition, this unique number must be $\int_a^b f$. $\blacksquare$ The uniqueness part of the last proposition is there mainly for emphasis, since when a function is integrable there can be no other number between all the lower and upper sums. There is, however, some small benefit in using only partitions $Q$ that are refinements of $P$ as seen in the proof of Proposition 3.1.6 below. **Corollary 2.4.** If $f$ is a bounded function on $[a, b]$, then $f$ is Riemann integrable if and only if there is a sequence of partitions $\left\{P_k\right\}$ of $[a, b]$ such that each $P_{k+1}$ is a refinement of $P_k$ and $U\left(f, P_k\right)-L\left(f, P_k\right) \rightarrow 0$ as $k \rightarrow \infty$. When this happens we have that $$ \int_a^b f=\lim _{k \rightarrow \infty} U\left(f, P_k\right)=\lim _{k \rightarrow \infty} L\left(f, P_k\right) $$ ### The Fundamental Theorem of Calculus We present the most general version of the Fundamental Theorem of Calculus. **Theorem 2.26.** (Fundamental Theorem of Calculus) If $f$ is a bounded Riemann integrable function on $[a, b]$ and $F:[a, b] \rightarrow \mathbb{R}$ is defined by $$F(x)=\int_{a}^{x} f(t) dt$$ then $F$ is a continuous function. If $f$ is continuous at a point $c$ in $[a, b]$, then $F$ is differentiable at $c$ and $F^{\prime}(c)=f(c)$. *Proof*. We assume that $c \in (a,b)$; the proof of the result when $c=a$ and $c=b$ follows from similar arguments. If $\lvert f(x) \rvert \leq M$ for all $x$ in $[a, b]$, then for $a \leq x \leq y \leq b$, $$ \begin{aligned} \left\lvert F(y)-F(x) \right\rvert & =\left\lvert \int_{a}^{y} f-\int_{a}^{x} f\right\rvert \\ & =\left\lvert \int_{x}^{y} f \right\rvert \\ & \leq M\lvert y-x \rvert. \end{aligned} $$ Therefore, $f$ is Lipschitz continuous, and therefore, a continuous function. Now, assume $f$ is continuous at $c$, which we assume is an interior point of the interval. If $\varepsilon>0$ there is a $\delta>0$ such that $|f(x)-f(c)|<\varepsilon$ when $|x-c|<\delta$. Thus when $0<|t|<\delta$, $$ \begin{aligned} \left|\frac{F(c+t)-F(c)}{t}-f(c)\right| & =\left|\left(\frac{1}{t} \int_c^{c+t} f\right)-f(c)\right| \\ & =\left|\frac{1}{t} \int_c^{c+t}[f(x)-f(c)] d x\right| \\ & \leq \frac{1}{|t|} \varepsilon|t| \\ & \leq \varepsilon \end{aligned} $$ By definition $F^{\prime}(c)$ exists and equals $f(c)$. When $c$ is one of the endpoints of the interval the proof is similar and left to the reader (Exercise 1). $\blacksquare$