Measure and Integration: First Steps - Clemson CECAScecas.clemson.edu/~petersj/Courses/M822/M822.pdf · 2009-04-20 · Measure and Integration: First Steps We are all made of the

Measure and Integration: First Steps

We are all made of the same stuff, Dewey.

James K. PetersonDepartment of Biological Sciences

andDepartment of Mathematical Sciences

Clemson Universityemail: [email protected]

c© James K. Peterson Version January 11, 2009Gneural Gnome Press

April 20, 2009

ii

Dedication

I dedicate this work to the students who have learned this material in its various preliminary versions and to

my family who have listened to my ideas in the living room and over dinner for many years. I hope that this

text helps inspire all my students to consider the study of abstraction in mathematics as an indispensable

tool in their own work.

iii

iv

Abstract

This book introduces graduate students in mathematics concepts from measure theory and also, the ab-

stract way of looking at the world. We feel that is a most important skill to have when your lifes work will

involve quantitative modeling to gain insight into the real world.

v

vi

Acknowledgements

Since Jim regained the ability to run in 2009, he is less grumpy than usual. Running on the local trails in theforest helps with the stress levels and counteracts Jim’s most important philosophical view: “There is always roomfor a donut!”

I wish to thank all my students for helping me by listening to what I say in my lectures, finding my typographicalerrors and my other mistakes. I am always hopeful that my efforts help my students to some extent and also impartsome of my enthusiasm for the subject. Of course, the reality is that I have been damaging students for years byforcing them to learn these abstract things. This is why I tell people at parties I am a roofer or electrician. If I amidentified as a mathematician, it could go badly given the terrible things I inflict on the students in my classes. Whoknows whom they have told about my tortuous methods. Hence, anonymity is best.

Still, by writing these notes, I have gone public. Sigh. This could be bad. So before I am taken out with a verypublic hit, I, of course, want to thank my family for all their help in many small and big ways. It is amazing to methat I have been teaching this material since my children were in grade school and now my youngest is now in hersecond year of college!

I would like to thank all the students who have used the various iterations of these notes as they have evolvedfrom handwritten to the typed version here. There is still more to do, but we are getting closer!

I am very grateful in particular to the students of the Spring 2009 semester who took MTHSC 822 at Clemson

University and helped me rewrite and rewrite these notes to find typographical errors, mistakes and poor wording. It

is still a work in progress and all errors are ultimately my responsibility, but the text is much better due to their aid.

vii

viii

History

Based On:

Handwritten Notes For Measure and Integration

MTHSC 822

1995 - 2006

Spring 2009

ix

x

Table Of Contents

Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Table Of Contents xi

I Introductory Matter 1

1 Introduction 31.1 The Analysis Courses . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Senior Level Analysis . . . . . . . . . . . . . . . . . . . 4

1.1.2 The Graduate Analysis Courses . . . . . . . . . . . . . . 5

1.1.3 More Advanced Courses . . . . . . . . . . . . . . . . . . 8

1.2 Teaching The Measure and Integration Course . . . . . . . . . . . 8

II Classical Riemann Integration 11

2 Riemann Overview 132.1 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.1 A Riemann Sum Example . . . . . . . . . . . . . . . . . 16

2.3.2 The Riemann Integral As A Limit . . . . . . . . . . . . . 17

2.3.3 The Fundamental Theorem Of Calculus . . . . . . . . . . 20

2.3.4 The Cauchy Fundamental Theorem Of Calculus . . . . . . 24

2.3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.6 Simple Substitution Techniques . . . . . . . . . . . . . . 27

xi

TABLE OF CONTENTS TABLE OF CONTENTS

2.4 Handling Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4.1 Removable Discontinuity . . . . . . . . . . . . . . . . . . 31

2.4.2 Jump Discontinuity . . . . . . . . . . . . . . . . . . . . . 32

2.4.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Bounded Variation 373.1 Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.1.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2 Monotone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2.1 Worked Out Example . . . . . . . . . . . . . . . . . . . . 48

3.2.2 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3 Bounded Variation . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.4 Total Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.5 Continuous Also . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Riemann Integration 634.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4 Riemann Integrable? . . . . . . . . . . . . . . . . . . . . . . . . 79

4.5 More Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.6 Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . 83

4.6.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.7 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.8 Same Integral? . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5 Further Riemann Results 995.1 Limit Interchange . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.2 Riemann Integrable? . . . . . . . . . . . . . . . . . . . . . . . . 105

5.3 Content Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

III Riemann - Stieljes Integrals 113

6 Riemann-Stieljes 1156.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.2 Step Integrators . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.3 Monotone Integrators . . . . . . . . . . . . . . . . . . . . . . . . 123

6.4 Equivalence Theorem . . . . . . . . . . . . . . . . . . . . . . . . 125

6.5 Further Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.6 Bounded Variation Integrators . . . . . . . . . . . . . . . . . . . 129

xii


7 Further Riemann-Stieljes 1357.1 Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . 135

7.2 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

7.3 Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

7.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

IV Abstract Measure Theory One 151

8 Measurability 1538.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

8.2 Borel Sigma Algebra . . . . . . . . . . . . . . . . . . . . . . . . 155

8.2.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 157

8.3 Extended Borel Sigma Algebra . . . . . . . . . . . . . . . . . . . 157

8.4 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . 160

8.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 162

8.5 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.6 Extended Valued . . . . . . . . . . . . . . . . . . . . . . . . . . 165

8.7 Extended Properties . . . . . . . . . . . . . . . . . . . . . . . . . 167

8.8 Continuous Compositions . . . . . . . . . . . . . . . . . . . . . . 171

8.8.1 The Composition With Finite Measurable Functions . . . 171

8.8.2 The Approximation Of Non-negative Measurable

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 172

8.8.3 Continuous Functions of Extended Valued Mea-

surable Functions . . . . . . . . . . . . . . . . . . . . . . 173

8.9 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

9 Abstract Integration 1779.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

9.2 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

9.3 Equality a.e. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

9.4 Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . 193

9.5 Extended Integrands . . . . . . . . . . . . . . . . . . . . . . . . 201

9.6 Summable Properties . . . . . . . . . . . . . . . . . . . . . . . . 205

9.7 The DCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

9.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

9.9 Alternate Integration . . . . . . . . . . . . . . . . . . . . . . . . 211

9.9.1 Homework . . . . . . . . . . . . . . . . . . . . . . . . . 218

10 The Lp Spaces 22110.1 The General Lp spaces . . . . . . . . . . . . . . . . . . . . . . . 225

10.2 The World Of Counting Measure . . . . . . . . . . . . . . . . . . 235

xiii


10.3 Essentially Bounded Functions . . . . . . . . . . . . . . . . . . . 237

10.4 The Hilbert Space L2 . . . . . . . . . . . . . . . . . . . . . . . . 244

10.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

V Constructing Measures 247

11 Building Measures 24911.1 Via Outer Measure . . . . . . . . . . . . . . . . . . . . . . . . . 249

11.2 Via Metric Outer Measure . . . . . . . . . . . . . . . . . . . . . 256

11.3 Building Outer Measure . . . . . . . . . . . . . . . . . . . . . . 262

11.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

11.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

12 Lebesgue Measure 27112.1 Outer Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

12.2 LOM Is MOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

12.3 Approximation Results . . . . . . . . . . . . . . . . . . . . . . . 286

12.3.1 Approximating Measurable Sets . . . . . . . . . . . . . . 286

12.3.2 Approximating Measurable Functions . . . . . . . . . . . 290

12.4 Non Measurable Sets . . . . . . . . . . . . . . . . . . . . . . . . 293

12.4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 295

12.5 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

12.5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 298

13 Cantor Sets 29913.1 Generalized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

13.2 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 302

13.3 Cantor Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

13.4 Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

14 Lebesgue Stieljes Measure 30714.1 Lebesgue-Stieljes . . . . . . . . . . . . . . . . . . . . . . . . . . 308

14.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

14.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318

VI Abstract Measure Theory Two 321

15 Convergence Modes 32315.1 Extracting Subsequences . . . . . . . . . . . . . . . . . . . . . . 325

15.2 Egoroff’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 333

15.3 Vitali’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 336

xiv


15.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

15.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

16 Decomposing Measures 34716.1 Jordan Decomposition . . . . . . . . . . . . . . . . . . . . . . . 347

16.2 Hahn Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 352

16.3 Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

16.4 Absolute Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 358

16.5 Radon-Nikodym . . . . . . . . . . . . . . . . . . . . . . . . . . 360

16.6 Lebesgue Decomposition . . . . . . . . . . . . . . . . . . . . . . 368

16.7 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372

17 Connections To Riemann Integration 375

18 Differentiation 37918.1 Absolutely Continuous Functions . . . . . . . . . . . . . . . . . . 379

18.2 LS and AC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

18.3 Bounded Variation Derivatives . . . . . . . . . . . . . . . . . . . 382

VII Summing It All Up 393

19 Summing It All Up 395

VIII References 397

References 399

IX Detailed Indices 401

Index 403

X Glossary Of Terms 411

Glossary 413

XI Appendix: Undergraduate Analysis Examinations 417

A Advanced Calculus I 419A-1 Course Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 419

A-2 Study Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

xv


A-3 Exams Version A . . . . . . . . . . . . . . . . . . . . . . . . . . 421

A-3.1 Exam 1A . . . . . . . . . . . . . . . . . . . . . . . . . . 421

A-3.2 Exam 2A . . . . . . . . . . . . . . . . . . . . . . . . . . 422

A-3.3 Exam 3A . . . . . . . . . . . . . . . . . . . . . . . . . . 423

A-3.4 Final A . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

A-4 Exams Version B . . . . . . . . . . . . . . . . . . . . . . . . . . 426

A-4.1 Exam 1B . . . . . . . . . . . . . . . . . . . . . . . . . . 426

A-4.2 Exam 2B . . . . . . . . . . . . . . . . . . . . . . . . . . 427

A-4.3 Exam 3B . . . . . . . . . . . . . . . . . . . . . . . . . . 428

A-4.4 Final B . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

A-5 Exams Version C . . . . . . . . . . . . . . . . . . . . . . . . . . 430

A-5.1 Exam 1C . . . . . . . . . . . . . . . . . . . . . . . . . . 430

A-5.2 Exam 2C . . . . . . . . . . . . . . . . . . . . . . . . . . 431

A-5.3 Exam 3C . . . . . . . . . . . . . . . . . . . . . . . . . . 433

A-5.4 Final C . . . . . . . . . . . . . . . . . . . . . . . . . . . 434

B Advanced Calculus II 437B-1 MTHSC 454 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

B-2 Course Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 437

B-3 Exams Version A . . . . . . . . . . . . . . . . . . . . . . . . . . 440

B-3.1 Exam 1A . . . . . . . . . . . . . . . . . . . . . . . . . . 440

B-3.2 Exam 2A . . . . . . . . . . . . . . . . . . . . . . . . . . 442

B-3.3 Exam 3A . . . . . . . . . . . . . . . . . . . . . . . . . . 443

B-3.4 Final A . . . . . . . . . . . . . . . . . . . . . . . . . . . 443

B-4 Exams Version B . . . . . . . . . . . . . . . . . . . . . . . . . . 445

B-4.1 Exam 1B . . . . . . . . . . . . . . . . . . . . . . . . . . 445

B-4.2 Exam 2B . . . . . . . . . . . . . . . . . . . . . . . . . . 446

B-4.3 Exam 3B . . . . . . . . . . . . . . . . . . . . . . . . . . 448

XII Appendix: Linear Analysis Examinations 449

C Linear Analysis I 451C-1 Course Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 451

C-2 Exams Version A . . . . . . . . . . . . . . . . . . . . . . . . . . 451

C-2.1 Exam 1A . . . . . . . . . . . . . . . . . . . . . . . . . . 451

C-2.2 Exam 2A . . . . . . . . . . . . . . . . . . . . . . . . . . 453

C-2.3 Exam 3A . . . . . . . . . . . . . . . . . . . . . . . . . . 454

C-3 Exams Version B . . . . . . . . . . . . . . . . . . . . . . . . . . 455

C-3.1 Exam 1B . . . . . . . . . . . . . . . . . . . . . . . . . . 455

C-3.2 Exam 2B . . . . . . . . . . . . . . . . . . . . . . . . . . 456

xvi


C-3.3 Final B . . . . . . . . . . . . . . . . . . . . . . . . . . . 458

xvii


xviii

Part I

Introductory Matter

1

Chapter 1

Introduction

We believe that all students who are seriously interested in mathematics at the Master’s and Doctoral level

should have a passion for analysis even if it is not the primary focus of their own research interests. So you

should all understand that my own passion for the subject will shine though in the notes that follow! And,

it goes without saying that we assume that you are all mature mathematically and eager and interested

in the material! Now, the present course focuses on the topics of Measure and Integration from a very

abstract point of view, but it is very helpful to place this course into its proper context. Also, for those of

you who are preparing to take the qualifying examination in analysis, the overview below will help you

see why all this material fits together into a very interesting web of ideas.

1.1 The Analysis Courses

In outline form, these courses would cover the following material using textbooks equivalent to the ones

listed below:

(A): Undergraduate Analysis, text Advanced Calculus: An Introduction to Analysis, by Watson Fulks.

Here these are MTHSC 453 and MTHSC 454.

(B): Introduction to Abstract Spaces, text Introduction to Functional Analysis and Applications, by

Ervin Kreyszig. Here this is MTHSC 821.

(C): Measure Theory and Abstract Integration, texts General Theory of Functions and Integration, by

Angus Taylor and Real Analysis, by Royden and, of course, the volume of notes you are currently

reading! Here this is MTHSC 822.

In addition, a nice book that organizes the many interesting examples and counterexamples in this area is

good to have on your shelf. We recommend the text Counterexamples in Analysis by Gelbaum and Olm-

stead. There are thus essentially five courses required to teach you enough of the concepts of mathemati-

3

1.1. THE ANALYSIS COURSES CHAPTER 1. INTRODUCTION

cal analysis to enable you to read technical literature (such as engineering, control, physics, mathematics,

statistics and so forth) at the beginning research level. Here are some more details about these courses.

1.1.1 Senior Level Analysis

Typically, this is a full two semester sequence that discusses thoroughly what we would call the analysis

of functions of a real variable. Here, this is the sequence MTHSC 453-454. This two semester sequence

covers the following:

Advanced Calculus I: MTHSC 453: This course studies sequences and functions whose domain is sim-

ply the real line. There are, of course, many complicated ideas, but everything we do here involves

things that act on real numbers to produce real numbers. If we call these things that act on other

things, OPERATORS, we see that this course is really about real–valued operators on real num-

bers. This course invests a lot of time in learning how to be precise with the notion of convergence

of sequences of objects, that happen to be real numbers, to other numbers.

1. Basic Logic, Inequalities for Real Numbers, Functions

2. Sequences of Real Numbers, Convergence of Sequences

3. Subsequences and the Bolzano–Weierstrass Theorem

4. Cauchy Sequences

5. Continuity of Functions

6. Consequences of Continuity

7. Uniform Continuity

8. Differentiability of Functions

9. Consequences of Differentiability

10. Taylor Series Approximations

Advanced Calculus II: MTHSC 454: In this course, we rapidly become more abstract. First, we develop

carefully the concept of the Riemann Integral. We show that although differentiation is intellectually

quite a different type of limit process, it is intimately connected with the Riemann integral. Also,

for the first time, we begin to explore the idea that we could have sequences of objects other than

real numbers. We study carefully their convergence properties. We learn about two fundamental

concepts: pointwise and uniform convergence of sequences of objects called functions. We are

beginning to see the need to think about sets of objects, such as functions, and how to define the

notions of convergence and so forth in this setting.

1. The Riemann Integral

2. Sequences of Functions

3. Uniform Convergence of Sequence of Functions

4. Series of Functions

4


1.1.2 The Graduate Analysis Courses

There are three basic courses here. First, linear analysis (MTHSC 821), then measure and integration

(MTHSC 822) and finally, functional analysis (MTHSC 927). MTHSC 821 is the core analysis course

and all Masters students at Clemson University must take it. Also, at Clemson University, MTHSC 821

and MTHSC 822 form the two courses which we test prospective Ph.D. students on as part of the analy-

sis preliminary examination (see www.ces.clemson.edu/∼petersj/prelims.html for details). The content of

these courses, also must fit within a web of other responsibilities. Many students are typically weak in

abstraction coming in, so if we teach the material too fast, we lose them. Now if 20 students take MTHSC

821, usually 15 or 75% are already committed to an M.S. program which emphasizes Operations Research,

Statistics, Algebra/ Combinatorics or Computation in addition to applied Analysis. Hence, currently, there

are only about 5 students in MTHSC 821 who might be interested in an M.S. specialization in analysis.

The other students typically either don’t like analysis at all and are only there because they have to be

or they like analysis but it is part of their studies in number theory, partial differential equations for the

Computation area and so forth. Either way, the students will not continue to study analysis for a degree

specialization. However, we think it is important for all students to both know and appreciate this material.

Traditionally, there are several ways to go. The Cynical Approach: Nothing you can do will make students

who don’t like analysis change their mind. So teach the material hard and fast and target the 2 - 3 students

who can benefit. The rest will come along for the ride and leave the course convinced that analysis is

just like they thought – too hard and too complicated. If you do this approach, you can pick about any

book you like. Most books for our students are too abstract and so are very hard for them to read. But

the 2 -3 students who can benefit from material at this level, will be happy with the book. We admit this

is not our style although some think it is a good way to find the really bright analysis students. We prefer

the alternate Enthusiastic “maybe I can get them interested anyway” Approach: The instructor scours the

available literature in order to make up notes to lead the students “gently” into the required abstract way

of thinking. We haven’t had much luck finding a published book for this so as is our preferred plan of

action: we type up notes such as the ones you have in your hand. These notes start out handwritten and

slowly mature into the typed versions. We believe it is important to actively try to get all the students

interested but, of course, this is never completely successful. However, we still think there is great value

in this approach and it is the one we have been trying for many years.

Introductory Linear Analysis: MTHSC 821: Our constraints for MTHSC 821 content are that we get

the students adequately exposed to a more abstract way of thinking about the world. We generally

cover

• metric spaces.

• vector spaces with a norm.

• vector spaces with an inner product.

It doesn’t sound like much but there is a lot of material in here the students haven’t seen. For

example, we typically focus a lot on how we are really talking about sets of objects with some

5


additional structure. A set plus a way to measure distance between objects gives a metric space; if

we can add and scale objects, we get a vector space; if we have a vector space and add the structure

that allows us to project a vector to a subspace, we get an inner product space. We also mention

we could have a set of objects and define one operation to obtain a group or if we define a special

collection of sets we call open, we get a topological space and so forth. If we work hard, we can

help open their minds to the fact that each of the many sub disciplines in the Mathematical Sciences

focuses on special structure we add to a set to help us solve problems in that arena.

There are lots of ways to cover the important material in these topic areas and even many ways to

decide on exactly what is important from metric, normed and inner product spaces. So there is that

kind of freedom, but not so much freedom that you can decide to drop say, inner product spaces. For

example, we could use Sturm Liouville systems as an example when the discussion turns to eigen-

values of operators. It is nice to use projection theorems in an inner product setting as a big finishing

application, but remember the students are weak in background, e.g., their knowledge of ordinary

differential equations and Calculus in <n is normally weak. So we are limited in our coverage of the

completeness of an orthonormal sequence in an inner product space in many respects. If you look

carefully at that material, you need to cover some elementary versions of the Hahn Banach theorem

to do it right. However, we run out of time to cover such advanced topics.. The trade off seems to

be between thorough coverage of a small number of topics or rapid coverage of many topics super-

ficially. We like the former approach myself, but it can be done other ways. We believe this course

is about teaching the students about the abstract way of thinking about problems and hence, we feel

there is great value in teaching very, very carefully the basics of this material.

Also, this MTHSC 821 material is a nice prerequisite for partial differential equations MTHSC 826

and ordinary differential equations (MTHSC 825) as well as statistics and probability courses.

This course takes a huge amount of time for lecture preparation and student interaction in your

office, so when we teach this material, we slow down in our research output!

In more detail, in MTHSC 821, we now begin to rephrase all of our knowledge about convergence

of sequence of objects in a much more general setting.

1. Metric Spaces: A set of objects and a way of measuring distance between objects which satis-

fies certain special properties. This function is called a metric and its properties were chosen

to mimic the properties that the absolute value function has on the real line. We learn to under-

stand convergence of objects in a general metric space. It is really important to note that there

is NO additional structure imposed on this set of objects; no linear structure (i.e. vector space

structure), no notion of a special set of elements called a basis which we can use to represent

arbitrary elements of the set. The metric in a sense generalizes the notion of distance between

numbers. We can’t really measure the size of an object by itself, so we do not yet have a way

of generalizing the idea of size or length.

A fundamentally important concept now emerges: the notion of completeness and how it is

related to our choice of metric on a set of objects. We learn a clever way of constructing an

6


abstract representation of the completion of any metric space, but at this time, we have no

practical way of seeing this representation.

2. Normed Spaces: We add linear structure to the set of objects and a way of measuring the

magnitude of an object; that is, there is now an operation we think of as addition and another

operation which allows us to scale objects and a special function called a norm whose value

for a given object can be thought of as the object’s magnitude. We then develop what we mean

by convergence in this setting. Since we have a vector space structure, we can now begin to

talk about a special subset of objects called a basis which can be used to find a useful way of

representing an arbitrary object in the space.

Another most important concept now emerges: the cardinality of this basis may be finite or

infinite. We begin to explore the consequences of a space being finite versus infinite dimen-

sional.

3. Inner Product Spaces: To a set of objects with vector space structure, we add a function called

an inner product which generalizes the notion of dot product of vectors. This has the ex-

tremely important consequence of allowing the inner product of two objects to zero even

though the objects are not the same. Hence, we can develop an abstract notion of the or-

thogonality of two objects. This leads to the idea of a basis for the set of objects in which all

the elements are mutually orthogonal. We then finally can learn how to build representations

of arbitrary objects efficiently.

4. Completions: We learn how to complete an arbitrary metric, normed or inner product space in

an abstract way, but we know very little about the practical representations of such completions.

5. Linear Operators: We study a little about functions whose domain is one set of objects and

whose range is another. These functions are typically called operators. We learn a little about

them here.

6. Linear Functionals: We begin to learn the special role that real-valued functions acting on

objects play in analysis. These types of functions are called linear functionals and learning

how to characterize them is the first step in learning how to use them. We just barely begin to

learn about this here.

Measure Theory: MTHSC 822: This course generalizes the notion of integration to a very abstract set-

ting. The set of notes you are reading is a textbook for this material. Roughly speaking, we first

realize that the Riemann integral is a linear mapping from the space of bounded real valued functions

on a compact interval into the reals which has a number of interesting properties. We then study how

we can generalize such mappings so that they can be applied to arbitrary sets X , a special collection

of subsets of X called a sigma-algebra and a new type of mapping called a measure which on <generalizes our usual notion of the length of an interval. In this class, we discuss the following:

1. The Riemann Integral

2. Measures on a sigma-algebra S in the set X and integration with respect to the measure.

7

1.2. TEACHING THE MEASURE AND INTEGRATION COURSECHAPTER 1. INTRODUCTION

3. Measures specialized to sigma-algebras on the set <n and integrations with respect to these

measures. The canonical example of this is Lebesgue measure on <n.

4. Differentiation and Integration in these abstract setting and their connections.

1.1.3 More Advanced Courses

It is also recommended that students consider taking a course in what is called Functional Analysis. Here

that is called MTHSC 927. While not part of the qualifying examination, in this course, we can finally

develop in a careful way the necessary tools to work with linear operators, weak convergence and so forth.

This is a huge area of mathematics, so there are many possible ways to design an introductory course. A

typical such course would cover:

1. The Open Mapping and Closed Graph Theorem.

2. An Introduction to General Operator Theory.

3. Topological Vector Spaces and Distributions.

4. An Introduction to the Spectral Theory of Linear Operators; this is the study of the eigenvalues and

eigenobjects for a given linear operator–lots of applications here!

5. Some advanced topic using these ideas: possibilities include

(a) Existence Theory of Boundary Value Problems.

(b) Existence Theory for Integral Equations.

(c) Existence Theory in Control.

1.2 Teaching The Measure and Integration Course

So now that you have seen how the analysis courses all fit together, it is time for the main course. So roll

up your sleeves and prepare to work! Let’s start with a few more details on what this course on Measure

and Integration will cover.

In this course, we assume mathematical maturity and we tend to follow the The Enthusiastic “maybe

I can get them interested anyway” Approach in lecturing (so, be warned)! It is difficult to decide where

to start in this course. There is usually a reasonable fraction of you who have never seen an adequate

treatment of Riemann Integration. For example, not everyone may have seen the equivalent of MTHSC

454 where Riemann integration is carefully discussed. We therefore have several versions of this course.

We have divided the material into blocks as follows: We believe there are a lot of advantages in treating

integration abstractly. So, if we covered the Lebesgue integral on < right away, we can take advantage of

a lot of the special structure < has which we don’t have in general. It is better for long term intellectual

development to see measure and integration approached without using such special structure. Also, all of

the standard theorems we want to do are just as easy to prove in the abstract setting, so why specialize to

<? So we tend to do abstract measure stuff first. The core material for Block 1 is as follows:

8


1. abstract measure ν on a sigma - algebra S of subsets of a universeX .

2. measurable functions with respect to a measure ν; these are also called random variables when ν is

a probability measure.

3. integration∫fdν

4. convergence results: monotone convergence theorem, dominated convergence theorem etc.

Then we develop the Lebesgue Integral in <n via outer measures as the great example of a nontrivial

measure. So Block 2 of material is thus

1. outer measures in <n

2. Caratheodory conditions for measurable sets

3. construction of the Lebesgue sigma algebra

4. connections to Borel sets

To fill out the course, we pick topics from the following

1. Riemann and Riemann - Stieljes integration. This would go before Block 1 if we do it. Call it block

Riemann.

2. Decomposition of measures – I love this material so this is after Block 2. Call it block Decomposi-tion.

3. Connection to Riemann integration via absolute continuity of functions. this is actually hard stuff

and takes about 3 weeks to cover nicely. Call it Block Riemann and Lebesgue. If this is done

without Block Riemann, you have to do a quick review of Riemann stuff so they can follow the

proofs.

4. Fubini type theorems. This would go after Block 2. Call this Block Fubini.

5. Differentiation via the Vitali approach. This is pretty hard too. Call this Differentiation.

6. Treatment of the usual Lp spaces. Call this Block Lp.

7. More convergence stuff like convergence in measure, Lp convergence implies convergence of a

subsequence pointwise etc. These are hard theorems and to do them right requires a lot of time. Call

this More Convergence.

We have taught this in at least the following ways. And always, lots of homework and projects, as we

believe only hands on work really makes this stuff sink in.

Way 1: Block Riemann, Block 1, Block 2 and Block Decomposition.

9


Way 2: Block 1, Block 2, Block Decomposition and Block Riemann and Lebesgue.

Way 3: Block 1, Block 2, Block Decomposition and Differentiation.

Way 4: Block 1, Block 2, Block Lp, Block More Convergence and Block Decomposition.

Way 5: Block 1, Block 2, Block Fubini, Block More Convergence and Block Decomposition.

So as you can see it will be an interesting ride!

10

Part II

Classical Riemann Integration

11

Chapter 2

An Overview Of Riemann Integration

In this Chapter, we will give you a quick overview of Riemann integration. There are few real proofs but

it is useful to have a quick tour before we get on with the job of extending this material to a more abstract

setting. Much of this material can be found in a good Calculus book although the more advanced stuff

requires that you look at a book on beginning real analysis such as (Fulks (3) 1978).

2.1 Continuity

A function f defined on an interval [a, b] where a and b are finite numbers can be quite strange. For

example, here is a legitimate function f defined on [−1, 1]:

f(t) =

1 if t is a rational number

−1 if t is an irrational number(2.1)

We will assume you know what a rational and irrational number is! Now, this function is horribly odd: it

is not possible to graph it at all. But we can understand it intellectually. This function is not differentiable

or continuous at any points! We naturally want to study and use functions that are much better behaved

than this. Since continuity and differentiability are pointwise concepts, to do anything useful for biological

modeling, we usually want functions that satisfy the requirements for continuity and differentiability at all

points in an entire interval. For example, the function f defined by

f(t) = 2t3 + 32 t + 16

13

2.2. DIFFERENTIABILITY CHAPTER 2. RIEMANN OVERVIEW

is defined at all real numbers t and even is continuous and differentiable at each such t. Recall, the defini-

tion of continuity of a function f at a point p in its domain.

Definition 2.1.1 (Continuity Of A Function At A Point: ε − δ Version).f is said to be continuous at a point p in its domain if given any tolerance ε there is a restriction δ on

the values of t so that | f(t) − f(p) |< ε if t is in the domain of f and t satisfies | t − p |< δ.

Now, this definition is written in the very formal language of mathematics. We require such precision

so that we can be absolutely clear as to what we mean. However, if we are willing to bury some of this

detail, this can be rephrased as

Definition 2.1.2 (Continuity Of A Function At A Point: Limit Version).f is said to be continuous at a point p in its domain if several conditions hold:

1. f is actually defined at p

2. The limit as t approaches p of f exists

3. The value of the limit above matches the value f(p).

This is usually stated more succinctly as f(p) exists and limt→ p f(t) = f(p), but both ways of

saying it mean the same. You should be able to understand that our polynomial f(t) = 2t3 + 32 t + 16 is

continuous at all t values using these definitions. That is, you should have studied the ideas of limits and

continuity at this level of abstraction in your past exposure to this material.

If a function is continuous at a point p, the next question we can ask is about its differentiability.

2.2 Differentiability

Recall the definition of differentiability of a function f at a point p.

Definition 2.2.1 (Differentiability of A Function At A Point).f is said to be differentiable at a point p in its domain if the limit as t approaches p, t 6= p, of the

quotients f(t)− f(p)t− p exists. When this limit exists, the value of this limit is denoted by a number of

possible symbols: f ′(p) or dfdt (p). This can also be phrased in terms of the right and left hand limits

f ′(p+) = limt→ p+f(t)− f(p)

t− p f and f ′(p−) = limt→ p−f(t)− f(p)

t− p f . If both exist and match at p,

then f ′(p) exists and the value of the derivative is the common value.

A fundamental consequence of the existence of a derivative of a function at a point t is that it must also be

continuous there. Remember that as it will be important in a lot of things later. We state this as Theorem

2.2.1

14

2.3. INTEGRATION CHAPTER 2. RIEMANN OVERVIEW

Theorem 2.2.1 (Differentiability Implies Continuity).Let f be a function which is differentiable at a point t in its domain. Then f is also continuous at t.

You should have discussed this idea carefully in your previous classes so that you know about secant lines

and the way they approximate the value of the limit in Definition 2.2.1 when it exists.

2.3 Integration

You should also have been exposed to the idea of the integration of a function f . There are two intellectu-

ally separate ideas here:

1. The idea of a Primitive or antiderivative of a function f . This is any function F which is differen-

tiable and satisfies F ′(t) = f(t) at all points in the domain of f . Normally, the domain of f is a

finite interval of the form [a, b], although it could also be an infinite interval like all of < or [1,∞)and so on. Note that an antiderivative does not require any understanding of the process of Riemann

integration at all – only what differentiation is!

2. The idea of the Riemann integral of a function. You should have been exposed to this in your first

Calculus course and perhaps a bit more rigorously in your undergraduate second semester analysis

course.

Let’s review what Riemann Integration involves. First, we start with a bounded function f on a finite

interval [a, b]. This kind of function f need not be continuous! Then select a finite number of points from

the interval [a, b], x0, x1, , . . . , xn−1, xn. We don’t know how many points there are, so a different

selection from the interval would possibly gives us more or less points. But for convenience, we will just

call the last point xn and the first point x0. These points are not arbitrary – x0 is always a, xn is always b

and they are ordered like this:

x0 = a < x1 < x2 < . . . < xn−1 < xn = b

The collection of points from the interval [a, b] is called a Partition of [a, b] and is denoted by some

letter – here we will use the letter π. So if we say π is a partition of [a, b], we know it will have n + 1points in it, they will be labeled from x0 to xn and they will be ordered left to right with strict inequalities.

But, we will not know what value the positive integer n actually is. The simplest Partition π is the two

point partition a, b. Note these things also:

1. Each partition of n+ 1 points determines n subintervals of [a, b]

2. The lengths of these subintervals always adds up to the length of [a, b] itself, b− a.

3. These subintervals can be represented as

[x0, x1], [x1, x2], . . . , [xn−1, xn]

15


or more abstractly as [xi, xi+1] where the index i ranges from 0 to n− 1.

4. The length of each subinterval is xi+1 − xi for the indices i in the range 0 to n− 1.

Now from each subinterval [xi, xi+1] determined by the Partition π, select any point you want and call

it si. This will give us the points s0 from [x0, x1], s1 from [x1, x2] and so on up to the last point, sn−1

from [xn−1, xn]. At each of these points, we can evaluate the function f to get the value f(sj). Call these

points an Evaluation Set for the partition π. Let’s denote such an evaluation set by the letter σ. Note

there are many such evaluation sets that can be chosen from a given partition π. We will leave it up to you

to remember that when we use the symbol σ, you must remember it is associated with some partition.

If the function f was nice enough to be positive always and continuous, then the product f(si) ×(xi+1 − xi) can be interpreted as the area of a rectangle. Then, if we add up all these rectangle areas

we get a sum which is useful enough to be given a special name: the Riemann sum for the function f

associated with the Partition π and our choice of evaluation set σ = s0, . . . , sn−1. This sum is rep-

resented by the symbol S(f,π,σ) where the things inside the parenthesis are there to remind us that this

sum depends on our choice of the function f , the partition π and the evaluations set σ. So formally, we

have the definition

Definition 2.3.1 (Riemann Sum).The Riemann sum for the bounded function f , the partition π and the evaluation set σ =s0, . . . , sn−1 from πx0, x1, , . . . , xn−1, xn is defined by

S(f,π,σ) =n−1∑i=0

f(si) (xi+1 − xi)

It is pretty misleading to write the Riemann sum this way as it can make us think that the n is always

the same when in fact it can change value each time we select a different partition π. So many of us

write the definition this way instead

S(f,π,σ) =∑i ∈ π

f(si) (xi+1 − xi) =∑π

f(si) (xi+1 − xi)

and we just remember that the choice of π will determine the size of n.

2.3.1 A Riemann Sum Example

Let’s look at an example of all this. In Figure 2.1, we see the graph of a typical function which is always

positive on some finite interval [a, b]

Next, let’s set the interval to be [1, 6] and compute the Riemann Sum for a particular choice of Partition

π and evaluation set π. This is shown in Figure 2.2.

We can also interpret the Riemann sum as an approximation to the area under the curve as shown in

Figure 2.1. This is shown in Figure 2.3.

16


(a, f(a))(b, f(b))

a b

A generic curve f on the inter-val [a, b] which is always positive.Note the area under this curve is theshaded region.

Figure 2.1: The Area Under The Curve f

2.3.2 The Riemann Integral As A Limit

We can construct many different Riemann Sums for a given function f . To define the Riemann Integral of

f , we only need a few more things:

1. Each partition π has a maximum subinterval length – let’s use the symbol || π || to denote this

length. We read the symbol || π || as the norm or gauge of π.

2. Each partition π and evaluation set σ determines the number S(f,π,σ) by a simple calculation.

3. So if we took a collection of partitions π1, π2 and so on with associated evaluation sets σ1, σ2 etc.,

we would construct a sequence of real numbers S(f,π1,σ1), S(f,π2,σ2), . . . , , S(f,πn,σn), . . . , .Let’s assume the norm of the partition πn gets smaller all the time; i.e. limn→∞ || πn ||= 0. We

could then ask if this sequence of numbers converges to something.

What if the sequence of Riemann sums we construct above converged to the same number I no matter

what sequence of partitions whose norm goes to zero and associated evaluation sets we chose? Then, we

would have that the value of this limit is independent of the choices above. This is indeed what we mean

by the Riemann Integral of f on the interval [a, b].

17


(1, f(1)) (6, f(6))

1 6

The partition is π =1.0, 1.5, 2.6, 3.8, 4.3, 5.6, 6.0.Hence, we have subinterval lengthsof x1 − x0 = 0.5, x2 − x1 = 1.1,x3 − x2 = 1.2, x4 − x3 = 0.5,x5 − x4 = 1.3 and x6 − x5 = 0.4,giving || P ||= 1.3. Thus,

S(f,π,σ) =5∑i=0

f(si) (xi+1 − xi)

For the evaluation set σ = 1.1, 1.8, 3.0, 4.1, 5.3, 5.8 shown in red in Figure 2.2,we would find the Riemann sum is

S(f,π,σ) = f(1.1)× 0.5+ f(1.8)× 1.1+ f(3.0)× 1.2+ f(4.1)× 0.5+ f(5.3)× 1.3+ f(5.8)× 0.4

Of course, since our picture shows a generic f , we can’t actually put in the functionvalues f(si)!

Figure 2.2: A Simple Riemann Sum

Definition 2.3.2 (Riemann Integrability Of A Bounded Function).Let f be a bounded function on the finite interval [a, b]. if there is a number I so that

limn→∞

S(f,πn,σn) = I

no matter what sequence of partitions πn with associated sequence of evaluation sets σn we

choose as long as limn→∞ || πn || = 0, we will say that the Riemann Integral of f on [a, b] exists

and equals the value I .

The value I is dependent on the choice of f and interval [a, b]. So we often denote this value by I(f, [a, b])or more simply as, I(f, a, b). Historically, the idea of the Riemann integral was developed using area

approximation as an application, so the summing nature of the Riemann Sum was denoted by the 16th

18


(1, f(1)) (6, f(6))

1 6

The partition is π =1.0, 1.5, 2.6, 3.8, 4.3, 5.6, 6.0.

Figure 2.3: The Riemann Sum As An Approximate Area

century letter S which resembled an elongated or stretched letter S which looked like what we call the

integral sign∫

. Hence, the common notation for the Riemann Integral of f on [a, b], when this value

exists, is∫ ba f . We usually want to remember what the independent variable of f is also and we want to

remind ourselves that this value is obtained as we let the norm of the partitions go to zero. The symbol

dx for the independent variable x is used as a reminder that xi+1 − xi is going to zero as the norm of

the partitions goes to zero. So it has been very convenient to add to the symbol∫ ba f this information and

use the augmented symbol∫ ba f(x) dx instead. Hence, if the independent variable was t instead of x, we

would use∫ ba f(t) dt. Since for a function f , the name we give to the independent variable is a matter of

personal choice, we see that the choice of variable name we use in the symbol∫ ba f(t) dt is very arbitrary.

Hence, it is common to refer to the independent variable we use in the symbol∫ ba f(t) dt as the dummy

variable of integration.

We need a few more facts. We shall prove later the following things are true about the Riemann Inte-

gral of a bounded function. First, we know when a bounded function actually has a Riemann integral from

Theorem 2.3.1.

19


Theorem 2.3.1 (Existence Of The Riemann Integral).Let f be a bounded function on the finite interval [a, b]. Then the Riemann integral of f on [a, b],∫ ba f(t)dt exists if

1. f is continuous on [a, b]

2. f is continuous except at a finite number of points on [a, b].

Further, if f and g are both Riemann integrable on [a, b] and they match at all but a finite number of

points, then their Riemann integrals match; i.e.∫ ba f(t)dt equals

∫ ba g(t)dt.

The function given by Equation 2.1 is bounded but continuous nowhere on [−1, 1] and it is indeed

possible to prove it does not have a Riemann integral on that interval. However, most of the functions we

want to work with do have a lot of smoothness, i.e. continuity and even differentiability on the intervals

we are interested in. Hence, Theorem 2.3.1 will apply. Here are some examples:

1. If f(t) is t2 on the interval [−2, 4], then∫ 4−2 f(t)dt does exist as f is continuous on this interval.

2. If g was defined by

g(t) =

t2 −2 ≤ t < 1 and 1 < t ≤ 45 t = 1

we see g is not continuous at only one point and so it is Riemann integrable on [−2, 4]. Moreover,

since f and g are both integrable and match at all but one point, their Riemann integrals are equal.

However, with that said, in this course, we want to relax the smoothness requirements on the functions

f we work with and define a more general type of integral for this less restricted class of functions.

2.3.3 The Fundamental Theorem Of Calculus

There is a big connection between the idea of the antiderivative of a function f and its Riemann integral.

For a positive function f on the finite interval [a, b], we can construct the area under the curve function

F (x) =∫ xa f(t) dt where for convenience we choose an x in the open interval (a, b). We show F (x) and

F (x + h) for a small positive h in Figure 2.4. Let’s look at the difference in these areas:

F (x + h) − F (x) =∫ x+h

af(t) dt −

∫ x

af(t) dt

=∫ x

af(t) dt +

∫ x+h

xf(t) dt −

∫ x

af(t) dt

=∫ x+h

xf(t) dt

20


where we have used standard properties of the Riemann integral to write the first integral as two pieces

and then do a subtraction. Now divide this difference by the change in x which is h. We find

F (x + h) − F (x)h

=1h

∫ x+h

xf(t) dt (2.2)

The difference in area,∫ x+hx f(t) dt, is the second shaded area in Figure 2.4. Clearly, we have

F (x + h) − F (x) =∫ x+h

xf(t) dt (2.3)

We know that f is bounded on [a, b]; hence, there is a number B so that f(t) ≤ B for all t in [a, b]. Thus,

using Equation 2.3, we see

F (x + h) − F (x) ≤∫ x+h

xB dt = B h (2.4)

From this we can see that

limh→ 0

(F (x + h) − F (x)) ≤ limh→ 0

B h

= 0

We conclude that F is continuous at each x in [a, b] as

limh→ 0

(F (x + h) − F (x)) = 0

It seems that the new function F we construct by integrating the function f in this manner, always builds

a new function that is continuous. Is F differentiable at x? If f is continuous at x, then given a positive ε,

there is a positive δ so that

f(x)− ε < f(t) < f(x) + ε if x− δ < t < x+ δ

and t is in [a, b]. So, if h is less than δ, we have

1h

∫ x+h

x(f(x)− ε) <

F (x + h) − F (x)h

=1h

∫ x+h

xf(t) dt <

1h

∫ x+h

x(f(x) + ε)

This is easily evaluated to give

21


f(x)− ε < F (x + h) − F (x)h

=∫ x+h

xf(t) dt < f(x) + ε

if h is less than δ. This shows that

limh→ 0+

F (x + h) − F (x)h

= f(x)

You should be able to believe that a similar argument would work for negative values of h: i.e.,

limh→ 0−

F (x + h) − F (x)h

= f(x)

This tells us that F ′(x) exists and equals f(x) as long as f is continuous at x as

F ′(x+) = limh→ 0+

F (x + h) − F (x)h

= f(x)

F ′(x−) = limh→ 0−

F (x + h) − F (x)h

= f(x)

This relationship is called Fundamental Theorem of Calculus. The same sort of argument works for x

equals a or b but we only need to look at the derivative from one side. We will prove this sort of theorem

using fairly relaxed assumptions on f for the interval [a, b] in the later Chapters. Even if we just consider

the world of Riemann Integration, we only need to assume that f is Riemann Integrable on [a, b] which

allows for jumps in the function.

Theorem 2.3.2 (Fundamental Theorem Of Calculus).Let f be Riemann Integrable on [a, b]. Then the function F defined on [a, b] by F (x) =

∫ xa f(t) dt

satisfies

1. F is continuous on all of [a, b]

2. F is differentiable at each point x in [a, b] where f is continuous and F ′(x) = f(x).

Using the same f as before, suppose G was defined on [a, b] as follows

G(x) =∫ b

xf(t) dt.

Note that

22


(a, f(a))

(b, f(b))

a bx x + h

F (x) F (x + h)

A generic curve f on theinterval [a, b] which isalways positive. We letF (x) be the area underthis curve from a to x.This is indicated by theshaded region.

Figure 2.4: The Function F (x)

F (x) + G(x) =∫ x

af(t) dt +

∫ b

xf(t) dt

=∫ b

af(t) dt.

Since the Fundamental Theorem of Calculus tells us F is differentiable, we seeG(x) =∫ ba f(t)dt−F (x)

must also be differentiable. It follows that

G′(x) = − F ′(x) = −f(x).

Let’s state this as a variant of the Fundamental Theorem of Calculus, the Reversed Fundamental Theorem

of Calculus so to speak.

23


Theorem 2.3.3 (Fundamental Theorem Of Calculus Reversed).Let f be Riemann Integrable on [a, b]. Then the function F defined on [a, b] by F (x) =

∫ bx f(t) dt

satisfies


2. F is differentiable at each point x in [a, b] where f is continuous and F ′(x) = −f(x).

2.3.4 The Cauchy Fundamental Theorem Of Calculus

We can use the Fundamental Theorem of Calculus to learn how to evaluate many Riemann integrals. Let

G be an antiderivative of the function f on [a, b]. Then, by definition, G′(x) = f(x) and so we know G

is continuous at each x. But we still don’t know that f itself is continuous. However, if we assume f is

continuous, then if we define F on [a, b] by

F (x) = f(a) +∫ x

af(t) dt,

the Fundamental Theorem of Calculus, Theorem 2.3.2, is applicable. Thus, F ′(x) = f(x) at each point.

But that means F ′ = G′ = f at each point. Functions whose derivatives are the same must differ by a

constant. Call this constant C. We thus have F (x) = G(x) + C. So, we have

F (b) = f(a) +∫ b

af(t)dt = G(b) + C

F (a) = f(a) +∫ a

af(t)dt = G(a) + C

But∫ aa f(t) dt is zero, so we conclude after some rewriting

G(b) = f(a) +∫ b

af(t)dt + C

G(a) = f(a) + C

And after subtracting, we find the important result

G(b) − G(a) =∫ b

af(t)dt

24


This is huge! This is what tells us how to integrate many functions. For example, if f(t) = t3, we can

guess the antiderivatives have the form t4/4 + C for an arbitrary constant C. Thus, since f(t) = t3 is

continuous, the result above applies. We can therefore calculate Riemann integrals like these:

1. ∫ 3

1t3 dt =

t4

4

∣∣∣∣31

=34

4− 14

4

=804

2. ∫ 4

−2t3 dt =

t4

4

∣∣∣∣4−2

=44

4− (−2)4

4

=2564− 16

4

=2404

Let’s formalize this as a theorem called the Cauchy Fundamental Theorem of Calculus. All we really

need to prove this result is that f is Riemann integrable on [a, b], which is true if our function f is contin-

uous.

Theorem 2.3.4 (Cauchy Fundamental Theorem Of Calculus).Let G be any antiderivative of the Riemann integrable function f on the interval [a, b]. Then G(b) −G(a) =

∫ ba f(t) dt.

2.3.5 Applications

With the Cauchy Fundamental Theorem of Calculus under our belt, we can guess a lot of antiderivatives

and from that know how to evaluate many Riemann integrals. Let’s get started.

1. It is easy to guess the antiderivative of a power of t as we have already mentioned. We know the

antiderivative of the following are easy to figure out:

(a) If f(t) = t5, then the antiderivative of f is any function of the form F (t) = t6/6 + C where

C can be any constant.

(b) If f(t) = t−5, it is still easy to guess the antiderivative which is F (t) = t−4/(−4) + C,

where C is an arbitrary constant.

25


The common symbol for the antiderivative of f has evolved to be∫f because of the close con-

nection between the antiderivative of f and the Riemann integral of f which is given in the Cauchy

Fundamental Theorem of Calculus, Theorem 2.3.4. The usual Riemann integral,∫ ba f(t) dt of f on

[a, b] computes a definite value – hence, the symbol∫ ba f(t) dt is usually referred to as the definite

integral of f on [a, b] to contrast it with the family of functions represented by the antiderivative∫f .

Since the antiderivatives are arbitrary up to a constant, most of us refer to the antiderivative as the

indefinite integral of f . Also, we hardly ever say “let’s find the antiderivative of f” – instead, we

just say, “let’s integrate f”. We will begin using this shorthand now! We can state these results as

Theorem 2.3.5.

Theorem 2.3.5 (Antiderivatives Of Simple Powers).If p is any power other than −1, then the antiderivative of f(t) = tp is F (t) = tp+1/(p+ 1) + C.

This is also expressed as∫tp dt = tp+1/(p+ 1) + C

2. The Riemann integral of the function f on [a, b] can also be easily computed. We state this Theorem

2.3.6

Theorem 2.3.6 (Definite Integrals Of Simple Powers).If p is any power other than −1, then the definite integral of f(t) = tp on [a, b] is

∫ ba tp dt =

tp+1/(p+ 1)∣∣∣∣ba

3. The simple trigonometric functions sin(t) and cos(t) also have straightforward antiderivatives as

shown in Theorem 2.3.7.

Theorem 2.3.7 (Antiderivatives of Simple Trigonometric Functions).

(a) The antiderivative of sin(t) equals − cos(t) + C

(b) The antiderivative of cos(t) equals sin(t) + C

4. The definite integrals of the sin and cos functions are then:

26


Theorem 2.3.8 (Definite Integrals Of Simple Trigonometric Functions).

(a)∫ ba sin(t) dt is − cos(t)

∣∣∣∣ba

(b)∫ ba cos(t) is sin(t)

∣∣∣∣ba

2.3.6 Simple Substitution Techniques

We can use the tools above to figure out how to integrate many functions that seem complicated but instead

are just disguised versions of simple power function integrations. Let’s go through some in great detail.

Exercise 2.3.1. Compute∫

(t2 + 1) 2t dt

Solution 2.3.1. When you look at this integral, you should train yourself to see the simpler integral∫udu

where u(t) = t2 + 1. Here are the steps:

1. We make the change of variable u(t) = t2 + 1. Now differentiate both sides to see u′(t) = 2t.Thus, we have ∫

(t2 + 1) 2t dt =∫

u(t) u′(t) dt

2. Now recall the chain rule for powers of functions, we know

((u(t))2

)′ (t) = 2 u(t) u′(t)

Thus,

u(t) u′(t) =12((u(t))2

)′ (t)This then tells us that ∫

(t2 + 1) 2t dt =∫

u(t) u′(t) dt

=∫

12((u(t))2

)′ (t)dtNow, the notation

∫ ((u(t))2

)′ (t)dt is just our way of asking for the antiderivative of the function

behind the integral sign. Here, that function is (u2)′. This antiderivative is, of course, just u2!

27


Plugging that into the original problem, we find∫(t2 + 1) 2t dt =

∫u(t) u′(t) dt

=∫

12((u(t))2

)′ (t)dt=

12u2(t) + C

=12

(t2 + 1)2 + C

Whew!! That was awfully complicated looking. Let’s do it again in a bit more streamlined fashion.

Note all of the steps we go through below are the same as the longer version above, but since we write

less detail down, it is much more compact. You need to get very good at understanding and doing all these

steps!! Here is the second version:

Solution 2.3.2.

1. We make the change of variable u(t) = t2 + 1. But we write this more simply as u = t2 + 1so that the dependence of u on t is implied rather than explicitly stated. This simplifies our notation

already! Now differentiate both sides to see u′(t) = 2t. We will write this as du = 2t dt, again

hiding the t variable, using the fact that dudt = 2t can be written in its differential form (you should

have seen this idea in your first Calculus course). Thus, we have∫(t2 + 1) 2t dt =

∫u du

2. The antiderivative of u is u2/2 + C and so we have∫(t2 + 1) 2t dt =

∫u du

=12u2 + C

=12

(t2 + 1)2 + C

Now let’s try one a bit harder:


(t2 + 1)3 4tdt

Solution 2.3.3. When you look at this integral, again you should train yourself to see the simpler integral

2∫u3 du where u(t) = t2 + 1. Here are the steps: first, the detailed version

28


1. We make the change of variable u(t) = t2 + 1. Now differentiate both sides to see u′(t) = 2t.Thus, we have ∫

(t2 + 1)3 4tdt = 2∫

u3(t) u′(t) dt

2. Now recall the chain rule for powers of functions, we know

((u(t))4

)′ (t) = 4 u3(t) u′(t)

Thus,

2 u3(t) u′(t) = 214((u(t))4

)′ (t)This then tells us that ∫

(t2 + 1)3 4dt = 2∫

u3(t) u′(t) dt

=∫

12((u(t))4

)′ (t)dtNow, the notation

∫ ((u(t))4

)′ (t)dt is just our way of asking for the antiderivative of the function

behind the integral sign. Here, that function is (u4)′. This antiderivative is, of course, just u4!

Plugging that into the original problem, we find∫(t2 + 1)3 4dt = 2

∫u3(t) u′(t) dt

=12u4(t) + C

=12

(t2 + 1)4 + C

Again, this was awfully complicated looking. the streamlined version is as follows:

1. We make the change of variable u(t) = t2 + 1. Now differentiate both sides to see u′(t) = 2t and

write this as du = 2t dt. Thus, we have∫(t2 + 1)3 4dt = 2

∫u3 du

2. The antiderivative of u3 is u4/4 + C and so we have∫(t2 + 1)3 4dt = 2

∫u3 du

=12u4 + C

29


=12

(t2 + 1)4 + C

Now let’s do one the short way only.

Exercise 2.3.3. Compute∫ √

t2 + 1 3t dt.


3/2∫u1/2 du where u(t) = t2 + 1. Here are the steps: we know du = 2t dt. Thus∫ √

t2 + 1 3t dt =32

∫u

12 du

=32

132

u32 + C

=32

23

(t2 + 1)32 + C


sin(t2 + 1) 5t dt.


5/2∫

sin(u) du where u(t) = t2 + 1. Here are the steps: we know du = 2t dt. Thus∫sin(t2 + 1) 5t dt =

52

∫sin(u) du

=52

(− cos(u)) + C

= −52

cos(t2 + 1) + C

Now let’s do a definite integral:

Exercise 2.3.5. Compute∫ 5

1 (t2 + 2t + 1)2 (t + 1) dt.


1/2∫u2 du where u(t) = t2 + 2t + 1. Here are the steps: we know du = (2t + 2)dt. Thus∫ 5

1(t2 + 2t + 1)2 (t + 1) dt =

12

∫ t=5

t=1u2 du

where we label the bottom and top limit of the integral in terms of the t variable to remind ourselves that

the original integration was respect to t. Then,

12

∫ t=5

t=1u2 du =

12u3

3|t=5t=1

=12

13

(t2 + 1)3

∣∣∣∣51

30

2.4. HANDLING JUMPS CHAPTER 2. RIEMANN OVERVIEW

=16((26)3 − 23

)We will prove general substitution theorems for Riemann Integrable functions later. But it is really just

an application of the chain rule!

2.4 The Riemann Integral of Functions With Jumps

Now let’s look at the Riemann integral of functions which have points of discontinuity.

2.4.1 Removable Discontinuity

Consider the function f defined on [−2, 5] by

f(t) =

2t −2 ≤ t < 01 t = 0(1/5)t2 0 < t ≤ 5

Let’s calculate F (t) =∫ t−2 f(s) ds. This will have to be done in several parts because of the way f

is defined.

1. On the interval [−2, 0], note that f is continuous except at one point, t = 0. Hence, f is Riemann

integrable by Theorem 2.3.1. Also, the function 2t is continuous on this interval and so is also

Riemann integrable. Then since f on [−2, 0] and 2t match at all but one point on [−2, 0], their

Riemann integrals must match. Hence, if t is in [−2, 0], we compute F as follows:

F (t) =∫ t

−2f(s) ds

=∫ t

−22s ds

= s2

∣∣∣∣t−2

= t2 − (−2)2 = t2 − 4

2. On the interval [0, 5], note that f is continuous except at one point, t = 0. Hence, f is Riemann

integrable by Theorem 2.3.1. Also, the function (1/5)t2 is continuous on this interval and is there-

fore also Riemann integrable. Then since f on [0, 5] and (1/5)t2 match at all but one point on [0, 5],their Riemann integrals must match. Hence, if t is in [0, 5], we compute F as follows:

F (t) =∫ t

−2f(s) ds

=∫ 0

−2f(s) ds +

∫ t

0f(s) ds

31


=∫ 0

−22s ds +

∫ t

0(1/5)s2 ds

= s2

∣∣∣∣0−2

+ (1/15)s3

∣∣∣∣t0

= −4 + t3/15

Thus, we have found that

F (t) =

t2 − 4 −2 ≤ t < 0t3/15 − 4 0 < t ≤ 5

Note, we didn’t define F at t = 0 yet. Since f is Riemann Integrable on [−2, 5], we know from the

Fundamental Theorem of Calculus, Theorem 2.3.2, that F must be continuous. Let’s check. F is clearly

continuous on either side of 0 and we note that limt→ 0− F (t) which is F (0−) is −4 which is exactly the

value of F (0+). Hence, F is indeed continuous at 0 and we can write

F (t) =

t2 − 4 −2 ≤ t ≤ 0t3/15 − 4 0 ≤ t ≤ 5

What about the differentiability of F ? The Fundamental Theorem of Calculus guarantees that F has a

derivative at each point where f is continuous and at those points F ′(t) = f(t). Hence, we know this is

true at all t except 0. Note at those t, we find

F ′(t) =

2t −2 ≤ t < 0(1/5)t2 0 < t ≤ 5

which is exactly what we expect. Also, note F ′(0−) = 0 and F ′(0+) = 0 as well. Hence, since the right

and left hand derivatives match, we see F ′(0) does exist and has the value 0. But this is not the same as

f(0) = 1. Note, F is not the antiderivative of f on [−2, 5] because of this mismatch.

2.4.2 Jump Discontinuity

Now consider the function f defined on [−2, 5] by

f(t) =

2t −2 ≤ t < 01 t = 02 + (1/5)t2 0 < t ≤ 5

Let’s calculate F (t) =∫ t−2 f(s) ds. Again, this will have to be done in several parts because of the

way f is defined.

32


1. On the interval [−2, 0], note that f is continuous except at one point, t = 0. Hence, f is Riemann

integrable by Theorem 2.3.1. Also, the function 2t is continuous on this interval and hence is also

Riemann integrable. Then since f on [−2, 0] and 2t match at all but one point on [−2, 0], their


F (t) =∫ t

−2f(s) ds

=∫ t

−22s ds

= s2

∣∣∣∣t−2

= t2 − (−2)2 = t2 − 4

2. On the interval [0, 5], note that f is continuous except at one point, t = 0. Hence, f is Riemann

integrable by Theorem 2.3.1. Also, the function 2 + (1/5)t2 is continuous on this interval and so is

also Riemann integrable. Then since f on [0, 5] and 2 + (1/5)t2 match at all but one point on [0, 5],their Riemann integrals must match. Hence, if t is in [0, 5], we compute F as follows:

F (t) =∫ t

−2f(s) ds

=∫ 0

−2f(s) ds +

∫ t

0f(s) ds

=∫ 0

−22s ds +

∫ t

0(2 + (1/5)s2) ds

= s2

∣∣∣∣0−2

+ (2s + (1/15)s3)∣∣∣∣t0

= −4 + 2t + t3/15


F (t) =

t2 − 4 −2 ≤ t < 0−4 + 2t + t3/15 0 < t ≤ 5

As before, we didn’t define F at t = 0 yet. Since f is Riemann Integrable on [−2, 5], we know from the

Fundamental Theorem of Calculus, Theorem 2.3.2, that F must be continuous. F is clearly continuous

on either side of 0 and we note that limt→ 0− F (t) which is F (0−) is −4 which is exactly the value of

F (0+). Hence, F is indeed continuous at 0 and we can write

F (t) =

t2 − 4 −2 ≤ t ≤ 0−4 + 2t + t3/15 0 ≤ t ≤ 5

33


What about the differentiability of F ? The Fundamental Theorem of Calculus guarantees that F has a


true at all t except 0. Note at those t, we find

F ′(t) =

2t −2 ≤ t < 02 + (1/5)t2 0 < t ≤ 5

which is exactly what we expect. However, when we look at the one sided derivatives, we find F ′(0−) = 0and F ′(0+) = 2. Hence, since the right and left hand derivatives do not match, we see F ′(0) does not

exist. Finally, note F is not the antiderivative of f on [−2, 5] because of this mismatch.

2.4.3 Homework

Exercise 2.4.1. Compute∫ t−3 f(s) ds for

f(t) =

3t −3 ≤ t < 06 t = 0(1/6)t2 0 < t ≤ 6

1. Graph f and F carefully labeling all interesting points.

2. Verify that F is continuous and differentiable at all points but F ′(0) does not match f(0) and so F

is not the antiderivative of f on [−3, 6]

Exercise 2.4.2. Compute∫ t

0 f(s) ds for

f(t) =

−2t 2 ≤ t < 512 t = 53t − 25 5 < t ≤ 10


2. Verify that F is continuous and differentiable at all points but F ′(5) does not match f(5) and so F

is not the antiderivative of f on [2, 10]

Exercise 2.4.3. Compute∫ t−3 f(s) ds for

f(t) =

3t −3 ≤ t < 06 t = 0(1/6)t2 + 2 0 < t ≤ 6


2. Verify that F is continuous and differentiable at all points except 0 and so F is not the antiderivative

of f on [−3, 6]

34


Exercise 2.4.4. Compute∫ t

0 f(s) ds for

f(t) =

−2t 2 ≤ t < 512 t = 53t 5 < t ≤ 10


2. Verify that F is continuous and differentiable at all points except 5 and so F is not the antiderivative

of f on [2, 10]

35


36

Chapter 3

Functions Of Bounded Variation

Now that we have seen a quick overview of what Riemann Integration entails, let’s go back and look at

it very carefully. This will enable us to extend it to a more general form of integration called Riemann- Stieljes. From what we already know about Riemann integrals, the Riemann integral is a mapping φ

which is linear and whose domain is some subspace of the vector space of all bounded functions. Let

B[a, b] denote this vector space which is a normed linear space using the usual infinity norm. The set of

all Riemann Integrable Functions can be denoted by the symbol RI[a, b] and we know it is a subspace

of B[a, b]. We also know that the subspace C[a, b] of all continuous functions on [a, b] is contained in

RI[a, b]. In fact, if PC[a, b] is the set of all functions on [a, b] that are piecewise continuous, then PC[a, b]is also a vector subspace contained in RI[a, b]. Hence, we know φ : RI[a, b] ⊆ B[a, b] → < is a linear

functional on the subspace RI[a, b]. Also, if f is not zero, then

|∫ ba f(t) dt ||| f ||∞

≤∫ ba | f(t) | dt|| f ||∞

≤∫ ba || f ||∞ dt

|| f ||∞= b− a

Thus, we see that || φ ||op is finite and φ is a bounded linear functional on a subspace of B[a, b] if we

use the infinity norm on RI[a, b]. But of course, we can choose other norms. There are clearly many

functions in B[a, b] that do not fit nicely into the development process for the Riemann Integral. So let

NI[a, b] denote a new subspace of functions which contains RI[a, b]. We know that the Riemann integral

satisfies an important idea in analysis called limit interchange. That is, if a sequence of functions fnfrom RI[a, b] converges in infinity norm to f that the following facts hold:

1. f is also in RI[a, b]

37

3.1. PARTITIONS CHAPTER 3. BOUNDED VARIATION

2. the classic limit interchange holds:

limn→∞

∫ b

afn(t) dt =

∫ b

a

(limn→∞

fn(t))dt

We can say this more abstractly as this: if fn → f in || · ||∞ in RI[a, b], then f remains in RI[a, b]and

limn→∞

φ (fn) = φ(

limn→∞

fn

)But if we wanted to extend φ to the larger subspace NI[a, b] in such a way that it remained a bounded

linear functional, we would also want to know what kind of sequence convergence we should use in order

for the interchange ideas to work. There are lots of questions:

1. Do we need to impose a norm on our larger subspace NI[a, b]?

2. Can we characterize the subspace NI[a, b] in some fashion?

3. If the extension is called φ, we want to make sure that φ is exactly φ when we restrict our attention

to functions in RI[a, b]

Also, do we have to develop integration only on finite intervals [a, b] of <? How do we even extend

traditional Riemann integration to unbounded intervals of <? All of these questions will be answered in

the upcoming chapters, but first we will see how far we can go with the traditional Riemann approach.

We will also see where the Riemann integral approach breaks down and makes us start to think of more

general tools so that we can get our work done.

3.1 Partitions

Definition 3.1.1 (Partition).A partition of the finite interval [a, b] is a finite collection of points, x0, . . . , xn, ordered so that

a = x0 < x1 < · · · < xn = b. We denote the partition by π and call each point xi a partition point.

For each j = 1, . . . , n − 1, we let ∆xj = xj+1 − xj . The collection of all finite partitions of [a, b] is

denoted Π[a, b].

Definition 3.1.2 (Partition Refinements).The partition π1 = y0, . . . , ym is said to be a refinement of the partition π2 = x0, . . . , xn if

every partition point xj ∈ π2 is also in π1. If this is the case, then we write π2 π1, and we say that

π1 is finer than π2 or π2 is coarser than π1.

38

3.2. MONOTONE CHAPTER 3. BOUNDED VARIATION

Definition 3.1.3 (Common Refinement).Given π1, π2 ∈ Π[a, b], there is a partition π3 ∈ Π[a, b] which is formed by taking the union of π1 and

π2 and using common points only once. We call this partition the common refinement of π1 and π2

and denote it by π3 = π1 ∨ π2.

Comment 3.1.1. The relation is a partial ordering of Π[a, b]. It is not a total ordering, since not all

partitions are comparable. There is a coarsest partition, also called the trivial partition. It is given by

π0 = a, b. We may also consider uniform partitions of order k. Let h = (b − a)/k. Then π = x0 =a, x0 + h, x0 + 2h, . . . , xk−1 = x0 + (k − 1)h, xk = b.

Proposition 3.1.1 (Refinements and Common Refinements).If π1, π2 ∈ Π[a, b], then π1 π2 if and only if π1 ∨ π2 = π2.

Proof 3.1.1. If π1 π2, then π1 = x0, . . . , xp ⊂ y0, . . . , yq = π2. Thus, π1 ∪ π2 = π2, and we

have π1 ∨ π2 = π2. Conversely, suppose π1 ∨ π2 = π2. By definition, every point of π1 is also a point of

π1 ∨ π2 = π2. So, π1 π2.

Definition 3.1.4 (The Gauge or Norm of a Partition).For π ∈ Π[a, b], we define the gauge of π, denoted ‖π‖, by ‖π‖ = max∆xj : 1 ≤ j ≤ p.

3.1.1 Homework

Exercise 3.1.1. Prove that the relation is a partial ordering of Π[a, b].

Exercise 3.1.2. Fix π1 ∈ Π[a, b]. The set C(π1) = π ∈ Π[a, b] : π1 π is called the core determined

by π1. It is the set of all partitions of [a, b] that contain (or are finer than) π1.

1. Prove that if π1 π2, then C(π2) ⊂ C(π1).

2. Prove that if ‖π1‖ < ε, then ‖π‖ < ε for all π ∈ C(π1).

3. Prove that if ‖π1‖ < ε and π2 ∈ Π[a, b], then ‖π1 ∨ π2‖ < ε.

3.2 Monotone Functions

In our investigations of how monotone functions behave, we will need two fundamental facts about infi-

mum and supremum of a set of numbers which are given in Lemma 3.2.1 and Lemma 3.2.2.

39


Lemma 3.2.1 (The Infimum Tolerance Lemma).Let S be a nonempty set of numbers that is bounded below. Then given any tolerance ε, there is at

least one element s in S so that

inf(S) ≤ s < inf(S) + ε

Proof 3.2.1. This is an easy proof by contradiction. Assume there is some ε so that no matter what s from

S we choose, we have

s ≥ inf(S) + ε

This says that inf(S) + ε is a lower bound for S and so by definition, inf(S) must be bigger than or equal

to this lower bound. But this is clearly not possible. So the assumption that such a tolerance ε exists is

wrong and the conclusion follows.

and

Lemma 3.2.2 (The Supremum Tolerance Lemma).Let T be a nonempty set of numbers that is bounded above. Then given any tolerance ε, there is at

least one element t in T so that

sup(T ) − ε < t ≤ sup(T )

Proof 3.2.2. This again is an easy proof by contradiction and we include it for completeness. Assume

there is some ε so that no matter what t from T we choose, we have

t ≤ sup(T ) − ε

This says that sup(T ) − ε is an upper bound for T and so by definition, sup(T ) must be less than or equal

to this upper bound. But this is clearly not possible. So the assumption that such a tolerance ε exists is

wrong and the conclusion must follow.

We are now in a position to discuss carefully monotone functions and other functions built from them.

We follow discussions in (Douglas (2) 1996) at various places.

Definition 3.2.1 (Monotone Functions).A real-valued function f : [a, b] → R is said to be increasing (respectively, strictly increasing) if

x1, x2 ∈ [a, b], x1 < x2 ⇒ f(x1) ≤ f(x2) (respectively, f(x1) < f(x2)). Similar definitions hold

for decreasing and strictly decreasing functions.

40


Theorem 3.2.3 (A Monotone Function Estimate).Let f be increasing on [a, b], and let π = x0, . . . , xp be in Π[a, b]. For any c ∈ [a, b], define

f(c+) = limx→c+

f(x) and f(c−) = limx→c−

f(x),

where we define f(a−) = f(a) and f(b+) = f(b). Then

p∑j=0

[f(x+j )− f(x−j )] ≤ f(b)− f(a).

Proof 3.2.3. First, we note that f(x+) and f(x−) always exist. The proof of this is straightforward. For

x ∈ (a, b], let Tx = f(y) : a ≤ y < x. Then Tx is bounded above by f(x), since f is monotone

increasing. Hence, Tx has a well-defined supremum. Let ε > 0 be given. Then, using the Supremum

Tolerance Lemma, Lemma 3.2.2, there is a y∗ ∈ [a, x) such that supTx − ε < f(y∗) ≤ supTx. For any

y ∈ (y∗, x), we have f(y∗) ≤ f(y) since f is increasing. Thus, 0 ≤ (supTx−f(y)) ≤ (supTx−f(y∗)) <ε for y ∈ (y∗, x). Let δ = (x− y∗)/2. Then, if 0 < x− y < δ, supTx − f(y) < ε. Since ε was arbitrary,

this shows that limy→x− f(y) = supTx. The proof for f(x+) is similar, using the Infimum Tolerance

Lemma, Lemma 3.2.1. You should be able to see that f(x−) is less than or equal to f(x+) for all x. We

will define f(a−) = f(a) and f(b+) = f(b) since f is not defined prior to a or after b.

To prove the stated result holds, first choose an arbitrary yj ∈ (xj , xj+1) for each j = 0, . . . , p − 1.

Then, since f is increasing, for each j = 1, . . . , p, we have f(yj−1) ≤ f(x−j ) ≤ f(x+j ) ≤ f(yj). Thus,

f(x+j )− f(x−j ) ≤ f(yj)− f(yj−1). (3.1)

We also have f(a) ≤ f(a+) ≤ f(y0) and f(yp−1) ≤ f(b−) ≤ f(b). Thus, it follows that

p∑j=0

(f(x+

j )− f(x−j ))

= f(x+0 )− f(x−0 ) +

p−1∑j=1

[f(x+j − f(x−j )] + f(x+

p )− f(x−p )

≤ f(a+)− f(a−) +p−1∑j=1

[f(yj − f(yj−1)] + f(b+)− f(b−)

using Equation 3.1 and replacing x0 by a and xp with b. We then note the sum on the right hand side

collapses to f(yp−1)− f(y0). Finally, since f(a−) = f(a) and f(b+) = f(b), we obtain

p∑j=0

(f(x+

j )− f(x−j ))≤ f(a+)− f(a) + f(yp−1)− f(y0) + f(b)− f(b−)

≤ f(y0)− f(a) + f(yp−1)− f(b−) + f(b)− f(y0)

≤ f(b)− f(a) + f(yp−1)− f(b−).

41


But f(yp−1)− f(b−) ≤ 0, so

p∑j=0

(f(x+

j )− f(x−j ))≤ f(b)− f(a).

Theorem 3.2.4 (A Monotone Function Has A Countable Number of Discontinuities).If f is monotone on [a, b], the set of discontinuities of f is countable.

Proof 3.2.4. For concreteness, we assume f is monotone increasing. The decreasing case is shown sim-

ilarly. Since f is monotone increasing, the only types of discontinuities it can have are jump discontinu-

ities. If x ∈ [a, b] is a point of discontinuity, then the size of the jump is given by f(x+) − f(x−). Define

Dk = x ∈ (a, b) : f(x+)− f(x−) > 1/k, for each integer k ≥ 1. We want to show that Dk is finite.

Select any finite subset S of Dk and label the points in S by x1, . . . , xp with x1 < x2 < · · · < xp. If

we add the point x0 = a and xp+1 = b, these points determine a partition π. Hence, by Theorem 3.2.3,

we know that

p∑j=1

[f(x+j )− f(x−j )] ≤

∑π

[f(x+j )− f(x−j )] ≤ f(b)− f(a).

But each jump satisfies f(x+j )− f(x−j ) > 1/k and there are a total of p such points in S. Thus, we must

have

p/k <

p∑j=1

[f(x+j )− f(x−j )] ≤ f(b)− f(a).

Hence, p/k < f(b) − f(a), implying that p < k[f(b) − f(a)]. Thus, the cardinality of S is bounded

above by the fixed constant k[f(b) − f(a)]. Let N be the first positive integer bigger than or equal to

k[f(b)−f(a)]. If the cardinality ofDk were infinite, then there would be a subset T ofDk with cardinality

N + 1. The argument above would then tell us that N + 1 ≤ k[f(b)− f(a)] ≤ N giving a contradiction.

Thus, Dk must be a finite set. This means that D = ∪∞k=1 Dk is countable also.

Finally, if x is a point where f is not continuous, then f(x+)− f(x−) > 0. Hence, there is a positive

integer k0 so that f(x+)− f(x−) > 1/k0. This means x is in Dk0 and so is in D.

42


Definition 3.2.2 (The Discontinuity Set Of A Monotone Function).Let f be monotone increasing on [a, b]. We will let S denote the set of discontinuities of f on [a, b].We know this set is countable by Theorem 3.2.4 so we can label it as S = xj. Define functions u

and v on [a, b] by

u(x) =

0, x = a

f(x)− f(x−), x ∈ (a, b]

v(x) =

f(x+)− f(x), x ∈ [a, b)0, x = b

In Figure 3.1, we show a monotone increasing function with several jumps. You should be able to

compute u and v easily at these jumps.

There are several very important points to make about these functions u and v which are listed below.

Comment 3.2.1.

1. Note that u(x) is the left-hand jump of f at x ∈ (a, b] and v(x) is the right-hand jump of f at

x ∈ [a, b) .

2. Both u and v are non-negative functions and u(x) + v(x) = f(x+)− f(x−) is the total jump in f

at x, for x ∈ (a, b) .

3. Moreover, f is continuous at x from the left if and only if u(x) = 0, and f is continuous from the

right at x if and only if v(x) = 0 .

4. Finally, f is continuous on [a, b] if and only if u(x) = v(x) = 0 on [a, b] .

Now, let S0 be any finite subset of S. From Theorem 3.2.3, we have

∑x∈S0

f(x+)− f(x−) ≤ f(b)− f(a)

This implies ∑x∈S0

u(x) + v(x) ≤ f(b)− f(a)

∑x∈S0

u(x) +∑x∈S0

v(x) ≤ f(b)− f(a).

The above tells us that the set of numbers we get by evaluating this sum over finite subsets of S is bounded

above by the number f(b)−f(a). Hence,∑n

j=1 u(xj) and∑n

j=1 v(xj) are bounded above by f(b)−f(a)

43


(a, f(a))

(b, f(b))

ax4 = bx1 x2 x3x4

33.54

77.58

10

11

12

15

16

A generic curve f onthe interval [a, b] whichis always positive. Weshow four points of dis-continuity x1, x2, x3 andx4. Note u(x1) = 0.5,u(x2) = 0.5, u(x3) = 1and u(x4) = 1. Also,we see v(x1) = 0.5,v(x2) = 0.5, v(x3) = 1and v(x4) = 0.

Figure 3.1: The Function F (x)

for all n. Thus, these sets of numbers have a finite supremum. But u and v are non-negative functions, so

these sums form monotonically increasing sequences. Hence, these sequences converge to their supremum

which we label as∑∞

j=1 u(xj) and∑∞

j=1 v(xj).

Now, consider a nonempty subset, T , of [a, b], and suppose F ⊂ S∩T is finite. Then, by the arguments

already presented, we know that

∑xj∈F

u(xj) +∑xj∈F

v(xj) ≤ f(b)− f(a). (3.2)

This implies

44


∑xj∈F

u(xj) ≤ f(b)− f(a) and∑xj∈F

v(xj) ≤ f(b)− f(a).

From this, it follows that

∑xj∈S∩T

u(xj) = sup∑xj∈F

u(xj) : F ⊂ S ∩ T, F finite.

Likewise, we also have

∑xj∈S∩T

v(xj) = sup∑xj∈F

v(xj) : F ⊂ S ∩ T, F finite.

Definition 3.2.3 (The Saltus Function Associated With A Monotone Function).For x, y ∈ [a, b] with x < y, define

S[x, y] = S ∩ [x, y], S[x, y) = S ∩ [x, y), S(x, y] = S ∩ (x, y] and S(x, y) = S ∩ (x, y)

Then, define the function Sf : [a, b]→ R by

Sf (x) =

f(a), x = a

f(a) +∑

xj∈S(a,x] u(xj) +∑

xj∈S[a,x) v(xj), a < x ≤ b

We call Sf the Saltus Function associated with the monotone increasing function f .

Intuitively, Sf (x) is the sum of all of the jumps (i.e. discontinuities) up to and including the left-hand jump

at x. In essence, it is a generalization of the idea of a step function.

Theorem 3.2.5 (Properties of The Saltus Function).Let f : [a, b]→ R be monotone increasing. Then

1. Sf is monotone increasing on [a, b];

2. if x < y, with x, y ∈ [a, b], then 0 ≤ Sf (y)− Sf (x) ≤ f(y)− f(x);

3. Sf is continuous on Sc ∩ [a, b] where Sc is the complement of the set S.

Proof 3.2.5. Suppose x < y. Then

45


Sf (y)− Sf (x) =∑

xj∈S(a,y]

u(xj) −∑

xj∈S(a,x]

u(xj) +∑

xj∈S[a,y)

v(xj) −∑

xj∈S[a,x)

v(xj)

=∑

xj∈S(x,y]

u(xj) +∑

xj∈S[x,y)

v(xj)

≥ 0.

This proves the first statement. Now, suppose x, y ∈ [a, b] with x < y. Let F be a subset of [a, b] that

consists of a finite number of points of the form F = x0 = x, x1, . . . , xp = y, such that x = x0 < x1 <

· · · < xp = y. In other words, F is a partition of [x, y]. Then, by Equation 3.2 we know

∑xj∈F∩S(x,y]

u(xj) +∑

xj∈F∩S[x,y)

v(xj) ≤ f(y)− f(x).

Taking the supremum of the left-hand side over all such sets, F , we obtain

∑xj∈S(x,y]

u(xj) +∑

xj∈S[x,y)

v(xj) ≤ f(y)− f(x).

But by the remarks made in the first part of this proof, this sum is exactly Sf (y)−Sf (x). We conclude that

Sf (y)− Sf (x) ≤ f(y)− f(x) as desired.

Finally, let x be a point in Sc ∩ [a, b]. Then f is continuous at x, so, given ε > 0, there is a δ > 0such that y ∈ [a, b] and |x − y| < δ ⇒ |f(x) − f(y)| < ε. But by the second part of this proof, we have

|Sf (x)− Sf (y)| ≤ |f(y)− f(x)| < ε. Thus, Sf is continuous at x.

So, why do we care about Sf? The function Sf measures, in a sense, the degree to which f fails to be

continuous. If we subtract Sf from f , we would be subtracting its discontinuities, resulting in a continuous

function that behaves similarly to f .

Definition 3.2.4 (The Continuous Part of A Monotone Function).Define fc : [a, b]→ R by fc(x) = f(x)− Sf (x).

Theorem 3.2.6 (Properties of fc).

1. fc is monotone on [a, b].

2. fc is continuous also.

Proof 3.2.6. The proof that fc is monotone is left to you as an exercise with this generous hint:

Hint 3.2.1. Note if x < y in [a, b], then

fc(y) − fc(x) = (f(y)− f(x)) − (Sf (y)− Sf (x)) .

46


The right hand side is non negative by Theorem 3.2.5.

To prove fc is continuous is a bit tricky. We will do most of the proof but leave a few parts for you to

fill in.

Pick any x in [a, b) and any positive ε. Since the f(x+) exists, there is a positive δ so that 0 ≤f(y)− f(x+) < ε if x < y < x+ δ. Thus, for such y,

fc(y) − fc(x) = [f(y)− Sf (y)] − [f(x)− Sf (x)]

= f(y) −

∑xj∈S(a,y]

u(xj) +∑

xj∈S[a,y)

v(xj)

− f(x) +

∑xj∈S(a,x]

u(xj) +∑

xj∈S[a,x)

v(xj).

Recall, S(a, y] = S(a, x] ∪ S(x, y] and S[a, y) = S[a, x) ∪ S[x, y). So,

fc(y) − fc(x) = f(y) −

∑xj∈S(x,y]

u(xj) +∑

xj∈S[x,y)

v(xj)

− f(x)

Now, the argument reduces to two cases:

1. if y and x are points of discontinuity, we get

fc(y) − fc(x) = f(y)− u(y)−

∑xj∈S(x,y)

u(xj) +∑

xj∈S(x,y)

v(xj)

− f(x)− v(x)

= f(y) − (f(y)− f(y−))−

∑xj∈S(x,y)

u(xj) +∑

xj∈S(x,y)

v(xj)

− f(x)− (f(x+)− f(x))

≤ f(y−)− f(x+)

≤ f(y)− f(x+) < ε

2. if either x and/ or y are not a point of discontinuity, a similar argument holds

Thus, we see fc is continuous from the right at this x. Now use a similar argument to show continuity from

the left at x. Together, these arguments show fc is continuous at x.

47


3.2.1 Worked Out Example

Let’s define f on [0, 2] by

f(x) =

−2 x = 0x3 0 < x < 19/8 x = 1x4/4 + 1 1 < x < 27 x = 2

1. Find u and v

2. Find Sf

3. Find fc

4. Following the discussion in Section 2.4 explain how to compute the Riemann Integral of f and find

its value (yes, this is in the careful rigorous section and so this problem is a bit out of place, but we

will be dotting all of our i’s and crossing all of our t’s soon enough!)

Solution 3.2.1. First, note f(0−) = −2, f(0) = −2 and f(0+) = 0 and so 0 is a point of discontinuity.

Further, f(1−) = 1, f(1) = 9/8 and f(1+) = 5/4 giving another point of discontinuity at 1. Finally,

since f(2−) = 5, f(2) = 7 and f(2+) = 7, there is a third point of discontinuity at 2. So, the set of

discontinuities of f is S = 0, 1, 2. Thus,

S(0, x] =

∅ 0 < x < 11 1 ≤ x < 21, 2 2 = x

and S[0, x) =

0 0 < x ≤ 10, 1 1 < x ≤ 2

Also,

u(x) =

0 x = 00 0 < x < 19/8− 1 = 1/8 x = 10 1 < x < 27− 5 = 2 2 = x

and v(x) =

0− (−2) = 2 x = 00 0 < x < 15/4− 9/8 = 1/8 x = 10 1 < x < 20 2 = x

Now, here

Sf (x) =

f(0) = −2, x = 0f(0) +

∑xj∈S(0,x] u(xj) +

∑xj∈S[0,x) v(xj) 0 < x ≤ 2

48


Thus,

Sf (x) =

−2, x = 0−2 + v(0) = −2 + 2 = 0 0 < x < 1−2 + u(1) + v(0) = −2 + 1/8 + 2 = 1/8 x = 1−2 + u(1) + v(0) + v(1) = −2 + 1/8 + 2 + 1/8 = 1/4 1 < x < 2−2 + u(1) + u(2) + v(0) + v(1) = −2 + 1/8 + 2 + 2 + 1/8 = 9/4 x = 2

So, Sf is the nice step function and fc = f − Sf gives

Sf (x) =

−2, x = 00 0 < x < 11/8 x = 11/4 1 < x < 29/4 x = 2

and fc(x) =

−2− (−2) = 0 x = 0x3 − 0 = x/3 0 < x < 19/8− 1/8 = 1 x = 1x4/4 + 1− 1/4 = x4/4 + 3/4 1 < x < 27− 9/4 = 19/4 x = 2

We see fc is continuous on [0, 2]. Finally, we can compute the Riemann integral of f on [0, 2].Let’s calculate F (t) =

∫ t0 f(x) dx. This will have to be done in several parts because of the way f

is defined.

1. On the interval [0, 1], note that f is continuous except at two points, x = 0 and x = 1. Hence, f

is Riemann integrable by Theorem 2.3.1. Also, the function x3 is continuous on this interval and so

is also Riemann integrable. Then since f on [0, 1] and x3 match at all but two points on [0, 2], their


F (t) =∫ t

0f(x) dx

=∫ t

0x3 dx

= x4/4∣∣∣∣t0

= t4/4

2. On the interval [1, 2], note that f is continuous except at the two points, x = 1 and x = 2. Hence,

f is Riemann integrable by Theorem 2.3.1. Also, the function 1 + x4/4 is continuous on this interval

and so is also Riemann integrable. Then since f on [1, 2] and 1 + x4/4 match at all but two points

on [1, 2], their Riemann integrals must match. Hence, if t is in [1, 2], we compute F as follows:

F (t) =∫ t

0f(x) dx

=∫ 1

0f(x) dx +

∫ t

1f(s) ds

=∫ 1

0x3 dx +

∫ t

1(1 + x4/4) dx

49


= x4/4 |10 + (x + x5/5) |t1= 1/4 + (t + t5/5) − (1 + 1/5)

= t5/5 + t − 19/20


F (t) =

t4/4 0 ≤ t ≤ 1t5/5 + t − 19/20 1 ≤ t ≤ 2

Note, we know from the Fundamental Theorem of Calculus, Theorem 2.3.2, that F must be continuous. To

check this at an interesting point such as t = 1, note F is clearly continuous on either side of 1 and we

note that limt→ 1− F (t) which is F (1−) is 1/4 which is exactly the value of F (1+). Hence, F is indeed

continuous at 1!

What about the differentiability of F? The Fundamental Theorem of Calculus guarantees that F has a


true at all t except 0, 1 and 2 because these are points of discontinuity of f . F ′ is nicely defined at 0 and

1 as a one sided derivative and at all other t save 1 by

F ′(t) =

t3 0 ≤ t < 1t4 + 1 0 < t ≤ 2

However, when we look at the one sided derivatives, we find F ′(0+) = 0 6= f(0) = −2, F ′(2−) =17 6= f(2) = 7 and F ′(1−) = 1 and F ′(1+) = 2 giving F ′(1) does not even exist. Thus, note F is notthe antiderivative of f on [0, 2] because of this mismatch.

3.2.2 Homework

Exercise 3.2.1. Prove fc is monotone.

Exercise 3.2.2. Let’s define f on [0, 2] by

f(x) =

−1 x = 0x2 0 < x < 17/4 x = 1√x+ 3 1 < x < 2

3 x = 2

1. Find u and v

2. Find Sf

50

3.3. BOUNDED VARIATION CHAPTER 3. BOUNDED VARIATION

3. Find fc

4. Do a nice graph of u, v, f , fc and Sf

5. Following the discussion in Section 2.4 explain how to compute the Riemann Integral of f and find

its value (yes, this is in the careful rigorous section and so this problem is a bit out of place, but we

will be dotting all of our i’s and crossing all of our t’s soon enough!)

3.3 Functions of Bounded Variation

The next important topic for us is to consider the class of functions of bounded variation. We will develop

this classically here, but in later chapters, we will define similar concepts using abstract measures. We

are going to find out that functions of bounded variation can also be represented as the difference of two

increasing functions and that there classical derivative exists everywhere except a set of measure zero (yes,

that idea is not defined yet, but I believe in teasers!). Let’s get on with it.

Definition 3.3.1 (Functions Of Bounded Variation).Let f : [a, b] → R and let π ∈ Π[a, b] be given by π = x0 = a, x1, . . . , xp = b. Define ∆fj =f(xj)− f(xj−1) for 1 ≤ j ≤ p. If there exists a positive real number, M , such that

∑π

|∆fj | ≤M

for all π ∈ Π[a, b], then we say that f is of bounded variation on [a, b]. The set of all functions of

bounded variation on the interval [a, b] is denoted by the symbol BV [a, b].

Comment 3.3.1.

1. Note saying a function f is of bounded variation is equivalent to saying the set ∑

π |∆fj | : π ∈Π[a, b] is bounded, and, therefore, has a supremum.

2. Also, if f is of bounded variation on [a, b], then, for any x ∈ (a, b), the set a, x, b is a partition of

[a, b]. Hence, there exists M > 0 such that | f(x)− f(a) | + | f(b)− f(x) |≤M . But this implies

| f(x) | − | f(a) | ≤ | f(x)− f(a) | + | f(b)− f(x) | ≤ M

This tells us that | f(x) | ≤ | f(a) | + M . Since our choice of x in [a, b] was arbitrary, this shows

that f is bounded, i.e. || f ||∞ < ∞.

We can state the comments above formally as Theorem 3.3.1.

Theorem 3.3.1 (Functions Of Bounded Variation Are Bounded).If f is of bounded variation on [a, b], then f is bounded on [a, b].

51


Theorem 3.3.2 (Monotone Functions Are Of Bounded Variation).If f is monotone on [a, b], then f ∈ BV [a, b].

Proof 3.3.1. As usual, we assume, for concreteness, that f is monotone increasing. Let π ∈ Π[a, b].Hence, we can write π = x0 = a, x1, . . . , xp−1, xp = b. Then

∑π

| ∆fj |=∑π

| f(xj)− f(xj−1) | .

Since f is monotone increasing, the absolute value signs are unnecessary, so that

∑π

| ∆fj |=∑π

∆fj =∑π

(f(xj)− f(xj−1)

).

But this is a telescoping sum, so

∑π

∆fj = f(xp)− f(x0) = f(b)− f(a).

Since the partition π was arbitrary, it follows that∑

π ∆fj ≤ f(b)−f(a) for all π ∈ Π[a, b]. This implies

that f ∈ BV [a, b], for if f(b) > f(a), then we can simply let M = f(b) − f(a). If f(b) = f(a), then f

must be constant, and we can let M = f(b)− f(a) + 1 = 1. In either case, f ∈ BV [a, b].

Theorem 3.3.3 (Bounded Differentiable Implies Bounded Variation).Suppose f ∈ C[a, b], f is differentiable on (a, b), and || f ′ ||∞<∞. Then f ∈ BV [a, b].

Proof 3.3.2. Let π ∈ Π[a, b] so that π = x0 = a, x1, . . . , xp = b. On each subinterval [xj−1, xj ],for 1 ≤ j ≤ p, the hypotheses of the Mean Value Theorem are satisfied. Hence, there is a point, yj ∈(xj−1, xj), with ∆fj = f(xj)− f(xj−1) = f ′(yj)∆xj . So, we have

| ∆fj |=| f ′(yj) | ∆xj ≤ B∆xj ,

where B is the bound on f ′ that we assume exists by hypothesis. Thus, for any π ∈ Π[a, b], we have

∑π

| ∆fj |≤ B∑π

∆xj = B(b− a) <∞.

Therefore, f ∈ BV [a, b].

52


Definition 3.3.2 (The Total Variation Of A Function Of Bounded Variation).Let f ∈ BV [a, b]. The real number

V (f ; a, b) = sup

∑π

| ∆fj | : π ∈ Π[a, b]

is called the Total Variation of f on [a, b].

Note that this number always exists if f ∈ BV [a, b].

Comment 3.3.2. For any f ∈ BV [a, b], we clearly have V (f ; a, b) = V (−f ; a, b) and V (f ; a, b) ≥ 0.

Moreover, we also see that V (f ; a, b) = 0 if and only if f is constant on [a, b].

Theorem 3.3.4 (Functions Of Bounded Variation Are Closed Under Addition).If f and g are in BV [a, b], then so are f ± g, and V (f ± g; a, b) ≤ V (f ; a, b) + V (g; a, b).

Proof 3.3.3. Let π ∈ Π[a, b], so that π = x0 = a, x1, . . . , xp = b. Consider f + g first. We have, for

each 1 ≤ j ≤ p,

| ∆(f + g)j | = | (f + g)(xj)− (f + g)(xj−1) |

≤ | f(xj)− f(xj−1) | + | g(xj)− g(xj−1) |

≤ | ∆fj | + | ∆gj | .

This implies that, for any π ∈ Π[a, b],

∑π

| ∆(f + g)j |≤∑π

| ∆fj | +∑π

| ∆gj | .

Both quantities on the right-hand side are bounded by V (f ; a, b) and V (g; a, b), respectively. Since π ∈Π[a, b] was arbitrary, we have

V (f + g; a, b) ≤ V (f ; a, b) + V (g; a, b).

This shows that f + g ∈ BV [a, b] and proves the desired inequality for that case. Since V (−g; a, b) =V (g; a, b), we also have

V (f − g; a, b) ≤ V (f ; a, b) + V (−g; a, b) = V (f ; a, b) + V (g; a, b),

which proves that f − g ∈ BV [a, b].

Theorem 3.3.5 (Products Of Functions Of Bounded Variation Are Of Bounded Variation).If f, g ∈ BV [a, b], then fg ∈ BV [a, b] and V (fg; a, b) ≤ || g ||∞ V (f ; a, b)+ || f ||∞ V (g; a, b).

53


Proof 3.3.4. By Theorem 3.3.1, we know that f and g are bounded. Hence, the numbers || f ||∞ and

|| g ||∞ exist and are finite. Let h = fg, and let π = x0 = a, x1, . . . , xp = b be any partition. Then

| ∆hj | = | f(xj)g(xj)− f(xj−1)g(xj−1) |

= | f(xj)g(xj)− g(xj)f(xj−1) + g(xj)f(xj−1)− f(xj−1)g(xj−1) |

≤ | g(xj) || ∆fj | + | f(xj−1) || ∆gj |

≤ || g ||∞| ∆fj | + || f ||∞| ∆gj |

Thus, ∑π

| ∆hj | ≤ || g ||∞∑π

| ∆fj | + || f ||∞∑π

| ∆gj |

≤ || g ||∞ V (f ; a, b) + || f ||∞ V (g; a, b)

Since π was arbitrary, we see the right hand side is an upper bound for all the partition sums and hence,

the supremum of all these sums must also be less than or equal to the right hand side. Thus,

V (fg; a, b) ≤ || g ||∞ V (f ; a, b) + || f ||∞ V (g; a, b)

Comment 3.3.3. Note that we have verified that BV [a, b] is a commutative algebra (i.e. a ring) of func-

tions with an identity, since the constant function f = 1 is of bounded variation.

It is natural to ask, then, what the units are in this algebra. That is, what functions have multiplicative

inverses?

Theorem 3.3.6 (Inverses Of Functions Of Bounded Variation).Let f be in BV [a, b], and assume that there is a positive m such that | f(x) | ≥ m > 0 for all

x ∈ [a, b]. Then 1/f ∈ BV [a, b] and V (1/f ; a, b) ≤ (1/m2)V (f ; a, b).

Proof 3.3.5. Let π = x0 = a, x1, . . . , xp be any partition. Then

∣∣∣∣∆( 1f

)j

∣∣∣∣ =∣∣∣∣ 1f(xj)

− 1f(xj−1)

∣∣∣∣=

∣∣∣∣f(xj−1)− f(xj)f(xj)f(xj−1)

∣∣∣∣=

| ∆fj || f(xj) || f(xj−1) |

≤ ∆fjm2

.

54


Thus, we have

∑π

| ∆( 1f

)j| ≤ 1

m2

∑π

| ∆fj |

implying that V (1/f ; a, b) ≤ (1/m2)V (f ; a, b).

Comment 3.3.4.

1. Any polynomial, p, is in BV [a, b], and p is a unit if none of its zeros occur in the interval.

2. Any rational function p/q where p and q are of bounded variation on [a, b], is in BV [a, b] as long

as none of the zeros of q occur in the interval.

3. ex ∈ BV [a, b]. In fact, eu(x) ∈ BV [a, b] if u(x) is monotone or has a bounded derivative.

4. sinx and cosx are in BV [a, b] by Theorem 3.3.3.

5. tanx ∈ BV [a, b] if [a, b] does not contain any point of the form (2k+1)π/2 for k ∈ Z, by Theorem

3.3.6.

6. The function

f(x) =

sin 1

x , 0 < x ≤ 10, x = 0

is not in BV [0, 1]. To see this, choose partition points x0, . . . , xp by x0 = 0, xp = 1, and

xj =2

π(2p− 2j + 1), 1 ≤ j ≤ p− 1.

Then

∆f1 = sin(π(2p− 1)

2

)= ±1,

∆f2 = sin(π(2p− 3)

2

)− sin

(π(2p− 1)2

)= ±2,

and continuing, we find

∆fp−1 = sin(π(2p− 2(p− 1) + 1)

2

)− sin

(π(2p− 2(p− 2) + 1)2

)= ±2,

∆fp = sin 1− sin(3π/2) = sin 1 + 1.

55

3.4. TOTAL VARIATION CHAPTER 3. BOUNDED VARIATION

Thus,

∑π

= | ∆f1 | +p−1∑j=2

| ∆fj | + sin 1 + 1

= 2(p− 1) + sin 1.

Hence, we can make the value of this sum as large as we desire and so this function is not of bounded

variation.

3.3.1 Homework

Exercise 3.3.1. Prove that if f is of bounded variation on the finite interval [a, b], then α f is also of

bounded variation for any scalar α. Do this proof using the partition approach.

Exercise 3.3.2. Prove that if f and g are of bounded variation on the finite interval [a, b], then α f + β g

is also of bounded variation for any scalars α and β. Do this proof using the partition approach. Note,

these two exercises essentially show BV [a, b] is a vector space.

Exercise 3.3.3. Prove BV [a, b] is a complete normed linear space with norm || · || defined by

|| f || = | f(a) | + V (f, a, b)

Exercise 3.3.4. Define f on [0, 1] by

f(x) =

x2 cos(x−2) x 6= 0 ∈ [0, 1]0 x = 0

Prove that f is differentiable on [0, 1] but is not of bounded variation. This is a nice example of something

we will see later. This f is a function which is continuous but not absolutely continuous.

3.4 The Total Variation Function

Theorem 3.4.1 (The Total Variation Is Additive On Intervals).If f ∈ BV [a, b] and c ∈ [a, b], then f ∈ BV [a, c], f ∈ BV [c, b], and V (f ; a, b) = V (f ; a, c) +V (f ; c, b). That is, the total variation, V , is additive on intervals.

Proof 3.4.1. The case c = a or c = b is easy, so we assume c ∈ (a, b). Let π1 ∈ Π[a, c] and π2 ∈ Π[c, b]with π1 = x0 = a, x1, . . . , xp = c and π2 = y0 = c, y1, . . . , yq = b. Then π1 ∨ π2 is a partition of

[a, b] and we know

∑π1∨π2

| ∆fj |=∑π1

| ∆fj | +∑π2

| ∆fj |≤ V (f ; a, b).

Dropping the π2 term, and noting that π1 ∈ Π[a, c] was arbitrary, we see that

56


supπ1∈Π[a,c]

∑π1

| ∆fj |≤ V (f ; a, b),

which implies that V (f ; a, c) ≤ V (f ; a, b) < ∞. Thus, f ∈ BV [a, c]. A similar argument shows that

V (f ; c, b) ≤ V (f ; a, b), so f ∈ BV [c, b].

Finally, since both π1 and π2 were arbitrary and we know that

∑π1

| ∆fj | +∑π2

| ∆fj |≤ V (f ; a, b),

we see that V (f ; a, c) + V (f ; c, b) ≤ V (f ; a, b).

Now we will establish the reverse inequality. Let π ∈ Π[a, b], so that π = x0 = a, x1, . . . , xp = b.First, assume that c is a partition point of π, so that c = xk0 for some k0. Thus, π = x0, . . . , xk0 ∪xk0 , . . . , xp. Let π1 = x0, . . . , xk0 ∈ Π[a, c] and let π2 = xk0 , . . . , xp ∈ Π[c, b]. From the first

part of our proof, we know that f ∈ BV [a, c] and f ∈ BV [c, b], so

∑π

| ∆fj | =∑π1

| ∆fj | +∑π2

| ∆fj |

≤ V (f ; a, c) + V (f ; c, b).

Since π ∈ Π[a, b] was arbitrary, it follows that V (f ; a, b) ≤ V (f ; a, c) + V (f ; c, b). For the other

case, suppose c is not a partition point of π. Then c must lie inside one of the subintervals. That is,

c ∈ (xk0−1, xk0) for some k0. Let π′ = x0, . . . , xk0−1, c, xk0 , . . . , xp be a new partition of [a, b]. Then

π′ refines π. Apply our previous argument to conclude that

∑π′

| ∆fj |≤ V (f ; a, c) + V (f ; c, b).

Finally, we note that ∑π

| ∆fj |≤∑π′

| ∆fj |,

since

| f(xk0)− f(xk0−1) |≤| f(xk0)− f(c) | + | f(c)− f(xk0−1) | .

Thus, we have

∑π

| ∆fj |≤ V (f ; a, c) + V (f ; c, b).

Since π was arbitrary, it follows that V (f ; a, b) ≤ V (f ; a, c) + V (f ; c, b). Combining these two inequali-

ties, we see the result is established.

57


Definition 3.4.1 (The Variation Function Of a Function f Of Bounded Variation).Let f ∈ BV [a, b]. The Variation Function of f , or simply the Variation of f , is the function Vf :[a, b]→ < defined by

Vf (x) =

0, x = a

V (f ; a, x), a < x ≤ b

Theorem 3.4.2 (Vf and Vf − f Are Monotone For a Function f of Bounded Variation).If f ∈ BV [a, b], then the functions Vf and Vf − f are monotone increasing on [a, b].

Proof 3.4.2. Pick x1, x2 ∈ [a, b] with x1 < x2. By Theorem 3.4.1, f ∈ BV [a, x1] and f ∈ BV [a, x2].Apply this same theorem to the interval [a, x1] ∪ [x1, x2] to conclude that f ∈ BV [x1, x2]. Thus

Vf (x2) = V (f ; a, x2) = V (f ; a, x1) + V (f ;x1, x2) = Vf (x1) + V (f ;x1, x2).

It follows that Vf (x2) − Vf (x1) = V (f ;x1, x2) ≥ 0, so Vf is monotone increasing. Now, consider

(Vf − f)(x2)− (Vf − f)(x1). We have

(Vf − f)(x2)− (Vf − f)(x1) = Vf (x2)− Vf (x1)− (f(x2)− f(x1))

= V (f ; a, x2)− V (f ; a, x1)− (f(x2)− f(x1))

= V (f ;x1, x2)− (f(x2)− f(x1)).

But x1, x2 is the trivial partition of [x1, x2], so

∑x1,x2

| ∆fj | ≤ supπ∈Π[x1,x2]

∑π

| ∆fj |= V (f ;x1, x2).

Thus, V (f ;x1, x2)− (f(x2)− f(x1)) ≥ 0, implying that Vf − f is monotone increasing.

Theorem 3.4.3 (A Function Of Bounded Variation Is The Difference of Two Increasing Functions).Every f ∈ BV [a, b] can be written as the difference of two monotone increasing functions on [a, b].In other words,

BV [a, b] = f : [a, b]→ < | ∃u, v : [a, b]→ <, u, v monotone increasing, f = u− v.

58

3.5. CONTINUOUS ALSO CHAPTER 3. BOUNDED VARIATION

Proof 3.4.3. If f = u−v, where u and v are monotone increasing, then u and v are of bounded variation.

Since BV [a, b] is an algebra, it follows that f ∈ BV [a, b].

Conversely, suppose f ∈ BV [a, b], and let u = Vf and v = Vf − f . Then u and v are monotone

increasing and u− v = f .

Comment 3.4.1. Theorem 3.4.3 tells us if g is of bounded variation on [a, b], then g = u − v where u

and v are monotone increasing. Thus, we can also use the Saltus decomposition of u and v to conclude

f = (uc + Su) − (vc + Sv)

= (uc − vc) + (Su − Sv)

The first term is the difference of two continuous functions of bounded variation and the second term is the

difference of Saltus functions. This is essentially another form of decomposition theorem for a function of

bounded variation.

3.5 Continuous Functions of Bounded Variation

Theorem 3.5.1 (Functions Of Bounded Variation Always Possess Right and Left Hand Limits).Let f ∈ BV [a, b]. Then the limit f(x+) exists for all x ∈ [a, b) and the limit f(x−) exists for all

x ∈ (a, b].

Proof 3.5.1. By Theorem 3.4.2, Vf and Vf − f are monotone increasing. So Vf (x+) and (Vf − f)(x+)both exist. Hence,

f(x+) = limx→x+

f(x)

= limx→x+

[Vf (x)− (Vf − f)(x)]

= limx→x+

Vf (x) + limx→x+

(Vf − f)(x)

= Vf (x+) + (Vf − f)(x+).

So, f(x+) exists. A similar argument shows that f(x−) exists.

Theorem 3.5.2 (Functions Of Bounded Variation Have Countable Discontinuity Sets).If f ∈ BV [a, b], then the set of discontinuities of f is countable.

Proof 3.5.2. f = u − v where u and v are monotone increasing. By Theorem 3.4.3, S1 = x ∈[a, b] | uis not continuous atx and S2 = x ∈ [a, b] | vis not continuous atx are countable. The union

59


of these sets is the set of all the points of possible discontinuity of f , so the set of discontinuities of f is

countable.

Theorem 3.5.3 (A Function Of Bounded Variation Continuous If and Only If Vf Is Continuous).Let f ∈ BV [a, b]. Then f is continuous at c ∈ [a, b] if and only if Vf is continuous at c.

Proof 3.5.3. The case where c = a and c = b are easier, so we will only prove the case where c ∈ (a, b).

First, suppose f is continuous at c. We will prove separately that Vf is continuous from the right at c and

from the left at c.

Let ε > 0 be given. Since f is continuous at c, there is a positive δ such that if x is in (c− δ, c+ δ) ⊂[a, b], then | f(x)− f(c) |< ε/2. Now,

V (f ; c, b) = supπ∈Π[c,b]

∑π

| ∆fj | .

So, there is a partition π0 such that

V (f ; c, b)− ε

2<

∑π0

| ∆fj |≤ V (f ; c, b). (∗)

If π0′ is any refinement of π0, we see that

∑π0

| ∆fj |≤∑π0′

| ∆fj |,

since adding points to π0 simply increases the sum. Thus,

V (f ; c, b)− ε

2<

∑π0

| ∆fj | ≤∑π0′

| ∆fj | ≤ V (f ; c, b)

for any refinement π0′ of π0. Now, choose a partition, π1 which refines π0 and satisfies || π1 ||< δ. Then

V (f ; c, b)− ε

2<

∑π1

| ∆fj |≤ V (f ; c, b). (∗∗)

So, if π1 = x0 = c, x1, . . . , xp, then | x1 − x0 |< δ. Thus, we have | x1 − c |< δ. It follows that

| f(x1)− f(c) |< ε/2. From Equation ∗∗, we then have

V (f ; c, b)− ε

2<

∑π1

| ∆fj |

60


= | f(x1)− f(c) | +∑

rest of π1

| ∆fj |

<ε

2+

∑rest of π1

| ∆fj |

<ε

2+ V (f ;x1, b).

So, we see that

V (f ; c, b)− ε

2<ε

2+ V (f ;x1, b),

which implies that

V (f ; c, b)− V (f ;x1, b) <ε

2+ε

2= ε.

But V (f ; c, b)− V (f ;x1, b) = V (f ; c, x1) which is the same as Vf (x1)− Vf (c). Thus, we have

Vf (x1)− Vf (c) < ε.

Now Vf is monotone and hence we have shown that if x ∈ (c, x1),

Vf (x)− Vf (c) ≤ Vf (x1)− Vf (c) < ε.

Since ε > 0 was arbitrary, this verifies the right continuity of Vf at c.

The argument for left continuity is similar. We can find a partition π1 of [a, c] with partition points

x0 = a, x1, . . . , xp−1, xp = c such that || π1 ||< δ and

V (f ; a, c)− ε

2< | f(c)− f(xp−1) | +

∑rest of π1

| ∆fj |

≤ | f(c)− f(xp−1) | +V (f ; a, xp−1).

Since || π1 ||< δ, we see as before that | f(c)− f(xp−1) |< ε/2. Thus,

V (f ; a, c)− ε

2<ε

2+ V (f ; a, xp−1),

and it follows that

V (f ; a, c)− V (f ; a, xp−1) < ε,

or

Vf (c)− Vf (xp−1) < ε.

61


Since Vf is monotone, we then have for any x in (xp−1, c) that

Vf (c)− Vf (x) < Vf (c)− Vf (xp−1) < ε

which shows the left continuity of Vf at c. Hence, Vf is continuous at c.

Conversely, suppose Vf is continuous at c ∈ (a, b). Given ε > 0, there is a positive δ such that

(c− δ, c+ δ) ⊂ [a, b] and | Vf (x)− Vf (c) |< ε for all x ∈ (c− δ, c+ δ). Pick any x ∈ (c, c+ δ). Then

c, x is a trivial partition of [c, x]. Hence

0 ≤| f(x)− f(c) |≤ V (f ; c, x) = V (f ; a, x)− V (f ; a, c)

or

0 ≤| f(x)− f(c) |≤ Vf (x)− Vf (c) < ε.

Hence, it follows that f is continuous from the right. A similar argument shows that f is continuous from

the left.

We immediately have this corollary.

Theorem 3.5.4 (f Continuous and Bounded Variation If and Only If Vf and Vf − f Are Continuous

and Increasing).f ∈ C[a, b] ∩BV [a, b] if and only if Vf and Vf − f are monotone increasing and continuous.

62

Chapter 4

The Theory Of Riemann Integration

We will now develop the theory of the Riemann Integral for a bounded function f on the interval [a, b].We followed the development of this material in (Fulks (3) 1978) closely at times, although Fulks does not

cover some of the sections very well.

4.1 Defining The Riemann Integral

Definition 4.1.1 (The Riemann Sum).Let f ∈ B[a, b], and let π ∈ Π[a, b] be given by π = x0 = a, x1, . . . , xp = b. Define σ =s1, . . . , sp, where sj ∈ [xj−1, xj ] for 1 ≤ j ≤ p. We call σ an evaluation set, and we denote this by

σ ⊂ π. The Riemann Sum determined by the partition π and the evaluation set σ is defined by

S(f,π,σ) =∑π

f(sj)∆xj

63

4.1. DEFINITION CHAPTER 4. RIEMANN INTEGRATION

Definition 4.1.2 (Riemann Integrability Of a Bounded f ).We say f ∈ B[a, b] is Riemann Integrable on [a, b] if there exists a real number, I , such that for every

ε > 0 there is a partition, π0 ∈ Π[a, b] such that

| S(f,π,σ)− I |< ε

for any refinement, π, of π0 and any evaluation set, σ ⊂ π. We denote this value, I , by

I ≡ RI(f ; a, b)

We denote the set of Riemann integrable functions on [a, b] by RI[a, b]. Also, it is readily seen that the

number RI(f ; a, b) in the definition above, when it exists, is unique. So we can speak of Riemann Integral

of a function, f . We also have the following conventions.

1. RI(f ; a, b) = −RI(f ; b, a)

2. RI(f ; a, a) = 0

3. f is called the integrand.

Theorem 4.1.1 (RI[a, b] Is A Vector Space and RI(f ; a, b) Is A Linear Mapping).RI[a, b] is a vector space over < and the mapping IR : RI[a, b]→ < defined by

IR(f) = RI(f ; a, b)

is a linear mapping.

Proof 4.1.1. Let f1, f2 ∈ RI[a, b], and let α, β ∈ <. For any π ∈ Π[a, b] and σ ⊂ π, we have

S(αf1 + βf2,π,σ) =∑π

(αf1 + βf2)(sj)∆xj

= α∑π

f1(sj)∆xj + β∑π

f2(sj)∆xj

= αS(f1,π,σ) + βS(f2,π,σ).

Since f1 is Riemann integrable, given ε > 0, there is a real number I1 = RI(f1, a, b) and a partition

π1 ∈ Π[a, b] such that

| S(f1,π,σ)− I1 | <ε

2(| α | +1)(∗)

64


for all refinements, π, of π1, and all σ ⊂ π.

Likewise, since f2 is Riemann integrable, there is a real number I2 = RE(f2; a, b) and a partition π2 ∈Π[a, b] such that

| S(f2,π,σ)− I2 | <ε

2(| β | +1)(∗∗)

for all refinements, π, of π2, and all σ ⊂ π.

Let π0 = π1 ∨ π2. Then π0 is a refinement of both π1 and π2. So, for any refinement, π, of π0, and any

σ ⊂ π, we have Equation ∗ and Equation ∗∗ are valid. Hence,

| S(f1,π,σ)− I1 | <ε

2(| α | +1)

| S(f2,π,σ)− I2 | <ε

2(| β | +1).

Thus, for any refinement π of π0 and any σ ⊂ π, it follows that

| S(αf1 + βf2,π,σ)− (αI1 + βI2) | = | αS(f1,π,σ) + βS(f2,π,σ)− αI1 − βI2 |

≤ | α || S(f1,π,σ)− I1 | + | β || S(f2,π,σ)− I2 |

< | α | ε

2(| α | +1)+ | β | ε

2(| β | +1)< ε.

This shows that αf1 + βf2 is Riemann integrable and that the value of the integral RI(αf1 + βf2; a, b) is

given by αRI(f1; a, b) + βRI(f2; a, b). It then follows immediately that IR is a linear mapping.

Theorem 4.1.2 (Fundamental Riemann Integral Estimates).Let f ∈ RI[a, b]. Let m = infx f(x) and let M = supx f(x). Then

m(b− a) ≤ RI(f ; a, b) ≤M(b− a).

Proof 4.1.2. If π ∈ Π[a, b], then for all σ ⊂ π, we see that

∑π

m∆xj ≤∑π

f(sj)∆xj ≤∑π

M∆xj .

But∑π ∆xj = b− a, so

65


m(b− a) ≤∑π

f(sj)∆xj ≤M(b− a),

or

m(b− a) ≤ S(f,π,σ) ≤M(b− a),

for any partition π and any σ ⊂ π.

Now, let ε > 0 be given. Then there exist π0 ∈ Π[a, b] such that for any refinement, π, of π0 and any

σ ⊂ π,

RI(f ; a, b)− ε < S(f,π,σ) < RI(f ; a, b) + ε.

Hence, for any such refinement, π, and any σ ⊂ π, we have

m(b− a) ≤ S(f,π,σ) < RI(f ; a, b) + ε

and

M(b− a) ≥ S(f,π,σ) > RI(f ; a, b)− ε.

Since ε > 0 is arbitrary, it follows that

m(b− a) ≤ RI(f ; a, b) ≤M(b− a).

Theorem 4.1.3 (The Riemann Integral Is Order Preserving).The Riemann integral is order preserving. That is, if f, f1, f2 ∈ RI[a, b], then

(i)

f ≥ 0⇒ RI(f ; a, b) ≥ 0;

(ii)

f1 ≤ f2 ⇒ RI(f1; a, b) ≤ RI(f2; a, b).

Proof 4.1.3. If f ≥ 0 on [a, b], then infx f(x) = m ≥ 0. Hence, by Theorem 4.1.2∫ b

af(x)dx ≥ m(b− a) ≥ 0.

This proves the first assertion. To prove (ii), let f = f2 − f1. Then f ≥ 0, and the second result follows

from the first.

66

4.2. EXISTENCE CHAPTER 4. RIEMANN INTEGRATION

4.2 The Existence of the Riemann Integral

Although we have a definition for what it means for a bounded function to be Riemann integrable, we still

do not actually know that RI[a, b] is nonempty! In this section, we will show how we prove that the set of

Riemann integrable functions is quite rich and varied.

Definition 4.2.1 (Darboux Upper and Lower Sums).Let f ∈ B[a, b]. Let π ∈ Π[a, b] be given by π = x0 = a, x1, . . . , xp = b. Define

mj = infxj−1≤x≤xj

f(x) 1 ≤ j ≤ p,

and

Mj = supxj−1≤x≤xj

f(x) 1 ≤ j ≤ p.

We define the Lower Darboux Sum by

L(f,π) =∑π

mj∆xj

and the Upper Darboux Sum by

U(f,π) =∑π

Mj∆xj .

Comment 4.2.1.

1. It is straightforward to see that

L(f,π) ≤ S(f,π,σ) ≤ U(f,π)

for all π ∈ Π[a, b].

2. We also have

U(f,π)− L(f,π) =∑π

(Mj −mj)∆xj .

Theorem 4.2.1 (π π′ Implies L(f,π) ≤ L(f,π′) and U(f,π) ≥ U(f,π′)).If π π′, that is, if π′ refines π, then L(f,π) ≤ L(f,π′) and U(f,π) ≥ U(f,π′).

Proof 4.2.1. The general result is established by induction on the number of points added. It is actually

quite an involved induction. Here are some of the details:

67


Step 1 We prove the proposition for inserting points z1, . . . , zq into one subinterval of π. The argument

consists of

1. The Basis Step where we prove the proposition for the insertion of a single point into one

subinterval.

2. The Induction Step where we assume the proposition holds for the insertion of q points into

one subinterval and then we show the proposition still holds if an additional point is inserted.

3. With the Induction Step verified, the Principle of Mathematical Induction then tells us that the

proposition is true for any refinement of π which places points into one subinterval of π.

Basis:

Proof . Let π ∈ Π[a, b] be given by x0 = a, x1, . . . , xp = b. Suppose we form the refinement,

π′, by adding a single point x′ to π. into the interior of the subinterval [xk0−1, xk0 ]. Let

m′ = inf[xk0−1,x′]

f(x)

m′′ = inf[x′,xk0 ]

f(x).

Note that mk0 = minm′,m′′ and

mk0∆xk0 = mk0(xk0 − xk0−1)

= mk0(xk0 − x′) +mk0(x′ − xk0−1)

≤ m′′(xk0 − x′) +m′(x′ − xk0−1)

≤ m′′∆x′′ +m′∆x′,

where ∆x′′ = xk0 − x′ and ∆x′ = x′ − xk0−1. It follows that

L(f,π′) =∑j 6=k0

mj∆xj +m′∆x′ +m′′∆x′′

≥∑j 6=k0

mj∆xj +mk0∆xk0

≥ L(f,π).

Induction:

68


Proof . We assume that q points z1, . . . , zq have been inserted into the subinterval [xk0−1, xk0 ].Let π′ denote the resulting refinement of π. We assume that

L(f,π) ≤ L(f,π′)

let the additional point added to this subinterval be called x′ and call π′′ the resulting refinement

of π′. We know that π′ has broken [xk0−1, xk0 ] into q + 1 pieces. For convenience of notation, let’s

label these q + 1 subintervals as [yj−1, yj ] where y0 is xk0−1 and yq+1 is xk0 and the yj values in

between are the original zi points for appropriate indices. The new point x′ is thus added to one of

these q + 1 pieces, call it [yj0−1, yj0 ] for some index j0. This interval plays the role of the original

subinterval in the proof of the em Basis Step. An argument similar to that in the proof of the Basis

Step then shows us that

L(f,π′) ≤ L(f,π′′)

Combining with the first inequality from the Induction hypothesis, we establish the result. Thus, the

Induction Step is proved.

Step 2 Next, we allow the insertion of a finite number of points into a finite number of subintervals of π.

The induction is now on the number of subintervals.

1. The Basis Step where we prove the proposition for the insertion of points into one subinterval.

2. The Induction Step where we assume the proposition holds for the insertion of points into q

subintervals and then we show the proposition still holds if an additional subinterval has points

inserted.

3. With the Induction Step verified, the Principle of Mathematical Induction then tells us that the

proposition is true for any refinement of π which places points into any number of subintervals

of π.

Basis

Proof . Step 1 above gives us the Basis Step for this proposition.

Induction

Proof . We assume the results holds for p subintervals and show it also holds when one more

subinterval is added. Specifically, let π′ be the refinement that results from adding points to p

subintervals of π. Then the Induction hypothesis tells us that

L(f,π) ≤ L(f,π′)

69


Let π′′ denote the new refinement of π which results from adding more points into one more subin-

terval of π. Then π′′ is also a refinement of π′ where all the new points are added to one subinterval

of π′. Thus, Step 1 holds for the pair (π′,π′′). We see

L(f,π′) ≤ L(f,π′′)

and the desired result follows immediately.

A similar argument establishes the result for upper sums.

Theorem 4.2.2 (L(f,π1) ≤ U(f,π2)).Let π1 and π2 be any two partitions in Π[a, b]. Then L(f,π1) ≤ U(f,π2).

Proof 4.2.6. Let π = π1 ∨ π2 be the common refinement of π1 and π2. Then, by the previous result, we

have

L(f,π1) ≤ L(f,π) ≤ U(f,π) ≤ U(f,π2).

Theorem 4.2.2 then allows us to define a new type of integrability for the bounded function f . We

begin by looking at the infimum of the upper sums and the supremum of the lower sums for a given

bounded function f .

Theorem 4.2.3 (The Upper And Lower Darboux Integral Are Finite).Let f ∈ B[a, b]. Let L = L(f,π) | π ∈ Π[a, b] and U = U(f,π) | π ∈ Π[a, b]. Define

L(f) = sup L , and U(f) = inf U . Then L(f) and U(f) are both finite. Moreover, L(f) ≤ U(f).

Proof 4.2.7. By Theorem 4.2.2, the set L is bounded above by any upper sum for f . Hence, it has a

finite supremum and so sup L is finite. Also, again by Theorem 4.2.2, the set U is bounded below by any

lower sum for f . Hence, inf U is finite. Finally, since L(f) ≤ U(f,π) and U(f) ≥ L(f,π) for all π, by

definition of the infimum and supremum of a set of numbers, we must have L(f) ≤ U(f).

Definition 4.2.2 (Darboux Lower And Upper Integrals).Let f be in B[a, b]. The Lower Darboux Integral of f is defined to be the finite number L(f) =sup L , and the Upper Darboux Integral of f is the finite number U(f) = inf U .

We can then define what is meant by a bounded function being Darboux Integrable on [a, b].

70


Definition 4.2.3 (Darboux Integrability).Let f be in B[a, b]. We say f is Darboux Integrable on [a, b] if L(f) = U(f). The common value is

then called the Darboux Integral of f on [a, b] and is denoted by the symbol DI(f ; a, b).

Comment 4.2.2. Not all bounded functions are Darboux Integrable. Consider the function f : [0, 1]→ <defined by

f(t) =

1 t ∈ [0, 1] and is rational

−1 t ∈ [0, 1] and is irrational

You should be able to see that for any partition of [0, 1], the infimum of f on any subinterval is always −1as any subinterval contains irrational numbers. Similarly, any subinterval contains rational numbers and

so the supremum of f on a subinterval is 1. Thus U(f,π) = 1 and L(f,π) = −1 for any partition π of

[0, 1]. It follows that L(f) = −1 and U(f) = 1. Thus, f is bounded but not Darboux Integrable.

Definition 4.2.4 (Riemann’s Criterion for Integrability).Let f ∈ B[a, b]. We say that Riemann’s Criteria holds for f if for every positive ε there exists a

π0 ∈ Π[a, b] such that U(f,π)− L(f,π) < ε for any refinement, π, of π0.

Theorem 4.2.4 (The Riemann Integral Equivalence Theorem).Let f ∈ B[a, b]. Then the following are equivalent.

(i) f ∈ RI[a, b].

(ii) f satisfies Riemann’s Criteria.

(iii) f is Darboux Integrable, i.e, L(f) = U(f), and RI(f ; a, b) = DI(f ; a, b).

Proof 4.2.8.(i)⇒ (ii)

Proof . Assume f ∈ RI[a, b], and let ε > 0 be given. Let IR be the Riemann integral of f over [a, b].Choose π0 ∈ Π[a, b] such that | S(f,π,σ) − IR |< ε/3 for any refinement, π, of π0 and any σ ⊂ π.

Let π be any such refinement, denoted by π = x0 = a, x1, . . . , xp = b, and let mj ,Mj be defined as

usual. Using the Infimum and Supremum Tolerance Lemmas, we can conclude that, for each j = 1, . . . , p,

there exist sj , tj ∈ [xj−1, xj ] such that

Mj −ε

6(b− a)< f(sj) ≤Mj

mj ≤ f(tj) < mj +ε

6(b− a).

71


It follows that

f(sj)− f(tj) > Mj −ε

6(b− a)−mj −

ε

6(b− a).

Thus, we have

Mj −mj −ε

3(b− a)< f(sj)− f(tj).

Multiply this inequality by ∆xj to obtain

(Mj −mj)∆xj −ε

3(b− a)∆xj <

(f(sj)− f(tj)

)∆xj .

Now, sum over π to obtain

U(f,π)− L(f,π) =∑π

(Mj −mj)∆xj

<ε

3(b− a)

∑π

∆xj +∑π

(f(sj)− f(tj)

)∆xj .

This simplifies to

∑π

(Mj −mj)∆xj −ε

3<

∑π

(f(sj)− f(tj)

)∆xj . (∗)

Now, we have

|∑π

(f(sj)− f(tj)

)∆xj | = |

∑π

f(sj)∆xj −∑π

f(tj)∆xj |

= |∑π

f(sj)∆xj − IR+ IR−∑π

f(tj)∆xj |

≤ |∑π

f(sj)∆xj − IR | + |∑π

f(tj)∆xj − IR |

= | S(f,π,σs)− IR | + | S(f,π,σt)− IR |,

where σs = s1, . . . , sp and σt = t1, . . . , tp are evaluation sets of π. Now, by our choice of partition

π, we know

| S(f,π,σs)− IR | <ε

3| S(f,π,σt)− IR | <

ε

3.

72


Thus, we can conclude that

|∑π

(f(sj)− f(tj)

)∆xj |<

2ε3.

Applying this to the inequality in Equation ∗, we obtain

∑π

(Mj −mj)∆xj < ε.

Now, π was an arbitrary refinement of π0, and ε > 0 was also arbitrary. So this shows that f satisfies

Riemann’s condition.

(ii)⇒ (iii)

Proof . Now, assume that f satisfies Riemann’s criteria, and let ε > 0 be given. Then there is a partition,

π0 ∈ Π[a, b] such that U(f,π)− L(f,π) < ε for any refinement, π, of π0. Thus, by the definition of the

upper and lower Darboux integrals, we have

U(f) ≤ U(f,π) < L(f,π) + ε ≤ L(f) + ε.

Since ε is arbitrary, this shows that U(f) ≤ L(f). The reverse inequality has already been established.

Thus, we see that U(f) = L(f).

(iii)⇒ (i)

Proof . Finally, assume f is Darboux integral which means L(f) = U(f). Let ID denote the value of the

Darboux integral. We will show that f is also Riemann integrable according to the definition and that the

value of the integral is ID.

Let ε > 0 be given. Now, recall that

ID = L(f) = supπL(f,π)

= U(f) = infπU(f,π)

Hence, by the Supremum Tolerance Lemma, there exists π1 ∈ Π[a, b] such that

ID − ε = L(f)− ε < L(f,π1) ≤ L(f) = ID

and by the Infimum Tolerance Lemma, there exists π2 ∈ Π[a, b] such that

ID = U(f) ≤ U(f,π2) < U(f) + ε = ID + ε.

Let π0 = π1 ∨ π2 be the common refinement of π1 and π2. Now, let π be any refinement of π0, and let

σ ⊂ π be any evaluation set. Then we have

ID − ε < L(f,π1) ≤ L(f,π0) ≤ L(f,π) ≤ S(f,π,σ) ≤ U(f,π) ≤ U(f,π0) ≤ U(f,π2) < ID + ε.

73

4.3. PROPERTIES CHAPTER 4. RIEMANN INTEGRATION

Thus, it follows that

ID − ε < S(f,π,σ) < ID + ε.

Since the refinement, π, of π0 was arbitrary, as were the evaluation set, σ, and the tolerance ε, it follows

that for any refinement, π, of π0 and any ε > 0, we have

| S(f,π,σ)− ID |< ε.

This shows that f is Riemann Integrable and the value of the integral is ID.

Comment 4.2.3. By Theorem 4.2.4, we now know that the Darboux and Riemann integral are equivalent.

Hence, it is now longer necessary to use a different notation for these two different approaches to what we

call integration. From now on, we will use this notation

RI(f ; a, b) ≡ DI(f ; a, b) ≡∫

f(t) dt

where the (t) in the new integration symbol refers to the name we wish to use for the independent variable

and dt is a mnemonic to remind us that the || π || is approaching zero as we choose progressively finer

partitions of [a, b]. This is, of course, not very rigorous notation. A better notation would be

RI(f ; a, b) ≡ DI(f ; a, b) ≡ I(f ; a, b)

where the symbol I denotes that we are interested in computing the integral of f using the equivalent

approach of Riemann or Darboux. Indeed, the notation I(f ; a, b) does not require the uncomfortable lack

of rigor that the symbol dt implies. However, for historical reasons, the symbol∫f(t) dt will be used.

Also, the use of the∫f(t)dt allows us to very efficiently apply the integration techniques of substitution

and so forth as we have shown in Chapter 2.

4.3 Properties Of The Riemann Integral

We can now prove a series of properties of the Riemann integral.

74


Theorem 4.3.1 (Properties Of The Riemann Integral).Let f, g ∈ RI[a, b]. Then

(i) | f |∈ RI[a, b];

(ii) ∣∣∣∣∣∫ b

af(x)dx

∣∣∣∣∣ ≤∫ b

a| f | dx;

(iii) f+ = maxf, 0 ∈ RI[a, b];

(iv) f− = max−f, 0 ∈ RI[a, b];

(v) ∫ b

af(x)dx =

∫ b

a[f+(x)− f−(x)]dx =

∫ b

af+(x)dx−

∫ b

af−(x)dx∫ b

a| f(x) | dx =

∫ b

a[f+(x) + f−(x)]dx =

∫ b

af+(x)dx+

∫ b

af−(x)dx;

(vi) f2 ∈ RI[a, b];

(vii) fg ∈ RI[a, b];

(viii) If there exists m,M such that 0 < m ≤| f |≤M , then 1/f ∈ RI[a, b].

Proof 4.3.1.(i)

Proof . Note given a partition π = x0 = a, x1, . . . , xp = b, for each j = 1, . . . , p we can easily show

that the supremum over order pairs can be computed in either order.

supx,y∈[xj−1,xj ]

(f(x)− f(y)) = supy∈[xj−1,xj ]

supy∈[xj−1,xj ]

(f(x)− f(y))

= supx∈[xj−1,xj ]

supy∈[xj−1,xj ]

(f(x)− f(y))

Thus,


(f(x)− f(y)) = supy∈[xj−1,xj ]

supx∈[xj−1,xj ]

(f(x)− f(y))

= supy∈[xj−1,xj ]

(Mj − f(y))

= Mj + supy∈[xj−1,xj ]

(−f(y))

75


= Mj − infy∈[xj−1,xj ]

(f(y))

= Mj −mj

Now, let m′j and M ′j be defined by

m′j = inf[xj−1,xj ]

| f(x) |

M ′j = sup[xj−1,xj ]

| f(x) | .

Then, arguing as we did earlier, we find

M ′j −m′j = supx,y∈[xj−1,xj ]

| f(x) | − | f(y) | .

Claim: supx,y | f(x)− f(y) |= Mj −mj

To see this is true, note

| f(x)− f(y) |=

f(x)− f(y), f(x) ≥ f(y)f(y)− f(x), f(x) < f(y)

In either case, we have | f(x) − f(y) |≤ Mj −mj for all x, y, implying that supx,y | f(x) − f(y) |≤Mj −mj .

To see the reverse inequality holds, we first note that if Mj = mj , we see the reverse inequality holds

trivially as supx,y | f(x) − f(y) | ≥ 0 = Mj −mj . Hence, we may assume without loss of generality

that the gap Mj −mj is positive.

Then, given 0 < ε < (1/2(Mj −m − j), there exist, s, t ∈ [xj−1, xj ] such that Mj − ε/2 < f(s) and

mj + ε/2 > f(t), so that f(s)− f(t) > Mj −mj − ε. by our choice of ε, these terms are positive and so

we also have | f(s)− f(t) |> Mj −mj − ε. It follows that


| f(x)− f(y) |≥| f(sj)− f(tj) |> Mj −mj − ε | .

Since we can make ε arbitrarily small, this implies that


| f(x)− f(y) |≥Mj −mj .

This establishes the reverse inequality and proves the claim ♦.

Thus, for each j = 1, . . . , p, we have

Mj −mj = supx,y∈[xj−1,xj ]

| f(x)− f(y) | .

76


So, since | f(x) | − | f(y) |≤| f(x)− f(y) | for all x, y, it follows that M ′j −m′j ≤Mj −mj , implying

that∑π(M ′j −m′j)∆xj ≤

∑π(Mj −mj)∆xj . Since f is integrable by hypothesis, by Theorem 4.2.4,

we know the Riemann criterion must also hold for | f |. Hence, | f | is Riemann integrable.

The other results now follow easily. (ii)

Proof . We have f ≤| f | and f ≥ − | f |, so that

∫ b

af(x)dx ≤

∫ b

a| f(x) | dx∫ b

af(x)dx ≥ −

∫ b

a| f(x) | dx,

from which it follows that

−∫ b

a| f(x) | dx ≤

∫ b

af(x)dx ≤

∫ b

a| f(x) | dx

and so ∣∣∣∣∣∫ b

af

∣∣∣∣∣ ≤∫ b

a| f |,

(iii) and (iv)

Proof . This follows from the facts that f+ = 12(| f | +f) and f− = 1

2(| f | −f) and the Riemann

integral is a linear mapping.

(v)

Proof . This follows from the facts that f = f+−f− and | f |= f+ +f− and the linearity of the integral.

(vi)

Proof . Note that, since f is bounded, there exists K > 0 such that | f(x) |≤ K for all x ∈ [a, b].Consequently, for all x, y ∈ [a, b], we have | (f(x))2 − (f(y))2 |≤ 2K | f(x) − f(y) |. Thus, the

integrability of f and the Riemann criterion imply that f2 is integrable.

(vii)

Proof . To prove that fg is integrable when f and g are, simply note that

fg = (1/2)

((f + g)2 − f2 − g2

).

Property (vi) and the linearity of the integral then imply fg is integrable.

77


(viii)

Proof . Suppose f ∈ RI[a, b] and there exist M,m > 0 such that m ≤| f(x) |≤ M for all x ∈ [a, b].Note that

1f(x)

− 1f(y)

=f(y)− f(x)f(x)f(y)

.

Let π = x0 = a, x1, . . . , xp = b be a partition of [a, b], and define

M ′j = sup[xj−1,xj ]

1f(x)

m′j = inf[xj−1,xj ]

1f(x)

.

Then we have

M ′j −m′j = supx,y∈[xj−1,xj ]

f(y)− f(x)f(x)f(y)

≤ supx,y∈[xj−1,xj ]

| f(y)− f(x) || f(x) || f(y) |

≤ 1m2


| f(y)− f(x) |

≤ Mj −mj

m2.

Since f ∈ RI[a, b], given ε > 0 there is a partition π0 such that U(f,π) − L(f,π) < m2ε for any

refinement, pi, of π0. Hence, the previous inequality implies that, for any such refinement, we have

U( 1f,π)− L

( 1f,π)

=∑π

(M ′j −m′j)∆xj

≤ 1m2

∑π

(Mj −mj)∆xj

≤ 1m2

(U(f,π)− L(f,π)

)<

m2ε

m2= ε.

Thus 1/f satisfies the Riemann Criterion and hence it is integrable.

78

4.4. RIEMANN INTEGRABLE? CHAPTER 4. RIEMANN INTEGRATION

4.4 What Functions Are Riemann Integrable?

Now we need to show that the setRI[a, b] is nonempty. We begin by showing that all continuous functions

on [a, b] will be Riemann Integrable.

Theorem 4.4.1 (Continuous Implies Riemann Integrable).If f ∈ C[a, b], then f ∈ RI[a, b].

Proof 4.4.1. Since f is continuous on a compact set, it is uniformly continuous. Hence, given ε > 0, there

is a δ > 0 such that x, y ∈ [a, b], | x− y |< δ ⇒| f(x)− f(y) |< ε/(b− a). Let π0 be a partition such

that || π0 ||< δ, and let π = x0 = a, x1, . . . , xp = b be any refinement of π0. Then π also satisfies

|| π ||< δ. Since f is continuous on each subinterval [xj−1, xj ], f attains its supremum, Mj , and infimum,

mj , at points sj and tj , respectively. That is, f(sj) = Mj and f(tj) = mj for each j = 1, . . . , p. Thus,

the uniform continuity of f on each subinterval implies that, for each j,

Mj −mj =| f(sj)− f(tj) |<ε

b− a.

Thus, we have

U(f,π)− L(f,π) =∑π

(Mj −mj)∆xj <ε

b− a∑π

∆xj = ε.

Since π was an arbitrary refinement of π0, it follows that f satisfies Riemann’s criterion. Hence, f ∈RI[a, b].

Theorem 4.4.2 (Constant Functions Are Riemann Integrable).If f : [a, b] → < is a constant function, f(t) = c for all t in [a, b], then f is Riemann Integrable on

[a, b] and∫ ba f(t)dt = c(b− a).

Proof 4.4.2. For any partitionπ of [a, b], since f is a constant, all the individualmj’s andMj’s associated

with π take on the value c. Hence, U(f,π)− U(f,π) = 0 always. It follows immediately that f satisfies

the Riemann Criterion and hence is Riemann Integrable. Finally, since f is integrable, by Theorem 4.1.2,

we have

c(b− a) ≤ RI(f ; a, b) ≤ c(b− a).

Thus,∫ ba f(t)dt = c(b− a).

Theorem 4.4.3 (Monotone Implies Riemann Integrable).If f is monotone on [a, b], then f ∈ RI[a, b].

Proof 4.4.3. As usual, for concreteness, we assume that f is monotone increasing. We also assume f(b) >f(a), for if not, then f is constant and must be integrable by Theorem 4.4.2. Let ε > 0 be given, and let

79

4.5. MORE PROPERTIES CHAPTER 4. RIEMANN INTEGRATION

π0 be a partition of [a, b] such that || π0 ||< ε/(f(b)− f(a)). Let π = x0 = a, x1, . . . , xp = b be any

refinement of π0. Then π also satisfies || π ||< ε/(f(b)− f(a)). Thus, for each j = 1, . . . , p, we have

∆xj <ε

f(b)− f(a).

Since f is increasing, we also know that Mj = f(xj) and mj = f(xj−1) for each j. Hence,

U(f,π)− L(f,π) =∑π

(Mj −mj)∆xj

=∑π

[f(xj)− f(xj−1)]∆xj

<ε

f(b)− f(a)

∑π

[f(xj)− f(xj−1)].

But this last sum is telescoping and sums to f(b)− f(a). So, we have

U(f,π)− L(f,π) <ε

f(b)− f(a)(f(b)− f(a)) = ε.

Thus, f satisfies Riemann’s criterion.

Theorem 4.4.4 (Bounded Variation Implies Riemann Integrable).If f ∈ BV [a, b], then f ∈ RI[a, b].

Proof 4.4.4. Since f is of bounded variation, there are functions u and v, defined on [a, b] and both

monotone increasing, such that f = u − v. Hence, by the linearity of the integral and the previous

theorem, f ∈ RI[a, b].

4.5 Further Properties of the Riemann Integral

We first want to establish the familiar summation property of the Riemann integral over an interval [a, b] =[a, c] ∪ [c, b]. Most of the technical work for this result is done in the following Lemma.

80


Lemma 4.5.1 (The Upper And Lower Darboux Integral Is Additive On Intervals).Let f ∈ B[a, b] and let c ∈ (a, b). Let

∫ b

af(x) dx = L(f) and

∫ b

af(x) dx = U(f)

denote the lower and upper Darboux integrals of f on [a, b], respectively. Then we have

∫ b

af(x)dx =

∫ c

af(x)dx+

∫ b

cf(x)dx

∫ b

af(x)dx =

∫ c

af(x)dx+

∫ b

cf(x)dx.

Proof 4.5.1. We prove the result for the upper integrals as the lower integral case is similar. Let π ∈Π[a, b] be given by π = x0 = a, x1, . . . , xp = b. We first assume that c is a partition point of π. Thus,

there is some index 1 ≤ k0 ≤ p − 1 such that xk0 = c. For any interval [α, β], let Uβα (f,π) denote

the upper sum of f for the partition π over [α, β]. Now, we can rewrite π as π = x0, x1, . . . , xk0 ∪xk0 , xk0+1, . . . , xp. Let π1 = x0, . . . , xk0 and π2 = xk0 , . . . , xp. Then π1 ∈ Π[a, c], π2 ∈Π[c, b], and

U ba(f,π) = U ca(f,π1) + U bc (f,π2)

≥∫ c

af(x)dx+

∫ b

cf(x)dx,

by the definition of the upper sum. Now, if c is not in π, then we can refine π by adding c, obtaining the

partition π′ = x0, x1, . . . , xk0 , c, xk0+1, . . . , xp. Splitting up π′ at c as we did before into π1 and π2,

we see that π′ = π1 ∨ π2 where π1 = x0, . . . , xk0 , c and π2 = c, xk0+1, . . . , xp. Thus, by our

properties of upper sums, we see that

U ba(f,π) ≥ U ba(f,π′) = U ca(f,π1) + U bc (f,π2) ≥∫ c

af(x)dx+

∫ b

cf(x)dx.

Combining both cases, we can conclude that for any partition π ∈ Π[a, b], we have

U ba(f,π) ≥∫ c

af(x)dx+

∫ b

cf(x)dx,

which implies that

∫ b

af(x)dx ≥

∫ c

af(x)dx+

∫ b

cf(x)dx.

81


Now we want to show the reverse inequality. Let ε > 0 be given. By the definition of the upper integral,

there exists π1 ∈ Π[a, c] and π2 ∈ [c, b] such that

U ca(f,π1) <

∫ c

af(x)dx+

ε

2

U bc (f,π2) <

∫ b

cf(x)dx+

ε

2.

Let π = π1 ∪ π2 ∈ Π[a, b]. It follows that

U ba(f,π) = U ca(f,π1) + U bc (f,π2) <∫ c

af(x)dx+

∫ b

cf(x)dx+ ε.

But, by definition, we have

∫ b

af(x)dx ≤ U ba(f,π)

for all π. Hence, we see that

∫ b

af(x)dx <

∫ c

af(x)dx+

∫ b

cf(x)dx+ ε.

Since ε was arbitrary, this proves the reverse inequality we wanted. We can conclude, then, that

∫ b

af(x)dx =

∫ c

af(x)dx+

∫ b

cf(x)dx.

Theorem 4.5.2 (The Riemann Integral Exists On Subintervals).If f ∈ RI[a, b] and c ∈ (a, b), then f ∈ RI[a, c] and f ∈ RI[c, b].

Proof 4.5.2. Let ε > 0 be given. Then there is a partition π0 ∈ Π[a, b] such that U ba(f,π)−Lba(f,π) < ε

for any refinement, π, of π0. Let π0 be given by π0 = x0 = a, x1, . . . , xp = b. Define π′0 =π0 ∪ c, so there is some index k0 such that xk0 ≤ c ≤ xk0+1. Let π1 = x0, . . . , xk0 , c and π2 =c, xk0+1, . . . , xp. Then π1 ∈ Π[a, c] and π2 ∈ Π[c, b]. Let π′1 be a refinement of π1. Then π′1 ∪ π2 is

a refinement of π0, and it follows that

U ca(f,π′1)− Lca(f,π′1) =∑π′1

(Mj −mj)∆xj

≤∑π′1∪π2

(Mj −mj)∆xj

≤ U ba(f,π′1 ∪ π2)− Lba(f,π′1 ∪ π2).

82

4.6. FUNDAMENTAL THEOREM CHAPTER 4. RIEMANN INTEGRATION

But, since π′1 ∪ π2 refines π0, we have

U ba(f,π′1 ∪ π2)− Lba(f,π′1 ∪ π2) < ε,

implying that

U ca(f,π′1)− Lca(f,π′1) < ε

for all refinements, π′1, of π1. Thus, f satisfies Riemann’s criterion on [a, c], and f ∈ RI[a, c]. The proof

on [c, b] is done in exactly the same way.

Theorem 4.5.3 (The Riemann Integral Is Additive On Subintervals).If f ∈ RI[a, b] and c ∈ (a, b), then∫ b

af(x)dx =

∫ c

af(x)dx+

∫ b

cf(x)dx.

Proof 4.5.3. Since f ∈ RI[a, b], we know that

∫ b

af(x)dx =

∫ b

af(x)dx.

Further, we also know that f ∈ RI[a, c] and f ∈ RI[c, b] for any c ∈ (a, b). Thus,

∫ c

af(x)dx =

∫ c

af(x)dx

∫ b

cf(x)dx =

∫ b

cf(x)dx.

So, applying Lemma 4.5.1, we conclude that, for any c ∈ (a, b),

∫ b

af(x)dx =

∫ b

af(x) dx =

∫ c

af(x) dx+

∫ b

cf(x) dx =

∫ c

af(x) dx+

∫ b

cf(x) dx.

4.6 The Fundamental Theorem Of Calculus

The next result is the well-known Fundamental of Theorem of Calculus.

83


Theorem 4.6.1 (The Fundamental Theorem Of Calculus).Let f ∈ RI[a, b]. Define F : [a, b]→ < by

F (x) =∫ x

af(t)dt.

Then

(i) F ∈ BV [a, b];

(ii) F ∈ C[a, b];

(iii) if f is continuous at c ∈ [a, b], then F is differentiable at c and F ′(c) = f(c).

Proof 4.6.1. First, note that f ∈ RI[a, b]⇒ f ∈ R[a, x] for all x ∈ [a, b], by our previous results. Hence,

F is well-defined. We will prove the results in order. (i)

Proof . Let π ∈ Π[a, b] be given by π = x0 = a, x1, . . . , xp = b. Then the fact that f ∈ R[a, xj ]implies that f ∈ R[xj−1, xj ] for each j = 1, . . . , p. Thus, we have

mj∆xj ≤∫ xj

xj−1

f(t)dt ≤Mj∆xj .

This implies that, for each j, we have∣∣∣∣∣∫ xj

xj−1

f(t)dt

∣∣∣∣∣ ≤|| f ||∞ ∆xj .

Thus,

| ∆Fj | = | F (xj)− F (xj−1) |

=

∣∣∣∣∣∫ xj

af(t)dt−

∫ xj−1

af(t)dt

∣∣∣∣∣=

∣∣∣∣∣∫ xj

xj−1

f(t)dt

∣∣∣∣∣≤ || f ||∞ ∆xj .

Summing over π, we obtain

∑π

| ∆Fj |≤|| f ||∞∑π

∆xj = (b− a) || f ||∞<∞.

Since the partition π was arbitrary, we conclude that F ∈ BV [a, b].

(ii)

84


Proof . Now, let x, y ∈ [a, b] be such that x < y. Then

inf[x,y]

f(t) (y − x) ≤∫ y

xf(t)dt ≤ sup

[x,y]f(t) (y − x),

which implies that

| F (y)− F (x) |=

∣∣∣∣∣∫ y

xf(t)dt

∣∣∣∣∣ ≤|| f ||∞ (y − x).

A similar argument shows that if y, x ∈ [a, b] satisfy y < x, then

| F (y)− F (x) |=

∣∣∣∣∣∫ y

xf(t)dt

∣∣∣∣∣ ≤|| f ||∞ (x− y).

Let ε > 0 be given. Then if

| x− y |< ε

|| f ||∞ +1,

we have

| F (y)− F (x) |≤|| f ||∞| y − x |<|| f ||∞|| f ||∞ +1

ε < ε.

Thus, F is continuous at x and, consequently, on [a, b].

(iii)

Proof . Finally, assume f is continuous at c ∈ [a, b], and let ε > 0 be given. Then there exists δ > 0 such

that x ∈ (c − δ, c + δ) ∩ [a, b] implies | f(x) − f(c) |< ε/2. Pick h ∈ < such that 0 <| h |< δ and

c+ h ∈ [a, b]. Let’s assume, for concreteness, that h > 0. Define

m = inf[c,c+h]

f(t) and M = sup[c,c+h]

f(t).

If c < x < c+ h, then we have x ∈ (c− δ, c+ δ) ∩ [a, b] and −ε/2 < f(x)− f(c) < ε/2. That is,

f(c)− ε

2< f(x) < f(c) +

ε

2∀x ∈ [c, c+ h].

Hence, m ≥ f(c)− ε/2 and M ≤ f(c) + ε/2. Now, we also know that

mh ≤∫ c+h

cf(t)dt ≤Mh.

Thus, we have

F (c+ h)− F (c)h

=

∫ c+ha f(t)dt−

∫ ca f(t)dt

h=

∫ c+hc f(t)dt

h.

Combining inequalities, we find

85


f(c)− ε

2≤ m ≤ F (c+ h)− F (c)

h≤M ≤ f(c) +

ε

2

yielding

⇒

∣∣∣∣∣F (c+ h)− F (c)h

− f(c)

∣∣∣∣∣ ≤ ε

2< ε

if x ∈ [c, c+ h].

The case where h < 0 is handled in exactly the same way. Thus, since ε was arbitrary, this shows that F

is differentiable at c and F ′(c) = f(c). Note that if c = a or c = b, we need only consider the definition of

the derivative from one side.

Comment 4.6.1. We call F (x) the indefinite integral of f . F is always better behaved than f , since

integration is a smoothing operation. We can see that f need not be continuous, but, as long as it is

integrable, F is always continuous.

The next result is one of the many mean value theorems in the theory of integration. It is a more

general form of the standard mean value theorem given in beginning calculus classes.

Theorem 4.6.2 (The Mean Value Theorem For Riemann Integrals).Let f ∈ C[a, b], and let g ≥ 0 be integrable on [a, b]. Then there is a point, c ∈ [a, b], such that∫ b

af(x)g(x)dx = f(c)

∫ b

ag(x)dx.

Proof 4.6.5. Since f is continuous, it is also integrable. Hence, fg is integrable. Let m and M denote the

lower and upper bounds of f on [a, b], respectively. Then mg(x) ≤ f(x)g(x) ≤Mg(x) for all x ∈ [a, b].Since the integral preserves order, we have

m

∫ b

ag(x)dx ≤

∫ b

af(x)g(x)dx ≤M

∫ b

ag(x)dx.

If the integral of g on [a, b] is 0, then this shows that the integral of fg will also be 0. Hence, in this case,

we can choose any c ∈ [a, b] and the desired result will follow. If the integral of g is not 0, then it must be

positive, since g ≥ 0. Hence, we have, in this case,

m ≤∫ ba f(x)g(x)dx∫ ba g(x)dx

≤M.

Now, f must be uniformly continuous, implying that it attains the values M and m at some points. Hence,

by the intermediate value theorem, there must be some c ∈ [a, b] such that

86


f(c) =

∫ ba f(x)g(x)dx∫ ba g(x)dx

.

This implies the desired result.

The next result is another standard mean value theorem from basic calculus. It is a direct consequence

of the previous theorem, by simply letting g(x) = 1 for all x ∈ [a, b]. This result can be interpreted as

stating that integration is an averaging process.

Theorem 4.6.3 (Average Value For Riemann Integrals).If f ∈ C[a, b], then there is a point c ∈ [a, b] such that

1b− a

∫ b

af(x)dx = f(c).

The next result is the standard means for calculating definite integrals in basic calculus. We start with

a definition.

Definition 4.6.1 (The Antiderivative of f ).Let f : [a, b] → < be a bounded function. Let G : [a, b] → < be such that G′ exists on [a, b] and

G′(x) = f(x) for all x ∈ [a, b]. Such a function is called an antiderivative or a primitive of f .

Comment 4.6.2. The idea of an antiderivative is intellectually distinct from the Riemann integral of a

bounded function f . Consider the following function f defined on [−1, 1].

f(x) =

x2 sin(1/x2), x 6= 0, x ∈ [−1, 1]0, x = 0

It is easy to see that this function has a removable discontinuity at 0. Moreover, f is even differentiable on

[−1, 1] with derivative

f ′(x) =

2x sin(1/x2)− (2/x) cos(1/x2), x 6= 0, x ∈ [−1, 1]0, x = 0

Note f ′ is not bounded on [−1, 1] and hence it can not be Riemann Integrable. Now to connect this to the

idea of antiderivatives, just relabel the functions. Let g be defined by

g(x) =

2x sin(1/x2)− (2/x) cos(1/x2), x 6= 0, x ∈ [−1, 1]0, x = 0

then define G by

G(x) =

x2 sin(1/x2), x 6= 0, x ∈ [−1, 1]0, x = 0

87


We see that G is the antiderivative of g even though g itself does not have a Riemann integral. Again, the

point is that the idea of the antiderivative of a function is intellectually distinct from that of being Riemann

integrable.

Theorem 4.6.4 (Cauchy’s Fundamental Theorem).Let f : [a, b]→ < be integrable. Let G : [a, b]→ < be any antiderivative of f . Then∫ b

af(t)dt = G(t)

∣∣∣ba

= G(b)−G(a).

Proof 4.6.6. Since G′ exists on [a, b], G must be continuous on [a, b]. Let ε > 0 be given. Since f is

integrable, there is a partition π0 ∈ Π[a, b] such that for any refinement, π, of π0 and any σ ⊂ π, we

have

∣∣∣∣∣S(f,π,σ)−∫ b

af(x)dx

∣∣∣∣∣ < ε.

Let π be any refinement of π0, given by π = x0 = a, x1, . . . , xp = b. The Mean Value Theorem

for differentiable functions then tells us that there is an sj ∈ (xj−1, xj) such that G(xj) − G(xj−1) =G′(sj)∆xj . Since G′ = f , we have G(xj) − G(xj−1) = f(sj)∆xj for each j = 1, . . . , p. The set of

points, s1, . . . , sp, is thus an evaluation set associated with π. Hence,

∑π

[G(xj)−G(xj−1)] =∑π

G′(sj)∆xj =∑π

f(sj)∆xj

The first sum on the left is a collapsing sum, hence we have

⇒ G(b)−G(a) = S(f,π, s1, . . . , sp).

We conclude

∣∣∣∣∣G(b)−G(a)−∫ b

af(x)dx

∣∣∣∣∣ < ε.

Since ε was arbitrary, this implies the desired result.

Comment 4.6.3. Not all functions (in fact, most functions) will have closed form, or analytically obtain-

able, antiderivatives. So, the previous theorem will not work in such cases.

88


Theorem 4.6.5 (The Recapture Theorem).If f is differentiable on [a, b], and if f ′ ∈ RI[a, b], then∫ x

af ′(t)dt = f(x)− f(a).

Proof 4.6.7. f is an antiderivative of f . Now apply Cauchy’s Fundamental Theorem 4.6.4.

Another way to evaluate Riemann integrals is to directly approximate them using an appropriate se-

quence of partitions. Theorem 4.6.6 is a fundamental tool that tells us when and why such approximations

will work.

Theorem 4.6.6 (Approximation Of The Riemann Integral).If f ∈ RI[a, b], then given any sequence of partitions πn with any associated sequence of evalua-

tion sets σn that satisfies || πn ||→ 0, we have

limn→∞

S(f,πn,σn) =∫ b

af(x) dx

Proof 4.6.8. Since f is integrable, given a positive ε, there is a partition π0 so that

| S(f,π,σ)−∫ b

af(x)dx | < ε/2, π0 π, σ ⊆ π. (∗)

Let the partition π0 be x0, x1, . . . , xP and let ξ be defined to be the smallest ∆xj from π0. Then since

the norm of the partitions πn goes to zero, there is a positive integer N so that

|| πn || < min (ξ, ε/(4P || f ||)∞)) (∗)

Now pick any n > N and label the points ofπn as y0, y1, . . . , yQ. We see that the points inπn are close

enough together so that at most one point of π0 lies in any subinterval yj−1, yj ] from πn. This follows

from our choice of ξ. So the intervals of πn split into two pieces: those containing a point of π0 and those

that do not have a π0 inside. Let A be the first collection of intervals and B, the second. Note there are

P points in π0 and so there are P subintervals in B. Now consider the common refinement πn ∨π0. The

points in the common refinement match πn except on the subintervals from B. Let [yj−1, yj ] be such a

subinterval and let γj denote the point from π0 which is in this subinterval. Let’s define an evaluation set

σ for this refinement πn ∨ π0 as follows.

1. if we are in the subintervals labeled A , we choose as our evaluation point, the evaluation point

sj that is already in this subinterval since σn ⊆ πn. Here, the length of the subinterval will be

denoted by δj(A ) which equals yj − yj−1 for appropriate indices.

89


2. if we are in the the subintervals labeled B, we have two intervals to consider as [yj−1, yj ] =[yj−1, γj ] ∪ [γj , yj ]. Choose the evaluation point γj for both [yj−1, γ] and [γ, yj ]. Here, the length

of the subintervals will be denoted by δj(B). Note that δj(B) = γj − yj−1 or yj − γj .

Then we have

S(f,πn ∨ π0,σ) =∑A

f(sj)δj(A ) +∑B

f(γj)δj(A )

=∑A

f(sj)(yj − yj−1) +∑B

(f(γj)(yj − γj) + f(γj)(γj − yj−1))

=∑A

f(sj)(yj − yj−1) +∑B

(f(γj)(yj − yj−1))

Thus, since the Riemann sums over πn and πn ∨ π0 with these choices of evaluation sets match on A ,

we have using Equation ∗ that

| S(f,πn,σn)− S(f,πn ∨ π0,σ) | = |∑A

(f(sj)− f(γj)(yj − yj−1) |

≤∑A

(| f(sj) | + | f(γj) |) (yj − yj−1) |

≤ P 2 || f ||∞ || πn ||

< P 2 || f ||∞ε

4P || f ||∞= ε/2

We conclude that for our special evaluation set σ for the refinement πn ∨ π0 that

|| S(f,πn,σn)−∫ b

af(x)dx | = | S(f,πn,σn)− S(f,πn ∨ π0,σ) + S(f,πn ∨ π0,σ)−

∫ b

af(x)dx |

≤ | S(f,πn,σn)− S(f,πn ∨ π0,σ) | + | S(f,πn ∨ π0,σ)−∫ b

af(x)dx |

< ε/2 + ε/2 = ε

using Equation ∗ as πn ∨ π0 refines π0. Since we can do this analysis for any n > N , we see we have

shown the desired result.

4.6.1 Homework

Exercise 4.6.1. Let f(x) = x2 on the interval [−1, 3]. Use Theorem 4.6.6 to prove that∫ 3−1 f(x)dx =

28/3.

Hint 4.6.1. We know f is Riemann integrable because it is continuous and so this theorem can be applied.

Use the uniform approximations xi = −1 + 4i/n for i = 0 to i = n to define partitions πn. Then using

left or right hand endpoints on each subinterval to define the evaluation set σn, you can prove directly

that∫ 3−1 x

2dx = lim S(f,πn,σn) = 28/3. Make sure you tell me all the reasoning involved.

90

4.7. SUBSTITUTION CHAPTER 4. RIEMANN INTEGRATION

Exercise 4.6.2. If f is continuous, evaluate

limx→a

x

x− a

∫ x

af(t)dt

Exercise 4.6.3. Prove if f is continuous on [a, b] and∫ ba f(x)g(x)dx = 0 for all choices of integrable g,

then f is identically 0.

4.7 Substitution Type Results

Using the Fundamental Theorem of Calculus, we can derive many useful tools.

Theorem 4.7.1 (Integration By Parts).Assume u : [a, b] → < and v : [a, b] → < are differentiable on [a, b] and u′ and v′ are integrable.

Then ∫ x

au(t)v′(t) dt = u(t)v(t)

∣∣∣∣∣x

a

−∫ x

av(t)u′(t) dt

Proof 4.7.1. Since u and v are differentiable on [a, b], they are also continuous and hence, integrable.

Now apply the product rule for differentiation to obtain

(u(t)v(t))′ = u′(t)v(t) + u(t)v′(t)

By Theorem 4.3.1, we know products of integrable functions are integrable. Also, the integral is linear.

Hence, the integral of both sides of the equation above is defined. We obtain

∫ x

a(u(t)v(t))′ dt =

∫ x

au′(t)v(t) dt +

∫ x

au(t)v′(t) dt

Since (uv)′ is integrable, we can apply the Recapture Theorem to see

u(t)v(t)

∣∣∣∣∣x

a

=∫ x

au′(t)v(t) dt +

∫ x

au(t)v′(t) dt

This is the desired result.

91


Theorem 4.7.2 (Substitution In Riemann Integration).Let f be continuous on [c, d] and u be continuously differentiable on [a, b] with u(a) = c and u(b) = d.

Then ∫ d

cf(u) du =

∫ b

af(u(t)) u′(t) dt

Proof 4.7.2. Let F be defined on [c, d] by F (u) =∫ uc f(t)dt. Then since f is continuous, F is continuous

and differentiable on [c, d] by the Fundamental Theorem of Calculus. We know F ′(u) = f(u) and so

F ′(u(t)) = f(u(t)), a ≤ t ≤ b

implying

F ′(u(t)) u′(t) = f(u(t))u′(t) , a ≤ t ≤ b

By the Chain Rule for differentiation, we also know

(F u)′(t) = F (u(t))u′(t) , a ≤ t ≤ b.

and hence (F u)′(t) = f(u(t))u′(t) on [a, b].

Now define g on [a, b] by

g(t) = (f u)(t) u′(t) = f(u(t)) u′(t)

= (F u)′(t).

Since g is continuous, g is integrable on [a, b]. Now define G on [a, b] by G(t) = (F u)(t). Then

G′(t) = f(u(t))u′(t) = g(t) on [a, b] and G′ is integrable. Now, apply the Cauchy Fundamental Theorem

of Calculus to G to find ∫ b

ag(t) dt = G(b) − G(a)

or ∫ b

af(u(t)) u′(t) dt = F (u(b)) − F (u(a))

=∫ u(b)=d

cf(t)dt −

∫ u(a)=c

cf(t)dt

=∫ d

cf(t)dt.

92


Theorem 4.7.3 (Leibnitz’s Rule).Let f be continuous on [a, b], u : [c, d] → [a, b] be differentiable on [c, d] and v : [c, d] → [a, b] be

differentiable on [c, d]. Then(∫ v(x)

u(x)f(t) dt

)′= f(v(x))v′(x) − f(u(x)u′(x)

Proof 4.7.3. Let F be defined on [a, b] by F (y) =∫ ya f(t)dt. Since f is continuous, F is also continuous

and moreover, F is differentiable with F ′(y)) = f(y). Since v is differentiable on [c, d], we can use the

Chain Rule to find

(F v)′(x) = F ′(v(x)) v′(x)

= f(v(x)) v′(x)

This says (∫ v(x)

af(t) dt

)′= f(v(x))v′(x)

Next, define G on [a, b] by G(y) =∫ by f(t)dt =

∫ ba f(t)−

∫ ya f(t)dt. Apply the Fundamental Theorem of

Calculus to conclude

G′(y) = −(∫ y

af(t)dt

)= −f(y)

Again, apply the Chain Rule to see

(G u)′ (x) = G′(u(x)) u′(x)

= −f(u(x)) u′(x).

We conclude (∫ b

u(x)f(t) dt

)′= −f(u(x))u′(x)

Now combine these results as follows:∫ b

af(t)dt =

∫ v(x)

af(t)dt +

∫ u(x)

v(x)f(t)dt +

∫ b

u(x)f(t)dt

93

4.8. SAME INTEGRAL? CHAPTER 4. RIEMANN INTEGRATION

or

(F v)(x) + (G u)(x)−∫ b

af(t)dt = −

∫ u(x)

v(x)f(t)dt

=∫ v(x)

u(x)f(t)dt

Then, differentiate both sides to obtain

(F v)′(x) + (G u)′(x) = f(v(x))v′(x) − f(u(x))u′(x)

=

(∫ v(x)

u(x)f(t)dt

)′

which is the desired result.

4.8 When Do Two Functions Have The Same Integral?

The last results in this chapter seek to find conditions under which the integrals of two functions, f and g,

are equal.

Lemma 4.8.1 (f Zero On (a, b) Implies Zero Riemann Integral).Let f ∈ B[a, b], with f(x) = 0 on (a, b). Then f is integrable on [a, b] and∫ b

af(x)dx = 0.

Proof 4.8.1. If f is identically 0, then the result is follows easily. Now, assume f(a) 6= 0 and f(x) on

(a, b]. Let ε > 0 be given, and let δ > 0 satisfy

δ <ε

| f(a) |.

Let π0 ∈ Π[a, b] be any partition such that || π0 ||< δ. Let π = x0 = a, x1, . . . , xp be any refinement

of π0. Then U(f,π) = max(f(a), 0)∆x1 and L(f,π) = min(f(a), 0)∆x1. Hence, we have

U(f,π)− L(f,π) = [max(f(a), 0)−min(f(a), 0)]∆x1 =| f(a) | ∆x1.

But

| f(a) | ∆x1 <| f(a) | δ <| f(a) | ε

| f(a) |= ε.

Hence, if π is any refinement of π0, we have U(f,π) − L(f,π) < ε. This shows that f ∈ RI[a, b].Further, we have

94


U(f,π) = max(f(a), 0)∆x1 ⇒ U(f) = infπU(f,π) = 0,

since we can make ∆x1 as small as we wish. Likewise, we also see that L(f) = supπ L(f,π) = 0,

implying that

U(f) = L(f) =∫ b

af(x)dx = 0.

The case where f(b) 6= 0 and f(x) = 0 on [a, b) is handled in the same way. So, assume that f(a), f(b) 6=0 and f(x) = 0 for x ∈ (a, b). Let ε > 0 be given, and choose δ > 0 such that

δ <ε

2 max| f(a) |, | f(b) |.

Let π0 be a partition of [a, b] such that | π0 |< δ, and let π be any refinement of π0. Then

U(f,π) = max(f(a), 0)∆x1 + max(f(b), 0)∆xp

L(f,π) = min(f(a), 0)∆x1 + min(f(b), 0)∆xp.

It follows that

U(f,π)− L(f, pi) = [max(f(a), 0)−min(f(a), 0)]∆x1 + [max(f(b), 0)−min(f(a), 0)]∆xp

= | f(a) | ∆x1+ | f(b) | ∆xp< | f(a) | δ+ | f(b) | δ

< ε.

Since we can make ∆x1 and ∆xp as small as we wish, we see

∫ b

af(x)dx = 0.

Lemma 4.8.2 (f = g on (a, b) Implies Riemann Integrals Match).Let f, g ∈ RI[a, b] with f(x) = g(x) on (a, b). Then∫ b

af(x)dx =

∫ b

ag(x)dx.

Proof 4.8.2. Let h = f − g, and apply the previous lemma.

95


Theorem 4.8.3 (Two Riemann Integrable Functions Match At All But Finitely Many Points Implies

Integrals Match).Let f, g ∈ RI[a, b], and assume that f = g except at finitely many points c1, . . . , ck. Then∫ b

af(x)dx =

∫ b

ag(x)dx.

Proof 4.8.3. We may re-index the points c1, . . . , ck, if necessary, so that c1 < c2 < · · · < ck. Then

apply Lemma 4.8.2 on the intervals (cj−1, cj) for all allowable j. This shows∫ cj

cj−1

f(t)dt =∫ cj

cj−1

g(t)dt.

Then, since ∫ b

af(t)dt =

k∑j=1

∫ cj

cj−1

f(t)dt

the results follows.

Theorem 4.8.4 (f Bounded and Continuous At All But One Point Implies f is Riemann Integrable).if f is bounded on [a, b] and continuous except at one point c in [a, b], then f is Riemann integrable.

Proof 4.8.4. For convenience, we will assume that c is an interior point, i.e. c is in (a, b). We will show

that f satisfies the Riemann Criterion and so it is Riemann integrable. Let ε > 0 be given. Since f is

bounded on [a, b], there is a real number M so that f(x) < M for all x in [a, b]. We know f is continuous

on [a, c− ε/(6M)] and f is continuous on [c+ ε/(6M), b]. Thus, f is integrable on both of these intervals

and f satisfies the Riemann Criterion on both intervals. For this ε there is a partition π0 of [a, c−ε/(6M)]so that

U(f,P )− L(f,P ) < ε/3, ifπ0 P

and there is a partition π1 of [c+ ε/(6M), b] so that

U(f,Q)− L(f,Q) < ε/3, ifπ0 Q.

Let π2 be the partition we get by combining π)0 with the points c − ε/(6M), c + ε/(6M) and π)1.

Then, we see

U(f,π2)− L(f,π2) = U(f,π0)− L(f,π0) +

(sup

x∈[c−ε/(6M),c+ε/(6M)]f(x)

)ε/3 + U(f,π1)− L(f,π1)

< ε/3 +Mε/(3M) + ε/3 = ε

96


Then if π2 π on [a, b], we have

U(f,π)− L(f,π) < ε

This shows f satisfies the Riemann criterion and hence is integrable if the discontinuity c is interior to

[a, b]. The argument at c = a and c = b is similar but a bit simpler as it only needs to be done from one

side. Hence, we conclude f is integrable on [a, b] in all cases..

It is then easy to extend this result to a function f which is bounded and continuous on [a, b] except at

a finite number of points x1, x2, . . . , xk for some positive integer k. We state this as Theorem 4.8.5.

Theorem 4.8.5 (f Bounded and Continuous At All But Finitely Many Points Implies f is Riemann

Integrable).if f is bounded on [a, b] and continuous except at finitely many points x1, x2, . . . , xk in [a, b], then

f is Riemann integrable.

Proof 4.8.5. We may assume without loss of generality that the points of discontinuity are ordered as

a < x1 < x2 < . . . < xk < b. Then f is continuous except at x1 on [a, x1] and hence by Theorem 4.8.4 f

is integrable on [a, x1]. Now apply this argument on each of the subintervals xk−1, xk] in turn.

97


98

Chapter 5

Further Riemann Integration Results

In this chapter, we will explore certain aspects of Riemann Integration that are more subtle. We begin with

a limit interchange theorem. A good reference for this is (Fulks (3) 1978).

5.1 The Limit Interchange Theorem for Riemann Integration

Suppose you knew that the sequence of functions xn contained in RI[a, b] converged uniformly to the

function x on [a, b]. Is it true that∫ ba x(t)dt = limn→∞ xn(t)dt? The answer to this question is Yes! and

it is our Theorem 5.1.1.

Theorem 5.1.1 (The Riemann Integral Limit Interchange Theorem).Let xn be a sequence of Riemann Integrable functions on [a, b] which converge uniformly to the

function x on [a, b]. Then x is also Riemann Integrable on [a, b] and∫ b

ax(t)dt = lim

n→∞xn(t)dt

Proof 5.1.1. First, we show that x is Riemann integrable on [a, b]. Let ε be given. Then since xn converges

uniformly to x on [a, b],

∃ δ > 0 3 | xn(t)− x(t) | <ε

5(b− a)∀ n > N, t ∈ [a, b] (α)

Fix any n1 > N . Then since xn1 is integrable,

∃ π0 ∈ Π[a, b] 3 U(xn1 ,π) − L((xn1 ,π) < fracε5 ∀ π0 π (β)

99

5.1. LIMIT INTERCHANGE CHAPTER 5. FURTHER RIEMANN RESULTS

Since xn converges uniformly to x on [a, b], you should be able to show that x is bounded on [a, b]. Hence,

we can define

Mj = sup[xj−1,xj ]

x(t), M1j = sup

[xj−1,xj ]xn1(t)

mj = inf[xj−1,xj ]

x(t), m1j = inf

[xj−1,xj ]xn1(t)

Using the Infimum and Supremum Tolerance Lemma, there are points sj and tj in [xj−1, xj ] so that

Mj −ε

5(b− a)< x(sj) ≤ Mj (γ)

and

mj ≤ x(tj) < mj +ε

5(b− a)(ξ)

Thus,

U(x,π) − L(x,π) =∑π

(Mj −mj)∆xj

The term on the right hand side can be rewritten using the standard add and subtract trick as

∑π

(Mj − x(sj) + x(sj)− xn1(sj) + xn1(sj)− xn1(tj) + xn1(tj)− x(tj) + x(tj)−mj

)∆xj

We can then overestimate this term using the triangle inequality to find

U(x,π) − L(x,π) ≤∑π

(Mj − x(sj))∆xj +∑π

(x(sj)− xn1(sj))∆xj +∑π

(xn1(sj)− xn1(tj))∆xj

+∑π

(xn1(tj)− x(tj))∆xj +∑π

(x(tj)−mj)∆xj

The first term can be estimated by Equation γ and the fifth term by Equation ξ to give

U(x,π) − L(x,π) <ε

5(b− a)

∑π

∆xj +∑π

(x(sj)− xn1(sj))∆xj +∑π


+∑π

(xn1(tj)− x(tj))∆xj +ε

5(b− a)

∑π

∆xj

Thus,

U(x,π) − L(x,π) < 2ε

5+∑π

(x(sj)− xn1(sj))∆xj

+∑π

(xn1(sj)− xn1(tj))∆xj +∑π

(xn1(tj)− x(tj))∆xj

100


Now apply the estimate from Equation α to the first and third terms of the equation above to conclude

U(x,π) − L(x,π) < 4ε

5+∑π


Finally, note

| xn1(sj)− xn1(tj) | ≤ M1j − m1

j

and so ∑π

(xn1(sj)− xn1(tj))∆xj ≤∑π

(M1j − m1

j )∆xj

< ε/5

by Equation β. Thus, U(x,π) − L(x,π) < ε. Since the partition π refining π0 was arbitrary, we see x

satisfies the Riemann Criterion and hence, is Riemann integrable on [a, b].

It remains to show the limit interchange portion of the theorem. Since xn converges uniformly to x,

given a positive ε, there is an integer N so that

supa≤t≤b

| xn(t)− x(t) | < ε/(b− a), if n > N. (ζ)

Now for any n > N , we have

|∫ b

ax(t)dt −

∫ b

axn(t)dt | = |

∫ b

a

(x(t)− xn(t)

)dt |

≤∫ b

a

∣∣∣∣∣x(t)− xn(t)

∣∣∣∣∣dt≤

∫ b

asupa≤t≤b

| xn(t)− x(t) | dt

<

∫ b

aε/(b− a)dt

= ε

using Equation ζ. This says lim∫ ba xn(t)dt =

∫ ba x(t)dt.

The next result is indispensable in modern analysis. Fundamentally, it states that a continuous real-

valued function defined on a compact set can be uniformly approximated by a smooth function. This is

used throughout analysis to prove results about various functions. We can often verify a property of a

continuous function, f , by proving an analogous property of a smooth function that is uniformly close to

f . We will only prove the result for a closed finite interval in <. The general result for a compact subset

of a more general set called a Topological Space is a modification of this proof which is actually not that

more difficult, but that is another story. We follow the development of (Simmons (5) 1963) for this proof.

101


Theorem 5.1.2 (Weierstrass Approximation Theorem).Let f be a continuous real-valued function defined on [0, 1]. For any ε > 0, there is a polynomial, p,

such that |f(t)− p(t)| < ε for all t ∈ [0, 1], that is || p− f ||∞< ε

Proof 5.1.2. We first derive some equalities. We will denote the interval [0, 1] by I . By the binomial

theorem, for any x ∈ I , we have

n∑k=0

(n

k

)xk(1− x)n−k = (x+ 1− x)n = 1. (α)

Differentiating both sides of Equation α, we get

0 =n∑k=0

(n

k

)(kxk−1(1− x)n−k − xk(n− k)(1− x)n−k−1

)

=n∑k=0

(n

k

)xk−1(1− x)n−k−1

(k(1− x) − x(n− k)

)

=n∑k=0

(n

k

)xk−1(1− x)n−k−1

(k − nx)

)

Now, multiply through by x(1− x), to find

0 =n∑k=0

(n

k

)xk(1− x)n−k(k − nx).

Differentiating again, we obtain

0 =n∑k=0

(n

k

)d

dx

(xk(1− x)n−k(k − nx)

).

This leads to a series of simplifications. It is pretty messy and many texts do not show the details, but we

think it is instructive.

0 =n∑k=0

(n

k

)[−nxk(1− x)n−k + (k − nx)

((k − n)xk(1− x)n−k−1 + kxk−1(1− x)n−k

)]=

n∑k=0

(n

k

)[−nxk(1− x)n−k + (k − nx)(1− x)n−k−1xk−1

((k − n)x+ k(1− x)

)]=

n∑k=0

(n

k

)(− nxk(1− x)n−k + (k − nx)2(1− x)n−k−1xk−1

)102


= −nn∑k=0

(n

k

)xk(1− x)n−k +

n∑k=0

(n

k

)(k − nx)2xk−1(1− x)n−k−1

Thus, since the first sum is 1, we have

n =n∑k=0

(n

k

)(k − nx)2xk−1(1− x)n−k−1

and multiplying through by x(1− x), we have

nx(1− x) =n∑k=0

(n

k

)(k − nx)2xk(1− x)n−k

x(1− x)n

=n∑k=0

(n

k

)(k − nxn

)2

xk(1− x)n−k

This last equality then leads to the

n∑k=0

(n

k

)(x− k

n

)2

xk(1− x)n−k =x(1− x)

n(β)

We now define the nth order Bernstein Polynomial associated with f by

Bn(x) =n∑k=0

(n

k

)xk(1− x)n−kf

(kn

).

Note that

f(x)−Bn(x) =n∑k=0

(n

k

)xk(1− x)n−k

[f(x)− f

(kn

)].

Also note that f(0)−Bn(0) = f(1)−Bn(1) = 0, so f and Bn match at the endpoints. It follows that

| f(x)−Bn(x) | ≤n∑k=0

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣. (γ)

Now, f is uniformly continuous on I since it is continuous. So, given ε > 0, there is a δ > 0 such that

|x− kn | < δ ⇒ |f(x)− f( kn)| < ε

2 . Consider x to be fixed in [0, 1]. The sum in Equation γ has only n+ 1terms, so we can split this sum up as follows. Let K1,K2 be a partition of the index set 0, 1, ..., nsuch that k ∈ K1 ⇒ |x− k

n | < δ and k ∈ K2 ⇒ |x− kn | ≥ δ. Then

| f(x)−Bn(x) |≤∑k∈K1

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣+∑k∈K2

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣.103


which implies

|f(x)−Bn(x)| ≤ ε

2

∑k∈K1

(n

k

)xk(1− x)n−k +

∑k∈K2

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣=

ε

2+∑k∈K2

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣.Now, f is bounded on I , so there is a real number M > 0 such that |f(x)| ≤M for all x ∈ I . Hence

∑k∈K2

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣ ≤ 2M∑k∈K2

(n

k

)xk(1− x)n−k.

Since k ∈ K2 ⇒ |x− kn | ≥ δ, using Equation β, we have

δ2∑k∈K2

(n

k

)xk(1− x)n−k ≤

∑k∈K2

(n

k

)(x− k

n

)2

xk(1− x)n−k ≤ x(1− x)n

.

This implies that ∑k∈K2

(n

k

)xk(1− x)n−k ≤ x(1− x)

δ2n.

and so combining inequalities

2M∑k∈K2

(n

k

)xk(1− x)n−k ≤ 2Mx(1− x)

δ2n

We conclude then that

∑k∈K2

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣ ≤ 2Mx(1− x)δ2n

.

Now, the maximum value of x(1− x) on I is 14 , so

∑k∈K2

(n

k

)xk(1− x)n−k

∣∣∣f(x)− f(kn

)∣∣∣ ≤ M

2δ2n.

Finally, choose n so that n > Mδ2ε

. Then Mnδ2

< ε implies M2nδ2

< ε2 . So, Equation γ becomes

| f(x)−Bn(x) |≤ ε

2+ε

2= ε.

Note that the polynomial Bn does not depend on x ∈ I , since n only depends on M , δ, and ε, all of which,

in turn, are independent of x ∈ I . So, Bn is the desired polynomial, as it is uniformly within ε of f .

Comment 5.1.1. A change of variable translates this result to any closed interval [a, b].

104

5.2. RIEMANN INTEGRABLE? CHAPTER 5. FURTHER RIEMANN RESULTS

5.2 Showing Functions Are Riemann Integrable

We already know that continuous functions, monotone functions and functions of bounded variation are

classes of functions which are Riemann Integrable on the interval [a, b]. A good reference for some of

the material in this section is (Douglas (2) 1996) although it is mostly in problems and not in the text!

Hence, since f(x) =√x is continuous on [0,M ] for any positive M , we know f is Riemann integrable

on this interval. What about the composition√g where g is just known to be non negative and Riemann

integrable on [a, b]? If g were continuous, since compositions of continuous functions are also continuous,

we would have immediately that√g is Riemann Integrable. However, it is not so easy to handle this case.

Let’s try this approach. Using Theorem 5.1.2, we know given a finite interval [c, d], there is a sequence

of polynomials pn(x) which converge uniformly to√x on [c, d]. Of course, the polynomials in this

sequence will change if we change the interval [c, d], but you get the idea. To apply this here, note that

since g is Riemann Integrable on [a, b], g must be bounded. Since we assume g is non negative, we know

that there is a positive number M so that g(x) is in [0,M ] for all x in [a, b]. Thus, there is a sequence of

polynomials pn which converge uniformly to√· on [0,M ].

Next, using Theorem 4.3.1, we know a polynomial in g is also Riemann integrable on [a, b] (f2 = f ·fso it is integrable and so on). Hence, pn(f) is Riemann integrable on [a, b]. Then given ε > 0, we know

there is a positive N so that

| pn(u)−√u | < ε, if n > N and u ∈ [0,M ].

Thus, in particular, since g(x) ∈ [0,M ], we have

| pn(g(x))−√g(x) | < ε, if n > N and x ∈ [a, b].

We have therefore proved that pn g converges uniformly to√g on [0,M ]. Then by Theorem 5.1.1, we

see√g is Riemann integrable on [0,M ].

If you think about it a bit, you should be able to see that this type of argument would work for any f

which is continuous and g that is Riemann integrable. We state this as Theorem 5.2.1.

Theorem 5.2.1 (f Continuous and g Riemann Integrable Implies f g is Riemann Integrable).If f is continuous on g([a, b]) where g is Riemann Integrable on [a, b], then f g is Riemann Integrable

on [a, b].

Proof 5.2.1.

Exercise 5.2.1. This proof is for you.

In general, the composition of Riemann Integrable functions is not Riemann integrable. Here is the

standard counterexample. This great example comes from (Douglas (2) 1996). Define f on [0, 1] by

105

5.3. CONTENT ZERO CHAPTER 5. FURTHER RIEMANN RESULTS

f(y) =

1 if y = 00 if 0 < y ≤ 1

and g on [0, 1] by

g(x) =

1 if x = 01/p if x = p/q, (p, q) = 1, x ∈ (0, 1] and x is rational

0 if x ∈ (0, 1] and x is irrational

We see immediately that f is integrable on [0, 1] by Theorem 4.8.4. We can show that g is also Riemann

integrable on [0, 1], but we will leave this as an exercise.

Exercise 5.2.2.

1. Show g is continuous at each irrational points in [01, ] and discontinuous at all rational points in

[0, 1].

2. Show g is Riemann integrable on [0, 1] with value∫ 1

0 g(x)dx = 0.

Now f g becomes

f(g(x)) =

f(1) if x = 0f(1/p) if x = p/q, (p, q) = 1, x ∈ (0, 1] and x rational

f(0) if 0 < x ≤ 1 and x irrational

=

1 if x = 00 if if x rational ∈ (0, 1]1 if if x irrational ∈ (0, 1]

The function f g above is not Riemann integrable as U(f g) = 1 and L(f g) = 0. Thus, we have

found two Riemann integrable functions whose composition is not Riemann integrable!

5.3 Sets Of Content Zero

We already know the length of the finite interval [a, b] is b− a and we exploit this to develop the Riemann

integral when we compute lower, upper and Riemann sums for a given partition. We also know that the set

of discontinuities of a monotone function is countable. We have seen that continuous functions with a finite

number of discontinuities are integrable and in the last section, we saw a function which was discontinuous

on a countably infinite set and still was integrable! Hence, we know that a function is integrable should

imply something about its discontinuity set. However, the concept of length doesn’t seem to apply as there

are no intervals in these discontinuity sets. With that in mind, let’s introduce a new notion: the content of

a set. We will follow the development of a set of content zero as it is done in (Sagan (4) 1974).

106


Definition 5.3.1 (Sets Of Content Zero).A subset S of < is said to have content zero if and only if given any positive ε we can find a sequence

of bounded open intervals J εn = (an, bn) either finite in number or infinite so that

S ⊆ ∪ Jn,

with the total length ∑(bn − an) < ε

If the sequence only has a finite number of intervals, the union and sum are written from 1 to N where

N is the number of intervals and if there are infinitely many intervals, the sum and union are written

from 1 to∞.

Comment 5.3.1.

1. A single point c in < has content zero because c ∈ (c− ε/2, c+ ε/2) for all positive ε.

2. A finite number of points S = c1, . . . , ck in < has content zero because Bi = ci ∈ (ci −ε/(2k), ci + ε/(2k)) for all positive ε. Thus, S ⊆ ∪ki=1Bi and the total length of these intervals

is smaller than ε.

3. The rational numbers have content zero also. Let ci be any enumeration of the rationals. Let

Bi = (ci − ε/(2i), ci + ε/(2i)) for any positive ε. The Q is contained in the union of these intervals

and the length is smaller than ε∑∞

i=1 1/2i = ε.

4. Finite unions of sets of content zero also have content zero.

5. Subsets of sets of content zero also have content zero.

Hence, the function g above is continuous on [0, 1] except on a set of content zero. We make this more

formal with a definition.

Definition 5.3.2 (Continuous Almost Everywhere).The function f defined on the interval [a, b] is said to be continuous almost everywhere if the set of

discontinuities of f has content zero. We abbreviate the phrase almost everywhere by writing a.e.

We are now ready to prove an important theorem which is known as the Riemann - Lebesgue Lemma.

This is also called Lebesgue’s Criterion For the Riemann Integrability of Bounded Functions . We

follow the proof given in (Sagan (4) 1974).

107


Theorem 5.3.1 (Riemann - Lebesgue Lemma).

(i) f ∈ B[a, b] and continuous a.e. implies f ∈ RI[a, b].

(ii) f ∈ RI[a, b] implies f is continuous a.e.

Proof 5.3.1. The proof of this result is fairly complicated. So grab a cup of coffee, a pencil and prepare

for a long battle!

(i):

Proof . We will prove this by showing that for any positive ε, we can find a partitionπ0 so that the Riemann

Criterion is satisfied. First, since f is bounded, there is are numbersm andM so thatm ≤ f(x) ≤M for

all x in [a, b]. If m and M we the same, then f would be constant and it would therefore be continuous. If

this case, we know f is integrable. So we can assume without loss of generality that M −m > 0. Let D

denote the set of points in [a, b] where f is not continuous. By assumption, the content ofD is zero. Hence,

given a positive ε there is a sequence of bounded open intervals Jn = (an, bn) (we will assume without

loss of generality that there are infinitely many such intervals) so that

D ⊆ ∪Jn,∑

(bn − an) < ε/(2(M −m)).

Now if x is from [a, b], x is either in D or in the complement of D, DC . Of course, if x ∈ DC , then f is

continuous at x. The set

E = [a, b] ∩

(∪Jn

)Cis compact and so f must be uniformly continuous on E. Hence, for the ε chosen, there is a δ > 0 so that

| f(y)− f(x) |< ε/(8(b− a)), (∗)

if y ∈ (x− δ, x+ δ) ∩ E. Next, note that

O = Jn, Bδ/2(x) | x ∈ E

is an open cover of [a, b] and hence must have a finite sub cover. Call this finite sub coverO′ and label its

members as follows:

O′ = Jn1 , . . . , Jnr , Bδ/2(x1), . . . , Bδ/2(xs)

Then it is also true that we know that

[a, b] ⊆ O′′ = Jn1 , . . . , Jnr , Bδ/2(x1) ∩ E, . . . , Bδ/2(xs) ∩ E

108


All of the intervals in O′′ have endpoints. Throw out any duplicates and arrange these endpoints in

increasing order in [a, b] and label them as y1, . . . , yp−1. Then, let

π0 = y0 = a, y1, y2, . . . , yp−1, yp = b

be the partition formed by these points. Recall where the points yj come from. The endpoints of the

Bδ/2(xi) ∩ E sets are not in any of the intervals Jnk . So suppose two successive points yj−1 and yjsatisfied yj−1 is in an interval Jnk and the next point yj was an endpoint of a Bδ/2(xi) ∩ E set which is

also inside Jnk . By our construction, this can not happen as all of the Bδ/2(xi) ∩ E are disjoint from the

Jnk sets. Hence, the next point yj either must be in the set Jnk also or it must be outside. If yj−1 is inside

and yj is outside, this is also a contradiction as this would give us a third point, call it z temporarily, so

that

yj−1 < z < yj

with z a new distinct endpoint of the finite cover O′′. Since we have already ordered these points, this

third point is not a possibility. Thus, we see (yj−1, yj) is in some Jnk or neither of the points is in any

Jnk . Hence, we have shown that given the way the points yj were chosen, either (yj−1, yj) is inside some

interval Jnq or it’s closure [yj−1, yj ] lies in none of the Jnq for any 1 ≤ q ≤ r. But that means (yj−1, yj)lies in some Bδ/2(xi). Note this set uses the radius δ/2 and so we can say the closed interval [yj−1, yj ]must be contained in some Bδ(xi).

Now we separate the index set 1, 2, . . . , p into two disjoint sets. We define A1 to be the set of all

indices j so that (yj−1, yj) is contained in some Jnk . Then we set A2 to be the complement of A1 in the

entire index set, i.e. A2 = 1, 2, . . . , p − A1. Note, by our earlier remarks, if j is in A2, [yj−1, yj ] is

contained in some Bδ(xi) ∩ E. Thus,

U(f,π0)− L(f,π0) =n∑j=1

(Mj −mj

)∆yj

=∑j∈A1

(Mj −mj

)∆yj +

∑j∈A2

(Mj −mj

)∆yj

Let’s work with the first sum: we have

∑j∈A1

(Mj −mj

)∆yj ≤

(M −m

) ∑j∈A1

∆yj

< (M −m) ε/(2(M −m)) = ε/2

Now if j is in A2, then [yj−1, yj ] is contained in some Bδ(xi)∩E. So any two points u and v in [yj−1, yj ]satisfy | u− xi |< δ and | v − xi |< δ. Since these points are this close, the uniform continuity condition,

Equation ∗, holds. Therefore

| f(u)− f(v) | ≤ | f(u)− f(xi) | + | f(v)− f(xi) |< ε/(4(b− a)).

109


This holds for any u and v in [yj−1, yj ]. In particular, we can use the Supremum and Infimum Tolerance

Lemma to choose uj and vj so that

Mj − ε/(8(b− a)) < f(uj), mj + ε/(8(b− a)) > f(vj).

It then follows that

Mj −mj < f(uj)− f(vj) + ε/(4(b− a)).

Now, we can finally estimate the second summation term. We have

∑j∈A2

(Mj −mj

)∆yj <

∑j∈A2

(| f(uj)− f(vj) | +ε/(4(b− a))

)∆yj

<∑j∈A2

(| f(uj)− f(vj) |

)∆yj + ε/(4(b− a))

∑j∈A2

∆yj

< ε/(4(b− a))∑j∈A2

∆yj + ε/(4(b− a))∑j∈A2

∆yj

< ε/2

Combining our estimates, we have

U(f,π0)− L(f,π0) =∑j∈A1

(Mj −mj

)∆yj +

∑j∈A2

(Mj −mj

)∆yj

< ε/2 + ε/2 = ε.

Any partition π that refines π0 will also satisfy U(f,π) − L(f,π) < ε. Hence, f satisfies the Riemann

Criterion and so f is integrable.

(ii):

Proof . We begin by noting that if f is discontinuous at a point x in [a, b], if and only if there is a positive

integer m so that

∀δ > 0, ∃y ∈ (x− δ, x+ δ) ∩ [a, b] 3 | f(y)− f(x) |≥ 1/m.

This allows us to define some interesting sets. Define the set Em by

Em = x ∈ [a, b] | ∀δ > 0 ∃y ∈ (x− δ, x+ δ) ∩ [a, b] 3 | f(y)− f(x) |≥ 1/m,

Then, the set of discontinuities of f ,D can be expressed asD = ∪∞j=1Em.

110


Now let π = x0, x1, . . . , xn be any partition of [a, b]. Then, given any positive integer m, the open

subinterval [xk−1, xk] either intersects Em or it does not. Define

A1 =

k ∈ 1, . . . , n | (xk−1, xk) ∩ Em 6= ∅

,

A2 =

k ∈ 1, . . . , n | (xk−1, xk) ∩ Em = ∅

By construction, we have A1 ∩A2 = ∅ and A1 ∪A2 = 1, . . . , n.

We assume f is integrable on [a, b]. So, by the Riemann Criterion, given ε > 0, and a positive integer m,

there is a partition π0 such that

U(f,π)− L(f,π) < ε/(2m), ∀π0 π. (∗∗)

It follows that if π0 = y0, y1, . . . , yn, then

U(f,π0)− L(f,π0) =n∑k=1

(Mk −mk)∆yk

=∑k∈A1

(Mk −mk)∆yk +∑k∈A2

(Mk −mk)∆yk

If k is in A1, then by definition, there is a point uk in Em and a point vk in (yk−1, yk) so that | f(uk) −f(vk) |≥ 1/m. Also, since uk and vk are both in (yk−1, yk),

Mk −mk ≥| f(uk)− f(vk) | .

Thus, ∑k∈A1

(Mk −mk)∆yk ≥∑k∈A1

| f(uk)− f(vk) | ∆yk ≥ (1/m)∑k∈A1

∆yk.

Also, the second term,∑

k∈A2(Mk −mk)∆yk is non-negative and so using Equation ∗∗, we find

ε/(2m) > U(f,π0 − L(f,π0 ≥ (1/m)∑k∈A1

∆yk.

which implies∑

k∈A1∆yk < ε/2.

The partition π0 divides [a, b] as follows:

[a, b] =

(∪k∈A1 (yk−1, yk)

)∪

(∪k∈A2 (yk−1, yk)

)∪(y0, . . . , yn

)= C1 ∪ C2 ∪ π0

111


By the way we constructed the sets Em, we know Em does not intersect C2. Hence, we can say

Em =

(C1 ∩ Em

)∪

(Em ∩ π0

)

Therefore, we have C1 ∩ Em ⊆ ∪k∈A1 (yk−1, yk) with∑

k∈A1∆yk < ε/2. Since ε is arbitrary, we see

C1∩Em has content zero. The other setEm∩π0 consists of finitely many points and so it also has content

zero by the comments at the end of Definition 5.3.1. This shows that Em has content zero since it is the

union of two sets of content zero. We finish by noting D = ∪Em also has content zero. The proof of this

we leave as an exercise.

Exercise 5.3.1. Prove that if Fn ⊆ [a, b] has content zero for all n, then F = ∪Fn also has content zero.

112

Part III

Riemann - Stieljes Integrals

113

Chapter 6

The Riemann-Stieltjes Integral

In classical analysis, the Riemann-Stieltjes integral was the first attempt to generalize the idea of the size,

or measure, of a subset of the real numbers. Instead of simply using the length of an interval as a measure,

we can use any function that satisfies the same properties as the length function.

Let f and g be any bounded functions on the finite interval [a, b]. If π is any partition of [a, b] and σ is

any evaluation set, we can extend the notion of the Riemann sum S(f,π,σ to the more general Riemann- Stieljes sum as follows:

Definition 6.0.3 (The Riemann - Stieljes Sum).Let f, g ∈ B[a, b], π ∈ Π[a, b] and σ ⊆ π. Let the partition points in π be x0, x1, . . . , xp and the

evaluation points be s1, s2, . . . , sp as usual. Define

∆gj = g(xj)− g(xj − i), 1 ≤ j ≤ p.

and the Riemann - Stieljes sum for integrand f and integrator g for partition π and evaluation set π

by

S(f, g,π,σ) =∑j∈π

f(sj) ∆gj

This is also called the Riemann - Stieljes sum for the function f with respect to the function g for

partition π and evaluation set σ.

Of course, you should compare this definition to Definition 4.1.1 to see the differences! We can

then define the Riemann - Stieljes integral of f with respect to g using language very similar to that of

Definition 4.1.2.

115

6.1. PROPERTIES CHAPTER 6. RIEMANN-STIELJES

Definition 6.0.4 (The Riemann - Stieljes Integral).Let f, g ∈ B[a, b]. If there is a real number I so that for all positive ε, there is a partition π0 ∈ Π[a, b]so that ∣∣∣∣∣S(f, g,π,σ)− I

∣∣∣∣∣ < ε

for all partitions π that refine π0 and evaluation sets σ from π, then we say f is Riemann - Stieljes

integrable with respect to g on [a, b]. We call the value I the Riemann - Stieljes integral of f with

respect to g on [a, b]. We use the symbol

I = RS(f, g; a, b)

to denote this value. We call f the integrand and g the integrator.

As usual, there is the question of what pairs of functions (f, g) will turn out to have a finite Riemann -

Stieljes integral. The collection of the functions f from B[a, b] that are Riemann - Stieljes integrable with

respect to a given integrator g from B[a, b] is denoted by RS[g, a, b].

Comment 6.0.2. If g(x) = x on [a, b], then RS[g, a, b] = RI[a, b] and RS(f, g; a, b) =∫ ba f(x)dx.

Comment 6.0.3. We will use the standard conventions: RS(f, g; a, b) = −RS(f, g; b, a) andRS(f, g; a; a) =0.

6.1 Standard Properties Of The Riemann - Stieljes Integral

We can easily prove the usual properties that we expect an integration type mapping to have.

Theorem 6.1.1 (The Linearity of the Riemann - Stieljes Integral).If f1 and f2 are in RS[g, a, b], then

(i)

c1f1 + c2f2 ∈ RS[g, a, b], ∀c1, c2 ∈ <

(ii)

RS(c1f1 + c2f2, g; a, b) = c1RS(f1, g; a, b) + c2RS(f2, g; a, b)

If f ∈ RS[g1, a, b] and f ∈ RS[g2, a, b] then

(i)

f ∈ RS[c1g1 + c2g2, a, b], ∀c1, c2 ∈ <

(ii)

RS(f, c1g1 + c2g2; a, b) = c1RS(f, g1; a, b) + c2RS(f, g2; a, b)

116

6.1. PROPERTIES CHAPTER 6. RIEMANN-STIELJES

Proof 6.1.1.

Exercise 6.1.1. We leave these proofs to you as an exercise.

The proof of these statements is quite similar in spirit to those of Theorem 4.1.1. You should compare

the techniques!

To give you a feel for the kind of partition arguments we use for Riemann - Stieljes proofs (you will

no doubt enjoy working out these details for yourselves in various exercises), we will go through the proof

of the standard Integration By Parts formula in this context.

Theorem 6.1.2 (Riemann Stieljes Integration By Parts).If f ∈ RS[g, a, b], then g ∈ RS[f, a, b] and

RS(g, f ; a, b) = f(x)g(s)

∣∣∣∣∣b

a

−RS(f, g; a, b)

Proof 6.1.2. Since f ∈ RS[g, a, b], there is a number If = RS(f, g; a, b) so that given a positive ε, there

is a partition π0 such that ∣∣∣∣∣S(f, g,π,σ − If

∣∣∣∣∣ < ε, π0 π, σ ⊆ π. (α)

For such a partition π and evaluation set σ ⊆ π, we have

π = x0, x1, . . . , xp,

σ = s1, . . . , sp

and

S(g, f,π,σ) =∑π

g(sj)∆fj .

We can rewrite this as

S(g, f,π,σ =∑π

g(sj)f(xj) −∑π

g(sj)f(xj−1) (β)

Also, we have the identity (it is a collapsing sum)

∑π

(f(xj)g(xj)− f(xj−1)g(xj−1)

)= f(b)g(b)− f(a)g(a). (γ)

117

6.2. STEP INTEGRATORS CHAPTER 6. RIEMANN-STIELJES

Thus, using Equation β and Equation γ, we have

f(b)g(b)− f(a)g(a) − S(g, f,π,σ) =∑π

f(xj)

(g(xj)− g(sj)

)(ξ)

+∑π

f(xj−1)

(g(sj)− g(xj−1)

)

Since σ ⊆ π, we have the ordering

a = x0 ≤ s1 ≤ x1 ≤ s2 ≤ x2 ≤ . . . ≤ xp−1 ≤ sp ≤ xp = b.

Hence, the points above are a refinement of π we will call π′. Relabel the points of π′ as

π′ = y0, y1, . . . , yq

and note that the original points of π now form an evaluation set σ′ of π′. We can therefore rewrite

Equation ξ as

f(b)g(b)− f(a)g(a) − S(g, f,π,σ) =∑π

f(yj)∆gj = S(f, g,π′,σ′)

Let Ig = f(b)g(b)− f(a)g(a) − If . Then since π0 π π′, we can apply Equation α to conclude

ε >

∣∣∣∣∣S(f, g,π′,σ′ − If

∣∣∣∣∣=

∣∣∣∣∣f(b)g(b)− f(a)g(a) − S(g, f,π,σ) − If

∣∣∣∣∣=

∣∣∣∣∣S(g, f,π,σ) − Ig

∣∣∣∣∣Since our choice of refinement π of π0 and evaluation set σ was arbitrary, we have shown that g ∈RS[f, a, b] with value

RS(g, f, a, b) = f(x)g(x)

∣∣∣∣∣b

a

−RS(f, g, a, b).

6.2 Step Function Integrators

We now turn our attention to the question of what pairs of functions might have a Riemann - Stieljes

integral. All we know so far is that if g(x) = x on [a, b] is labeled as g = id, then RS[f, id, a, b] =RI[f, a, b].

First, we need to define what we mean by a Step Function.

118


Definition 6.2.1 (Step Function).We say g ∈ B[a, b] is a Step Function if g only has finitely many jump discontinuities on [a, b] and

g is constant on the intervals between the jump discontinuities. Thus, we may assume there is a non

negative integer p so that the jump discontinuities are ordered and labeled as

c0 < c1 < c2 < . . . < cp

and g is constant on each subinterval (ck−1, ck) for 1 ≤ k ≤ p.

Comment 6.2.1. We can see g(c−k ) and g(c+k ) both exist and are finite with g(c−k ) the value g has on

(ck−1, ck) and g(c+k ) the value g has on (ck, ck+1). At the endpoints, g(a+) and g(b−) are also defined.

The actual finite values g takes on at the points cj are completely arbitrary.

We can prove a variety of results about Riemann Stieljes integrals with step function integrations.

Lemma 6.2.1 (One Jump Step Functions As Integrators One).Let g ∈ B[a, b] be a step function having only one jump at some c in [a, b]. Let f ∈ B[a, b]. Then

f ∈ C[a, b] implies f ∈ RS[g, a, b] and

• If c ∈ (a, b), then RS(f, g; a, b) = f(c)[g(c+)− g(c−)].

• If c = a, then RS(f, g; a, b) = f(a)[g(a+)− g(a)].

• If c = b, then RS(f, g; a, b) = f(b)[g(b)− g(b−)].

Proof 6.2.1. Let π be any partition of [a, b]. We will assume that c is a partition point of π because if not,

we can use the argument we have used before to construct an appropriate refinement as done, for example,

in the proof of Lemma 4.5.1. Letting the partition points be

π = x0, x1, . . . , xp,

we see there is a partition point xk0 = c with k0 6= 0 or p. Hence, on [xk0−1, xk0 ] = [xk0−1, c], ∆gk0 =g(c)− g(xk0−1). However, since g has a single jump at c, we see that the value g(xk0−1) must be g(c−).

Thus, ∆gk0 = g(c) − g(c−). A similar argument shows that ∆gk0 = g(c+) − g(c). Further, since g

has only one jump, all the other terms ∆gk are zero. Hence, for any evaluation set σ in π, we have

σ = s1, . . . , sp and

S(f, g,π,σ) = f(sk0)∆gk0 + f(sk0+1∆gk0+1

= f(sk0)

(g(c)− g(c−)

)+ f(sk0+1

(g(c+)− g(c)

)

=

(f(sk0)− f(c) + f(c)

)(g(c)− g(c−)

)

119


+

(f(sk0+1 − f(c) + f(c)

)(g(c+)− g(c)

)

Thus, we obtain

S(f, g,π,σ) =

(f(sk0)− f(c)

)(g(c)− g(c−)

)

+

(f(sk0+1 − f(c)

)(g(c+)− g(c)

)(α)

+ f(c)

(g(c+)− g(c−)

)

We know f is continuous at c. Let A = max(| g(c)− g(c−) |, | g(c+)− g(c) |

). Then A > 0 because g

has a jump at c. Since f is continuous at c, given ε > 0, there is a δ > 0, so that

| f(x)− f(c) | < ε/(2A), x ∈ (c− δ, c+ δ) ∩ [a, b]. (β)

In fact, since c is an interior point of [a, b], we can choose δ so small that (c − δ, c + δ) ⊆ [a, b]. Now,

if π0 is any partition with || π0 ||< δ containing c as a partition point, we can argue as we did in the

prefatory remarks to this proof. Thus, there is an index k0 so that

[xk0−1, xk0 = c] ⊆ (c− δ, c], [c = xk0 , xk0+1] ⊆ [c, c+ δ).

This implies that

[xk0−1, xk0+1] ⊆ (c− δ, c+ δ)

and so the evaluation points, labeled as usual, sk0 and sk0+1 are also in (c− δ, c+ δ). Applying Equation

β, we have

| f(sk0)− f(c)| < ε/(2A), | f(sk0+1)− f(c)| < ε/(2A.

From Equation α, we then have∣∣∣∣∣S(f, g,π,σ)− f(c)(g(c+)− g(c−)

)∣∣∣∣∣ ≤

∣∣∣∣∣(f(sk0)− f(c)

) (g(c)− g(c−)

)∣∣∣∣∣+

∣∣∣∣∣(f(sk0+1 − f(c)

) (g(c+)− g(c)

)∣∣∣∣∣< ε/(2A)

∣∣∣∣g(c)− g(c−)∣∣∣∣ + ε/(2A)

∣∣∣∣g(c+)− g(c)∣∣∣∣

< ε

120


Finally, if π0 π, then || π ||< δ also and the same argument shows that for any evaluation set σ ⊆ π,

we have ∣∣∣∣∣S(f, g,π,σ)− f(c)

(g(c+)− g(c−)

)∣∣∣∣∣ < ε

This proves that f ∈ RS[g, a, b] and RS(f, g; a, b) = f(c)

(g(c+) − g(c−)

). Now, if c = a or c = b,

the arguments are quite similar, except one sided and we find RS(f, g; a, b) = f(a)

(g(a+) − g(a)

)or

RS(f, g; a, b) = f(b)

(g(b)− g(b−)

).

Lemma 6.2.2 (One Jump Step Functions As Integrators Two).Let g ∈ B[a, b] be a step function having only one jump at some c in [a, b]. Let f ∈ B[a, b]. If

c ∈ (a, b), f(c−) = f(c) and g(c+) = g(c), then f ∈ RS[g, a, b]. We can rephrase this as: if c

is an interior point, f is continuous from the left at c and g is continuous from the right at c, then

f ∈ RS[g, a, b] and

• If c ∈ (a, b), then RS(f, g; a, b) = f(c)[g(c)− g(c−)].

• If c = a, then RS(f, g; a, b) = f(a)[g(a)− g(a)] = 0.

• If c = b, then RS(f, g; a, b) = f(b)[g(b)− g(b−)].

Proof 6.2.2. To prove this result, we first use the initial arguments of Lemma 6.2.1 and then we note in

this case, f is continuous from the left at c so f(c−) = f(c). Further, g is continuous from the right;

hence, g(c) = g(c+). Thus, Equation α reduces to

S(f, g,π,σ) =

(f(sk0)− f(c)

)(g(c)− g(c−)

)+ f(c)

(g(c)− g(c−)

)(α′)

Let L =| g(c)− g(c−) |. Then, given ε > 0, since f is continuous from the left, there is a δ > 0 so that

| f(x)− f(c) |< ε/L, x ∈ (c− δ, c] ⊆ [a, b].

As usual, we can restrict our attention to partitions that contain the point c. We continue to use xi’s and

sj’s to represent points in these partitions and associated evaluation sets. Let π be such a partition with

xk0 = c and || π ||< δ. Let σ be any evaluation set of π. Then, we have

[xk0−1, xk0 ] ⊆ (c− δ, c]

121


and thus

| f(sk0)− f(c) |< ε/L.

Hence, ∣∣∣∣∣S(f, g,π,σ)− f(c)(g(c+ − g(c)

)∣∣∣∣∣ =∣∣∣∣f(sk0 − f(c)

∣∣∣∣ ∣∣∣∣g(c)− g(c−)∣∣∣∣

< ε.

Finally, just as in the previous proof, if π0 π, then || π ||< δ also and the same argument shows that

for any evaluation set σ ⊆ π, we have∣∣∣∣∣S(f, g,π,σ)− f(c)

(g(c)− g(c−)

)∣∣∣∣∣ < ε

This proves that f ∈ RS[g, a, b] and RS(f, g; a, b) = f(c)

(g(c) − g(c−)

). Now, if c = a or c = b, the

arguments are again similar, except one sided and we find RS(f, g; a, b) = f(a)

(g(a) − g(a)

)= 0 or

RS(f, g; a, b) = f(b)

(g(b)− g(b−)

).

Lemma 6.2.3 (One Jump Step Functions As Integrators Three).Let g ∈ B[a, b] be a step function having only one jump at some c in [a, b]. Let f ∈ B[a, b]. If

c ∈ (a, b), f(c+) = f(c) and g(c−) = g(c), then f ∈ RS[g, a, b]. We can rephrase this as: if c

is an interior point, f is continuous from the right at c and g is continuous from the left at c, then

f ∈ RS[g, a, b] and

• If c ∈ (a, b), then RS(f, g; a, b) = f(c)[g(c+)− g(c)].

• If c = a, then RS(f, g; a, b) = f(a)[g(a+)− g(a)].

• If c = b, then RS(f, g; a, b) = f(b)[g(b)− g(b)] = 0.

Proof 6.2.3. This is quite similar to the argument presented for Lemma 6.2.2. We find f ∈ RS[g, a, b] and

RS(f, g; a, b) = f(c)

(g(c+)−g(c)

). Now, if c = a or c = b, the arguments are again similar, except one

sided and we findRS(f, g; a, b) = f(a)

(g(a+)−g(a)

)= 0 orRS(f, g; a, b) = f(b)

(g(b)−g(b)

)= 0.

We can then generalize to a finite number of jumps.

122

6.3. MONOTONE INTEGRATORS CHAPTER 6. RIEMANN-STIELJES

Lemma 6.2.4 (Finite Jump Step Functions As Integrators).Let g be a step function on [a, b] with jump discontinuities at

a ≤ c0, c1, . . . , ck−1, ck ≤ b.

Assume f ∈ B[a, b]. Then, if

(i) f is continuous at cj , or

(ii) f is left continuous at cj and g is right continuous at cj , or

(iii) f is right continuous at cj and g is left continuous at cj ,

then, f ∈ RS[f, g, a, b] and

RS(f, g, a, b) = f(a)(g(a+)− g(a)

)+

k∑j=0

f(cj)(g(c+

j )− g(c−j ))

+ f(b)(g(b)− g(b−)

).

Proof 6.2.4. Use Lemma 6.2.1, Lemma 6.2.2 and Lemma 6.2.3 repeatedly.

6.3 Monotone Integrators

The next step is to learn how to deal with integrators that are monotone functions. To do this, we extend

the notion of Darboux Upper and Lower Sums in the obvious way.

123

6.3. MONOTONE INTEGRATORS CHAPTER 6. RIEMANN-STIELJES

Definition 6.3.1 (Upper and Lower Riemann - Stieljes Darboux Sums).Let f ∈ B[a, b] and g ∈ B[a, b] be monotone increasing. Let π be any partition of [a, b] with partition

points

π = x0, x1, . . . , xp

as usual. Define

Mj = supx∈[xj−1,xj ]

f(x), mj = infx∈[xj−1,xj ]

f(x).

The Lower Riemann - Stieljes Darboux Sum for f with respect to g on [a, b] for the partition π is

L(f, g,π) =∑π

mj∆gj

and the Upper Riemann - Stieljes Darboux Sum for f with respect to g on [a, b] for the partition π is

U(f, g,π) =∑π

Mj∆gj

Comment 6.3.1. It is clear that for any partition π and associated evaluation set σ, that we have the

usual inequality chain:

L(f, g,π) ≤ S(f, g,π,σ ≤ U(f, g,π)

The following theorems have proofs very similar to the ones we did for Theorem 4.2.1 and Theorem

4.2.2.

Theorem 6.3.1 (π π′ Implies L(f, g,π) ≤ L(f, g,π′) and U(f, g,π) ≥ U(f, g,π′)).Assume g is a bounded monotone increasing function on [a, b] and f ∈ B[a, b]. Then if π π′, then

L(f, g,π) ≤ L(f, g,π′) and U(f, g,π) ≥ U(f, g,π′).

Theorem 6.3.2 (L(f, g,π1) ≤ U(f, g,π2)).Let π1 and π2 be any two partitions in Π[a, b]. Then L(f, g,π1) ≤ U(f, g,π2).

These two theorems allow us to prove the following

Theorem 6.3.3 (The Upper And Lower Riemann - Stieljes Darboux Integral Are Finite).Let f ∈ B[a, b] and let g be a bounded monotone increasing function on [a, b]. Let U =L(f, g,π) | π ∈ Π[a, b] and V = U(f, g,π) | π ∈ Π[a, b]. Define L(f, g) = sup U , and

U(f, g) = inf V . Then L(f, g) and U(f, g) are both finite. Moreover, L(f, g) ≤ U(f, g).

We can then define upper and lower Riemann - Stieljes integrals analogous to the way we defined the

upper and lower Riemann integrals.

124

6.4. EQUIVALENCE THEOREM CHAPTER 6. RIEMANN-STIELJES

Definition 6.3.2 (Upper and Lower Riemann - Stieljes Integrals).Let f ∈ B[a, b] and g be a bounded, monotone increasing function on [a, b]. The Upper and Lower

Riemann - Stieljes integrals of f with respect to g are U(f, g) and L(f, g), respectively.

Thus, we can define the Riemann - Stieljes Darboux integral of f ∈ B[a, b] with respect to the bounded

monotone increasing integrator g.

Definition 6.3.3 (The Riemann - Stieljes Darboux Integral).Let f ∈ B[a, b] and g be a bounded, monotone increasing function on [a, b]. We say f is Riemann

- Stieljes Darboux integrable with respect to the integrator g if U(f, g) = L(f, g). We denote this

common value by RSD(f, g, a, b).

6.4 The Riemann - Stieljes Equivalence Theorem

The connection between the Riemann - Stieljes and Riemann - Stieljes Darboux integrals is obtained using

an analog of the familiar Riemann Condition we have seen before in Definition 4.2.4.

Definition 6.4.1 (The Riemann - Stieljes Criterion For Integrability).Let f ∈ B[a, b] and g be a bounded monotone increasing function on [a, b]. We say the Riemann

Condition or Criterion holds for f with respect to g if there is a partition of [a, b], π0 so that

U(f, g,π)− L(f, g,π) < ε, π0 π.

We can then prove an equivalence theorem for Riemann - Stieljes and Riemann - Stieljes Darboux

integrability.

Theorem 6.4.1 (The Riemann Stieljes Integral Equivalence Theorem).Let f ∈ B[a, b] and g be a bounded monotone increasing function on [a, b]. Then the following are

equivalent.

(i) f ∈ RS[g, a, b].

(ii) Riemann’s Criterion holds for f with respect to g.

(iii) f is Riemann - Stieljes Darboux Integrable, i.e, L(f, g) = U(f, g), and RS(f, g; a, b) =RSD(f, g; a, b).

Proof 6.4.1. The arguments are essentially the same as presented in the proof of Theorem 4.2.4 and hence,

you will be asked to go through the original proof and replace occurrences of ∆xj with ∆gj and b − awith g(b)− g(a).

125

6.5. FURTHER PROPERTIES CHAPTER 6. RIEMANN-STIELJES

Comment 6.4.1. We have been very careful to distinguish between Riemann - Stieljes and Riemann -

Stieljes Darboux integrability. Since we now know they are equivalent, we can begin to use a common

notation. Recall, the common notation for the Riemann integral is∫ ba f(x)dx. We will now begin using

the notation∫ ba f(x)dg(x) to denote the common value RS(f, g; a, b) = RSD(f, g; a, b). We thus know

intbaf(x)dx is equivalent to the Riemann - Stieljes integral of f with respect to the integrator g(x) = x.

Hence, in this case, we could write g(x) = id(x) = x, where id is the identity function. We could then

use the notation∫ ba f(x)dx =

∫ ba f(x)did. However, that is cumbersome. We can easily remember that

the identity mapping is simply x itself. So replace did by dx to obtain∫ ba f(x)dx. The use of the (x) in

these notations has always been helpful to allow us to handle substitution type rules, but it is certainly

somewhat awkward. A reasonable change of notation would be to go to using boldface for the f and g in

these integrals and write∫ ba fdg giving

∫ ba fdx for the simpler Riemann integral.

You can see no matter what we do the symbolism becomes awkward. For example, suppose f(x) =sin(x2) on [0, π] and g(x) = x2. Then, how do we write

∫ π0 fdg? We will usually abuse our integral

notation and write∫ π

0 sin(x2)d(x2).

6.5 Further Properties Of The Riemann-Stieljes Integral

We can prove the following useful collection of facts about Riemann - Stieljes integrals.

126


Theorem 6.5.1 (Properties Of The Riemann Stieljes Integral).Let the integrator g be bounded and monotone increasing on [a, b]. Assume f1, f2 and f3 are in

RS[f, g, a, b]. Then

(i) | f |∈ RS[g, a, b];

(ii) ∣∣∣∣∣∫ b

af(x)dg(x)

∣∣∣∣∣ ≤∫ b

a| f | dg(x);

(iii) f+ = maxf, 0 ∈ RS[g, a, b];

(iv) f− = max−f, 0 ∈ RS[g, a, b];

(v) ∫ b

af(x)dg(x) =

∫ b

a[f+(x)− f−(x)]dg(x)

=∫ b

af+(x)dg(x)−

∫ b

af−(x)dg(x)∫ b

a| f(x) | dg(x) =

∫ b

a[f+(x) + f−(x)]dg(x)

=∫ b

af+(x)dg(x) +

∫ b

af−(x)dg(x);

(vi) f2 ∈ RS[g, a, b];

(vii) f1f2 ∈ RS[g, a, b];

(viii) If there exists m such that 0 < m ≤ f(x) for all x in [a, b], then 1/f ∈ RS[g, a, b].

Proof 6.5.1. The arguments are straightforward modifications of the proof of Theorem 4.3.1 using b−a =g(b)− g(a) and ∆xj = ∆gj .

We can also easily prove the following fundamental estimate.

Theorem 6.5.2 (Fundamental Riemann Stieljes Integral Estimates).Let g be bounded and monotone increasing on [a, b] and let f ∈ RS[g, a, b]. Let m = infx f(x) and

let M = supx f(x). Then

m(g(b)− g(a)) ≤∫ b

af(x)dg(x) ≤M(g(b)− g()a).

127


In addition, Riemann - Stieljes integrals are also order preserving as we can modify the proof of

Theorem 4.1.3 quite easily.

Theorem 6.5.3 (The Riemann Stieljes Integral Is Order Preserving).Let g be bounded and monotone increasing on [a, b] and f, f1, f2 ∈ RS[g, a, b] with f1 ≤ f2 on [a, b].Then the Riemann Stieljes integral is order preserving in the sense that

(i)

f ≥ 0⇒∫ b

af(x)dg(x) ≥ 0;

(ii)

f1 ≤ f2 ⇒∫ b

af1(x)dg(x) ≤

∫ b

af2(x)dg(x).

We also want to establish the familiar summation property of the Riemann Stieljes integral over an

interval [a, b] = [a, c] ∪ [c, b]. We can modify the proof of the corresponding result in Lemma 4.5.1 as

usual to obtain Lemma 6.5.4.

Lemma 6.5.4 (The Upper And Lower Riemann - Stieljes Darboux Integral Is Additive On Intervals).Let g be bounded and monotone increasing on [a, b] and f ∈ B[a, b]. Let c ∈ (a, b). Define

∫ b

af(x) dg(x) = L(f, g) and

∫ b

af(x) dg(x) = U(f, g)

denote the lower and upper Riemann - Stieljes Darboux integrals of f on with respect to g on [a, b],respectively. Then we have

∫ b

af(x)dg(x) =

∫ c

af(x)dg(x) +

∫ b

cf(x)dg(x)

∫ b

af(x)dg(x) =

∫ c

af(x)dg(x) +

∫ b

cf(x)dg(x).

Lemma 6.5.4 allows us to prove existence of the Riemann - Stieljes on [a, b] implies it also exists on

subintervals of [a, b] and the Riemann - Stieljes value is additive. The proofs are obvious modifications of

the proofs of Theorem 4.5.2 and Theorem 4.5.3, respectively.

Theorem 6.5.5 (The Riemann Stieljes Integral Exists On Subintervals).Let g be bounded and monotone increasing on [a, b]. If f ∈ RS[g, a, b] and c ∈ (a, b), then f ∈RS[g, a, c] and f ∈ RS[g, c, b].

128

6.6. BOUNDED VARIATION INTEGRATORS CHAPTER 6. RIEMANN-STIELJES

Theorem 6.5.6 (The Riemann Integral Is Additive On Subintervals).If f ∈ RS[g, a, b] and c ∈ (a, b), then∫ b

af(x)dg(x) =

∫ c

af(x)dg(x) +

∫ b

cf(x)dg(x).

6.6 Bounded Variation Integrators

We now turn our attention to integrators which are of bounded variation. By Theorem 3.4.3, we know that

if g ∈ BV [a, b], then we can write g = u− v where u and v are monotone increasing on [a, b]. Note if h

is any other monotone increasing function on [a, b], we could also use the decomposition

g = (u+ h)− (v + h)

as well, so this representation is certainly not unique. We must be very careful when we extend the

Riemann - Stieljes integral to bounded variation integrators. For example, even if f ∈ RS[g, a, b] it

does not always follow that f ∈ RS[u, a, b] and /or f ∈ RS[v, a, b]! However, we can prove that this

statement is true if we use a particular decomposition of f . Let u(x) = Vg(x) and v(x) = Vg(x) − g(x)be our decomposition of g. Then, we will be able to show f ∈ RS[g, a, b] implies f ∈ RS[Vg, a, b] and

f ∈ RS[Vg − g, a, b].

Theorem 6.6.1 (f Riemann Stieljes Integrable With Respect To g Of Bounded Variation Implies

Integrable With Respect To Vg and Vg − g.).Let g ∈ BV [a, b] and f ∈ RS[g, a, b]. Then f ∈ RS[Vg, a, b] and f ∈ RS[Vg − g, a, b].

Proof 6.6.1. For convenience of notation, let u = Vg and v = Vg − g. First, we show that f ∈ RS[u, a, b]by showing the Riemann - Stieljes Criterion holds for f with respect to u on [a, b]. Fix a positive ε. Then

there is a partition π0 so that

| S(f, g,π,σ)−∫ b

af(x)dg(x) |< ε

for all refinements π of π0 and evaluation sets σ of π. Thus, given two such evaluation sets σ1 and σ2 of

a refinement π, we have

| S(f, g,π,σ1)− S(f, g,π,σ2) | ≤ | S(f, g,π,σ1)−∫ b

af(x)dg(x) |

+ | S(f, g,π,σ2)−∫ b

af(x)dg(x) |

< 2ε.

129


Hence, we know for σ1 = s1, . . . , sp and σ2 = s′1, . . . , s′p, that

| S(f, g,π,σ1 − S(f, g,π,σ2 | < 2ε (α)

Now, u(b) = Vg(b) = supπ∑π | ∆gj |. Thus, by the Supremum Tolerance Lemma, there is a partition

π1 so that

u(b)− ε <∑π1

| ∆gj |≤ u(b).

Then if π refines π1, we have

u(b)− ε <∑π1

| ∆gj |≤∑π

| ∆gj |≤ u(b).

and so for all π1 π,

u(b)− ε <∑π

| ∆gj |≤ u(b). (β)

Now let π2 = π0 ∨ π1 and choose any partition π that refines π2. Then,

∑π

(Mj −mj

)| ∆uj | − | ∆gj | ≤

∑π

(Mj +mj

)∆uj − | ∆gj |

≤ 2M∑π

| ∆uj | − | ∆gj |

where M =|| f ||∞. But the term∑π ∆uj is a collapsing sum which becomes u(b) − u(a) = u(b) as

u(a) = 0. We conclude

∑π

(Mj −mj

)| ∆uj | − | ∆gj | ≤ 2M

(u(b)−

∑π

∣∣∣∣∆gj∣∣∣∣Now by Equation α, for all refinements of π2, we have

u(b)−∑π

| ∆gj |< ε.

Hence,

∑π

(Mj −mj

)| ∆uj | − | ∆gj | ≤ 2M ε. (γ)

Next, for any refinement of π of π2, let the partition points be x0, . . . , xn as usual and define

J+(π) = j ∈ π|∆gj ≥ 0, J−(π) = j ∈ π|∆gj < 0.

130


By the Infimum and Supremum Tolerance Lemma, if j ∈ J+(π),

∃s′j ∈ [xj−1, xj ] 3 mj ≤ f(s′j) < mj + ε/2, ∃sj ∈ [xj−1, xj ] 3 Mj − ε/2 < f(sj) ≤Mj .

It follows

f(sj)− f(s′j) > Mj −mj − ε, j ∈ J+(π). (ξ)

On the other hand, if j ∈ J−(π), we can find sj and s′j in [xj−1, xj ] so that

∃s′j ∈ [xj−1, xj ] 3 mj ≤ f(sj) < mj + ε/2, ∃sj ∈ [xj−1, xj ] 3 Mj − ε/2 < f(s′j) ≤Mj .

This leads to

f(s′j)− f(sj) > Mj −mj − ε, j ∈ J−(π). (ζ)

Thus,

∑π

(Mj −mj

)| ∆gj | =

∑j∈J+(π)

(Mj −mj

)∆gj +

∑j∈J−(π)

(Mj −mj

)(−∆gj

)

<∑

j∈J+(π)

(f(sj)− f(s′j)

)∆gj + ε

∑j∈J+(π)

∆gj

+∑

j∈J−(π)

(f(s′j)− f(sj)

)(−∆gj

)+ ε

∑j∈J+(π)

(−∆gj

)

=∑j∈π

(f(sj)− f(s′j)

)∆gj + ε

∑j∈π| ∆gj | .

Also, by the definition of the variation function of g, we have∑j∈π| ∆gj |≤ u(b) = Vg(b).

Since the points s1, . . . , sn and s′1, . . . , s′n are evaluation sets of π, we can apply Equation α to

conclude

| S(f, g,π,σ1 − S(f, g,π,σ2 | =∑j∈π

(f(sj)− f(s′j)

)∆gj

< 2ε.

Hence,

∑π

(Mj −mj

)| ∆gj | < 2ε+ εu(b) = (2 + u(b))ε. (θ)

131


Then, using Equation γ and Equation θ, we find

∑π

(Mj −mj

)∆uj =

∑π

(Mj −mj

) (∆uj− | ∆gj |

)+∑π

(Mj −mj

) (| ∆gj |

)< 2Mε + (2 + u(b))ε = (2M + 2 + u(b))ε.

Letting A = 2M + 2 + u(b), and recalling that u = Vg, we have

U(f, Vg,π)− L(f, Vg,π) < Aε

for any refinement π of π2. Hence, f satisfies the Riemann - Stieljes Criterion with respect to Vg on [a, b].We conclude f ∈ RS[Vg, a, b].

Thus, f ∈ RS[g, a, b] and f ∈ RS[Vg, a, b] and by Theorem 6.1.1, we have f ∈ RS[Vg − g, a, b] also.

Theorem 6.6.2 (Products And Reciprocals Of Functions Riemann Stieljes Integrable With Respect To

g Of Bounded Variation Are Also Integrable).Let g ∈ BV [a, b] and f, f1, f2 ∈ RS[g, a, b]. Then

(i) f2 ∈ RS[g, a, b]

(ii) f1f2 ∈ RS[g, a, b]

(iii) If there is a positive constant m, so that |f(x)| > m for all x in [a, b], then 1/f ∈ Rs[g, a, b].

Proof 6.6.2. (i)

Proof . Since f ∈ RS[g, a, b], f ∈ RS[Vg, a, b] and f ∈ RS[Vg − g, a, b] by Theorem 6.6.1. Hence, by

Theorem 6.5.1, f2 ∈ RS[Vg, a, b] and f2 ∈ RS[Vg−g, a, b]. Then, by the linearity of the Riemann Stieljes

integral for monotone integrators, Theorem 6.1.1, we have f2 ∈ RS[Vg − (Vg − g) = g, a, b].

(ii)

Proof . f1, f2 ∈ RS[g, a, b] implies f1, f2 ∈ RS[Vg, a, b] and f1, f2 ∈ RS[Vg − g, a, b]. Thus, using

reasoning just like that in Part (i), we have f1f2 ∈ RS[g, a, b].

(iii)

Proof . By our assumptions, we know 1/f ∈ RS[Vg, a, b] and 1/f ∈ RS[Vg − g, a, b]. Thus, by the

linearity of the Riemann Stieljes integral with respect to monotone integrators, 1/f ∈ RS[g, a, b].

132


Theorem 6.6.3 (The Riemann Stieljes Integral Is Additive On Subintervals).Let g ∈ BV [a, b] and f ∈ RS[g, a, b]. Then, if a ≤ c ≤ b,∫ b

af(x)dg(x) =

∫ c

af(x)dg(x) +

∫ b

cf(x)dg(x).

Proof 6.6.6. From Theorem 6.5.6, we know∫ b

af(x)dVg(x) =

∫ c

af(x)dVg(x) +

∫ b

cf(x)dVg(x)∫ b

af(x)d(Vg − g)(x) =

∫ c

af(x)d(Vg − g)(x) +

∫ b

cf(x)d(Vg − g)(x).

Also, we know ∫ b

af(x)dg(x) =

∫ b

af(x)dVg(x)−

∫ b

cf(x)d(Vg − g)(x)

and so the result follows.

133


134

Chapter 7

Further Riemann - Stieljes Results

We know quite a bit about the Riemann Stieljes integral in theory. However, we do not know how to

compute a Riemann Stieljes integral and we only know that Riemann Stieljes integrals exist for a few type

of integrators: those that are bounded with a finite number of jumps and the identity integrator g(x) = x.

It is time to learn more.

7.1 The Riemann - Stieljes Fundamental Theorem Of Calculus

As you might expect, we can prove a Riemann - Stieljes variant of the Fundamental Theorem Of Calculus.

Theorem 7.1.1 (Riemann Stieljes Fundamental Theorem Of Calculus).Let g ∈ BV [a, b]m f ∈ RS[g, a, b]. Define F : [a, b]→ < by

F (x) =∫ x

af(t)dg(t).

Then

(i) F ∈ BV [a, b],

(ii) If g is continuous at c in [a, b], then F is continuous at c.

(iii) If g is monotone and if at c is in [a, b], g′(c) exists and f is continuous at c, then F ′(c) exists

with

F ′(c) = f(c) g′(c).

135

7.1. FUNDAMENTAL THEOREM CHAPTER 7. FURTHER RIEMANN-STIELJES

Proof 7.1.1. First, assume g is monotone increasing and g(a) < g(b). Let π be a partition of [a, b]. Then,

we immediately have the fundamental estimates

m(g(b)− g(a)) ≤ L(f, g) ≤ U(f, g) ≤M(g(b)− g(a)),

where m and M are the infimum and supremum of f on [a, b] respectively. Since f ∈ RS[g, a, b], we then

have

m(g(b)− g(a)) ≤∫ b

afdg ≤M(g(b)− g(a)).

or

m ≤∫ ba fdg

g(b)− g(a)≤M.

Let K(a, b) =∫ ba fdg/(g(b)− g(a)). Then, m ≤ K(a, b) ≤M and

∫ ba fdg = K(a, b)(g(b)− g(a)).

Now assume x < y in [a, b]. Since f ∈ RS[g, a, b], by Theorem 6.5.5, f ∈ RS[g, x, y]. By the

argument just presented, we can show there is a number K(x, y) so that

K(x, y) =∫ y

xfdg/(g(y)− g(x)),

m ≤ inft∈[x,y]

f(t) ≤ K(x, y) ≤ supt∈[x,y]

f(t) ≤M (α)∫ y

xfdg = K(x, y)(g(y)− g(x))

(i)

Proof . We show f ∈ BV [a, b]. Let π be a partition of [a, b]. Then, labeling the partition points in the

usual way,

∑π

| ∆Fj | =∑π

| ∆F (xj)− F (xj−1 |

=∑π

|∫ xj

xj−1

fdg |

=∑π

| K(xj−1, xj) || g(xj)− g(xj−1) |=∑π

| K(xj−1, xj) || ∆gj |

using Equation α on each subinterval [xj−1, xj ]. However, we know each m ≤ K(xj−1, xj) ≤M and so∑π

| ∆Fj | ≤ ‖ f ‖∞∑π

| ∆gj |

= ‖ f ‖∞ (g(b)− g(a)),

as g is monotone increasing. Since this inequality holds for all partitions of [a, b], we see

V (F ; a, b) ≤‖ f ‖∞ (g(b)− g(a))

136


implying F ∈ BV [a, b].

(ii)

Proof . Let g be continuous at c. Then given a positive ε, there is a δ > 0, so that

| g(c)− g(y) | < ε/(1+ ‖ f ‖∞), | y − c |< δ, y ∈ [a, b].

For any such y, apply Equation α to the interval [c, y] or [y, c] depending on whether y > c or vice -

versa. For concreteness, let’s look at the case y > c. Then, there is a K(c, y) so that m ≤ K(c, y) ≤ M

and∫ yc f(t)dg(t) = K(c, y)(g(y)− g(c)). Thus, since y is within δ of c, we have∣∣∣∣∫ y

cf(t)dg(t)

∣∣∣∣ =| K(c, y) | | g(y)− g(c) |≤‖ f ‖∞ ε/(1+ ‖ f ‖∞) < ε.

We conclude that if y ∈ [c, c + δ), then∣∣∣∣∫ yc f(t)dg(t)

∣∣∣∣ < ε. A similar argument holds for y ∈ (c − δ, c].

Combining, we see y ∈ (c− δ, c+ δ) and in [a, b] implies∣∣∣∣F (y)− F (c)∣∣∣∣ =

∣∣∣∣∫ y

cf(t)dg(t)

∣∣∣∣ < ε.

So F is continuous at c.

(iii)

Proof . If c ∈ [a, b], g′(c) exists and f is continuous at c, we must show that F ′(c) = f(c)g′(c). Let a

positive ε be given. Then,

∃δ1 3

∣∣∣∣∣g(y)− g(c)y − c

− g′(c)

∣∣∣∣∣ < ε, 0 <| y − c |< δ1, y ∈ [a, b]. (β)

and

∃δ2 3∣∣∣∣f(y)− f(c) | < ε, | y − c |< δ2, y ∈ [a, b]. (γ)

Choose any δ < min(δ1, δ2). Let y be in(

(c − δ, c) ∪ (c, c + δ))∩ [a, b]. We are interested in the

interval I with endpoints c and y which is either of the form [c, y] or vice - versa. Apply Equationα to this

interval. We find there is a K(I) that satisfies

inft∈I

f(t) ≤ K(I) ≤ supt∈I

f(t)

and ∫ y

cf(t)dg(t) = K([c, y])(g(y)− g(c)), y > c

137


or ∫ c

yf(t)dg(t) = K([y, c])(g(c)− g(y)), y < c

or

−∫ y

cf(t)dg(t) = K([y, c])(g(c)− g(y)), y < c

which gives ∫ y

cf(t)dg(t) = K([y, c])(g(y)− g(c)), y < c.

So we conclude we can write ∫ y

cf(t)dg(t) = K(I)(g(y)− g(c)).

where K(I) denotes K([c, y]) or K([y, c]) depending on where y is relative to c. Next, since δ <

min(δ1, δ2), both Equation α and Equation β holds. Thus,

f(c)− ε < f(t) < f(c) + ε, y ∈(

(c− δ, c) ∪ (c, c+ δ))∩ [a, b].

This tells us that supt∈I f(t) ≤ f(c) + ε and inft∈I f(t) ≥ f(c)− ε. Thus,

f(c)− ε ≤ K([c, y]),K([y, c]) ≤ f(c) + ε

or | K([c, y])− f(c) |< ε and | K([y, c])− f(c) |< ε. Finally, consider∣∣∣∣∣F (y)− F (c)y − c

− f(c)g′(c)

∣∣∣∣∣ =

∣∣∣∣∣K(I)(g(y)− g(c)y − c

− f(c)g′(c)

∣∣∣∣∣=

∣∣∣∣∣K(I)(g(y)− g(c)y − c

− f(c)g′(c) +K(I)g′(c)−K(I)g′(c)

∣∣∣∣∣≤ | K(I) |

∣∣∣∣∣(g(y)− g(c)y − c

− g′(c)

∣∣∣∣∣+∣∣∣∣K(I)− f(c)

∣∣∣∣ | g′(c) |< ‖ f ‖∞ ε+ | g′(c) | ε.

Since ε is arbitrary, this shows F is differentiable at c with value f(c)g′(c).

This proves the proposition for the case that g is monotone. To finish the proof, we note if g ∈ BV [a, b],then g = Vg − (Vg − g) is the standard decomposition of g into the difference of two monotone increasing

functions. Let F1(x) =∫ xa f(t)d(Vg)(t) and F2(x) =

∫ xa f(t)d(Vg − g)(t). From Part (i), we see

F = F1 − F2 is of bounded variation. Next, if g is continuous at c, so is Vg and Vg − g by Theorem 3.5.3.

So by Part (ii), F1 and F2 are continuous at c. This implies F is continuous at c.

138

7.2. EXISTENCE CHAPTER 7. FURTHER RIEMANN-STIELJES

7.2 Existence Results

We begin by looking at continuous integrands.

Theorem 7.2.1 (Integrand Continuous and Integrator Of Bounded Variation Implies Riemann - Stieljes

Integral Exists).If f ∈ C[a, b] and g ∈ BV [a, b], then f ∈ RS[g, a, b].

Proof 7.2.1. Let’s begin by assuming g is monotone increasing. We may assume without loss of generality

that g(a) < g(b). Let K = g(b)− g(a) > 0. Since f is continuous on [a, b], f is uniformly continuous on

[a, b]. Hence, given a positive ε, there is a positive δ so that

| f(s)− f(t) |< ε/K, | t− s |< δ, t, s ∈ [a, b].

Now, repeat the proof of Theorem 4.4.1 which shows that if f is continuous on [a, b], then f ∈ RI[a, b], but

replace all the ∆xj by ∆gj . This shows that f satisfies the Riemann - Stieljes Criterion for integrability.

Thus, by the equivalence theorem, f ∈ RS[g, a, b].Next, let g ∈ BV [a, b]. Then g = Vg − (Vg − g) as usual. Since Vg and Vg − g are monotone

increasing, we can apply our first argument to conclude f ∈ RS[Vg, a, b] and f ∈ RS[Vg − g, a, b]. Then,

by the linearity of the Riemann - Stieljes integral with respect to the integrator, Theorem 6.1.1, we have

f ∈ RS[g, a, b] with ∫ b

afdg =

∫ b

afdvg −

∫ b

afd(Vg − g).

Next, we let the integrand be of bounded variation.

Theorem 7.2.2 (Integrand Bounded Variation and Integrator Continuous Implies Riemann - Stieljes

Integral Exists).If f ∈ BV [a, b] and g ∈ C[a, b], then f ∈ RS[g, a, b].

Proof 7.2.2. If f ∈ BV [a, b] and g ∈ C[a, b]], then by the previous theorem, Theorem 7.2.1, g ∈RS[f, a, b]. Now apply integration by parts, Theorem 6.1.2, to conclude f ∈ RS[g, a, b].

What if the integrator is differentiable?

Theorem 7.2.3 (Integrand Continuous and Integrator Continuously Differentiable Implies Riemann -

Stieljes Integrable).Let f ∈ C[a, b] and g ∈ C1[a, b]. Then f ∈ RS[g, a, b], fg′ ∈ RI[a, b] and∫ b

af(x)dg(x) =

∫ b

af(x)g′(x)dx

where the integral on the left side is a traditional Riemann integral.

139

7.2. EXISTENCE CHAPTER 7. FURTHER RIEMANN-STIELJES

Proof 7.2.3. Pick an arbitrary positive ε. Since g′ is continuous on [a, b], g′ is uniformly continuous on

[a, b]. Thus, there is a positive δ so that

| g′(s)− g′(t) | < ε, | s− t |< δ, s, t ∈ [a, b]. (α)

Since g′ is continuous on [a, b], there is a number M so that | g(x) |≤M for all x in [a, b]. We conclude

that g ∈ BV [a, b] by Theorem 3.3.3. Now apply Theorem 7.2.1, to conclude f ∈ RS[g, a, b]. Thus, there

is a partition π0 of [a, b], so that∣∣∣∣S(f, g,π,σ)−∫ b

afdg

∣∣∣∣ < ε, π0 π, σ ⊆ π. (β)

Further, since fg′ is continuous on [a, b], fg′ ∈ RI[a, b] and so∫ ba fg

′ exists also.

Now let π1 be a refinement of π0 with || π1 ||< δ. Then we can apply Equation β to conclude∣∣∣∣S(f, g,π,σ)−∫ b

afdg

∣∣∣∣ < ε, π1 π, σ ⊆ π. (γ)

Next, apply the Mean Value Theorem to g on the subintervals [xj−1, xj ] from partition π for which

Equation γ holds. Then, ∆gj = g′(tj)(xj − xj−1) for some tj in (xj−1, xj). Hence,

S(f, g,π,σ) =∑π

f(sj)∆gj =∑π

f(sj)g′(tj)∆xj .

Also, we see

S(fg′,π,σ) =∑π

f(sj)g′(sj)∆xj .

Thus, we can compute∣∣∣∣S(f, g,π,σ)− S(fg′,π,σ)∣∣∣∣ =

∣∣∣∣∑π

f(sj)(g′(tj)− g′(sj)∆xj

)≤ || f ||∞

∑π

∣∣∣∣g′(tj)− g′(sj)∆xj∣∣∣∣.By Equation α, since || π ||< δ, |tj − sj | < δ and so |g′(tj)− g′(sj)| < ε. We conclude∣∣∣∣S(f, g,π,σ)− S(fg′,π,σ)

∣∣∣∣ < ε || f ||∞∑π

∆xj = ε || f ||∞ (b− a). (ξ)

Thus,∣∣∣∣S(fg′,π,σ)−∫ b

afdg

∣∣∣∣ ≤ ∣∣∣∣S(f, g,π,σ)− S(fg′,π,σ)∣∣∣∣ +

∣∣∣∣S(f, g,π,σ)−∫ b

afdg

∣∣∣∣< ε || f ||∞ (b− a) + ε

by Equation γ and Equation ξ. This proves the desired result.

140

7.3. COMPUTATIONS CHAPTER 7. FURTHER RIEMANN-STIELJES

It should be easy to see that the assumptions of Theorem 7.2.3 can be relaxed. Consider

Theorem 7.2.4 (Integrand Riemann Integrable and Integrator Continuously Differentiable Implies

Riemann - Stieljes Integrable).Let f ∈ C[a, b] and g ∈ C1[a, b]. Then f ∈ RS[g, a, b], fg′ ∈ RI[a, b] and∫ b

af(x)dg(x) =

∫ b

af(x)g′(x)dx

where the integral on the left side is a traditional Riemann integral.

Proof 7.2.4. We never use the continuity of f in the proof given for Theorem 7.2.3. All we use is the fact

that f is Riemann integrable. Hence, we can use the proof of Theorem 7.2.3 without change to find∣∣∣∣S(fg′,π,σ)−∫ b

afdg

∣∣∣∣ < ε || f ||∞ (b− a) + ε.

This tells us that fg′ is Riemann integrable on [a, b] with value∫ ba fdg.

7.3 Worked Out Examples Of Riemann Stieljes Computations

How do we compute a Riemann Stieljes integral? Let’s look at some example.

Example 7.3.1. Let f and g be defined on [0, 2] by

f(x) =

x, x ∈ Q ∩ [0, 2]2− x, x ∈ Ir ∩ [0, 2],

g(x) =

1, 0 ≤ x < 13, 1 ≤ x ≤ 2.

Does∫fdg exist?

Solution 7.3.1. We can answer this two ways so far. Method 1: We note f is continuous at 1 (you should

be able to do a traditional ε − δ proof of this fact!) and since g has a jump at 1, we can look at Lemma

6.2.1 - Lemma 6.2.3 to see that f is indeed Riemann - Stieljes with respect to g. The value is given by∫ 2

0fdg = f(1)(g(1+)− g(1−) = 1(3− 1) = 2.

Method 2: We can compute the integral using a partition approach. Let π be a partition of [0, 2]. We

may assume without loss of generality that 1 ∈ π (recall all of our earlier arguments that allow us to make

this statement!). Hence, there is an index k0 such that xk0 = 1. We have

L(f, g,π) =(

infx∈[xk0−1,1]

f(x))(

g(1)− g(xk0−1))

+(

infx∈[1,xk0+1]

f(x))(

g(xk0+1)− g(1))).

141


Now use how g is defined to see,

L(f, g,π) =(

infx∈[xk0−1,1]

f(x))(

3− 1))

+(

infx∈[1,xk0+1]

f(x))(

3− 3)).

Hence,

L(f, g,π) = 2(

infx∈[xk0−1,1]

f(x)).

If you graphed x and 2 − x simultaneously on [0, 2], you would see that they cross at 1 and x is below

2−x before 1. This graph works well for f even though we can only use the graph of x when x is rational

and the graph of 2− x when x is irrational. We can see in our mind how to do the visualization. For this

mental picture, you should be able to see that the infimum of f on [xk0−1, 1] will be the value xk0−1. We

have thus found that L(f, g,π) = 2xk0−1. A similar argument will show that U(f, g,π) = 2(2− xk0−1).

This immediately implies that L(f, g) = U(f, g) = 2.

Example 7.3.2. Let f be any bounded function which is discontinuous from the left at 1 on [0, 2]. Again,

let g be defined on [0, 2] by

g(x) =

1, 0 ≤ x < 13, 1 ≤ x ≤ 2.

Does∫fdg exist?

Solution 7.3.2. First, since we know f is not continuous from the left at 1 and g is continuous from the

right at 1, the conditions of Lemma 6.2.2 do not hold. So it is possible this integral does not exist. We will

in fact show this using arguments that are similar to the previous example. Again, π is a partition which

has xk0 = 1. We find

L(f, g,π) =(

infx∈[xk0−1,1]

f(x))(

3− 1)), U(f, g,π) =

(sup

x∈[xk0−1,1]f(x)

)(3− 1)

).

Since we can choose xk0 − 1 as close to 1 as we wish, we see

infx∈[xk0−1,1]

f(x)→ min(f(1−), f(1))

supx∈[xk0−1,1]

f(x)→ max(f(1−), f(1))

But f is discontinuous from the left at 1 and so f(1−) 6= f(1). For concreteness, let’s assume f(1−) <f(1) (the argument the other way is very similar). We see L(f, g) = 2f(1−) and U(f, g) = 2f(1). Since

these values are not the same, f is not Riemann Stieljes integrable with respect to g by the Riemann -

Stieljes equivalence theorem, Theorem 6.4.1.

142


Example 7.3.3. Let f be any bounded function which is continuous from the left at 1 on [0, 2]. Again, let

g be defined on [0, 2] by

g(x) =

1, 0 ≤ x < 13, 1 ≤ x ≤ 2.

Does∫fdg exist?

Solution 7.3.3. First, since we know f is continuous from the left at 1 and g is continuous from the right

at 1, the conditions of Lemma 6.2.2 do hold. So this integral does exist. Using Lemma 6.2.2, we see∫ 2

0fdg = f(1)(g(1+)− g(1−)) = 2f(1).

We can also show this using partition arguments as we have done before. Again, π is a partition which

has xk0 = 1. Again, we have

L(f, g,π) =(

infx∈[xk0−1,1]

f(x))(

3− 1)), U(f, g,π) =

(sup

x∈[xk0−1,1]f(x)

)(3− 1)

).

Since we can choose xk0 − 1 as close to 1 as we wish, we see

infx∈[xk0−1,1]

f(x)→ min(f(1−), f(1))

supx∈[xk0−1,1]

f(x)→ max(f(1−), f(1))

But f is continuous from the left at 1, f(1−) = f(1). We see L(f, g) = 2f(1) and U(f, g) = 2f(1). Since

these values are the same, f is Riemann Stieljes integrable with respect to g by the Riemann - Stieljes

equivalence theorem, Theorem 6.4.1.

Example 7.3.4. Define a step function g on [0, 12] by

g(x) =

0, 0 ≤ x < 2∑bxc

j=2 (j − 1)/36, 2 ≤ x < 8

21/36 +∑bxc

j=8 (13− j)/36, 8 ≤ x ≤ 12

where bxc is the greatest integer which is less than or equal to x. The function g is everywhere continuous

from the right and represents the probability of rolling a number j ≤ x. It is called the cumulative

probability distribution function of a fair pair of dice. The Riemann - Stieljes integral µ =∫ 12

0 xdg(x) is

called the mean of this distribution. The variance of this distribution is denoted by σ2 (unfortunate choice,

isn’t it as that is the letter we use to denote evaluation sets of partitions!) and defined to be

σ2 =∫ 12

0(x− µ)2dg(x).

143


Compute µ and σ2.

Solution 7.3.4. Since f(x) = x is continuous on [0, 12], Lemma 6.2.4 applies and we have

∫ 12

0xdg(x) =

12∑j=2

j

(g(j+)− g(j−)

).

The evaluations are a bit messy.

36(g(2+)− g(2−)) = 36(g(2)− g(2−)) = 1− 0 = 1

36(g(3+)− g(3−)) = 36(g(3)− g(3−)) = 3− 1 = 2

36(g(4+)− g(4−)) = 36(g(4)− g(4−)) = 6− 3 = 3

36(g(5+)− g(5−)) = 36(g(5)− g(5−)) = 10− 6 = 4

36(g(6+)− g(6−)) = 36(g(6)− g(6−)) = 15− 10 = 5

36(g(7+)− g(7−)) = 36(g(7)− g(7−)) = 21− 15 = 6

36(g(8+)− g(8−)) = 36(g(8)− g(8−)) = 26− 21 = 5

36(g(9+)− g(9−)) = 36(g(9)− g(9−)) = 30− 26 = 4

36(g(10+)− g(10−)) = 36(g(10)− g(10−)) = 33− 30 = 3

36(g(11+)− g(11−)) = 36(g(11)− g(11−)) = 35− 33 = 2

36(g(12+)− g(12−)) = 36(g(12)− g(12−)) = 36− 35 = 1

Thus, ∫ 12

0xdg(x) =

(2(1) + 3(2) + 4(3) + 5(4) + 6(5) + 7(6)

+8(5) + 9(4) + 10(3) + 11(2) + 12(1))/36

=(

2 + 6 + 12 + 20 + 30 + 42 + 40 + 36 + 30 + 22 + 12)/36

= 252/36 = 7.

So, the mean or expected value of a single roll of a fair pair of dice is 7. To find the variance, we calculate

σ2 =∫ 12

0(x− 7)dg(x)

=12∑j=2

(j − 7)2

(g(j+)− g(j−)

)

=(

25(1) + 16(2) + 9(3) + 4(4) + 1(5) + 0(6) + 1(5) + 4(4) + 9(3) + 16(2) + 25(1))/36

=(

25 + 32 + 27 + 16 + 5 + 5 + 16 + 27 + 32 + 25)/36

144


= 210/36 = 35/6.

Example 7.3.5. Let f(x) = ex and let g be defined on [0, 2] by

g(x) =

x2, 0 ≤ x ≤ 1x2 + 1, 1 < x ≤ 2.

Show∫fdg exists and evaluate it.

Solution 7.3.5. Since g is monotone,∫ 2

0 fdg exists. We can thus decompose g into its continuous and

saltus part. We find

gc(x) = x2, sg(x) =

0, 0 ≤ x ≤ 11, 1 < x ≤ 2.

The saltus integral is evaluated using Lemma 6.2.1. The integrand is continuous and the jump is at 1, so

we have ∫ 2

0f dsg =

∫ 2

0ex dsg(x)

= e1(sg(1+)− sg(1−)) = e(1− 0) = e.

and for the continuous part, we can use the fact the integrator is continuously differentiable on [0, 2] to

apply Theorem 7.2.3 to obtain∫ 2

0f dgc =

∫ 2

0ex d(x2) =

∫ 2

0ex 2xdx = 2(e2 + 1)

Thus, ∫ 2

0fdg =

∫ 2

0f dgc +

∫ 2

0f dsg

= 2(e2 + 1) + e.

We can also do this by integration by parts, Theorem 6.1.2. Since f ∈ RS[g, 0, 2], it follows that

g ∈ RS[f, 0, 2] and ∫ 2

0f(x)dg(x) = ex g(x)

∣∣∣∣20

−∫ 2

0g(x)df(x)

= e2g(2)− g(0) −∫ 2

0g(x)d(ex)

= e2 −∫ 2

0g(x)exdx.

145


Example 7.3.6. Let f(x) = ex and let g be defined on [0, 2] by

g(x) =

x2, 0 ≤ x < 1sin(x), 1 ≤ x ≤ 2.


Solution 7.3.6. We know that g is of bounded variation on [1, 2] because it is continuously differentiable

with bounded derivative there. But what about on [0, 1]? We know that the function h(x) = x2 on [0, 1] is

of bounded variation on [0, 1] because it is also continuously differentiable with a bounded derivative. If

π is any partition of [0, 1] then we must have, using standard notation for the partition points of π, that

∑π

| ∆gj | =p−1∑j=0

| ∆gj | + | g(1)− g(xp−1 |

≤ V (h, 0, 1) + 2 ‖ g ‖∞ .

Since the choice of partition on [0, 1] is arbitrary, we see g ∈ BV [0, 1]. Thus, combining, we have that g ∈BV [0, 2]. It then follows that f ∈ RS[g, 0, 2]. Now note that on [0, 1], we can write g(x) = h(x) + u(x)where

u(x) =

0, 0 ≤ x < 1sin(1)− 1, x = 1.

Then, to evaluate∫ 2

0 fdg we write∫ 2

0fdg =

∫ 1

0fdg +

∫ 2

1fdg

=∫ 1

0fd(h+ u) +

∫ 2

1fd(sin(x))

=∫ 1

0fd(h) +

∫ 1

0fd(u) +

∫ 2

1f cos(x)dx

=∫ 1

0ex2xdx+ f(1)(u(1)− u(1−)) +

∫ 2

1ex cos(x)dx

=∫ 1

0ex2xdx+ e(sin(1)− 1) +

∫ 2

1ex cos(x)dx

and these integrals are standard Riemann integrals that can be evaluated by parts.

146

7.4. HOMEWORK CHAPTER 7. FURTHER RIEMANN-STIELJES

7.4 Homework

Exercise 7.4.1. Define g on [0, 2] by

g(x) =

−2 x = 0x3 0 < x < 19/8 x = 1x4/4 + 1 1 < x < 27 x = 2

This function is from a previous exercise.

1. Show that if f(x) = x4 on [0, 2], then f ∈ RS[g, 0, 2].

2. Compute∫ 2

0 fdg.

3. Explain why g ∈ RS[f, 0, 2].

4. Compute∫ 2

0 gdf .

Exercise 7.4.2. Define g on [0, 2] by

g(x) =

−1 x = 0x2 0 < x < 17/4 x = 1√x+ 3 1 < x < 2

3 x = 2

This function is also from a previous exercise.

1. Show that if f(x) = x2 + 5 on [0, 2], then f ∈ RS[g, 0, 2].

2. Compute∫ 2

0 fdg.

3. Explain why g ∈ RS[f, 0, 2].

4. Compute∫ 2

0 gdf .

Exercise 7.4.3. Let f and g be defined on [0, 4] by

f(x) =

x, x ∈ Q ∩ [0, 4]2x, x ∈ Ir ∩ [0, 4],

g(x) =

1, 0 ≤ x < 12, 1 ≤ x < 23, 2 ≤ x < 34, 3 ≤ x ≤ 4.

Does∫fdg exist and if so what is its value?

147


Exercise 7.4.4. Let f(x) = x3 and let g be defined on [0, 3] by

g(x) =

x2, 0 ≤ x ≤ 2x2 + 4, 2 < x ≤ 3.


Exercise 7.4.5. Let f(x) = x2 + 3x+ 10 and let g be defined on [−1, 5] by

g(x) =

x3, −1 ≤ x ≤ 2−10x2, 2 < x ≤ 5.


Exercise 7.4.6. The following are definitions of integrands f1, f2 and f3 and integrators g1, g2 and g3 on

[0, 2]. For each pair of indices i, j determine if∫ 2

0 fidgj exists. If the integral exists, compute the value

and if the integral does not exist, provide a proof of its failure to exist.

f1(x) =

1, 0 ≤ x < 1x− 1, 1 ≤ x ≤ 2,

f2(x) =

1, x = 0x, 0 < x ≤ 2

f3(x) =

2, x = 01, 0 < x < 1x− 1, 1 ≤ x ≤ 2

g1(x) =

x, 0 ≤ x < 1x+ 1, 1 ≤ x ≤ 2,

g2(x) =

x, 0 ≤ x ≤ 1x+ 1, 1 < x < 24, x = 2,

g3(x) =

−1, x = 0x, 0 < x ≤ 1x+ 1, 1 < x < 24, x = 2.

Exercise 7.4.7. Prove

Theorem 7.4.1 (Limit Interchange Theorem For Riemann - Stieljes Integrals).Assume g ∈ BV [a, b] and fn ⊆ RS[g, a, b] converges uniformly to f0 on [a, b]. Then

(i) f0 ∈ RS[g, a, b],

(ii) If Fn(x) =∫ xa fn(t)dg(t) and F0(x) =

∫ xa f0(t)dg(t), then Fn converges uniformly to F0 on

[a, b].

(iii)

limn

∫ b

afn(t)dg(t) =

∫ b

af0(t)dg(t).

Exercise 7.4.8. Let g be strictly monotone on [a, b]. For f1, f2 in C[a, b], define ω : C[a, b]×C[a, b]→ <by ω(f1, f2) =

∫ ba f1(t)f2(t)dg(t).

148


(i) Prove that ω is an inner product on C[a, b].

(ii) Prove if ω(f, h) = 0 for all h ∈ RS[g, a, b], then f = 0.

149


150

Part IV

Abstract Measure Theory One

151

Chapter 8

Measurable Functions and Spaces

If you have been looking closely at how we prove the properties of Riemann and Riemann Stieljes inte-

gration, you will have noted that these proofs are intimately tied to the way we use partitions to divide the

function domain into small pieces. We are now going to explore a new way to associate a given bounded

function with a real number which can be interpreted as the integral.

Let X be a nonempty set. In mathematics, we study sets such as X when various properties and

structures have been added. For example, we might want X to have a metric d to allow us to measure an

abstract version of distance between points in X . We could study sets X which have a linear or vector

space structure and if this resulting vector space possessed a norm ‖ · ‖, we could determine an abstract

version of the magnitude of objects in X . Here, we want to look at collections of subsets of the set X and

impose some conditions on the structure of these collections.

Definition 8.0.1 (Sigma Algebras).Let X be a nonempty set. A family of subsets S is called a σ - algebra if

(i) ∅, X ∈ S.

(ii) If A ∈ S, so is AC . We say S is closed under complementation or complements.

(iii) If An∞n=1 ∈ S, then ∪∞n=1 An ∈ S. We say S is closed under countable unions.

The pair (X,S) will be called a measurable space and if A ∈ S, we will call A an S measurable set.

If the underlying σ - algebra is understood, we usually just say, A is a measurable subset of X .

A common tool we use in working with countable collections of sets are De Morgan’s Laws.

153

8.1. EXAMPLES CHAPTER 8. MEASURABILITY

Lemma 8.0.2 (De Morgan’s Laws).Let X be a nonempty set and Aα|α ∈ Λ be any collection of subsets of X . Hence, the index set Λmay be finite, countably infinite or arbitrary cardinality. Then

(i) (∪α Aα

)C= ∩α ACα

(ii) (∩α Aα

)C= ∪α ACα

Proof 8.0.1. This is a standard proof and is left to you as an exercise.

8.1 Examples

Let’s work through a series of examples of σ algebras.

Example 8.1.1. Let X be any not empty set and let S = A|A ⊆ X. This is the collection of all subsets

and is sometimes called the power set ofX . It is often denoted by the symbolP(X). This collection clearly

is a σ algebra. Hence, (P(X), X) is a measurable space and all subsets of X are P(X) measurable.

Example 8.1.2. Let X be any set and S = ∅, X. Then this collection is also a σ algebra, albeit not a

very interesting one! With this σ algebra, X is a measurable space with only two measurable sets.

Example 8.1.3. Let X be the set of counting numbers and let S = ∅,O,E, X where O is the odd

counting numbers and E, the odd. It is easy to see ( X , S) is a measurable space.

Example 8.1.4. Let X be any uncountable set and let S = A ⊆ X|A is countable or AC is countable.It is easy to see ∅ and X itself are in S. If A ∈ S , then there are two cases: A is countable and /or AC

is countable. In both cases, it is straightforward to reason that AC is also in S. It remains to show that

S is closed under countable unions. To do this, assume we have a sequence of sets An from S. Consider

A = ∪n An. There are several cases to consider.

1. If all the An are countable, then so is the countable union implying A ∈ S.

2. If all the An are not countable, then each ACn is countable. Thus, ∩nACn = (∪nAn)C is countable.

Again, this tells us A ∈ S.

154

8.2. BOREL SIGMA ALGEBRA CHAPTER 8. MEASURABILITY

3. If a countable number of An and a countable number of ACn are uncountable, then we have, since

X is uncountable,(∪nAn

)C=

(∩nACn

)=

(∩(An countable) A

Cn

)∩(∩(An uncountable) A

Cn

)=

(∩(ACn uncountable) A

Cn

)∩(∩(ACn countable) A

Cn

)

Now, for any index n, we must have ∩n ACn ⊆ ACn . Thus, since some ACn are countable, we must

have ∩nACn is countable. By De Morgan’s Laws, it follows that (∪nAn)C is countable. This implies

A ∈ S.

We conclude (X,S) is a measurable space.

Example 8.1.5. Let X be any nonempty set and let S1 and S2 be two sigma - algebras of X . Let

S3 = A ⊆ X|A ∈ S1 and A ∈ S2

≡ S1 ∩ S2.

It is straightforward to see that (X,S3) is a measurable space.

Example 8.1.6. Let X be any nonempty set. Let A be any nonempty collection of subsets of X . Note that

P(X), the collection of all subsets of X , is a sigma - algebra of X and hence, (X,P(X)) is a measurable

space that containsA. By Example 8.1.5, we know if S1 and S2 are two other sigma - algebras that contain

A, then S1 ∩ S2 is a new sigma - algebra that also contains A. This suggests we search for the smallest

sigma - algebra that contains A.

Definition 8.1.1 (The Sigma - Algebra Generated By Collection A).The sigma - algebra generated by a collection of subsets A in a nonempty set X , is denoted by σ(A)and is defined by

σ(A) = ∩S | A ⊆ S .

Since any sigma - algebra S that contains A by definition satisfies σ(A) ⊆ S, it is easy to see why we

interpret this generated sigma - algebra as the smallest sigma - algebra that contains the collection

A.

8.2 The Borel Sigma - Algebra of <

We now discuss a very important sigma algebra of subsets of the real line called the Borel sigma - algebra

which is denoted by B. Define four collections of subsets of < as follows:

155

8.2. BOREL SIGMA ALGEBRA CHAPTER 8. MEASURABILITY

1. A is the collection of finite open intervals of the form (a, b),

2. B is the collection of finite half open intervals of the form (a, b],

3. C is the collection of finite half open intervals of the form [a, b) and

4. D is the collection of finite closed intervals of the form [a, b].

It is possible to show that

σ(A) = σ(B) = σ(C) = σ(A).

This common sigma - algebra is what we will call the Borel sigma - algebra of <. It should be evident

to you that a set can be very complicated and still be in B. Some of these equalities will be left to you as

homework exercises, but we will prove that σ(A) = σ(D). Let S be any sigma - algebra that contains

A. We know that

[a, b] = (−∞, b] ∩ [a,∞)

= (b,∞)C ∩ (−∞, a)C

=(

(−∞, a) ∪ (b,∞))C

.

In the representation of [a, b] above, note we can write

(−∞, a) =n=bac⋃−∞

(n, a)

(b,∞) =∞⋃

n=dbe

(b, n).

Since, S is a sigma - algebra containingA, the unions on the right hand sides in the equations above must

be in S. This immediately tells us that [a, b] is also in S . Hence, since [a, b] is arbitrary, we conclude D

is contained in S also. Further, this is true for any sigma - algebra that contains A and so we have that

D ⊆ σ(A). Thus, by definition, we can say σ(D) ⊆ σ(A).

To show the reverse containment is quite similar. Let S be any sigma - algebra that contains D. We

know that

(a, b) = (−∞, b) ∩ (a,∞)

= [b,∞)C ∩ (−∞, a]C

=(

(−∞, a] ∪ [b,∞))C

.

156

8.3. EXTENDED BOREL SIGMA ALGEBRA CHAPTER 8. MEASURABILITY

In the representation of (a, b) above, note we can write

(−∞, a] =∞⋃bac

[−n, a]

and

[b,∞) =∞⋃dbe

[b, n].

Since, S is a sigma - algebra containingD, the unions on the right hand sides in the equations above must

be in S. This immediately tells us that (a, b) is also in S. Hence, since (a, b) is arbitrary, we conclude

A is contained in S also. Again, since this is true for any sigma - algebra that contains A, we have that

A ⊆ σ(D). Thus, by definition, we can say σ(A) ⊆ σ(D). Combining, we have the equality we seek.

8.2.1 Homework

Exercise 8.2.1. Prove σ(A) = σ(B).

Exercise 8.2.2. Prove σ(B) = σ(C).

8.3 The Extended Borel Sigma Algebra

It is often very convenient to deal with a number system that explicitly adjoins the symbols∞ and −∞ to

the standard real line <. This is actually called the two - point compactification of <, but that is another

story!

157


Definition 8.3.1 (The Extended Real Number System).The extended real number systems is denoted by < and is defined as the real numbers with two addi-

tional elements:

< = < ∪ +∞ ∪ −∞.

We want arithmetic involving the new symbols ±∞ to reflect our everyday experience with limits of

sequences of numbers which either grow without bound positively or negatively. Hence, we use the

conventions for all real numbers x:

(±∞) + (±∞) = x + (±∞) = (±∞) + x = ±∞,

(±∞) · (±∞) = ∞,

(±∞) · (∓∞) = = (∓∞) · (±∞) = −∞,

x · (±∞) = = (±∞) · x = ±∞ if x > 0,

x · (±∞) = = (±∞) · x = 0 if x = 0,

x · (±∞) = = (±∞) · x = ∓∞ if x < 0.

We can not define the arithmetic operations (∞) + (−∞), (−∞) + (∞) or any the four ratios of

the form (±∞)/(±∞).

We can now define the Borel sigma - algebra in <. Let E be any Borel set in <. Let

E1 = E ∪ −∞, E2 = E ∪ +∞, and E3 = E ∪ +∞ ∪ +∞.

Then, we define

B = E,E1, E2, E3 | E ∈ B.

We leave to you the exercise of showing that B is a sigma - algebra in <.

Exercise 8.3.1. Prove that B is a sigma - algebra in <.

It is that open intervals in < are in B, but is it true that B contains arbitrary open sets? To see that it

does, we must prove a characterization for the open sets of <.

Theorem 8.3.1 (Open Set Characterization Lemma).If U is an open set in <, then there is a countable collection of disjoint open intervals C = (an, bn)so that U = ∪n(an, bn).

158


Proof 8.3.1. Since U is open, if p ∈ U , there is an r > 0 so that B(p; r) ⊆ U . Hence, (p− r, p+ r) ⊆ Uimplying both (p, p+ r) ⊆ U and (p− r, p) ⊆ U . Let

Sp = y | (p, y) ⊆ U and Tp = x | (x, p) ⊆ U.

It is easy to see that both Sp and Tp are nonempty since U is open. Let bp = supSp and ap = inf Tp.Clearly, bp could be +∞ and ap could be −∞.

Consider u ∈ (ap, bp). From the Infimum and Supremum tolerance lemmas, we know there are points

x∗ and y∗ so that

u < y∗ ≤ bp ≤ ∞ and (p, y∗) ⊆ U ,

−∞ ≤ ap ≤ x∗ < u and (x∗, p) ⊆ U .

Hence, u ∈ (x∗, y∗) ⊆ U which implies u ∈ U . Thus, since u in (ap, bp) is arbitrary, we have (ap, bp) ⊆ U .

If ap or bp were not finite, they can not be in < and can not be in U . However, what if either one was finite?

Is it possible for the point to be in U? We will show that in this case, the points ap and bp still can not

lie in U . For concreteness, let us assume that ap is finite and in U . Then, ap would be an interior point

of U . Hence, there would be a radius ρ > 0 so that (ap − ρ, ap) ⊆ U implying ap − ρ ∈ Tp. Thus,

inf Tp = ap ≤ ap − ρ which is not possible. Hence, ap 6∈ U . A similar argument then shows that if bp is

finite, bp is not in U .

Thus, we know that ap and bp are never in U and that p is always in the open interval (ap, bp) ⊆ U .

Let F = (ap, bp) | p ∈ U. We see immediately that

U = ∪F (ap, bp).

Let (a, b) and (c, d) be any two intervals from F which overlap. From the definition of F , we then know

that a, b, c and d are not in U . Then, if a ≥ d, the two intervals would be disjoint; hence, we must have

a < d. By the same sort of argument, it is also true that c < b. Hence, if c is in the intersection, we have a

chain of inequalities like this:

a < c < b < d.

Next, since a 6∈ U , we see a ≤ c since (c, d) ⊆ U . Further, since c 6∈ U and (a, b) ⊆ U , it follows that

c ≤ a. Combining, we have a = c. A similar argument shows that b = d. Hence, (a, b) ∩ (c, d) 6= ∅implies that (a, b) = (c, d). Thus, two interval Ip and Iq in F are either the same or disjoint. We conclude

U =⋃

(disjoint Ip∈F)

Ip.

Let F0 be this collection of disjoint intervals from F . Each Ip in F0 contains a rational number rp. By

definition, it then follows that if Ip and Iq are in F0, then rp 6= rq. The set of these rational numbers is

countable and so we can label them using an enumeration rn. Label the interval Ip which contains rn as

159

8.4. MEASURABLE FUNCTIONS CHAPTER 8. MEASURABILITY

In. Then, we have

U =∞⋃n=1

In,

which is the desired result.

8.4 Measurable Functions

Let f : < → < be a continuous function. let O be an open subset of <. By Theorem 8.3.1, we know that

we can write

O =⋃n

(an, bn)

where the (an, bn) are mutually disjoint finite open intervals of <. It follows immediately that O is in

the Borel sigma - algebra B. Now consider the inverse image of O under f , f−1(O). If p ∈ f−1(O),

then f(p) ∈ O. Since O is open, f(p) must be an interior point. Hence, there is a radius r > 0 so

that (f(p) − r, f(p) + r) ⊆ O. Since f is continuous at p, there then is a δ > 0 so that f(x) ∈((f(p) − r, f(p) + r) if x ∈ (p − δ, p + δ). This tells us that (p − δ, p + δ) ⊆ f−1(O). Since p was

arbitrarily chosen, we conclude that f−1(O) is an open set.

We see that if f is continuous on <, then f−1(O) is in the Borel sigma - algebra for any open set Oin <. We can then say that f−1(α,∞) is in B for all α > 0. This suggests that an interesting way to

generalize the notion of continuity might be to look for functions f on an arbitrary nonempty set X with

sigma - algebra S satisfying f−1(O) ∈ S for all open sets O. Further, by our last remark, it should be

enough to ask that f−1((α,∞)) ∈ S for all α ∈ <. This is exactly what we will do. It should be no

surprise to you that functions f satisfying this new definition will not have to be continuous!

Definition 8.4.1 (The Measurability of a Function).Let X be a nonempty set and S be a sigma - algebra of subsets of X . We say that f : X → < is a S -

measurable function on X or simply S measurable if

∀α ∈ <, x ∈ X | f(x) > α ∈ S.

We can easily prove that there are equivalent ways of proving a function is measurable.

160


Lemma 8.4.1 (Equivalent Conditions For The Measurability of a Function).Let X be a nonempty set and S be a sigma - algebra of subsets of X . The following statements are

equivalent:

(i): ∀α ∈ <, Aα = x ∈ X | f(x) > α ∈ S,

(ii): ∀α ∈ <, Bα = x ∈ X | f(x) ≤ α ∈ S,

(iii): ∀α ∈ <, Cα = x ∈ X | f(x) ≥ α ∈ S,

(iv): ∀α ∈ <, Dα = x ∈ X | f(x) < α ∈ S.

Proof 8.4.1.(i)⇒ (ii):

Proof . If Aα ∈ S, then its complement is in S also. Since Bα = ACα , (ii) follows.

(ii)⇒ (i):

Proof . If Bα ∈ S, then its complement is in S also. Since Aα = BCα , (i) follows.

(iii)⇔ (iv):

Proof . Since Cα = DCα and Dα = CCα , arguments similar to those of the previous cases can be applied.

Hence, if we show (i)⇔ (iii), we will be done. (i)⇒ (iii):

Proof . By (i), Aα−1/n ∈ S for all n. We know

Cα =⋂n

Aα−1/n =⋂n

x|f(x) > α− 1/n

We also know ACα−1/n is measurable and so ∪n ACα−1/n is also measurable. Thus, the complement of

∪n ACα−1/n is also measurable. Then, by De Morgan’s Laws, Cα = ∩n Aα−1/n is measurable.

(iii)⇒ (i):

Proof . Note, Cα+1/n ∈ S for all n and so

Aα =⋃n

Cα+1/n =⋃n

x|f(x) ≥ α+ 1/n

is also measurable.

We conclude all four statements are equivalent.

161


8.4.1 Examples

Example 8.4.1. Any constant function f on a nonempty set X with given sigma - algebra S is measurable

as if f(x) = c for some c ∈ <, then

x|f(x) > α =

∅ ∈ S α ≥ cX ∈ S α < c

Example 8.4.2. Let X be a nonempty set X with given sigma - algebra S. Let E ∈ S be given. Define

IE(x) =

1 if x ∈ E0 if x 6∈ E

Then IE is measurable. Note

x|IE(x) > α =

∅ ∈ S α ≥ 1E ∈ S 0 ≤ α < 1X ∈ S α < 0

Example 8.4.3. Let X = < and S = B. Then, if f : < → < is continuous, f is measurable by the

arguments we made at the beginning of this section. More generally, let f : [a, b] → < be continuous on

[a, b]. Then, extend f to < as f defined by

f =

f(a) x < a

f(x) a ≤ x ≤ bf(b) x > b

Then f is continuous on < and measurable with f−1(α,∞) ∈ B for all α. It is not hard to show that

B ∩ [a, b] = E ⊆ [a, b] | E ∈ B

is a sigma -algebra of the set [a, b]. Further, the standard arguments for f continuous on [a, b] show us

that f−1(α,∞) ∈ B∩ [a, b] for all α. Hence, a continuous f on the interval [a, b] will be measurable with

respect to the sigma - algebra B ∩ [a, b].We can argue is a similar fashion for functions continuous on intervals of the form (a, b], [a, b) and

(a, b) whether a and b is finite or not.

Example 8.4.4. If X = < and S = B, then any monotone function is Borel measurable. To see this,

note we can restrict our attention to monotone increasing functions as the argument is quite similar for

monotone decreasing. It is enough to consider the cases where f takes on the value α without a jump at

the point x0 or f has a jump across the value α at x0. In the first case, since f is monotone increasing and

162

8.5. PROPERTIES CHAPTER 8. MEASURABILITY

f(x0) = α, f−1(α,∞) = (x0,∞) ∈ B. On the other hand, if f has a jump at x0 across the value α, then

f(x−0 ) 6= f(x+0 ) and α ∈ [f(x−0 ), f(x+

0 )]. there are three possibilities:

(i): f(x−0 ) = f(x0) < f(x+0 ): If α = f(x0), then since f is monotone, f−1(α,∞) = (x0,∞). If

f(x0) < α < f(x+0 ), we again have f−1(α,∞) = (x0,∞). Finally, if α = f(x+

0 ), we have

f−1(α,∞) = [x0,∞). In all cases, these inverse images are in B.

(ii): f(x−0 ) < f(x0) < f(x+0 ): A similar analysis shows that all the possible inverse images are Borel

sets.

(iii): f(x−0 ) < f(x0) = f(x+0 ): we handle the arguments is a similar way.

We conclude that in all cases, f−1(α,∞) ∈ B and hence f is measurable.

Note, the analysis of the previous example could be employed here also to show that a monotone

function defined on an interval such as [a, b], (a, b) and so forth is Borel measurable with respect to the

restricted sigma - algebra B ∩ [a, b] etc.

Exercise 8.4.1. Let f be piecewise continuous on [a, b]. Prove that f is measurable with respect to the

restricted Borel sigma - algebra B ∩ [a, b]. Recall, a function is piecewise continuous on [a, b] if there are

a finite number of points xi in [a, b] where f is not continuous.

Comment 8.4.1. For convenience, we will start using a more abbreviated notation for sets like

x ∈ X | f(x) > α;

we will shorten this to f(x) > α or (f(x) > α) in our future discussions.

8.5 Properties of Measurable Functions

We now want to see how we can build new measurable functions from old ones we know.

Lemma 8.5.1 (Properties of Measurable Functions).Let X be a nonempty set and S a sigma - algebra on X . Then if f and g are S measurable, so are

(i): cf for all c ∈ <.

(ii): f2.

(iii): f + g.

(iv): fg.

(v): | f |.

Proof 8.5.1.(i):

163

8.5. PROPERTIES CHAPTER 8. MEASURABILITY

Proof . If c = 0, cf = 0 and the result is clear. If c > 0, then (cf(x) > α) = (f(x) > α/c) which is

measurable as f is measurable. If c < 0, a similar argument holds.

(ii):

Proof . If α < 0, then (f2(x) > α) = X which is in S. Otherwise, if α ≥ 0, then

(f2(x) > α) = (f(x) >√α) ∪ (f(x) < −

√α),

and both of these sets are measurable since f is measurable. The conclusion follows.

(iii):

Proof . If r ∈ Q, let Sr = (f(x) > r) ∩ (g(x) > α − r) which is measurable since f and g are

measurable. We claim that

(f(x) + g(x) > α) =⋃r∈Q

Sr.

To see this, let x satisfy f(x) + g(x) > α. Thus, f(x) > α − g(x). Since the rationals are dense in <,

we see there is a rational number r so that f(x) > r > α− g(x). This clearly implies that f(x) > r and

g(x) > α− r and so x ∈ Sr. Since our choice of x was arbitrary, we have shown that

(f(x) + g(x) > α) ⊆⋃r∈Q

Sr.

The converse is easier as if x ∈ Sr, it follows immediately that f(x) + g(x) > α.

Since Sr is measurable for each r and the rationals are countable, we see (f(x) + g(x) > α) is

measurable.

(iv):

Proof . To prove this result, note that fg = (1/4)(

(f + g)2− (f − g)2

)and all the individual pieces are

measurable by (iii) and (i).

(v):

Proof . If α < 0, (f(x) > α) = X which is measurable. On the other hand, if α ≥ 0,

(| f | (x) > α) = (f(x) > α) ∪ (f(x) < −α),

which implies the measurability of | f |.

We can also prove another characterization of the measurability of f .

164

8.6. EXTENDED VALUED CHAPTER 8. MEASURABILITY

Lemma 8.5.2 (A Function is Measurable If and Only If Its Positive and Negative Parts Are Measur-

able).Let X be a nonempty set and S be a sigma - algebra on X . Then f : X → < is measurable if and

only if f+ and f− are measurable, where

f+(x) = max f(x), 0, and f+(x) = −min f(x), 0.

Proof 8.5.7. We note f = f+ − f− and | f |= f+ + f−. Thus,

f+ = (1/2)(| f | +f

)

f− = (1/2)(| f | −f

).

Hence, if f is measurable, by Lemma 8.5.1 (i), (iii) and (v), f+ and f− are also measurable. Conversely,

if f+ and f− are measurable, f = f+ − f− is measurable as well.

8.6 Extended Valued Measurable Functions

We now extend these ideas to functions which are extended real valued.

Definition 8.6.1 (The Measurability Of An Extended Real Valued Function).LetX be a nonempty set and S be a sigma - algebra onX . Let f : X → <. We say f is S measurable

if (f(x) > α) is in S for all α in <.

Comment 8.6.1. If the extended valued function f is measurable, then (f(x) = +∞) = ∩n(f(x) > n)is measurable. Also, since

(f(x) = −∞) =(∪n(f(x) > −n)

)C,

it is measurable also.

We can then prove an equivalence theorem just like before.

165

8.6. EXTENDED VALUED CHAPTER 8. MEASURABILITY

Lemma 8.6.1 (Equivalent Conditions For The Measurability of an Extended Real Valued Function).Let X be a nonempty set and S be a sigma - algebra of subsets of X . The following statements are

equivalent:

(i): ∀α > 0, Aα = x ∈ X | f(x) > α ∈ S,

(ii): ∀α > 0, Bα = x ∈ X | f(x) ≤ α ∈ S,

(iii): ∀α > 0, Cα = x ∈ X | f(x) ≥ α ∈ S,

(iv): ∀α > 0, Dα = x ∈ X | f(x) < α ∈ S.

Proof 8.6.1. The proof follows that of Lemma 8.4.1

The collection of all extended valued measurable functions is important to future work. We make the

following definition:

Definition 8.6.2 (The Set of Extended Real Valued Measurable Functions).Let X be a nonempty set and S be a sigma - algebra of subsets of X . We denote by M(X,S) the set

of all extended real valued measurable functions on X . Thus,

M(X,S) = f : X → < | f is S measurable.

It is also easy to prove the following equivalent definition of measurability for extended valued func-

tions.

Lemma 8.6.2 (Extended Valued Measurability In Terms Of The Finite Part Of The Function).Let X be a nonempty set and S be a sigma - algebra of subsets of X . Then f ∈M(X,S) if and only

if (i): (f(x) = +∞) ∈ S, (ii): (f(x) = −∞) ∈ S and (iii): f1 is measurable where

f1(x) =

f(x) x 6∈ (f(x) = +∞) ∪ (f(x) = −∞),0 x ∈ (f(x) = +∞) ∪ (f(x) = −∞).

Proof 8.6.2. By Comment 8.6.1, if f is measurable, (i) and (ii) are true. Now, if α ≥ 0 is given, we see

(f1(x) > α) = (f(x) > α) ∩ (f(x) = +∞)C ,

which is a measurable set. On the other hand, if α < 0, then

(f1(x) > α) = (f(x) > α) ∪ (f(x) = −∞),

166

8.7. EXTENDED PROPERTIES CHAPTER 8. MEASURABILITY

which is measurable as well. We conclude f1 is measurable. Conversely, if (i), (ii) and (iii) hold, then if

α ≥ 0, we have

(f(x) > α) = (f1(x) > α) ∪ (f(x) = +∞),

and if α < 0,

(f(x) > α) = (f1(x) > α) ∩ (f(x) = −∞)C ,

implying both sets are measurable. Thus, f is measurable.

Example 8.6.1. Let X be a nonempty set X with given sigma - algebra S. Let E ∈ S be given. Define

the extended value characteristic function

JE(x) =

∞ if x ∈ E0 if x 6∈ E

Then JE is measurable. Note

x|JE(x) > α =

E ∈ S α ≥ 0X ∈ S α < 0

Note also that if we define

JE(x) =

∞ if x ∈ E−∞ if x 6∈ E

Then JE is measurable. We have

x|JE(x) > α =

E ∈ S α ≥ 0E ∈ S α < 0

Finally, (JE(x) = +∞) = E and (JE(x) = −∞) = EC are both measurable and the f1 type

function used in Lemma 8.6.2 here is (JE)1(x) = 0 always.

8.7 Properties Of Extended Valued Measurable Functions

It is straightforward to prove these properties:

167


Lemma 8.7.1 (Properties of Extended Valued Measurable Functions).Let X be a nonempty set and S a sigma - algebra on X . Then if f and g are in M(X,S), so are

(i): cf for all c ∈ <.

(ii): f2.

(iii): f + g, as long as we restrict the domain of f + g to be Efg where

ECfg =(

(f(x) = +∞) ∩ (g(x) = −∞) ∪ (f(x) = −∞) ∩ (g(x) = +∞))C

.

We usually define (f+g)(x) = 0 onEfg. NoteEfg is measurable since f and g are measurable

functions.

(iv): | f |, f+ and f−.

Proof 8.7.1. These proofs are similar to those shown in the proof of Lemma 8.5.1. However, let’s look at

the details of the proof of (ii). We see that our definition of addition of the extended real valued sum means

that

(f + g)(x) =(f + g

)IECfg

.

Define h by

h(x) =(f + g

)IECfg

(x).

Let α be a real number. Then

(h(x) > α) =

(f(x) + g(x) > α)

⋂ECfg α ≥ 0(

(f(x) + g(x) > α)⋂ECfg

)∪ Efg α < 0

Similar to what we did in Lemma 8.5.1, for r ∈ Q, let

Sr = (f(x) > r) ∩ (g(x) > α− r) ∩ ECfg

which is measurable since f and g are measurable. We claim that

(f(x) + g(x) > α) ∩ ECfg =⋃r∈Q

Sr. (8.1)

To see this, let x be in the left hand side of Equation 8.1. There are several cases. First, neither f(x) or g(x)can be −∞ since α is a real number. Now, if f(x) =∞, then g(x) > −∞ is and it is easy to see there is

168


a rational number r satisfying f(x) =∞ > r > α−g(x) and so x is in the right hand side. If g(x) =∞,

then f(x) > −∞ and again, we see there is a rational number so that f(x) > r > α − g(x) = −∞.

Thus, x is in the right hand side again. The case where both f(x) and g(x) are finite is then handled just

like we did in the proof of Lemma 8.5.1. We conclude

(f(x) + g(x) > α) ⊆⋃r∈Q

Sr.

The converse is easier as if x ∈ Sr, it follows immediately that f(x)+g(x) is defined and f(x)+g(x) > α.

Since Sr is measurable for each r and the rationals are countable, we see (f(x) + g(x) > α) ∩ ECfgis measurable.

To prove that products of extended valued measurable functions are also measurable, we have to use a

pointwise limit approach.

Lemma 8.7.2 (Pointwise Infimums, Supremums, Limit Inferiors and Limit Superiors are Measurable).

Let X be a nonempty set and S a sigma - algebra on X . Let (fn) ⊆M(X,S). Then

(i): If f(x) = infn fn(x), then f ∈M(X,S).

(ii): If F (x) = supn fn(x), then F ∈M(X,S).

(iii): If f∗(x) = lim infn fn(x), then f∗ ∈M(X,S).

(iv): If F ∗(x) = lim supn fn(x), then F ∗ ∈M(X,S).

Proof 8.7.2. It is straightforward to see that (f(x) ≥ α)=∩n(fn(x) ≥ α) and (F (x) ≥ α)=∪n(fn(x) >α) and hence, are measurable for all α. It follows that f and F are in M(X,S) and so (i) and (ii) hold.

Next, recall from classical analysis that at each point x,

lim inf(fn(x)) = supn

infk≥n

fk(x),

lim sup(fn(x)) = infn

supk≥n

fk(x).

Now let zn(x) = infk≥n fk(x) and wn(x) = supk≥n fk(x). Applying (i) to zn, we have zn ∈ M(X,S)and applying (ii) to wn, we have wn ∈ M(X,S). Then apply (i) and (ii) to supn zn and inf wn, respec-

tively, to get the desired result.

This leads to an important result.

Theorem 8.7.3 (Pointwise Limits of Measurable Functions Are Measurable).Let X be a nonempty set and S a sigma - algebra on X . Let (fn) ⊆M(X,S) and let f : X → < be

a function such that fn → f pointwise on X . Then f ∈M(X,S).

169


Proof 8.7.3. We know that lim infn fn(x) = lim supn fn(x) = limn fn(x). Thus, by Lemma 8.7.2, we

know that f is measurable.

Comment 8.7.1. This is a huge result. We know from classical analysis that the pointwise limit of continu-

ous functions need not be continuous (e.g. let fn(t) = tn on [0, 1]). Thus, the closure of a class of functions

which satisfy a certain property (like continuity) under a limit operation is not always guaranteed. We see

that although measurable functions are certainly not as smooth as we would like, they are well behaved

enough to be closed under pointwise limits!

We now show that M(X,S) is closed under multiplication.

Lemma 8.7.4 (Products of Measurable Functions Are Measurable).Let X be a nonempty set and S a sigma - algebra on X . Let f, g ∈M(X,S). Then fg ∈M(X,S).

Proof 8.7.4. Let fn, the truncation of f , be defined by

fn(x) =

f(x) | f(x) |≤ nn f(x) > n

−n f(x) < −n

We define the truncation of g, gn, is a similar way. We can easily show fn and gm are measurable for any

n and m. We only show the argument for fn as the argument for gm is identical. Let α be a given real

number. Then

(fn(x) > α) =

∅ α ≥ n,(f(x) > n) ∪ (α < f(x) ≤ n) 0 ≤ α < n,

(f(x) > n) ∪ (α < f(x) ≤ n) −n < α < 0,X α ≤ −n.

It is easy to see all of these sets are in S since f is measurable. Thus, each real valued fn is measurable.

It then follows by Lemma 8.5.1 that fngm is also measurable. Note we are using the definition of

measurability for real valued functions here. Next, an easy argument shows that at each x,

f(x) = limnfn(x) and g(x) = lim

mgm(x)

It then follows that

f(x) gm(x) = limn

(fn(x)gm(x)

)

Using Theorem 8.7.3, we see fgm is measurable. Then, noting

f(x) g(x) = limm

(f(x)gm(x)

)170

8.8. CONTINUOUS COMPOSITIONS CHAPTER 8. MEASURABILITY

another application of Theorem 8.7.3 establishes the result.

8.8 Continuous Functions of Measurable Functions

We wish to explore what properties the composition of a continuous function and a measurable function

might have.

8.8.1 The Composition With Finite Measurable Functions

We begin with the case of finite measurable functions.

Lemma 8.8.1 (Continuous Functions Of Finite Measurable Functions Are Measurable).Let X be nonempty and (X,S, µ) be a measure space. Let f ∈M(X,S) be finite. Let φ : < → < be

continuous. Then φ f is measurable.

Proof 8.8.1. Let α be in <. We claim(φ f

)−1

(α,∞) = f−1

(φ−1(α,∞)

).

First, let x be in the right hand side. Then,

f(x) ∈ φ−1(α,∞) ⇒ φ

(f(x)

)∈ (α,∞)

⇒ x ∈(φ f

)−1

(α,∞).

Conversely, if x is in the left hand side, then(φ f

)(x) ∈ (α,∞) ⇒ f(x) ∈ φ−1(α,∞)

⇒ x ∈ f−1

(φ−1(α,∞)

).

Since φ is continuous, G = φ−1(α,∞) is an open set. Finally, since f is measurable, f−1(G) is in S. We

conclude that φ f is measurable, since our choice of α is arbitrary.

To handle the composition of a continuous function and an extended valued measurable function, we need

an approximation result.

171


8.8.2 The Approximation Of Non-negative Measurable Functions

Theorem 8.8.2 (The Approximation Of Non negative Measurable Functions By Monotone Se-

quences).Let X be a nonempty set and S a sigma - algebra on X . Let f ∈ M(X,S) which is non negative.

Then there is a sequence (φn) ⊆M(X,S) so that

(i): 0 ≤ φn(x) ≤ φn+1 for all x and for all n ≥ 1.

(ii): φn(x) ≤ f(x) for all x and n and f(x) = limn φn(x).

(iii): Each φn has a finite range of values.

Proof 8.8.2. Pick a positive integer n. Let

Ek,n =

x ∈ X | k2n ≤ f(x) < k+1

2n , for 0 ≤ k ≤ n2n − 1x ∈ X | n ≤ f(x), for k = n2n

You should draw some of these sets for a number of choices of non negative functions f to get a feel for

what they mean. Once you have done this, you will see that this definition slices the [0, n] range of f into

n2n slices each of height 2−n. The last set, En2n,n is the set of all points where f(x) exceeds n. This gives

us a total of n2n + 1 sets. It is clear that X = ∪k Ek,n and that each of these sets are disjoint from the

others. Now define the functions φn by

φn(x) =k

2n, x ∈ Ek,n.

It is evident that φn only takes on a finite number of values and so (iii) is established. Also, since f is

measurable, we know each Ek,n is measurable. Then, given any real number α, the set (φn(x) > α) is

either empty or consists of a union of the finite number of sets Ek,n with the property that α > (k/2n).

Thus, (φn(x) > α) is measurable for all α. We conclude each φn is measurable. If f(x) = +∞, then

by definition, φn(x) = n for all n and we have f(x) = limn φn(x). Note, the φn values are strictly

monotonically increasing which shows (i) and (ii) both hold in this case.

On the other hand, if f(x) is finite, let n0 be the first integer with n0− 1 ≤ f(x) < n0. Then, we must

have φ1(x) = 1, φ2(x) = 2 and so forth until we have φn0−1 = n0 − 1. These first values are monotone

increasing. We also know from the definition of φn0 that there is a k0 so that

k0

2n0≤ f(x) <

k0 + 12n0

.

Thus, 0 ≤ f(x)− φn0(x) < 2−n0 . Now consider the function φn0+1. We know

f(x) ∈[k0

2n0,k0 + 1

2n0

)172


=[

2k0

2n0+1,

2k0 + 12n0+1

)∪[

2k0 + 12n0+1

,2k0 + 22n0+1

)If f(x) lands in the first interval above, we have

φn0+1(x) =2k0

2n0+1=

k0

2n0= φn0(x)

and if f(x) is in the second interval, we have

φn0+1(x) =2k0 + 12n0+1

>k0

2n0= φn0(x).

In both cases, we have φn0(x) ≤ φn0+1(x). We also have immediately that 0 ≤ f(x) − φn0+1(x) <2−n0−1.

The argument for n0 + 2 and so on in quite similar and is omitted. This establishes (i) for this case. In

general, we have 0 ≤ f(x) − φk(x) < 2−k for all k ≥ n0. This implies that f(x) = limn φn(x) which

establishes (ii).

Finally, we can handle the case of the composition of a continuous function and an extended valued

measurable function.

8.8.3 Continuous Functions of Extended Valued Measurable Functions

Lemma 8.8.3 (Continuous Functions Of Measurable Functions Are Measurable).Let X be nonempty and (X,S, µ) be a measure space. Let f ∈ M(X,S). Let φ : < → < be

continuous and assume that limn φ(n) and limn φ(−n) are well defined extended value numbers.

Then φ f is measurable.

Proof 8.8.3. Assume first that f is non negative. Then by Theorem 8.8.2, there is a sequence of finite non

negative increasing functions (fn) which are measurable and satisfy fn ↑ f . Let E be the set of points

where f is infinite. Then, fn(x) = n when x is in E and.

limn

fn(x) =

f(x) x ∈ EC

∞ x ∈ E.

Thus, since φ is continuous on EC and limn φ(n) is a well-defined extended valued real number, we have,

limn

φ

(fn(x)

)=

φ(f(x)) x ∈ EC

limn φ(n) x ∈ E.

173

8.9. HOMEWORK CHAPTER 8. MEASURABILITY

Let limn φ(n) = β in [∞,∞]. Thus, if β is finite, we have

limn

φ

(fn(x)

)= φ

(f IEC

)+ β IE

which is measurable since the first part is measurable by Lemma 8.8.1 and the second part is measurable

since E is a measurable set by Lemma 8.6.2. If β =∞, we have

limn

φ

(fn(x)

)=

φ(f(x)) x ∈ EC

∞ x ∈ E.

Now apply Lemma 8.6.2. Since E is measurable and f1 defined by

f1(x) =

φ(f(x)) x ∈ EC

0 x ∈ E,

is measurable, we see limn φ

(fn(x)

)is measurable. A similar argument holds if β = −∞. We conclude

that if f is non negative, φ f interpreted as above is a measurable function.

Thus, if f is arbitrary, the argument above shows that φ f+ and φ f− are measurable. This implies

that φ f = φ (f+ − f−) is measurable when interpreted right.

8.9 Homework

Exercise 8.9.1. If a, b and c are real numbers, define the value in the middle, mid(a, b, c) by

mid(a, b, c) = inf supa, b, supa, c, supb, c .

Let X be a nonempty set and S a sigma - algebra on X . Let f1, f2, f3 ∈ M(X,S). Prove the function h

defined pointwise by h(x) = mid(f1(x), f2(x), f3(x)) is measurable.

Exercise 8.9.2. Let X be a nonempty set and S a sigma - algebra on X . Let f ∈ M(X,S) and A > 0.

Define fA by

fA(x) =

f(x), | f(x) |≤ AA, f(x) > A

−A, f(x) < −A

Prove fA is measurable.

Exercise 8.9.3. Let X be a nonempty set and S a sigma - algebra on X . Let f ∈ M(X,S) and assume

there is a positiveK so that 0 ≤ f(x) ≤ K for all x. Prove the sequence φn of functions given in Theorem

8.8.2 converges uniformly to f on X .

Exercise 8.9.4. Let X and Y be nonempty sets and let f : X → Y be given. Prove that if T is a sigma -

algebra of subsets of Y , then f−1(E) | E ∈ T is a sigma - algebra of subsets of X .

174


Exercise 8.9.5. This problem needs to go into the next chapter! Let (X,S) be a measurable space. Let

(µn) be a sequence of measures on S with µn(X) ≤ 1 for all n. Define λ on S by

λ(E) =∞∑n=1

1/2n µn(E)

for all measurable E. Prove λ is a measure on S.

175


176

Chapter 9

Measure And Integration

Once we have a nonempty set X with a given sigma - algebra S, we can develop an abstract version of

integration. To motivate this, consider the Borel sigma - algebra on <, B. We know how to develop and

use an integration theory that is based on finite intervals of the form [a, b] for bounded functions. Hence,

we have learned to understand and perform integrations of the form∫ ba f(t)dt for the standard Riemann

integral. We could also write this as ∫ b

af(t)dt =

∫[a,b]

f(t)dt

and we have learned that∫[a,b]

f(t)dt =∫

(a,b)f(t)dt =

∫(a,b]

f(t)dt =∫

[a,b)f(t)dt.

Note that we can thus say that we can compute∫E f(t)dt for E ∈ B for sets E which are finite and have

the form [a, b], (a, b], [a, b) and (a, b). We can extend this easily to finite unions of disjoint intervals of the

form E as given above by taking advantage of Theorem 4.5.3 to see∫∪nEn

f(t)dt =∑n

∫En

f(t)dt.

However, the development of the Riemann integral is closely tied to the interval [a, b] and so it is difficult

to extend these integrals to arbitrary elements F of B. Still, we can see that the Riemann integral is defined

on some subset of the sigma - algebra B.

From our discussions of the Riemann - Stieljes integral, we know that the Riemann integral can be

interpreted as a Riemann - Stieljes integral with the integrator given by the identity function id(x) = x.

Let’s switch to a new notation. Define the function µ(x) = x. Then for our allowable E, we can write∫E f(t)dt =

∫E f(t)dµ(t) which we can further simplify to

∫E fdµ as usual. Note that µ is a function

177

CHAPTER 9. ABSTRACT INTEGRATION

which assigns a real value which we interpret as length to all of the allowable sets E we have been

discussing. In fact, note µ is a mapping which satisfies

(i): If E is the empty set, then the length of E is 0; i.e. µ(∅) = 0.

(ii): If E is the finite interval [a, b], (a, b], [a, b) or (a, b), µ(E) = b− a.

(iii): If (En) is a finite collection of disjoint intervals, then the length of the union is clearly the sum of

the individual lengths; i.e. µ(∪nEn) =∑

n µ(En).

However, µ is not defined on the entire sigma -algebra. Also, it seems that we would probably like to

extend (iii) above to countable disjoint unions as it is easy to see how that would arise in practice. If we

could find a way to extend the usual length calculation of an interval to the full sigma -algebra, we could

then try to extend the notion of integration as well.

It turns out we can do all of these things but we can not do it by reusing our development process from

Riemann integration. Instead, we must focus on developing a theory that can handle integrators which

are mappings µ defined on a full sigma - algebra. It is time to precisely define what we mean by such a

mapping.

Definition 9.0.1 (Measures).Let X be a nonempty set and S a sigma - algebra of subsets in X . We say µ : S → < is a measure on

S if

(i): µ(∅) = 0,

(ii): µ(E) ≥ 0, for all E ∈ S,

(iii): µ is countably additive on S; i.e. if (En) ⊆ S is a countable collection of disjoint sets, then

µ(∪nEn) =∑

n µ(En).

We also say (X,S, µ) is a measure space. If µ(X) is finite, we say µ is a finite measure. Also, even

if µ(X) =∞, the measure µ is “almost finite” if we can find a collection of measurable sets (Fn) so

that X = ∪nFn with µ(Fn) finite for all n. In this case, we say the measure µ is σ - finite.

We can drop the requirement that the mapping µ be non negative. The resulting mapping is called a

charge instead of a measure. This will be important later.

178

CHAPTER 9. ABSTRACT INTEGRATION

Definition 9.0.2 (Charges).Let X be a nonempty set and S a sigma - algebra of subsets in X . We say ν : S → < is a charge on

S if

(i): ν(∅) = 0,

(ii): ν is countably additive on S; i.e. if (En) ⊆ S is a countable collection of disjoint sets, then

ν(∪nEn) =∑

n ν(En).

Note that we want the value of the charge to be finite on all members of S as otherwise we could

potentially have trouble with subsets having value ∞ and −∞ inside a given set. That would then

lead to undefined∞−∞ operations.

Let’s look at some examples:

Example 9.0.1. Let X be any nonempty set and let the sigma - algebra be S = P(X), the power set of

X . Define µ1 on S by µ1(E) = 0 for all E. Then µ1 is a measure, albeit not very interesting! Another

non interesting measure is defined by µ2(E) =∞ if E is not empty and 0 if E = ∅.

Example 9.0.2. Let X be any set and again let S = P(X). Pick any element p in X . Define µ by

µ(E) = 0 if p 6∈ E and 1 if p ∈ E. Then µ is a measure.

Example 9.0.3. Let X be the counting numbers,N , and S = P(N). Define µ by µ(E) is the cardinality

of E if E is a finite set and ∞ otherwise. Then µ is a measure called the counting measure. Note that

N = ∪n 1, . . . , n for all n and µ(1, . . . , n) = n, which implies µ is a σ - finite measure.

Example 9.0.4. This example is just a look ahead to future material we will be covering. Let B be the

extended Borel sigma - algebra. We will show later there is a measure λ : B → < that extends the usual

idea of the length of an interval. That is, if E is a finite interval of the form (a, b), [a, b), (a, b] or [a, b],then the length of E is b− a and λ(E) = b− a. Further, if the interval has infinite length, (for example, E

is (−∞, a)), then λ(E) =∞ also. The measure λ will be called Borel measure and since < = ∪n[−n, n],we see Borel measure is a σ - finite measure. The sets in B are called Borel measurable sets.

Example 9.0.5. We will be able to show that there is a larger sigma - algebraM of subsets of < and a

measure µ defined on M which also returns the usual length of intervals. Hence, B ⊆ M strictly (i.e.

there are sets inM not in B) with µ = λ on B. This measure will be called Lebesgue measure and the sets

in M will be called Lebesgue measurable sets. The proof that there are Lebesgue measurable sets that

are not Borel sets will require a non constructive argument using the Axiom of Choice. Further, we will

be able to show that the Lebesgue sigma - algebra is not the entire power set as there are non Lebesgue

measurable sets. The proof that such sets exist requires the use of the interesting functions built using

Cantor sets discussed in Chapter 13.

179

9.1. PROPERTIES CHAPTER 9. ABSTRACT INTEGRATION

Example 9.0.6. In the setting of Borel measure on <, we will be able to show that if g is a continuous and

monotone increasing function of <, then there is a measure, λg defined on B which satisfies

λg(E) =∫Edg

for any finite interval E. Here,∫E dg is the usual Riemann - Stieljes integral.

9.1 Some Basic Properties Of Measures

Lemma 9.1.1 (Monotonicity).Let (X,S, µ) be a measure space. If E,F ∈ S with E ⊆ F , then µ(E) ≤ µ(F ). Moreover, if µ(E)is finite, then µ(F \ E) = µ(F )− µ(E).

Proof 9.1.1. We know F = E ∪ (F \E) is a disjoint decomposition of F . By the countable additivity of

µ, it follows immediately that µ(F ) = µ(E) + µ(F \E). Since µ is non-negative, we see µ(F ) ≥ µ(E).

Finally, if µ(E) is finite, then subtraction is allowed in µ(F ) = µ(E) + µ(F \ E) which leads to

µ(F \ E) = µ(F )− µ(E).

Lemma 9.1.2 (The Measure Of Monotonic Sequence Of Sets).Let (X,S, µ) be a measure space.

(i): If (En) is an increasing sequence of sets in S (i.e. En ⊆ En+1 for all n), then µ(∪nEn) =limn µ(En).

(ii): If (Fn) is an decreasing sequence of sets in S (i.e. Fn+1 ⊆ Fn for all n) and µ(F1) is finite,

then µ(∩nFn) = limn µ(Fn).

Proof 9.1.2. To prove (i), if there is an index n0 where µ(En0 is infinite, then by the monotonicity of µ, we

must have∞ = µ(En0 ≤ µ(∪nEn). Hence, µ(∪nEn) = ∞. However, since En0 ⊆ En for all n ≥ n0,

again by monotonicity, n ≥ n0 implies µ(En) = ∞. Thus, limn µ(En) = µ(∪nEn) = ∞. On the other

hand, if µ(En) is finite for all n, define the disjoint sequence of set (An) as follows:

A1 = E1

A2 = E2 \ E1

A3 = E3 \ E2

......

...

An = En \ En−1

180


We see ∪nAn = ∪nEn and since µ is countably additive, we must have µ(∪nAn) =∑

n µ(An). Since by

assumption µ(En) is finite in this case, we know µ(An) = µ(En)− µ(En−1). It follows that

n∑k=1

µ(Ak) = µ(E1) +n∑k=2

(µ(Ek)− µ(Ek−1)

)= µ(E1) + µ(En)− µ(E1)

= µ(En).

We conclude

µ(∪nEn) = µ(∪nAn)

= limn

n∑k=1

µ(Ak)

= limnµ(En)

this proves the validity of (i). Next, for (ii), construct the sequence of sets (En) by

E1 = ∅

E2 = F1 \ F2

E3 = F1 \ F3

......

...

En = F1 \ Fn.

Then (En) is an increasing sequence of sets and so by (i), µ(∪n En) = limn µ(En). Since µ(F1) is finite,

we then know that µ(En) = µ(F1)− µ(Fn). Hence, µ(∪n En) = µ(F1)− limn µ(Fn). Next, note by De

Morgan’s Laws,

µ(∪n En) = µ

(∪n F1 ∩ FCn

)= µ

(F1 ∩ ∪nFCn

)= µ

(F1 ∩

(∩nFn

)C)

= µ

(F1 \

(∩nFn

)).

Thus, since µ(F1) is finite and ∩nFn ⊆ F1, we have µ(∪nEn) = µ(F1) − µ(∩nFn). Combining these

results, we have

µ(F1)− limnµ(Fn) = µ(F1)− µ(∩nFn).

181


The result then follows by canceling µ(F1) from both sides which is allowed as this is a finite number.

We will now develop a series of ideas involving sequences of sets.

Definition 9.1.1 (Limit Inferior And Superior Of Sequences Of Sets).Let X be a nonempty set and (An) be a sequence of subsets of X . The limit inferior of (An) is defined

to be the set

lim inf = lim(An) =∞⋃m=1

∞⋂n=m

An

while the limit superior of (An) is defined by

lim sup = lim(An) =∞⋂m=1

∞⋃n=m

An

It is convenient to have a better characterization of these sets.

Lemma 9.1.3 (Characterizing Limit Inferior And Superiors Of Sequences Of Sets).Let (An) be a sequence of subsets of the nonempty set X . Then we have

lim inf(An) = x ∈ X | x ∈ Ak for all but finitely many indices k = B

and

lim sup(An) = x ∈ X | x ∈ Ak for infinitely many indices k = C

Proof 9.1.3. We will prove the statement about lim inf(An) first. Let x ∈ B. If there are no indices k so

that x 6∈ Ak, then x ∈ ∩∞n=1 telling us that x ∈ lim inf(An). On the other hand, if there are a finite number

of indices k that satisfy x 6∈ Ak, we can label these indices as k1, . . . , kp for some positive integer p.

Let k∗ be the maximum index in this finite list. Then, if k > k∗, x ∈ ∩∞n=k. This implies immediately that

x ∈ lim inf(An). Conversely, if x ∈ lim inf(An), there is an index k0 so that x ∈ ∩∞n=k0. This implies

that x can fail to be in at most a finite number of Ak where k < k0. Hence, x ∈ B.

Next, we prove that lim sup(An) = C. If x ∈ C, then if there were an index m0 so that x 6∈ ∪∞n=m0,

then x would belong to only a finite number of setsAk which contradicts the definition of the set C. Hence,

there is no such index m0 and so x ∈ ∪∞n=m for all m. This implies x ∈ lim sup(An). On the other hand,

if x ∈ lim sup(An), then x ∈ ∪∞n=m for all m. So, if x was only in a finite number of sets An, there would

be a largest index m∗ satisfying x ∈ Am∗ but x 6∈ Am if m > m∗. But this then says x 6∈ lim sup(An).

This is a contradiction. Thus, our assumption that x was only in a finite number of sets An is false. This

implies x ∈ C.

182


Lemma 9.1.4 (Limit Inferiors And Superiors Of Monotone Sequences Of Sets).Let X be a nonempty set. Then

(i): If (An) is an increasing sequence of subsets of X , then

lim inf(An) = lim sup(An) =∞⋃n=1

An.

(ii): If (An) is a decreasing sequence of subsets of X , then

lim inf(An) = lim sup(An) =∞⋂n=1

An.

(iii): If (An) is an arbitrary sequence of subsets of X , then

∅ ⊆ lim inf(An) ⊆ lim sup(An)

Proof 9.1.4.(i): If x ∈ lim sup(An), then x ∈ ∪∞n=1An. Conversely, if x ∈ ∪∞n=1An, there is an index n0 so that

x ∈ An0 . But since the sequence (An) is increasing, this means x ∈ An for all n > n0 also. Hence,

x ∈ ∪∞n=mAn for all indices m ≥ n0. However, it is also clear that x is in any union that starts at n

smaller than n0. Thus, x must be in ∩∞m=1 ∪∞n=m An. But this is the set lim sup(An). We conclude

lim sup(An) = ∪∞n=1. Now look at the definition of lim inf(An). Since An is monotone increasing,

∩∞n=mAn = Am. Hence, it is immediate that lim inf(An) = ∪∞n=1.

(ii): the argument for this case is similar to the argument for case (i) and is left to you.

(iii): it suffices to show that lim inf(An) ⊆ lim sup(An). If x ∈ lim inf(An), by Lemma 9.1.3, x belongs

to all but finitely many An. Hence, x belongs to infinitely many An. Then, applying Lemma 9.1.3 again,

we have the result.

There will be times when it will be convenient to write an arbitrary union of sets as a countable union

of disjoint sets. In the next result, we show how this is done.

Lemma 9.1.5 (Disjoint Decompositions Of Unions).Let X be a nonempty set and let (An) be a sequence of subsets of X . Then there exists a sequence of

mutually disjoint set (Fn) satisfying ∪nAn = ∪nFn.

Proof 9.1.5. Define sets En and Fn as follows:

E0 = ∅, F1 = A1 \ E0 = A1

183


E1 = A1, F2 = A2 \ E1 = A2 \A1

E2 = A1

⋃A2, F3 = A3 \ E2 = A3 \

(A1

⋃A2

)E3 =

3⋃k=1

Ak, F4 = A4 \ E3 = A4 \( 3⋃k=1

Ak

)... =

...,...

En =n⋃k=1

Ak, Fn+1 = An+1 \ En = A4 \( n⋃k=1

Ak

)

Note that (En) forms a monotonically increasing sequence of sets with cupnAn ∪n En. We claim the sets

Fn are mutually disjoint and ∪nj=1fj = ∪nj=1Aj . We do this by induction.

Proof . Basis: It is clear that F1 and F2 are disjoint and F1 ∪ F2 = A1 ∪A2. Induction: We assume that

(Fk) are mutually disjoint for 1 ≤ k ≤ n and ∪kj=1fj = ∪kj=1Aj for 1 ≤ k ≤ n as well. Then

Fn+1 = An+1 \ En

= An+1

⋂( n⋃j=1

Aj

)C=

n⋂j=1

(An+1

⋂ACj

).

Now, by construction, Fj ⊆ Aj for all j. However, from the above expansion of Fn+1, we see Fn+1 ⊆ ACjfor all 1 ≤ j ≤ n. This tells us Fn+1 ⊆ FCj for these indices also. We conclude Fn+1 is disjoint from all

the previous Fj . This shows (Fj) is a collection of mutually disjoint sets for 1 ≤ j ≤ n + 1. This proves

the first part of the assertion. To prove the last part, note

n+1⋃j=1

Fj =n⋃j=1

Fj⋃

Fn+1

=n⋃j=1

Aj⋃(

An+1 \( n⋃j=1

Aj

))

=n+1⋃j=1

Aj .

This completes the induction step. We conclude that this proposition holds for all n.

Since the claim holds, it is then obvious that ∪nj=1fj = ∪nj=1Aj .

To finish this section on measures, we want to discuss the idea that a property holds except on a set

of measure zero. Recall, this subject came up when we discussed the content of a subset of < earlier in

Section 5.3. However, we can extend this concept of an arbitrary measure space (X,S, µ) as follows.

184

9.2. INTEGRATION CHAPTER 9. ABSTRACT INTEGRATION

Definition 9.1.2 (Propositions Holding Almost Everywhere).Let (X,S, µ) be a measure space. We say a proposition P holds almost everywhere on X if x ∈X | P does not hold has µ measure zero. We usually say the proposition holds µ a.e. rather than

writing out the phrase µ almost everywhere. Also, if the measure µ is understood from context, we

usually just say the proposition hold a.e. to make it even easier to write down.

Comment 9.1.1. Given the measure space (X,S, µ), if f and g are extended real valued functions on X

which are measurable, we would say f = g µ a.e. if µ(x ∈ X | f(x) 6= g(x)) = 0.

Comment 9.1.2. Given the measure space (X,S, µ), If (fn) is a sequence of measurable extended real

valued functions on the X , and f : X → < is another measurable function on X , we would say fnconverges pointwise a.e. to f if the set x ∈ X | fn(x) 6→ f(x) has measure 0. We would usually write

fn → f pointwise µ a.e.

9.2 Integration

In this section, we will introduce an abstract notion of integration on the measure space (X,S, µ). Recall

that M(X,S) denotes the class of extended real valued measurable functions f on X . First we introduce

a standard notation for some useful classes of functions. When we want to restrict our attention to the non

negative members of M(X,S), we will use the notation that f ∈M+(X,S).

To construct an abstract integration process on the measure space (X,S, µ), we begin by defining the

integral of a class of functions which can be used to approximate any function f in M+(X,S).

Definition 9.2.1 (Simple Functions).Let (X,S, µ) be a measure space and let f : X → < be a function. We say f is a simple function

if the range of f is a finite set and f is S measurable. This implies the following standard unique

representation of f . Since the range is finite, there is an positive integer N and distinct numbers aj ,

1 ≤ j ≤ N so that

(i): the sets Ej = f−1(aj) are measurable and mutually disjoint for 1 ≤ j ≤ N ,

(ii): X =⋃Nj=1 Ej ,

(iii): f has the characterization

f(x) =N∑j=1

ajIEj (x).

We then define the integral of a simple function as follows.

185


Definition 9.2.2 (The Integral Of A Simple Function).Let (X,S, µ) be a measure space and let φ : X → < be a simple function. Let

φ(x) =N∑j=1

ajIEj (x),

be the standard representation of φ where the numbers aj are distinct and the sets Ej are mutually

disjoint, cover X , and are measurable for 1 ≤ j ≤ N for some positive integer N . Then the integral

of φ with respect to the measure µ is the extended real valued number

∫φ dµ =

N∑j=1

aj µ(Ej).

Comment 9.2.1. We note that∫φdµ can be +∞. Recall, our convention that 0 ·∞ = 0. Hence, if one of

the values aj is 0, the contribution to the integral is 0µ(Ej) which is 0 even if µ(Ej) = ∞. Further, note

the 0 function on X can be defined as I∅ which is a simple function. Hence,∫

0 dµ = 0.

Using this, we can define the integral of any function in M+(X,S).

Definition 9.2.3 (The Integral Of A Non-negative Measurable Function).Let (X,S, µ) be a measure space and let f ∈ M+(X,S), µ). For convenience of notation, let F+

denote the collection of all non negative simple functions on X . Then, the integral of f with respect to

the measure µ is the extended value real number∫f dµ = sup

∫φ dµ | φ ∈ F+, φ ≤ f.

If E ∈ S, we define the integral of f over E with respect to µ to be∫Ef dµ =

∫fIE dµ.

It is time to prove some results about this new abstract version of integration.

186


Lemma 9.2.1 (Properties Of Simple Function Integrations).Let (X,S, µ) be a measure space and let φ, ψ ∈M+(X,S)) be simple functions. Then,

(i): If c ≥ 0 is a real number, then cφ is also a simple function and∫cφ dµ = c

∫φ dµ.

(ii): φ+ ψ is also a simple function and∫

(φ+ ψ) dµ =∫φ dµ +

∫ψ dµ.

(iii): The mapping λ : S → < defined by λ(E) =∫E φ dµ for all E in S is a measure.

Proof 9.2.1. Let φ have the standard representation

φ(x) =N∑j=1

ajIEj (x),

where the numbers aj are distinct, the sets Ej are mutually disjoint, cover X , and are measurable for

1 ≤ j ≤ N for some positive integer N . Similarly, let ψ have the standard representation

ψ(x) =M∑k=1

bkIFk(x),

where the numbers bk are distinct, the sets Fk are mutually disjoint, cover X , and are measurable for

1 ≤ k ≤M for some positive integer M . Now to the proofs of the assertions:

(i):

First, if c = 0, cφ = 0 and∫

0dµ = 0 ·∫φdµ. Next, if c > 0, then it is easy to see cφ is a simple function

with representation

cφ(x) =N∑j=1

cajIEj (x),

and hence, by the definition of the integral of a simple function

∫cφ dµ =

N∑j=1

caj µ(Ej)

= c

(N∑j=1

aj µ(Ej)

)

= c

∫φ dµ.

(ii):

This one is more interesting to prove. First, to prove φ + ψ is a simple function, all we have to do is find

its standard representation. From the standard representations of φ and ψ, it is clear the sets Fk ∩Ej are

187


mutually disjoint and since X = ∪Ej = ∪Fk, we have the identities

Fk =N⋃j=1

Fk ∩ Ej , and Ej =M⋃k=1

Fk ∩ Ej .

Now define h : X → < by

h(x) =N∑j=1

M∑k=1

(aj + bk) IFk∩Ej (x).

Next, since X = ∪j ∪k Fk ∩ Ej , given x ∈ X , there are indices k0 and j0 so that x ∈ Fk0 ∩ Ej0 . Thus,

φ(x) + ψ(x) = aj0 IEj0 + bk0 IFk0 = aj0 + bk0 = h(x).

From the above argument, we see h(x) = φ(x) + ψ(x) for all x in X . It follows that the range of h is

finite and hence it is a measurable simple function, but we still do not know its standard representation.

To find the standard representation, let ci, 1 ≤ i ≤ P be the set of distinct numbers formed by the

collection aj + bk | 1 ≤ j ≤ N, 1 ≤ k ≤ M. Then let Ui be the set of index pairs (j, k) that satisfy

ci = aj + bk. Finally, let

Gi =⋃

(j,k)∈Ui

Ej ∩ Fk.

Since the sets Fk ∩ Ej are mutually disjoint, we have

µ(Gi) =∑

(j,k)∈Ui

µ(Ej ∩ Fk).

It follows that

h(x) =P∑i=1

ci IGi

is the standard representation of h = φ+ ψ. Thus

∫h dµ =

∫(φ+ ψ) dµ =

P∑i=1

ci µ(Gi)

=P∑i=1

ci

( ∑(j,k)∈Ui

µ(Ej ∩ Fk)

)

=P∑i=1

∑(j,k)∈Ui

ci µ(Ej ∩ Fk).

188


But we know that

N∑j=1

M∑k=1

=P∑i=1

∑(j,k)∈Ui

.

Hence, we can write

∫(φ+ ψ) dµ =

N∑j=1

M∑k=1

(aj + bk) µ(Ej ∩ Fk)

=N∑j=1

M∑k=1

aj µ(Ej ∩ Fk) +N∑j=1

M∑k=1

bk µ(Ej ∩ Fk).

This can be reorganized as

N∑j=1

aj

M∑k=1

µ(Ej ∩ Fk) +M∑k=1

bk

N∑j=1

µ(Ej ∩ Fk)

=N∑j=1

aj µ

( M⋃k=1

Ej ∩ Fk)

+M∑k=1

bk µ

( N⋃j=1

Ej ∩ Fk)

=N∑j=1

aj µ(Ej) +M∑k=1

bk µ(Fk)

=∫

φ dµ +∫

ψ dµ.

(iii):

Given

φ(x) =N∑j=1

ajIEj (x),

it is easy to see that

φ IE(x) =N∑j=1

ajIE∩Ej (x).

Further, it is straightforward to show that the mappings µj : (S)→ < defined by µj(A) = µ(A ∩Ej) for

all A in S are measures on the sigma - algebras S ∩ Ej for each 1 ≤ j ≤ N . It is also easy to see that

the finite linear combination of these measures given by ξ =∑N

j=1 aj µj is a measure on S itself. Thus,

applying part (ii) of this lemma, we see

λ(E) =∫

φIE dµ =∫

φI∪Nj=1E∩Ejdµ

189

9.3. EQUALITY A.E. CHAPTER 9. ABSTRACT INTEGRATION

=∫ ( N∑

j=1

φIE∩Ej

)dµ =

N∑j=1

∫ajIE∩Ej dµ

=N∑j=1

ajµ(E ∩ Ej) =N∑j=1

aj µj(E) = ξ(E).

We conclude λ = ξ and λ is a measure on S.

Lemma 9.2.2 (Monotonicity Of The Abstract Integral For Non Negative Functions).Let (X,S, µ) be a measure space and let f and g be inM+(X,S) with f ≤ g. Then,

∫fdµ ≤

∫gdµ.

Further, if E ⊆ F with E and F measurable sets, then∫E fdµ ≤

∫F fdµ.

Proof 9.2.2. Let φ be a positive simple function which is dominated by f ; i.e., φ ≤ f . Then φ is also

dominated by g and so by the definition of the integral of f , we have∫fdµ = sup

∫φdµ | 0 ≤ φ ≤ f

≤ sup ∫ψdµ | 0 ≤ ψ ≤ g

=∫gdµ.

Next, if E ⊆ F with E and F measurable sets, then fIE ≤ fIF and from the first result, we have∫fIEdµ ≤

∫fIFdµ,

which implies the result we seek.

9.3 Complete Measures And Equality a.e.

We know if a sequence of extended real - valued measurable functions (fn) converges pointwise to a

function f , then the limit function is also measurable. But what if the convergence was pointwise a.e?

Is it still true that the limit function is also measurable? In general, the answer is no. We have to add

an additional property to the measure. We will motivate this with an example that we are not really fully

prepared for, but it should make sense anyway. The argument below will also be placed in Chapter 13 for

completeness of our exposition.

Let B ∩ [0, 1] denote the Borel sigma - algebra of subsets of [0, 1]. We will be able to show in later

chapters that there is a measure called Lebesgue measure, µL, defined on a sigma - algebra of subsets L,

the Lebesgue sigma - algebra, which extends the usual meaning of length in the following sense. If [a, b]is a finite interval then the length of [a, b] is the finite number b− a. Denote this length by `([a, b]). Then

we can show that

µL([a, b]) = `([a, b]) = b− a.

190


We can show also that every subset in B is also in L. The restriction of µL to B is called Borel measure

and we will denote it by µB .

We can argue that the Borel sigma - algebra is strictly contained in the Lebesgue sigma - algebra

by using the special functions we construct in Chapter 13. In that Chapter, we show if C is a Cantor

set constructed from the generating sequence (an) where lim 2nan = 0, the content of C is 0. Then if

we let Ψ be the mapping discussed above for this C in Section 13.3, we define the mapping mapping

g : [0, 1] → [0, 1] by g(x) = (Ψ(x) + x)/2. The mapping g is quite nice: it is 1 − 1, onto, strictly

increasing and continuous. We also showed in the exercises in Section 13.3 that g(C) is another Cantor

set with lim 2na′n = 1/2, where (a′n) is the generating sequence for g(C).

Now it turns out that the notion of content and Lebesgue measure coincide. Thus, we can say since C

is a Borel set,

µB(C) = µL(C) = 0.

Also, we can show that since lim 2na′n = 1/2,

µB(g(C)) = µL(g(C)) = 1/2.

A nonconstructive argument we will present later using the Axiom of Choice allows us to show that

any Lebesgue measurable set with positive Lebesgue measure must contain a subset which is not in the

Lebesgue sigma - algebra. So since µL(g(C)) = 1/2, there is a set F ⊆ g(C) which is not is L. Thus,

g−1(F ) ⊆ C which has Lebesgue measure 0. Lebesgue measure is a measure which has the property that

every subset of a set of measure 0 must be in the Lebesgue sigma - algebra. Then, using the monotonicity

of µL, we have µL(g−1(F )) is also 0. From the above remarks, we can infer something remarkable.

Let the mapping h be defined to be g−1. Then h is also continuous and hence it is measurable with

respect to the Borel sigma-algebra. Note since B ⊆ L, this tells us immediately that h is also measurable

with respect to the Lebesgue sigma - algebra. Thus, h−1(U) is in the Borel sigma - algebra for all Borel

sets U . But we know h−1 = g, so this tells us g(U) is in the Borel sigma -algebra if U is a Borel set.

Hence, if we chose U = g−1(F ), then g(U) = F would have to be a Borel set if U is a Borel set. However,

we know that F is not inL and so it is also not a Borel set. We can only conclude that g−1(F ) can not be a

Borel set. However, g−1(F ) is in the Lebesgue sigma - algebra. Thus, there are Lebesgue measurable sets

which are not Borel! Thus, the Borel sigma - algebra is strictly contained in the Lebesgue sigma - algebra!

We can use this example to construct another remarkable thing.

Comment 9.3.1. Using all the notations from above, note the indicator function of CC , the complement

of C, is defined by

ICC (x) =

1 x ∈ CC

0 x ∈ C.

191


We see f = ICC is Borel measurable. Next, define a new mapping like this:

φ(x) =

1 x ∈ CC

2 x ∈ C \ g−1(F )3 x ∈ g−1(F ).

Note that φ = f a.e. with respect to Borel measure. However, φ is not Borel measurable because φ−1(3)is the set g−1(F ) which is not a Borel set.

We conclude that in this case, even though the two functions were equal a.e. with respect to Borel

measure, only one was measurable! The reason this happens is that even though C has Borel measure 0,

there are subsets of C which are not Borel sets!

Hence, in some situations, we will have to stipulate that the measure we are working with has the

property that every subset of a set of measure zero is measurable. We make this formal with a definition.

Definition 9.3.1 (Complete Measure).Let X be a nonempty set and (X,S, µ) be a measure space. If E ∈ S with µ(E) = 0 and F ⊆ E

implies F ∈ S, we say µ is a complete measure. Further, it follows immediately that since µ(F ) ≤µ(E) = 0, that µ(F ) = 0 also.

Comment 9.3.2. This example above can be used in another way. Consider the composition of the measur-

able function IC and the function g defined above. For convenience, let W = g−1(F ) which is Lebesgue

measurable. Then IW is a measurable function. Consider(IW g−1

)(x) =

1 g−1(x) ∈W0 g−1(x) ∈WC

=

1 x ∈ g(W )0 x ∈ g(WC)

=

1 x ∈ F0 x ∈ FC

= IF .

But IF is not a measurable function as F is not a measurable set! Hence, the composition of the mea-

surable function IW and the continuous function g−1 is not measurable. This is why we can only prove

measurability with the order of the composition reversed as we did in Lemma 8.8.1.

Theorem 9.3.1 (Equality a.e. Implies Measurability If The Measure Is Complete).Let X be a nonempty set and (X,S, µ) be a complete measure space. Let f and g both be extended

real valued functions on X with f = g a.e. Then, if f is measurable, so is g.

Proof 9.3.1. We leave it as an exercise for you to show that an equivalent condition for measurability is to

prove f(x) ∈ G is measurable for all open sets G. Then to prove this result, let G be open in < and let

E = (f(x) 6= g(x)). Then, by assumption, E is measurable and µ(E) = 0. Then, we claim

g−1(G) =(g−1(G) ∩ E

)∪(f−1(G) \ E

).

If x is in g−1(G), then g(x) is in G∩E or it is in G∩EC . Now if g(x) ∈ E, g(x) 6= f(x), but if g(x) is in

the complement of E, g and f must match. Thus, we see x is in the right hand side. Conversely, if x is in

192

9.4. CONVERGENCE THEOREMS CHAPTER 9. ABSTRACT INTEGRATION

g−1(G)∩E, x is clearly in g−1(G). Finally, if x is in f−1(G) \E, then since x is not in E, f(x) = g(x).

Thus, x ∈ g−1(G) also. We conclude x ∈ g−1(G). This shows the right hand side is contained in the left

hand side. Combining these arguments, we conclude the two sets must be equal.

Since g−1(G)∩E is a subset of E, the completeness of µ implies that g−1(G)∩E is measurable and

has measure 0. The measurability of f tells us that f−1(G) \ E is also measurable. Hence, g−1(G) is

measurable implying g is measurable.

If the measure µ is not complete, we can still prove the following.

Theorem 9.3.2 (Equality a.e. Can Imply Measurability Even If The Measure Is Not Complete).Let X be a nonempty set and (X,S, µ) be a measure space. Let f and g both be extended real valued

functions on X with f = g on the measurable set EC with µ(E) = 0. Then, if f is measurable and g

is constant on E, g is measurable.

Proof 9.3.2. We will repeat the notation of the previous theorem’s proof. As before, if G is open, we can

write

g−1(G) =(g−1(G) ∩ E

)∪(f−1(G) \ E

).

Then, since g is constant on E with value say c, we have

g−1(G) =

(E c ∈ G∅ c 6∈ G

) ⋃(f−1(G) \ E

)=

E ∪

(f−1(G) \ E

)c ∈ G(

f−1(G) \ E)

c 6∈ G.

In both cases, the resulting set is measurable. Hence, we conclude g is measurable.

Comment 9.3.3. In Comment 9.3.1, we set

φ(x) =

1, x ∈ CC

2, x ∈ C \ g−1(F )3, x ∈ g−1(F ).

and since φ was not constant on E = C, φ was not measurable. However, if we had defined

φ(x) =

1 x ∈ CC

c x ∈ C,

then φ would have been measurable!

9.4 Convergence Theorems

We are now ready to look at various types of interchange theorems for abstract integrals. We will be

able to generalize the results of Chapter 5 substantially. There are three basic results: (i) The Monotone

193


Convergence Theorem, (ii) Fatou’s Lemma and (iii) The Lebesgue Dominated Convergence Theorem. We

will examine each in turn.

Theorem 9.4.1 (The Monotone Convergence Theorem).Let (X,S, µ) be a measure space and let (fn) be an increasing sequence of functions in M+(X,S).

Let f : X → < be an extended real valued function such that fn → f pointwise on X . Then f is also

in M+(X,S) and

limn

∫fndµ =

∫fdµ.

Proof 9.4.1. Since fn converges to f pointwise, we know that f is measurable by Theorem 8.7.3. Further,

since fn ≥ 0 for all n on X , it is clear that f ≥ 0 also. Thus, f ∈M+(X,S). Since fn ≤ fn+1 ≤ f , the

monotonicity of the integral tells us∫fndµ ≤

∫fn+1dµ ≤

∫fdµ.

Hence,∫fndµ is an increasing sequence of real numbers bounded above by

∫fdµ. Of course, this limit

could be∞. Thus, we have the inequality∫fndµ ≤

∫fdµ.

We now show the reverse inequality,∫fdµ ≤

∫fndµ. Let α be in (0, 1) and choose any non negative

simple function φ which is dominated by f . Let

An = x | fn(x) ≥ α φ(x).

We claim that X = ∪nAn. If this was not true, then there would be an x which is not in any An. This

implies x is in ∩nACn . Thus, using the definition of An, fn(x) < α φ(x) for all n. Since fn is increasing

and converges pointwise to f , this tells us

f(x) ≤ α φ(x) ≤ α f(x).

We can rewrite this as (1 − α) f(x) ≤ 0 and since 1 − α is positive by assumption, we can conclude

f(x) ≤ 0. But f is non negative, so combining, we see f(x) = 0. Since f dominates φ, we must have

φ(x) = 0 too. However, if this is true, fn(x) must be 0 also. Hence, fn(x) = 0 ≥ αφ(x) = 0 for all n.

This says x ∈ An for all n. This is a contradiction; thus, X = ∪nAn.

Next, since f and αφ are measurable, so is f − αφ. This implies x | f(x) − αφ(x) ≥ 0 is a

measurable set. Therefore, An is measurable for all n. Further, it is easy to An ⊆ An+1 for all n; hence,

(An) is an increasing sequence of measurable sets. Then, we know by the monotonicity of the integral,

194


that ∫An

αφdµ ≤∫An

fndµ ≤∫

fdµ.

Next, we know that λ(E) =∫E φdµ defines a measure. Thus,

limnλ(An) = λ(∪nAn) = λ(X).

Replacing λ by its meaning in terms of φ, we have∫φdµ = lim

n

∫An

φdµ.

Multiplying through by the positive number α, we have

α

∫φdµ = lim

n

∫An

α φdµ ≤ limn

∫An

fndµ.

Thus, for all α ∈ (0, 1), we have

α

∫φdµ ≤ lim

n

∫An

fndµ.

Letting α→ 1, we obtain ∫φdµ ≤ lim

n

∫An

fndµ.

Since the above inequality is valid for all non negative simple functions dominated by f , we have immedi-

ately ∫fdµ ≤ lim

n

∫An

fndµ,

which provides the other inequality we need to prove the result.

This has an immediate extension to series of non-negative functions.

Theorem 9.4.2 (The Extended Monotone Convergence Theorem ).Let (X,S, µ) be a measure space and let (gn) be a sequence of functions in M+(X,S, µ). Then, the

sequence of partial sums,

Sn =n∑k=1

gn

converges pointwise on X to the extended real valued non negative valued function S =∑∞

k=1 gn.

Further, S is also in M+(X,S) and

limn

∫Sndµ =

∫S dµ.

195


Comment 9.4.1. Once we establish Theorem 9.4.3 (below), we can rewrite this in series notation as

∞∑k=1

∫gk dµ =

∫ ∞∑k=1

gk dµ.

Proof 9.4.2. To prove this result, just apply the Monotone Convergence Theorem to the sequence of partial

sums (Sn).

The Monotone Convergence Theorem allows us to prove that this notion of integration is additive and

linear for positive constants.

Theorem 9.4.3 (Abstract Integration Is Additive).Let (X,S, µ) be a measure space and let f and g be functions in M+(X,S). Further, let α be a non

negative real number. Then

(i): α f is in M+(X,S) and ∫α fdµ = α

∫fdµ.

(ii): Also, f + g is in M+(X,S) and∫(f + g)dµ =

∫fdµ +

∫gdµ.

Proof 9.4.3.(i): The case α = 0 is clear, so we may assume without loss of generality that α > 0. We know from

Theorem 8.8.2 that there is a sequence of non negative simple functions (φn) which are increasing and

converge up to f on X . Hence, since α > 0, we also know that α φn ↑ α f . Thus, by the Monotone

Convergence Theorem, αf is in M+(X,S) and

limn

∫α φndµ =

∫α fdµ.

From Lemma 9.2.1, we know that∫αφndµ = α

∫φndµ. Thus,∫

α fdµ = α limn

∫φndµ = α

∫fdµ.

(ii): If we apply Theorem 8.8.2 to f and g, we find two sequences of increasing simple functions (φn) and

(ψn) so that φn ↑ f and ψn ↑ g. Thus, (φn + ψn) ↑ (f + g). Hence, by the Monotone Convergence

Theorem, f + g is in M+(X,S) and∫(f + g)dµ = lim

n

∫(φn + ψn)dµ = lim

n

∫φndµ + lim

n

∫ψndµ

196


=∫

fdµ +∫

gdµ.

Comment 9.4.2. The conclusion of Theorem 9.4.2 can now be restated. By Theorem 9.4.3, each Sn is

measurable and ∫Sndµ =

n∑k=1

∫gk dµ.

Thus, the left hand side can be written as

limn

∫Sndµ = lim

n

n∑k=1

∫gk dµ

=∞∑k=1

∫fk dµ.

The limit function S can clearly be written as an infinite series giving

∫S dµ =

∫ ∞∑k=1

fk dµ.

Combining these statements, we get the result.

Theorem 9.4.4 (Fatou’s Lemma).Let (X,S, µ) be a measure space and let (fn) be a sequence of functions in M+(X,S). Then∫

lim inf fn dµ ≤ lim inf∫

fn dµ.

Proof 9.4.4. Recall

lim inf fn(x) = supm

infn≥m

fn(x)

= limm

infn≥m

fn(x),

lim sup fn(x) = infm

supn≥m

fn(x)

= limm

supn≥m

fn(x).

Further, if we define gm = infn≥m fn(x), we know

gm ↑ lim inf fn(x).

197


It follows immediately that gm is measurable for all m and by the monotonicity of the integral∫gm dµ ≤

∫fn(x) dµ ∀ n ≥ m.

This implies that∫gm dµ is a lower bound for the set of numbers

∫fn(x) dµ and so by definition of

the infimum, ∫gm dµ ≤ inf

n≥m

(∫fn(x) dµ

).

Let αm denote the number infn≥m

(∫fn(x) dµ

). Then, αm ↑ lim inf

∫fndµ. We see

limm

∫gm dµ ≤ lim

minfn≥m

(∫fn(x) dµ

)= lim

mαm = lim inf

∫fndµ.

But since gm ↑ lim inf fn(x), this implies∫lim inf fn(x) dµ ≤ lim inf

∫fn dµ.

These results allow us to construct additional measures.

Theorem 9.4.5 (Constructing Measures From Non Negative Measurable Functions).Let (X,S, µ) be a measure space and let f be a function in M+(X,S). Then λ : S → < defined by

λ(E) =∫Ef dµ, E ∈ S

is a measure.

Proof 9.4.5. It is clear λ(∅) is 0 and that λ(E) is always non negative. To show that λ is countably

additive, let (En) be a sequence of disjoint measurable sets in S and let E = ∪nEn, be their union. Then

E is measurable. Define

fn =n∑k=1

fIEk = fI∪nk=1 Ek.

198


We note that fn ↑ fIE and so by the Monotone Convergence Theorem,

λ(E) =∫Ef dµ =

∫fIEdµ = lim

n

∫fn dµ.

But, ∫fn dµ =

n∑k=1

∫fIEk dµ

=n∑k=1

∫Ek

f dµ =n∑k=1

λ(Ek).

Combining, we have

λ(E) = limn

n∑k=1

λ(Ek) =∞∑k=1

λ(Ek),

which proves that λ is countably additive.

Once we can construct another measure λ from a given measure µ , it is useful to think about their

relationship. One useful relationship is that of absolute continuity.

Definition 9.4.1 (Absolute Continuity Of A Measure).Let (X,S, µ) be a measure space and let λ be another measure defined on S. We say λ is absolutely

continuous with respect to the measure µ if given E in S with µ(E) = 0, then λ(E) = 0 also. This is

written as λ µ.

We can also now prove an important result set within the framework of functions which are equal a.e.

Lemma 9.4.6 (Function f Zero a.e. If and Only If Its Integral Is Zero).Let (X,S, µ) be a measure space and let f be a function in M+(X,S). Then f = 0 a.e. if and only

if∫fdµ = 0.

Proof 9.4.6.(⇐): If

∫fdµ = 0, then let En = (f(x) > 1/n). Note En ⊆ En+1 so that (En) is an increasing

sequence. Since (En) is an increasing sequence, we also know limn µ(En) = µ(∪nEn). Further,

∪nEn = x | f(x) > 0.

From the definition of En, we have

f(x) ≥ 1nIEn ,

which implies

0 =∫fdµ ≥

∫1nIEn =

1nµ(En).

199


We see µ(En) = 0 for all n which implies

µ(f(x) > 0) = limn

µ(En) = 0.

Hence, f is zero a.e.

(⇒): If f is zero a.e., let E be the set where f(x) > 0. Let fn = nIE . Note that

lim inf fn(x) = supm

infn≥m

n f(x) > 00 f(x) = 0

= supm

m f(x) > 00 f(x) = 0

=

∞ f(x) > 00 f(x) = 0.

Clearly, f(x) ≤ lim inf fn(x) which implies∫f dµ ≤

∫lim inf fn dµ. Finally, by Fatou’s Lemma, we

find ∫f dµ ≤

∫lim inf fn dµ ≤ lim inf

∫fn dµ = lim inf nµ(E) = 0.

We conclude∫f dµ = 0.

Comment 9.4.3. Given f in M+(X,S), Theorem 9.4.5 allows us to construct the new measure λ by

λ(E) =∫E fdµ. If E has µ measure 0, we can use Lemma 9.4.6 to conclude that λ(E) = 0. Hence, a

measure constructed in this way is absolutely continuous with respect to µ.

We can now extend the Monotone Convergence Theorem slightly. It is often difficult to know that we

have pointwise convergence up to a limit function on all of X . The next theorem allows us to relax the

assumption to almost everywhere convergence as long as the underlying measure is complete.

Theorem 9.4.7 (The Extended Monotone Convergence Theorem Two).Let (X,S, µ) be a measure space with complete measure µ and let (fn) be an increasing sequence

of functions in M+(X,S). Let f : X → < be an extended real valued function such that fn → f

pointwise a.e. on X . Then f is also in M+(X,S) and

limn

∫fndµ =

∫fdµ.

Proof 9.4.7. Let E be the set of points where fn does not converge to f . Then by assumption E has

measure 0 and fn ↑ f on EC . Thus,

fn IEC ↑ f IEC

and applying the Monotone Convergence Theorem, we have

limn

∫fn IEC =

∫f IEC

and we can say f IEC is in M+(X,S). Now f is equal to fIEC a.e. and so although in general, f need

not be measurable, since µ is a complete measure, we can invoke Theorem 9.3.1 to conclude that f is

200

9.5. EXTENDED INTEGRANDS CHAPTER 9. ABSTRACT INTEGRATION

actually measurable. Hence, fIE is measurable too. Since µ(E) = 0, we thus know that∫fIE dµ =

∫fIE dµ =

∫fnIE dµ = 0.

Therefore, we have∫f dµ =

∫Ef dµ +

∫EC

f dµ =∫EC

f dµ

= limn

∫EC

fn dµ = limn

(∫EC

fn dµ +∫Efn dµ

)= lim

n

∫fn dµ.

Now to develop the Dominated Convergence Theorem, we need a few more concepts.

9.5 Extending Integration To Extended Real Valued Functions

The results of the previous sections can now be used to extend the notion of integration to general extended

real valued functions f in M(X,S).

Definition 9.5.1 (Summable Functions).Let (X,S, µ) be a measure space and f be in M(X,S). We say f is summable or integrable on X if∫f+dµ and

∫f−dµ are both finite. In this case, we define the integral of f on X with respect to the

measure µ to be ∫f dµ =

∫f+ dµ −

∫f− dµ.

Also, if E is a measurable set, we define∫Ef dµ =

∫Ef+ dµ −

∫Ef− dµ.

We let L1(X,S, µ) be the collection of summable functions on X with respect to the measure µ.

Comment 9.5.1. If f can be decomposed into two non negative measurable functions f1 and f2 as f =f1 − f2 a.e. with

∫f1dµ and

∫f2dµ both finite, then note since f = f+ − f− also, we have

f1 + f− = f2 + f+.

Thus, since all functions involved are summable,∫f1dµ +

∫f−dµ =

∫f2dµ +

∫f+dµ.

This implies that ∫(f2 − f1)dµ =

∫(f+ − f−)dµ =

∫fdµ.

201


Hence, the value of the integral of f is independent of the decomposition.

There are a number of results that follow right away from this definition.

Theorem 9.5.1 (Summable Implies Finite a.e.).Let (X,S, µ) be a measure space and f be in L1(X,S). Then the set of points where f is not finite

has measure 0.

Proof 9.5.1. Let En = (f(x) > n). Then it is easy to see that (En) is a decreasing sequence of sets and

so

µ

(⋂n

En

)= lim

nµ(En).

It is also clear that

(f(x) =∞) =⋂n

En.

Next, note ∫f+ dµ =

∫En

f+ dµ +∫ECn

f+ dµ

≥∫En

f+ dµ > n µ(En).

Thus, µ(En) < (∫f+ dµ)/n. Since, the integral is a finite number, this tells us that limn µ(En) = 0. This

immediately implies that µ(E) = 0.

A similar argument shows that the set (f(x) = −∞) which is the same as the set (f−(x) = ∞) has

measure 0.

Theorem 9.5.2 (Summable Function Equal a.e. To Another Measurable Function Implies The Other

Function Is Also Summable).Let (X,S, µ) be a measure space and f be in L1(X,S). Then if g ∈ M(X,S) with f = g a.e., g is

also summable.

Proof 9.5.2. Let E be the set of points in X where f and g are not equal. Then E has measure zero.

We then have fIEC = gIEC and so gIEC must be summable. Further, f+IEC = g+IEC and f−IEC =g−IEC . We then note that ∫

g+IECdµ =∫

g+IECdµ +∫

g+IEdµ

because∫g+IEdµ = 0 since E has measure zero. But then we see∫g+ dµ =

∫g+IECdµ +

∫g+IEdµ =

∫f+IECdµ +

∫f+IEdµ =

∫f+ dµ.

Thus, we can see that∫g+ dµ is finite. A similar argument shows

∫g+ dµ is finite and so g is summable.

202


Theorem 9.5.3 (Summable Function Equal a.e. To Another Function With Measure Complete Implies

The Other Function Is Also Summable).Let (X,S, µ) be a measure space with µ complete and f be in L1(X,S). Then if g is a function equal

a.e. to f , g is also summable.

Proof 9.5.3. First, the completeness of µ implies that g is measurable. The argument to show g is

summable is then the same as in the previous theorem’s proof.

We can extend the Monotone Convergence a bit more and actually construct a summable limit function

in certain instances. This is known as Levi’s Theorem.

Theorem 9.5.4 (Levi’s Theorem).Let (X,S, µ) be a measure space and let (fn) be a sequence of functions in L1(X,S, µ) which satisfy

fn ≤ fn+1 a.e. Further, assume

limn

∫fn dµ <∞.

Then, there is a summable function f on X so that fn ↑ f a.e. and∫fndµ ↑

∫fdµ.

Proof 9.5.4. Define the new sequence of functions (gn) by gn = fn − f1. Then, since (fn) is increasing

a.e., (gn) is increasing and non negative a.e. By assumption, limn

∫gndµ is then finite. Call its value I

for convenience of exposition. Now define the function g pointwise on X by

g(x) = limngn(x).

This limit always exists as an extended real number in [0,∞] and since each gn is measurable, so is g. Let

E = (g(x) =∞). Note that

E =⋂i

(⋃n

(gn(x) > i

)),

and so we know that E is measurable.

For each non-negative measurable function gi, there is an increasing sequence of simple functions

(φin) such that φin ↑ gi. For each n, define (recall the binary operator ∨ means a pointwise maximum)

Ψn = φ1n ∨ φ2

n ∨ · · · ∨ φnn.

Then it is clear that Ψn is measurable. Given any x in X , we have that

Ψn+1(x) = φ1n+1 ∨ φ2

n+1 ∨ · · · ∨ φn+1n+1

≥ φ1n+1 ∨ φ2

n+1 ∨ · · · ∨ φn+1n

≥ φ1n ∨ φ2

n ∨ · · · ∨ φnn

= Ψn(x).

203


Hence, (Ψn) is an increasing sequence. Moreover, it is straightforward to see that

Ψn(x) ≤ g1(x) ∨ g2(x) ∨ · · · ∨ gn(x) ≤ gn(x) = g(x).

Hence, we know that limn Ψn(x) ≤ g(x). If this limit was strictly less than g(x), let r denote half of the

gap size; i.e., r = (1/2)(g(x)− limn Ψn(x). Then, since Ψn(x) ≥ φin where i is an index between 1 and

n, we would have

φin < g(x) − r, 1 ≤ i ≤ n.

This implies that φnn ≤ g(x) − r for all n. In particular, fixing the index i, we see that φin ≤ g(x) − rfor all n. But since φin ↑ gi, this says gi(x) ≤ g(x) − r. Since, we can do this for all indices i, we have

limi gi(x) ≤ g(x)− r or g(x) ≤ g(x)− r which is not possible. We conclude limn Ψn = g pointwise on

X .

Next, we claim∫

Ψndµ = limn

∫gndµ. To see this, first notice that

∫Ψndµ ≥

∫φindµ for all

1 ≤ i ≤ n. In fact, for any index j, there is an index n∗ so that n∗ > j. Hence,∫

Ψn∗dµ ≥∫φjndµ. This

still holds for any n > n∗ as well. Thus, for any index j, we can say

limn

∫Ψn dµ ≥ lim

n

∫φjn dµ =

∫gj dµ.

This implies that

limn

∫Ψn dµ ≥ sup

j

∫gj dµ = lim

j

∫gj dµ = I.

Also, since Ψn ≤ gn(x),

limn

∫Ψndµ ≤ lim

n

∫gndµ = I.

This completes the proof that∫

Ψndµ = limn

∫gndµ.

We now show the measure ofE is zero. To do that, we start with the functions Ψn∧kIE for any positive

integer k, where the wedge operation ∧ is simply taking the minimum. If g(x) is finite, then IE(x) = 0 and

since Ψn is non negative, Ψn∧kIE = 0. On the other hand, if g(x) =∞, then x ∈ E and so kIE(x) = k.

Since Ψn ↑ g, eventually, Ψn(x) will exceed k and we will have Ψn ∧ kIE = k. These two cases allow us

to conclude

Ψn ∧ kIE ↑ kIE

for all x. Thus, ∫kIE dµ =

∫Ψn ∧ kIE dµ ≤

∫Ψn dµ ≤ lim

n

∫gndµ = I.

We conclude k µ(E) ≤ I for all k which implies that µ(E) = 0.

Finally, to construct the summable function f we need, define h = gIEC . Clearly, gn ↑ h on EC , that

is, a.e. Also, since Ψn ↑ g on EC , the Monotone Convergence Theorem tells us that

limn

∫EC

Ψn dµ ↑∫EC

g dµ.

204

9.6. SUMMABLE PROPERTIES CHAPTER 9. ABSTRACT INTEGRATION

But, ∫EC

g dµ =∫

h dµ.

Hence, h is summable and so f1+h is also summable. Define f = f1+h onX and we have f is summable

and

fn ↑ f1 + h∫fn dµ =

∫f1 dµ +

∫h dµ

=∫

f dµ.

Each summable function can also be used to construct a charge.

Theorem 9.5.5 (Integrals Of Summable Functions Create Charges).Let (X,S, µ) be a measure space and let f be a functions in L1(X,S, µ). Then the mapping λ : S →< defined by

λ(E) =∫Ef dµ

for all E in S defines a charge on S. The integral∫E f dµ is also called the indefinite integral of f

with respect to the measure µ.

Proof 9.5.5. Since f is summable, note that the mappings λ+ and λ− defined by

λ+(E) =∫Ef+ dµ, λ−(E) =

∫Ef− dµ

both define measures. It then follows immediately that λ is countably additive and hence is a charge.

Comment 9.5.2. Since∫E f dµ defines a charge and is countably additive, we see that if (En) is a

collection of mutually disjoint measurable subsets, then∫∪nEn

f dµ =∑n

∫En

f dµ.

9.6 Properties Of Summable Functions

We need to know if L1(X,S, µ) is a linear space under the right interpretation of scalar multiplication and

addition. To do this, we need some fundamental inequalities and conditions that force summability.

205


Theorem 9.6.1 (Fundamental Abstract Integration Inequalities).Let (X,S, µ) be a measure space.

(i): f ∈ L1(X,S, µ) if and only if | f |∈ L1(X,S, µ).

(ii): f ∈ L1(X,S, µ) implies |∫f dµ | ≤

∫| f | dµ.

(iii): f measurable and g ∈ L1(X,S, µ) with | f |≤| g | implies f is also summable and∫| f |

dµ ≤∫| g | dµ.

Proof 9.6.1.(i): If f is summable, f+ and f− are in M+(X,S) with finite integrals. Since | f |= f+ + f−, we

see | f |+=| f | and | f |−= 0. Thus,∫| f |+ dµ =

∫(f+ + f−)dµ which is finite. Also, since∫

| f |− dµ = 0, we see that | f | is summable.

Conversely, if | f | is summable, then∫| f |+ dµ =

∫(f+ + f−)dµ is finite. This, in turn, tells us

each piece is finite and hence f is summable too.

(ii): If f is summable, then

|∫

f dµ | = |∫

f+ dµ −∫

f− dµ |

≤∫

f+ dµ +∫

f− dµ

=∫

(f+ + f−) dµ =∫| f | dµ.

(iii): Since g is summable, so it | g | by (i). Also, because | f |≤| g |, each function is in M+(X,S) and

so∫| f |+ dµ ≤

∫| g |+ dµ which is finite. Hence, | f | is summable. Then, also by (i), f is summable.

We can now tackle the question of the linear structure of L1(X,S, µ).

206


Theorem 9.6.2 (The Summable Function Form A Linear Space).Let (X,S, µ) be a measure space. We define operations on L1(X,S, µ) as follows:

• scalar multiplication: for all α in <, αf is the function defined pointwise by (αf)(x) = αf(x).

• addition of functions: for any two functions f and g the sum of f and g is the new function

defined pointwise on ECfg by (f + g)(x) = f(x) + g(x), where, recall,

Efg =

((f =∞) ∩ (g = −∞)

⋃(f = −∞) ∩ (g =∞)

).

This is equivalent to defining f + g to be the function h where

h = (f + g)IECfg

This is a measurable function as we discussed in the proof of Lemma 8.7.1.

Then, we have

(i): αf is summable for all real α if f is summable and∫αfdµ = α

∫fdµ.

(ii): f + g is summable for all f and g which are summable and∫

(f + g)dµ =∫fdµ +

∫gdµ.

Hence, L1(X,S, µ) is a vector space over <.

Proof 9.6.2.(i): If α is 0, this is easy. Next, assume α > 0. Then, (αf)+ = αf+ and (αf)− = αf− and these two

functions are clearly summable since f+ and f− are. Thus, αf is summable. Then, we have∫αf dµ =

∫(αf)+ dµ −

∫(αf)− dµ

= α

(∫f+ dµ −

∫f− dµ

= α

∫f dµ.

Finally, if α < 0, we have (αf)+ = −αf− and (αf)− = −αf+. Now simply repeat the previous

arguments making a few obvious changes.

(ii): Since f and g are summable, we know that µ(Efg = 0. Further, we know | f | and | g | are summable.

Since

| f + g | IECfg ≤(| f | + | g |

)IECfg

≤ | f | + | g |,

207

9.7. THE DCT CHAPTER 9. ABSTRACT INTEGRATION

we see | f + g | IECfg is summable by Theorem 9.6.1, part (iii). Hence, (f + g) IECfg is summable also.

Now decompose f + g on ECfg as

f + g = (f+ + g+) − (f− + g−).

Then, note ∫ECfg

(f + g) dµ =∫ECfg

(f+ + g+) dµ −∫

(f− + g−) dµ

=∫ECfg

(f+ − f−) dµ +∫

(g+ − g−) dµ,

where we are permitted to manipulate the terms in the integrals above because all are finite in value.

However, we can rewrite this as∫ECfg

(f + g) dµ =∫ECfg

f dµ +∫g dµ.

Since we define the sum of f and g to be the function (f + g) IECfg , we see f + g is in L1(X,S, µ).

9.7 The Dominated Convergence Theorem

We can now complete this chapter by proving the important limit interchange called the Lebesgue Domi-

nated Convergence Theorem.

Theorem 9.7.1 (Lebesgue’s Dominated Convergence Theorem).Let (X,S, µ) be a measure space, (fn) be a sequence of functions in L1(X,S, µ) and f : X → < so

that fn → f a.e. Further, assume there is a summable g so that | fn |≤ g for all n. Then, suitably

defined, f is also measurable and summable with limn

∫fn dµ =

∫f dµ.

Proof 9.7.1. Let E be the set of points in X where the sequence does not converge. Then, by assumption,

µ(E) = 0 and

fnIEC → fIEC , and | fnIEC | ≤ gIEC .

Hence, fIEC is measurable and satisfies | fIEC |≤ gIEC . Therefore, since g is summable, we have that

fIEC is summable too.

We can write out our fundamental inequality as follows

− gIEC ≤ fn IEC ≤ gIEC . (α)

208


This implies that hn = fn IEC + gIEC is non negative and hence, we can apply Fatou’s lemma to find∫lim inf hn dµ ≤ lim inf

∫hn dµ.

However, we know

lim inf hn = lim inf(fn IEC + gIEC

)= gIEC + lim inf fn IEC

= gIEC + f IEC ,

because fn converges pointwise to f on EC . It then follows that∫ (gIEC + f IEC

)dµ ≤ lim inf

∫ (fn IEC + gIEC

)dµ

=∫

gIEC dµ + lim inf∫

fn IEC dµ.

Since g is summable, we also know∫(g IEC + f IEC ) dµ =

∫(g IEC dµ +

∫f IEC ) dµ.

Using this identity, we have∫(g IEC dµ +

∫f IEC ) dµ. ≤

∫g IEC dµ + lim inf

∫fn IEC dµ.

The finiteness of the integral of the g term then allows cancellation so that we obtain the inequality∫f IEC dµ. ≤ lim inf

∫fn IEC dµ.

Since the integrals of f and fn are all zero on E, we have shown∫f dµ. ≤ lim inf

∫fn dµ.

We now show the reverse inequality holds. Using Equation α, we see zn = gIEC − fnIEC is also non

negative for all n. Applying Fatou’s Lemma, we find∫lim inf zn dµ ≤ lim inf

∫zn dµ.

Then, we note

lim inf zn = lim inf(−fn IEC + gIEC

)209


= gIEC + lim inf(−fn IEC

)= gIEC − f IEC ,

because fn converges pointwise to f on EC . It then follows that∫ (gIEC − f IEC

)dµ ≤ lim inf

∫ (−fn IEC + gIEC

)dµ

=∫

gIEC dµ + lim inf∫ (−fn IEC

)dµ.

Now,

lim inf∫ (−fn IEC

)dµ = sup

minfm≥n

∫ (−fn IEC

)dµ

= supm

(−∑m≥n

∫fn IEC dµ

)= − inf

msupm≥n

∫fn IEC dµ

= − lim sup∫

fn IEC dµ.

Thus, we can conclude∫ (gIEC − f IEC

)dµ ≤

∫gIEC dµ − lim sup

∫fn IEC dµ.

Again, since g is summable, we can write∫gIEC dµ −

∫f IEC dµ ≤

∫gIEC dµ − lim sup

∫fn IEC dµ.

After canceling the finite value∫gIEC dµ, we have∫f IEC dµ ≥ lim sup

∫fn IEC dµ.

This then implies, using arguments similar to the ones used in the first case, that∫f dµ ≥ lim sup

∫fn dµ.

However, limit inferiors are always less than limit superiors and so we have

lim sup∫

fn dµ ≤∫

f dµ ≤ lim inf∫

fn dµ ≤ lim sup∫

fn dµ.

It follows immediately that limn

∫fndµ =

∫fdµ.

210

9.8. HOMEWORK CHAPTER 9. ABSTRACT INTEGRATION

Finally, we can now see how to define f in a suitable fashion. The function fIEC is measurable and is

0 on E. Hence, the limit function f can has the form

f(x) =

limn fn(x) when the limit exists, i.e. when x ∈ EC

0 when the limit does not exist, i.e. when x ∈ E.

9.8 Homework

Exercise 9.8.1. Assume f ∈ L1(X,S, µ) with f(x) > 0 onX . Further, assume there is a positive number

α so that α < µ(X) <∞. Prove that

inf ∫Ef dµ | µ(E) ≥ α > 0.

Exercise 9.8.2. Assume f ∈ L1(X,S, µ). Let α > 0. Prove that

µ(x | | f(x) |≥ α)

is finite.

Exercise 9.8.3. Assume (fn) ⊆ L1(X,S, µ). Let f : X → < be a function. Assume fn → f [ptws ae].Prove ∫

| fn − f | dµ → 0 ⇒∫| fn | dµ →

∫| f | dµ

Exercise 9.8.4. Let (X,S) be a measurable space. Let C be the collection of all charges on S. Prove that

C is a Banach Space under the operations(c µ

)(E) = c µ(E), ∀ c ∈ <, ∀ µ(

µ + ν

)(E) = µ(E) + ν(E), ∀ µ, ν,

with norm ‖ µ ‖= |µ|(X)

9.9 Alternative Abstract Integration Schemes

It is possible to define abstract integration in other ways. Let (X,S, µ) be a measure space with µ(X)finite. If f : X → < is bounded and S measurable, let m denote the lower bound of f on X and

M , its upper bound. If [a, b] is a finite interval in < (a < b, of course), let pi denote a partition of

[a, b] as defined in Definition 3.1.1. For convenience of exposition, we will let pi be the set of points

a = y0 < y1 < · · · < yn = b. In our discussion of Riemann integration and functions of bounded

variation, we used the variable x because we often think of the symbol x as a domain variable; here, we

211

9.9. ALTERNATE INTEGRATION CHAPTER 9. ABSTRACT INTEGRATION

use the variable y because it is often used as a range variable in many settings. The important thing to

remember is we are partitioning the range of f now, rather than the its domain. We define the norm of pi

as usual following Definition 3.1.4. Further, we define refinements and common refinements of a partition

as in Definition 3.1.1.

Now, let M0 > M be chosen. Let π be any partition of [m,M0] and label its points as m = y0 <

y1 < · · · < yn = M0. Define the following measurable subsets of X:

Ej = x : yj−1 ≤ f(x) < yj, 1 ≤ j ≤ n.

It is clear the sets Ej are disjoint and X = ∪nj=1 Ej . Thus,

µ(X) =n∑j=1

µ(Ej).

Define Lower and Upper sums as follows: the Lower sum is

L(f,π) =∑π

yj−1 µ(Ej)

and the upper sum is

U(f,π) =∑π

yj µ(Ej).

It is clear

mµ(X) ≤ L(f,π) ≤ U(f,π) ≤ M0µ(X). (9.2)

It is clear then that

supπ

L(f,π) < ∞

infπU(f,π) < ∞.

We can thus define abstract Lower and Upper Darboux integrability as usual (see Theorem 4.2.3) for our

choice of M0.

212


Theorem 9.9.1 (The Upper And Lower Abstract Darboux Integrals Are Finite).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then the Lower Integral L(f,M0) and Upper Integral U(f,M0) defined by

L(f,M0) = supπ

L(f,π)

U(f,M0) = infπU(f,π).

are both finite.

Proof 9.9.1. This is a consequence of Equation 9.2.

We can then define our new abstract integral by

Definition 9.9.1 (Abstract Darboux Integrability).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

We say f is Abstract Darboux Integrable if L(f,M0) = U(f,M0). The common value is then called

the Abstract Darboux Integral of f and is denoted by the symbol DAI(f,M0).

We then prove the following results.

Theorem 9.9.2 (π′ refines π Implies L(f,π) ≤ L(f,π′) and U(f,π) ≥ U(f,π′)).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable. If

π′ refines π on [m,M0], then L(f,π) ≤ L(f,π′) and U(f,π) ≥ U(f,π′).

Proof 9.9.2. Mimic the proof of Theorem 4.2.1, mutatis mutandi.

Theorem 9.9.3 (L(f,π1) ≤ U(f,π2)).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Let π1 and π2 be any two partitions of [m,M0]. Then L(f,π1) ≤ U(f,π2).

Proof 9.9.3. See the proof of Theorem 4.2.2

This implies

Theorem 9.9.4 (L(f,M0) ≤ U(f,M0)).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then L(f,M0) ≤ U(f,M0).

From Equation 9.2 and Theorem 9.9.3, it immediately follows that for any partition π that

213


0 ≤ S(f, π) − L(f, π) ≤ ‖π‖µ(X).

We can use this inequality to prove

Theorem 9.9.5 (U(f,M0) ≤ L(f,M0)).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then U(f,M0) ≤ L(f,M0).

We conclude that the Abstract Darboux Integral of f exists and DAI(f,M0) has the common value

L(f,M0) = U(f,M0).

We can do a similar analysis for partitions of the form [m0,M0] for any m0 < m. We can prove that

the resulting abstract Darboux integrals always exist: hence, we have finite numbers DAI(f,m0,M0) for

many choices ofm0 andM0. Clearly, our development of this abstract integral is without much application

if these numbers depend on our choice of m0 and M0. This is not the case. We can prove

Theorem 9.9.6 (DAI(f,m0,M0) is independent of the choice of m0 and M0).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then DAI(f,m0,M0) is independent of the choice of m0 and M0

Proof 9.9.6. It is enough to do the argument for the upper bound side, M0 as the arguments for the m0

case are similar. Here is a sketch of the argument. LetM < M0 < M ′0 be given. Show that each partition

of [m,M0] corresponds to a partition of [m,M ′0] for which the lower and upper sums for the partition of

[m,M0] are the same as the ones for the partition of [m,M ′0]. Then, with that done, show this implies the

desired result.

With Theorem 9.9.6 established, we can choose any value of m0 and M0 useful in our calculations. This

can be of great help in some arguments. Since the value of the abstract Darboux integral is now well-

defined, we will begin using the standard notation,∫X f dµ or simply

∫f dµ for this value. Also,

sometimes, we will continue to refer to this value as DAI(f) where we no longer need add the argument

m0 or M0. Note this is the same symbol we use for the other abstract integral we have discussed. We also

define the symbol∫E f dµ = DAI(f,E) as usual for any measurable set E.

You can then prove the following theorems.

214


Theorem 9.9.7 (Abstract Darboux Integral Lower and Upper Bounds).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then

mµ(X) ≤ DAI(f) ≤ M µ(X).

Theorem 9.9.8 (DAI(f) = 0 if µ(X) = 0).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then DAI(f) = 0 if µ(X) = 0.

Theorem 9.9.9 (DAI(f(x) ≡ c) is cµ(X)).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded, S measurable and

have constant value c. Then DAI(f) = cµ(X).

Theorem 9.9.10 (Abstract Darboux Integral Measures).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded, S measurable and

non negative. Then λ : S → < defined by λ(E) = DAI(f,E) defines a measure on S.

Proof 9.9.10. We will provide a sketch of the proof. First, note our argument for a non negative f can be

used to show a finite valued summable f can be used to create a charge. We will not show that argument

here, but it should be straightforward for you to do.

• First, prove λ(A ∪ B) = λ(A) + λ(B) when A and B are disjoint and measurable. Let π be

a partition consisting of points (yj) and corresponding sets (Fj). Then for some set D and the

function fID, each set Fj has the form

Fj = (yj−1 ≤ f(x)ID(x) < yj)

.

This is the same as

F1 = (0 = y0 ≤ f(x) < y1) ∩D ∪ [0, y1) ∩DC

and for j larger than 1,

Fj = (0 = y0 ≤ f(x) < y1) ∩D

215


.

If we let Ej = (yj−1 ≤ f(x) < yj), we see Fj = Ej ∩D if j > 1 and F1 = E0∩D ∪ [0, y1)∩DC .

Hence, since the first term vanishes, we find the lower sum is

∑j>1

yj−1 µ(Ej ∩D)

.

Hence, if we apply this idea to the set D = A ∪B for A and B disjoint, we find

L(fIA∪B, π) = L(fIA, π) + L(fIB, π)

which leads to the inequality λ(A∪B) ≤ λ(A)+λ(B). A similar argument using upper sums gives

us the other inequality which proves this first result. We will the details for you in an exercise below.

• It then follows immediately that the result above holds for finite unions of disjoint measurable sets

via a standard induction argument. Again, the details are for you.

• Now let (An) be a countable union of disjoint measurable sets. Let A be the full countable union,

VN be the union of the first N sets and RN be X \VN . We can then apply what we know about finite

unions to find

∫Af dµ =

∫VN

f dµ+∫RN

f dµ

which further expands to

∫Af dµ =

N∑i=1

∫Ai

f dµ+∫RN

f dµ

.

It is then not hard to show∫RN

f dµ→ 0, which proves countable additivity.

Theorem 9.9.11 (Abstract Darboux Integral Is Monotone).Let (X,S, µ) be a measure space with µ(X) finite and f, g : X → < be bounded and S measurable.

Then if f ≤ g, DAI(f) ≤ DAI(g).

Proof 9.9.11. It is easy to see that g determines a charge. Hence, for any partition π with points (yj) and

associated sets (Ej), we have

216


∫g dµ =

∑j

∫Ej

g dµ

.

On each Ej , g dominates f implying∫g dµ dominates the lower sum for this partition. The result then

follows.

Theorem 9.9.12 (Abstract Darboux Integral Is Additive).Let (X,S, µ) be a measure space with µ(X) finite and f, g : X → < be bounded and S measurable.

Then DAI(f + g) = DAI(f) +DAI(g).

Proof 9.9.12. We prove additivity in two steps.

• We prove for the case that g is a constant c onX . Let π be a partition with points (yj) and associated

sets (Ej). Then, it is easy to see yj−1 ≤ f(x) < yj is the same set as yj−1 + c ≤ f(x) + c <

yj + c. Hence, the points (yj + c and the sets (Ej) give a partition π′ for f + g. Thus, the lower

sum is

L(f + g, π′) =∑j

(yj−1 + c)µ(Ej)

which can be broken into two sums giving

L(f + g, π′) =∑j

yj−1 µ(Ej) +∑j

c µ(Ej)

.

The first sum on the right is L(f, π) and the second is cµ(X) which is also∫g dµ. Hence, we have

shown additivity for this case.

• We now handle the case of an arbitrary g. Since f + g defines a charge, for any partition for f with

points (yj) and associated sets (Ej), we have

∫(f + g) =

∑j

∫Ej

(f + g) dµ

.

However, onEj , we know f(x)+g(x) ≥ yj−1+g(x). Hence,∫Ej

(f+g)dµ ≥∫Ej

(yj−1+g)dµ. The

additivity result holds for one of the functions a constant and so∫Ej

(yj−1 + g) dµ = yj−1 µ(Ej) +∫Ej

g dµ. We conclude

∫(f + g) ≥ L(f, π) +

∫Ej

g dµ

217


.

Since the partition π is arbitrary, this shows∫

(f + g) ≥∫f dµ+

∫Ej

g dµ. A similar argument

using upper sums completes this proof.

Theorem 9.9.13 (Abstract Darboux Integral Is Scalable).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then for all real numbers c, DAI(c f) = c DAI(f).

Proof 9.9.13. The case c = 0 is easy. First prove the case c > 0 using a partition argument. Then, if

c < 0, we can write cf as |c|(−f) and write immediately∫cfdµ = |c|

∫(−f)dµ. However, additivity

tells us∫

(−f)dµ = −∫fdµ which completes the argument.

Theorem 9.9.14 (Abstract Darboux Integral Absolute Inequality).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded and S measurable.

Then ∣∣∣∣DAI(f)∣∣∣∣ ≤ DAI(|f |).

Theorem 9.9.15 (Abstract Darboux Integral Zero Implies f = 0 a.e.).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded, S measurable and

non negative. Then DAI(f) = 0 implies f = 0 a.e.

Finally, it is easy to see

Theorem 9.9.16 (Abstract Darboux Integral Measures Are Absolutely Continuous).Let (X,S, µ) be a measure space with µ(X) finite and f : X → < be bounded, S measurable and

non negative. Then λ : S → < defined by λ(E) = DAI(f,E) defines an absolutely continuous

measure on S.

9.9.1 Homework

Exercise 9.9.1. Prove Theorem 9.9.2.




218












219


220

Chapter 10

The Lp Spaces

In mathematics and other fields, we often group objects of interest into sets and study the properties of

these sets. In this book, we have been studying a set X with a sigma - algebra of subsets contained within

it, the collection of functions which are measurable with respect to the sigma - algebra and recently, the set

of functions which are summable. In addition, we have noted that the sets of measurable and summable

functions are closed under scalar multiplication and addition as long as we interpret addition in the right

way when the functions are extended real - valued.

We can do more along these lines. We will now study the sets of summable functions as vector spaces

with a suitable norm. We begin with a review.

Definition 10.0.2 (The Norm On A Vector Space).Let X be a non empty vector space over <. We say ρ : X → < is a norm on X if

(N1): ρ(x) is non negative for all x in X ,

(N2): ρ(x) = 0 ⇔ x = 0,

(N3): ρ(αx) = |α|ρ(x), for all α in < and for all x in X ,

(N4): ρ(x + y) ≤ ρ(x) + ρ(y), for all x and y in X .

If ρ satisfies only N1, N3 and N4, we say ρ is a semi-norm or pseudo-norm. We will usually denote a

norm of x by the symbol ‖ x ‖.The pair (X, ‖ ‖) is called a Normed Linear Space or NLS.

If a set X has no linear structure, we can still have a notion of the distance between objects in the set,

if the set is endowed with a metric. This is defined below.

221

CHAPTER 10. THE LP SPACES

Definition 10.0.3 (The Metric On A Set).Let X be a non empty set. We say d : X ×X → < is a metric if

(M1): d(x, y) is non negative for all x and y in X ,

(M2): d(x, y) = 0 ⇔ x = y,

(M3): d(x, y) = d(y, x), for all for all x and y in X ,

(M4): d(x, y) ≤ d(x, z) + d(y, z), for all x, y and z in X .

If d satisfies only M1, M2 and M4, we say d is a semi-metric or pseudo-metric. The pair (X, d) is

called a metric space. Note that in a metric space, there is no notion of scaling or adding objects

because there is no linear structure.

Comment 10.0.1. It is a standard result from a linear analysis course, that the norm in a NLS (X, ‖ ‖)induces a metric on X by defining

d(x, y) = ‖ x− y ‖, ∀ x, y ∈ X.

Given a sequence (xn) in a NLS (X, ‖ ‖), we can define what we mean by the convergence of this

sequence to another object x in X .

Definition 10.0.4 (Norm Convergence).Let (X, ‖ ‖) be a non empty NLS. Let (xn) be a sequence in X . We say the sequence (xn) converges

to x in X if

∀ ε > 0, ∃N 3 n > N ⇒‖ xn − x ‖< ε.

Now let (X,S, µ) be a nonempty measurable space. We are now ready to discuss the spaceL1(X,S, µ).

By Theorem 9.6.2, we know that this space is a vector space with suitably defined addition. We can now

define a semi-norm for this space.

Theorem 10.0.17 (The L1 Semi-norm).Let (X,S, µ) be a nonempty measurable space. Define ‖ x ‖1 on L1(X,S, µ) by

‖ f ‖1 =∫|f | dµ, ∀ f ∈ L1(X,S, µ).

Then, ‖ x ‖1 is a semi-norm. Moreover, property N3 is almost satisfied: instead of N3, we have

‖ f ‖1 = 0 ⇔ f = a.e.

222


Proof 10.0.14.(N1): ‖ f ‖1 is clearly non negative.

(N2): This proof is an easy calculation.

‖ αf ‖1 =∫|α f | dµ =

∫|α| |f | dµ

= |α|∫|f | dµ = |α| ‖ f ‖1 .

(N4): To prove this, we start with the triangle inequality for real numbers. We know that if f and g are

summable, then the sum of f + g is defined to be h = (f + g)IECfg . Let A be the set of points where this

sum is∞ and B be the set where the sum is −∞. Then µ(Efg ∪ A ∪ B) = 0 and on (Efg ∪ A ∪ B)C ,

h is finite. For convenience of exposition, we will simply write h as f + g from now on. So f + g is finite

off a set of measure 0. At the points where f + g is finite, we can apply the standard triangle inequality to

f(x) + g(x). We have

|f(x) + g(x)| ≤ |f(x)| + |g(x)|, a.e.

This implies ∫|f + g| dµ ≤

∫|f | dµ +

∫|g| dµ.

At the risk of repeating ourselves too much, let’s go through the integral on the left hand side again. We

actually have ∫|f + g| IECfg ∩AC ∩BC dµ =

∫h IAC ∩BC dµ

=∫

h dµ

since µ(AC ∩ BC) = 0. Now the above inequality estimates clearly tell us

‖ f + g ‖1 ≤ ‖ f ‖1 + ‖ g ‖1 .

Finally, we look at what is happening in condition N2. Since |f | is in M+(X,S, µ), by Lemma 9.4.6, we

know

|f | = 0 a.e. ⇔∫|f | dµ = 0.

Hence, ‖ f ‖1= 0 if and only if f = 0 a.e.

Although ‖ x ‖1 is only a semi-norm, there is a way to think of this class of functions as a normed linear

space. Let’s define two functions f and g in L1(X,S, µ) to be equivalent or to be precise, µ - equivalent

if f = g except of a set of µ measure 0. We use the notation f ∼ g to indicate this equivalence. It is easy

to see that ∼ defines an equivalence relation on L1(X,S, µ). We will let [f ] denote the equivalence class

defined by f :

[f ] = g ∈ L1(X,S, µ) | g ∼ f.

223


Any g in [f ] is called a representative of the equivalence class [f ]. A straightforward argument shows

that two equivalence classes [f1] and [f2] are either equal as sets or disjoint. The collection of all distinct

equivalence classes of L1(X,S, µ) under a.e. equivalence will be denoted by L1(X,S, µ).

Theorem 10.0.18 (L1 Is A Normed Linear Space).L1(X,S, µ) is a vector space over < with scalar multiplication and object addition defined as

α [f ] = [α f ], ∀ [f ]

[f ] + [g] = [f + g], ∀[f ] and [g].

Further, ‖ [f ] ‖1 defined by

‖ [f ] ‖1 =∫|g| dµ,

for any representative g of [f ], is a norm on L1(X,S, µ).

Proof 10.0.15. The definition of scalar multiplication is clear. However, as usual, we can spend some time

with addition. We know f + g is defined on ECfg and that Efg has measure 0. Hence, if u ∈ [f ] and

v ∈ [g], then u = f and v = g except on sets A and B of measure 0. Also, as usual, the sum u + v is

defined on ECuv. Hence,

u + v = f + g, x ∈ ECuv ∩ ECfg ∩ AC ∩ BC .

which is the complement of a set of measure 0. Hence, u + v ∈ [f + g]. Thus, [f ] + [g] ⊆ [f + g].Conversely, let h ∈ [f + g]. Now

(f + g) IEfgC = f IEfgC + g IEfgC .

Hence, if we let

u = f IEfgC and v = g IEfgC ,

we see h ∼ (u + v), with u ∈ [f ] and v ∈ [g]. We conclude [f + g] ⊆ [f ] + [g]. Hence, the addition of

equivalence classes makes sense.

We now turn our attention to the possible norm ‖ [f ] ‖1. First, we must show that our definition of

norm is independent of the choice of representative chosen from [f ]. If g ∼ f , then g = f except on a set

A of measure 0. Thus, we know the integral of f and g match by Lemma 9.4.6. Here are the details:∫|g| dµ =

∫A|g| dµ +

∫AC|g| dµ

= 0 +∫AC|f | dµ

=∫A|f | dµ +

∫AC|f | dµ

224

10.1. THE GENERAL LP SPACES CHAPTER 10. THE LP SPACES

=∫|f | dµ.

We conclude the value of ‖ [f ] ‖1 is independent of the choice of representative from [f ]. Now we prove

this is a norm.

(N1): ‖ [f ] ‖1=∫|g|dµ ≥ 0.

(N2): If ‖ [f ] ‖1= 0, then for any representative g of [f ], we have∫|g|dµ = 0. By Lemma 9.4.6, this

implies that g = 0 a.e. and hence, g ∈ [0] (we abuse notation here by simply writing the zero function

h(x) = 0, ∀x as 0 ). But since g ∈ [f ] also, this means [f ] ∩ [0] is nonempty. This immediately implies

that [f ] = [0]. Conversely, if [f ] = [0], the result is clear.

(N3): Let α be a real number. Then, if g is any representative of [f ], we have α g is a representative of

[α f ]. We find

‖ [α f ] ‖1 =∫|α g| dµ = |α|

∫|g| dµ

= |α| ‖ [f ] ‖ .

(N4): The triangle inequality follows from the triangle inequality that holds for the representatives.

10.1 The General Lp spaces

We can construct additional spaces of summable functions. Let p be a real number satisfying 1 ≤ p < ∞.

Then the function φ(u) = up is a continuous function on [0,∞) that satisfies limn φ(n) = ∞. Thus, by

Lemma 8.8.3, if f is an extended real - valued function on X , then the composition φ |f | or |f |p is also

measurable. Hence, we know the integral∫|f |p dµ exists as an extended real - valued number. The class

of measurable functions that satisfy∫|f |p dµ <∞ is another interesting class of functions.

We begin with some definitions.

Definition 10.1.1 (The Space Of p Summable Functions).(X,S, µ) be a nonempty measurable space. Let p be a real number satisfying 1 ≤ p < ∞. Then,

|f |p is a measurable function. We let

Lp(X,S, µ) = f ∈M(X,S, µ) |∫|f |p dµ <∞.

For later use, we will also define what are called conjugate index pairs.

225


Definition 10.1.2 (Conjugate Index Pairs).Let p be a real number satisfying 1 ≤ p ≤ ∞. If 1 < p is finite, the index conjugate to p is the real

number q satisfying1p

+1q

= 1,

while if p = 1, the index conjugate to p is q =∞.

We will be able to show that Lp(X,S, µ) is a vector space under the usual scalar multiplication and

addition operations once we prove some auxiliary results. These are the Holder’s and Minkowski’s In-

equality. First, there is a standard lemma we will call the Real Number Conjugate Indices Inequality.

Lemma 10.1.1 (Real Number Conjugate Indices Inequality).Let 1 < p <∞ and q be the corresponding conjugate index. Then if α and β are positive numbers,

AB ≤ Ap

p+

Bq

q.

Proof 10.1.1. This proof is standard in any Linear Analysis book and so we will not repeat it here.

Theorem 10.1.2 (Holder’s Inequality).Let 1 < p < ∞ and q be the index conjugate to p. Let f ∈ Lp(X,S, µ) and g ∈ Lq(X,S, µ). Then

f g ∈ L1(X,S, µ) and

∫|fg| dµ ≤

(∫|f |p dµ

)1/p (∫|g|q dµ

)1/q

.

Proof 10.1.2. The result is clearly true if f = g = 0 a.e. Also, if∫|f |pdµ = 0, then |f |p = 0 a.e. which

tells us f = 0 a.e. and the result follows again. We handle the case where∫|g|qdµ = 0 in a similar

fashion. Thus, we will assume both Ip =∫|f |pdµ > 0 and Jq =

∫|g|qdµ > 0.

Let Ef and Eg be the sets where f and g are not finite. By our assumption, we know the measure of

these sets is 0. Hence, for all x in ECf ∩ ECg , the values f(x) and g(x) are finite. We apply Lemma 10.1.1

to conclude|f(x)|I

|g(x)|J

≤ (1/p)|f(x)|p

Ip+ (1/q)

|g(x)|q

Jq.

holds onECf ∩ECg . Off of this set, we have that the left hand side is∞ and so is the right hand side. Hence,

even on Ef ∪ Eg, the inequality is satisfied. Thus, since the function on the right hand side is summable,

we must have the left hand side is a summable function too by Theorem 9.6.1. Hence, f g ∈ L1(X,S, µ).

We then have ∫|f(x)|I

|g(x)|J

dµ ≤∫

(1/p)|f(x)|p

Ipdµ +

∫(1/q)

|g(x)|q

Jqdµ

226


=1pIp

∫|f(x)|p dµ +

1qJq

∫|g(x)|q dµ

=1p

+1q

= 1.

Thus, ∫|f g| dµ ≤ I J =

(∫|f |p dµ

)1/p (∫|g|q dµ

)1/p

.

The special case of p = q = 2 is of great interest. The resulting Holder’s Inequality is often called the

Cauchy - Schwartz Inequality. We see

Theorem 10.1.3 (Cauchy - Bunyakovskii - Schwartz Inequality).Let f, g ∈ L2(X,S, µ). Then f g ∈ L1(X,S, µ) and

∫|fg| dµ ≤

(∫|f |2 dµ

)1/2 (∫|g|2 dµ

)1/2

.

Theorem 10.1.4 (Minkowski’s Inequality).Let 1 ≤ p <∞ and let f, g ∈ Lp(X,S, µ). Then f + g is in Lp(X,S, µ) and

(∫|f + g|p dµ

)1/p

≤(∫

|f |p dµ)1/p

+(∫

|g|p dµ)1/p

.

Proof 10.1.4. If p = 1, this is property N4 of the semi-norm ‖ · ‖1. Thus, we can assume 1 < p < ∞.

Since f and g are measurable, we define the sum of f + g as h = (f + g) IA where A = ECfg with

µ(Efg = 0. Then as discussed h is measurable. We see on A,

|f(x) + g(x)| ≤ |f(x)| + |g(x)| ≤ 2 max|f(x)|, |g(x)|

even when function values are∞. Hence,

|f(x) + g(x)|p ≤ 2p(

max|f(x)|, |g(x)|)p≤ 2p

(|f(x)|p + |g(x)|p

).

Then, since the right hand side is summable, so is the left hand side. We conclude f + g is in Lp(X,S, µ).

Further,

|f(x) + g(x)|p = |f + g| |f + g|p−1 ≤ |f | |f + g|p−1 + |g| |f + g|p−1.

227


We have the identity

|f(x) + g(x)|p ≤ |f | |f + g|p−1 + |g| |f + g|p−1. (∗)

Now since p and q are conjugate indices, we know

(1/p) + (1/q) = 1 ⇒ p + q = pq

⇒ p = q(p− 1).

Thus, the function (∣∣∣∣f + g

∣∣∣∣p−1)q

= |f + g|p,

and so this function is summable implying |f + g|p−1 ∈ Lq(X,S, µ). Now apply Holder’s Inequality to

the two parts of the right hand side of Equation ∗. We find

∫|f | |f + g|p−1 dµ ≤

(∫|f |p dµ

)1/p (∫ (|f + g|p−1

)qdµ

)1/q

.

and ∫|g| |f + g|p−1 dµ ≤

(∫|g|p dµ

)1/p (∫ (|f + g|p−1

)qdµ

)1/q

.

But we have learned we can rewrite the second terms of the above inequalities to get

∫|f | |f + g|p−1 dµ ≤

(∫|f |p dµ

)1/p (∫ (|f + g|p dµ

)1/q

.

and ∫|g| |f + g|p−1 dµ ≤

(∫|g|p dµ

)1/p (∫ (|f + g|p dµ

)1/q

.

Thus, combining, we have

∫|f + g|p dµ ≤

((∫|f |p dµ

)1/p

+(∫

|g|p dµ)1/p

)(∫ (|f + g|p dµ

)1/q

.

We can rewrite this as(∫|f + g|p dµ

)1−1/q

≤(∫

|f |p dµ)1/p

+(∫

|g|p dµ)1/p

.

Since 1− 1/q = 1/p, we have established the desired result.

228


Holder’s and Minkowski’s Inequalities allow us to prove that the Lp spaces are normed linear spaces.

Theorem 10.1.5 (Lp Is A Vector Space).Let (X,S, µ) be a measure space and let 1 ≤ p < ∞. Then, if scalar multiplication and object

addition are defined pointwise as usual, Lp(X,S, µ) is a vector space.

Proof 10.1.5. The only thing we must check is that if f and g are in Lp(X,S, µ), then so is f + g. This

follows from Minkowski’s inequality.

Since Lp(X,S, µ) is a vector space, the next step is to find a norm for the space.

Theorem 10.1.6 (The Lp Semi-Norm).Let (X,S, µ) be a measure space and let 1 ≤ p <∞. Define ‖ · ‖p on Lp(X,S, µ) by

‖ f ‖p =(∫

|f |p dµ)1/p

.

Then, ‖ · ‖p is a semi-norm.

Proof 10.1.6. Properties N1 and N3 of a norm are straightforward to prove. To see that the triangle

inequality holds, simply note that Minkowski’s Inequality can be rewritten as

‖ f + g ‖p ≤ ‖ f ‖p + ‖ g ‖p,

for arbitrary f and g in Lp(X,S, µ).

If we use the same notion of equivalence a.e. as did earlier, we can define the the collection of all

distinct equivalence classes of Lp(X,S, µ) under a.e. equivalence. This will be denoted by Lp(X,S, µ).

We can prove that this space is a normed linear space using the norm ‖ [·] ‖p.

Theorem 10.1.7 (Lp Is A Normed Linear Space).Let 1 ≤ p < ∞. Then Lp(X,S, µ) is a vector space over < with scalar multiplication and object

addition defined as

α [f ] = [α f ] ∀ [f ]

[f ] + [g] = [f + g],∀[f ] and [g].

Further, ‖ [f ] ‖p defined by

‖ [f ] ‖p =(∫

|g|p dµ)1/p

,

for any representative g of [f ] is a norm on Lp(X,S, µ).

229


Proof 10.1.7. The proof of this is quite similar to that of Theorem 10.0.18 and so we will not repeat it.

We will now show that Lp(X,S, µ) is a complete NLS. First, recall what a Cauchy Sequence means.

Definition 10.1.3 (Cauchy Sequence In Norm).Let (X, ‖ · ‖) be a NLS. We say the sequence (fn) of X is a Cauchy Sequence, if given ε > 0, there is

a positive integer N so that

‖ fn − fm ‖ < ε, ∀ n, m > N.

This leads to the definition of a complete NLS or Banach space.

Definition 10.1.4 (Complete NLS).Let (X, ‖ · ‖) be a NLS. We say the X is a complete NLS if every Cauchy sequence in X converges to

some object in X .

It is a standard proof to show that any sequence in a NLS that converges must be a Cauchy sequence.

Let’s prove that in the context of the Lp(X,S, µ) space to get some practice.

Theorem 10.1.8 (Sequences That Converge in Lp Are Cauchy).Let ([fn]) be a sequence in Lp(X,S, µ) which converges to [f ] in Lp(X,S, µ) in the ‖ · ‖p norm.

Then, ([fn]) is a Cauchy sequence.

Proof 10.1.8. Let ε > 0 be given. Then, there is a positive integer N so that if n > N , then

‖ [fn − f ] ‖p < ε/2.

Thus, if n and m are larger than N , we by property N4 that

‖ [fn − fm] ‖p = ‖ [(fn − f) + (f − fm)] ‖p ≤ ‖ [fn − f ] ‖p + ‖ [fm − f ] ‖p < ε.

This shows the sequence in a Cauchy sequence.

We will now show the Lp(X,S, µ) is a Banach space.

Theorem 10.1.9 (Lp Is A Banach Space).Let 1 ≤ p <∞. Then Lp(X,S, µ) is complete with respect to the norm ‖ · ‖p.

Proof 10.1.9. Let [fn] be a Cauchy sequence. These are the steps of the proof.

230


(Step 1): we find a subsequence ([gk]) so that for all k,∫|gk+1 − gk|p dµ < (1/2k)p (α)

(Step 2): Define the function g by

g(x) = |g1(x)| +∞∑k=1

|gk+1(x)− gk(x)|. (β)

We show that g satisfies

‖ g ‖p ≤ ‖ g1 ‖p + 1. (γ)

This implies that g, defined by Equation β, converges and is finite a.e.

(Step 3): Then, we show

f(x) = g1(x) +∞∑k=1

(gk+1(x)− gk(x)

)

is defined a.e. and is in Lp(X,S, µ). This is our candidate for the convergence of the Cauchy sequence.

(Step 4): We show gk converge to f in ‖ · ‖p.

(Step 5): We show [fn] converges to [f ] in ‖ · ‖p. This last step will complete the proof of completeness.

Now to the proof of these steps.

(Proof Step 1): For ε = (1/2), since [fn] is a Cauchy sequence, there is a positive integer N1 so that

n,m > N1 implies ∫|fn − fm|p dµ < (1/2).

Note we use representative fn ∈ [fn] for simplicity of exposition since the norms are independent of

choice of representatives. Define g1 = fN1+1.

Next, for ε = (1/2)2, there is a positive integer N2, which we can always choose so that N2 > N1, so

that n,m > N2 implies ∫|fn − fm|p dµ <

(1/(22)

)p.

Let g2 = fN2+1. Then, again by our choice of indices, we have∫|g2 − g1|p dµ < (1/2)p.

231


The next step is similar. For ε = (1/2)3, there is a positive integer N3, which we can always choose

so that N3 > N2, so that n,m > N3 implies∫|fn − fm|p dµ <

(1/(23)

)p.

Let g3 = fN3+1. Then, we have ∫|g3 − g2|p dµ <

(1/(22)

)p.

An induction argument thus shows that there is a subsequence [gk] that satisfies∫|gk+1 − gk|p dµ <

(1/(2k)

)p.

for all k ≥ 1. This establishes Equation α.

(Proof Step 2): Define the non negative sequence (hn) by

hn(x) = |g1(x)| +n∑k=1

|gk+1(x)− gk(x)|.

In this definition, there is the usual messiness of where all the differences are defined. Let’s clear that up.

Each pair (gk, gk+1) has a potential set Ek of measure zero where the subtraction is not defined. Thus, we

need to throw away the set E = ∪k Ek which also has measure 0. Thus, it is clear that all of the hn are

defined on EC . Now they may take on the value∞, but that is acceptable. We see hpn ↑ gp on EC . Apply

Fatou’s Lemma to (hn). We find∫ (lim inf hpn IEC

)dµ ≤ lim inf

∫hpn IEC dµ.

But, lim inf hpn = gp and so ∫gp IEC dµ ≤ lim inf

∫hpn IEC dµ.

The pth root function is continuous and so

lim inf(∫

hpn IEC dµ

)1/p

=(

lim inf∫

hpn IEC dµ

)1/p

.

Then, since the pth root function is increasing, we have(∫gp IEC dµ

)1/p

≤ lim inf(∫

hpn IEC dµ

)1/p

.

232


Next, applying Minkowski’s Inequality to a finite sum, we obtain

(∫hpn IEC

)1/p

=

(∫ (|g1|+

n∑k=1

|gk+1 − gk|)p

IEC

)1/p

≤ ‖ g1IEC ‖p +n∑k=1

‖ (gk+1 − gk)IEC ‖p .

Since the finite sum is monotonic increasing, we have immediately that the series

∞∑k=1

‖ (gk+1 − gk)IEC ‖p

is a well defined extended real - valued number. Thus, we have(∫hpn IEC

)1/p

≤ ‖ g1IEC ‖p +∞∑k=1

‖ (gk+1 − gk)IEC ‖p .

By Equation α, we also know that

∞∑k=1

‖ (gk+1 − gk)IEC ‖p ≤∞∑k=1

1/(2)k = 1.

Hence, we can actually say (∫g IEC

)1/p

≤ ‖ g1IEC ‖p +1

We conclude g IEC is in Lp(X,S, µ). Next, note if F = x | g(x)IEC (x) =∞, we know F has measure

0. Hence, g IEC∩FC is finite. This completes Step 2.

(Proof Step 3): Now define the function f by

f(x) =

g1(x) +∑∞

k=1

(gk+1(x)− gk(x)

), x ∈ EC ∩ FC

0 x ∈ E ∪ F.

Note, for x ∈ EC ∩ FC ,

|gk| = |g1 + (g2 − g1) + (g3 − g2) + . . . + (gk − gk−1|

≤ |g1| +k∑i=1

|gk+1 − gk| = hk.

However, we already seen that on this set hk ↑ g. Hence, we can say

|gk| ≤ g.

233


This tells us that the partial sum expansion of gk converges absolutely on EC ∩FC and thus, gk converges

to g. But g = f on this set, so we have shown that gk converges to f a.e. We can now apply the Lebesgue

Dominated Convergence Theorem to say

limn

∫gn dµ =

∫f dµ.

Since |gk| ≤ g for all k, it follows |f |p ≤ |g|p. Since g is p summable, we have established that f is in

Lp(X,S, µ).

(Proof Step 4): Now we show gk converges to f in ‖ · ‖p. To see this, let zk = f − gk on EC ∩FC . From

the definition of f , we can write this as∑∞

j=k (gj+1 − gj). The rest of the argument is very similar to the

one used in Step 2. Consider the partial sums of this convergent series

znk =n∑j=k

|gj+1 − gj|.

Minkowski’s Inequality then gives for all n,

‖ znk ‖p ≤n∑j=k

‖ gj+1 − gj ‖p .

Using Equation α, it follows that the right hand side is bounded above by∑n

j=k 1/2j which sums to

1/2n−1. Now apply Fatou’s Lemma to find∫lim inf |znk |p ≤ lim inf

∫|znk |p

or ∫|zk|p ≤ lim inf

∫|znk |p.

The continuity and increasing nature of the pth root then give us(∫|zk|p

)1/p

≤ lim inf(∫

|znk |p)1/p

≤ lim inf (1/2n−1) = 0.

Thus, ‖ f − gk ‖→ 0.

(Proof Step 5): Finally, given ε > 0, since [fn] is a Cauchy sequence, there is an N so that

‖ fn − fm ‖p< ε/2, ∀ n,m > N.

Since ([gk]) is a subsequence of ([fn]), there is a K1 so that if k > K1, we have

‖ fm − gk ‖p< ε/2, ∀m > N, k > K1.

234

10.2. THE WORLD OF COUNTING MEASURE CHAPTER 10. THE LP SPACES

Also, since gk → f in p - norm, there is a K2 so that

‖ gk − f ‖p< ε/2, ∀ k > K2.

We conclude for any given k > max(K1,K2), we have

‖ fm − f ‖p ≤ ‖ fm − gk ‖p + ‖ gk − f ‖p < ε

if m > N . Thus, [fn]→ [f ] in p - norm.

The proof of the theorem above has buried in it a powerful result. We state this below.

Theorem 10.1.10 (Sequences That Converge In p - Norm Possess Subsequences Converging Point-

wise a.e.).Let 1 ≤ p < ∞. Let ([fn]) be a sequence in Lp(X,S, µ) which converges in norm to [f ] in

Lp(X,S, µ). Then, there is a subsequence ([f1n]) of ([fn]) which converges pointwise a.e. to f .

Proof 10.1.10. The sequence we seek is the sequence (gn) as defined in the proof of Theorem 10.1.9; see

the discussion for the proof of Step (3).

10.2 The World Of Counting Measure

Let’s see what the previous material means when we use counting measure, µC , on the set of natural

numbers N. In this case, the sigma - algebra is the power set of N. Note if f : N→ <, then f is identified

with a sequence of extended real - valued numbers, (an) so that f(n) = an. It is therefore possible for

f(n)∞ or f(n) = −∞ for some n. Let

φN (n) =

|f(n)|, 1 ≤ n ≤ N

0, n > N

Then, φN ↑ f and so by the Monotone Convergence Theorem,∫|f | dµC = lim

N

∫φN (n) dµC .

Now the simple functions φN are not in their standard representation. Let c1, . . . , cM be the distinct

elements of |a1|, . . . , |aN |. Then we can write

φN =M∑i=1

ci IEi ,

235

10.2. THE WORLD OF COUNTING MEASURE CHAPTER 10. THE LP SPACES

where EI is the pre-image of each distinct element ci. The sets Ei are clearly disjoint by construction. It

is a straightforward matter to see that

∫φN dµC =

M∑i=1

ci µCEi =N∑i=1

|ai|.

Thus, we have

∫|f | dµC = lim

N

N∑i=1

|f(i)|.

Since all the terms |f(i)| are non negative, we see the sequence of partial sums converges to some extended

real - valued number (possibly∞). For counting measure, the only set of measure 0 is ∅, so measurable

functions can not differ on a set of measure 0 in this case. We see for 1 ≤ p <∞,

Lp(N,P(N), µC) = Lp(N,P(N), µC).

Further,

Lp(N,P(N), µC) = sequences (an) |∞∑i=1

|ai|p converges .

We typically use the notation

`p = Lp(N,P(N), µC) = sequences (an) |∞∑i=1

|ai|p converges .

and we call this a sequence space. Note in all cases, summability implies the sequence involved must be

finite everywhere.

In this context, Holder’s Inequality becomes:

Theorem 10.2.1 (Holder’s Inequality: Sequence Spaces).Let 1 < p < ∞ and q be the index conjugate to p. Let (an) ∈ `p and (bn) ∈ ellq. Then (an bn) ∈ `1and ∑

n

|an bn| ≤(∑

n

|an|p)1/p (∑

n

|bn|q)1/q

.

and Minkowski’s Inequality becomes

236

10.3. ESSENTIALLY BOUNDED FUNCTIONS CHAPTER 10. THE LP SPACES

Theorem 10.2.2 (Minkowski’s Inequality: Sequence Spaces).Let 1 ≤ p <∞ and let (an), (bn) ∈ `p. Then (an + bn) is in `p and

(∑n

|an + bn|p)1/p

≤(∑

n

|an|p)1/p

+(∑

n

|bn|p)1/p

.

Finally, the special case of p = q = 2 should be mentioned. The sequence space version of the

resulting Holder’s Inequality Cauchy - Schwartz Inequality has this form:

Theorem 10.2.3 (Cauchy - Bunyakovskii - Schwartz Inequality: Sequence Spaces).Let (an), (bn) ∈ `2. Then (an bn) ∈ `1 and

∑n

|an bn| ≤(∑

n

|an|2)1/2 (∑

n

|bn|2)1/2

.

10.3 Equivalence Classes of Essentially Bounded Functions

There is one more space to define. This will be the analogue of the space of bounded functions we use in

the definition of the Riemann Integral.

Definition 10.3.1 (Essentially Bounded Functions).Let (X,S, µ) be a measure space and let f be measurable. If E is a set of measure 0, let

ξ(E) = supx∈EC

|f(x)|

and

ρ∞(f) = inf ξ(E) | E ∈ S, µ(E) = 0.

If ρ∞(f) is finite, we say f is an essentially bounded function.

There are then two more spaces to consider:

237


Definition 10.3.2 (The Spaces of Essentially Bounded Functions).Let (X,S, µ) be a measure space. Then we define

L∞(X,S, µ) = f : X → < | f ∈M(X,S), ρ∞(f) <∞.

and defining equivalence classes using a.e. equivalence,

L∞(X,S, µ) = [f ] | ρ∞(f) <∞.

There is an equivalent way of characterizing an essentially bounded function. This requires another

definition.

Theorem 10.3.1 (Alternate Characterization Of Essentially Bounded Functions).Let (X,S, µ) be a measure space and f be a measurable function. Define q∞(f) by

q∞(f) = inf a | µ(x | |f(x)| > a) = 0.

Then, ρ∞(f) = q∞(f).

Proof 10.3.1. Let Ea = x | |f(x)| > a. If a is a number so that µ(Ea) = 0, then for any other

measurable set A with measure 0, we have

AC = AC ∩ Ea ∪ AC ∩ ECa .

Thus,

supAC|f | ≥ sup

AC∩Ea|f | ≥ a.

because if x ∈ AC ∩ Ea, then |f(x)| > a. Since we can do this for such a, it follows that

supAC|f | ≥ q∞(f).

Further, since the measurable set A with measure zero is arbitrary, we must have

ρ∞(f) ≥ q∞(f).

Next, we prove the reverse inequality. If µ(Ea) = 0, then by the definition of ρ∞(f), we have

ρ∞(f) ≤ supECa

|f | = sup|f(x)|≤a

|f(x)| ≤ a.

238


But this is true for all such a. Thus, ρ∞(f) is a lower bound for the set a|µ(Ea) = 0 and we can say

ρ∞(f) ≤ q∞(f).

We need to know that if two functions are equivalent with respect to the measure µ, then their ρ∞values agree.

Lemma 10.3.2 (Essentially Bounded Functions That Are Equivalent Have The Same Essential

Bound).Let (X,S, µ) be a measure space and f and g be equivalent measurable functions such that ρ(f) is

finite. Then ρ(g) = ρ(f).

Proof 10.3.2. Let E be the set of points where f and g are not equal. Then µ(E) = 0. Now,

0 ≤ µ

((|g(x)| > a) ∩ E

)≤ µ(E) = 0.

Thus,

µ

((|g(x)| > a)

)= µ

((|g(x)| > a) ∩ E

)+ µ

((|g(x)| > a) ∩ EC

)= µ

((|g(x)| > a) ∩ EC

).

But on EC , f and g match, so we have

µ

((|g(x)| > a)

)= µ

((|f(x)| > a) ∩ EC

)= µ

((|f(x)| > a)

),

by the same sort of argument we used on µ(

(|g(x)| > a))

. Hence, if µ(

(|f(x)| > a))

= 0, then

µ

((|g(x)| > a)

)= 0 as well. This immediately implies q∞(g) = q∞(f). The result then follows because

q∞ = ρ∞.

Finally, we can show that essentially bounded functions are bounded above by their essential bound

a.e.

Lemma 10.3.3 (Essentially Bounded Functions Bounded Above By Their Essential Bound a.e).Let (X,S, µ) be a measure space and f be a measurable functions such that ρ(f) is finite. Then

|f(x)| ≤ ρ(f) a.e.

239


Proof 10.3.3. Let E = (|f(x) > ρ∞(f)). It is easy to see that

E =∞⋃k=1

(|f(x)| > ρ∞(f) + 1/k

).

If you look at how q∞ is defined, if µ(|f(x)| > ρ∞(f) + 1/k) > 0, that would force q∞(f) = ρ∞(f) ≥ρ∞(f) + 1/k which is not possible. Hence, µ(|f(x)| > ρ∞(f) + 1/k) = 0 for all k. This means E has

measure 0 also. It is then clear from the definition of the set E that |f(x)| ≤ ρ∞(f) on EC .

We can now prove that L∞(X,S, µ) is a vector space with norm ‖ [f ] ‖∞= ρ∞(f).

Theorem 10.3.4 (The L∞ Semi-Norm).Let (X,S, µ) be a measure space Define ‖ · ‖∞ on L∞(X,S, µ) by

‖ f ‖∞ = ρ∞(g),

where g is any representative of [f ]. Then, ‖ · ‖∞ is a semi-norm.

Proof 10.3.4. We show ρ∞(·) satisfies all the properties of a norm except N2 and hence it is a semi - norm.

(N1): It is clear the N1 is satisfied because ρ∞(·) is always non negative.

(N2): Let 0X is the function defined to be 0 for all x and let Ea = x | |0X(x)| > a. It is clear Ea = ∅for all a > 0. Thus, since ρ∞ = q∞,

q∞(0X) = inf a | µ(Ea) = 0 = 0.

However, if q∞(f) = 0, let Fn = (|fn(x)| > 1/n). Then, by definition of q∞(f), it follows that µ(Fn) = 0and |f(x)| ≤ 1/n on the complement FCn . Let F = ∪ Fn. Then, µ(F ) = 0 and

FC =⋂n

FCn =⋂n

(|f(x)| ≤ 1/n

)=(f(x) = 0

).

Thus, f is 0 on FC and non zero on F which has measure 0. All that we can say then is that f = 0 a.e.

and hence, ‖ · ‖∞ does not satisfy N2.

(N3): If α is 0, the result is clear. If α is a positive number, then

q∞(αf) = inf a | µ(x | |α f(x)| > a) = 0

= inf a | µ(x | |f(x)| > a/α) = 0.

Let β = a/α and we have

q∞(αf) = inf α β | µ(x | |f(x)| > β) = 0

= α inf β | µ(x | |f(x)| > β) = 0

= α q∞(f).

240


If α is negative, simply write αf as |α| (−f) and apply the result for a positive α.

(N4): Now let f and g be in L∞(X,S, µ) with the sum f + g defined in the usual way on ECfg with

µ(Efg) = 0. Note on Efg itself, f(x) + g(x) = 0, so the sum is bounded above by ρ∞(f) + ρ∞(g) there.

Now by Lemma 10.3.3, there are sets F and G of measure 0 so that

|f(x)| ≤ ρ∞(f), ∀x ∈ FC ,

|g(x)| ≤ ρ∞(g), ∀x ∈ GC .

Thus,

|f(x) + g(x)| ≤ ρ∞(f) + ρ∞(g), ∀x ∈ FC ∩ GC ,

Thus, the measure of the set of points where |f(x) + g(x)| > ρ∞(f) + ρ∞(g) is zero as µ(F ∪G) = 0.

By definition of q∞, it then follows that

q∞(f + g) ≤ ρ∞(f) + ρ∞(g).

which implies the result.

Theorem 10.3.5 (L∞ Is A Normed Linear Space).Then L∞(X,S, µ) is a vector space over < with scalar multiplication and object addition defined as

α [f ] = [α f ] ∀ [f ]

[f ] + [g] = [f + g],∀[f ] and [g].

Further, ‖ [f ] ‖∞ defined by ‖ [f ] ‖∞= ρ∞(g), for any representative g of [f ] is a norm on

L∞(X,S, µ).

Proof 10.3.5. The argument that the scalar multiplication and addition of equivalence classes is the same

as the one we used in the proof of Theorem 10.1.5 and so we will not repeat it here. From Lemma 10.3.2

we know that any two functions which are equivalent a.e. will have the same value for ρ∞ and so ‖ [f ] ‖∞is independent of the choice of representative from [f ]. The proofs that properties N1, N3 and N4 hold

follow immediately from the fact that they hold for representatives of equivalence classes. It remains to

show that if ‖ [f ] ‖∞= 0, then [f ] = [0X ] where 0X is the zero function on X . However, we have already

established in the proof of Theorem 10.3.4 that such an f is 0 a.e. This tells us f ∈ [0X ]; thus, [f ] = [0X ].

Theorem 10.3.6 (L∞ Is A Banach Space).Then L∞(X,S, µ) is complete with respect to the norm ‖ · ‖∞.

241


Proof 10.3.6. Let ([fn] be a Cauchy sequence of objects in L∞(X,S, µ). Now everything is independent

of the choice of representative of an equivalence class, so for convenience, we will use as our representa-

tives the functions fn themselves. Then, by Lemma 10.3.3, there are sets En of measure 0 so that

|fn(x)| ≤ ρ∞(fn), ∀ x ∈ ECn .

Also, there are sets Fnm of measure 0 so that

|fn(x) − fm(x)| ≤ ρ∞(fn − fm), ∀ x ∈ FCnm.

Hence, both of the equations above hold on

U =∞⋂m=1

∞⋂n=1

(ECn ∩ FCnm

).

We then use De Morgan’s Laws to rewrite U as follows:

U =∞⋂m=1

∞⋂n=1

(En ∪ Fnm

)C

=∞⋂m=1

( ∞⋃n=1

(En ∪ Fnm

))C

=

( ∞⋃m=1

∞⋃n=1

(En ∪ Fnm

))C.

Clearly, the measure of U is 0 and

|fn(x) − fm(x)| ≤ ρ∞(fn − fm), ∀ x ∈ UC . (α)

Now since ([fn] is a Cauchy sequence with respect to ‖ · ‖∞, given ε > 0, there is a positive integer N

so that

|fn(x) − fm(x)| ≤ ρ∞(fn − fm) < ε/4, ∀ x ∈ UC , ∀ n, m > N. (β)

Equation β implies that at each x in UC , the sequence (fn(x)) is a Cauchy sequence of real numbers.

By the completeness of <, it then follows that limn fn(x) exists on UC . Define the function f : X → < by

f(x) =

limn fn(x), x ∈ UC ,

0 x ∈ U.

From Equation β, we have that

limn|fn(x) − fm(x)| ≤ ε/4, ∀ x ∈ UC ,∀m > N.

242


As usual, since the absolute value function is continuous, we can let the limit operation pass into the

absolute value function to obtain

|f(x) − fm(x)| ≤ ε/4, ∀ x ∈ UC , ∀m > N. (γ)

From the backwards triangle inequality, we then find

|f(x)| ≤ ε/4 + |fm(x)| < ε/4 + ρ∞(fm), ∀ x ∈ UC , ∀m > N.

Now fix M > N + 1. Then

|f(x)| < ε/2 + ρ∞(fM ), ∀ x ∈ UC .

Since the measure of the set (|f(x)| > ε/4 + ρ∞(fM )) is 0, from the definition of q∞(f), it then follows

that

q∞(f) ≤ ε/4 + ρ∞(fM ).

We have thus shown f is essentially bounded and since f equals f a.e., we have f is in L∞(X,S, µ).

It remains to show that [fn] converges to [f ] in norm. Note that Equation γ implies that (fn) converges

uniformly on UC . Further, the measure of the set (|fn(x)− f(x)| > ε/4) is 0. Thus, we can conclude

q∞(f − fm) ≤ ε/4 < ε, ∀m > N.

This shows the desired convergence in norm.

Thus, we have shown L∞(X,S, µ) is complete.

From the proofs above, we see Minkowski’s Inequality holds for the case p = ∞ because ‖ · ‖∞ is

a norm. Finally, we can complete the last case of Holder’s Inequality: the case of the conjugate indices

p = 1 and q =∞. We obtain

Theorem 10.3.7 (Holder’s Inequality: p = 1).Let p = 1 and q = ∞ be the index conjugate to 1. Let [f ] ∈ L1(X,S, µ) and [g] ∈ L∞(X,S, µ).

Then [f g] ∈ L1(X,S, µ) and ∫|fg| dµ ≤ ‖ [f ] ‖1 ‖ [g] ‖∞.

Proof 10.3.7. It is enough to prove this result for the representatives of the equivalence classes f ∈ [f ]and g ∈ [g]. We know the product fg is measurable. It remains to show that fg is summable. Since g is

essentially bounded, by Lemma 10.3.3, there is a sets E of measure 0 so that

|g(x)| ≤ ρ∞(g), ∀ x ∈ EC .

243

10.4. THE HILBERT SPACE L2 CHAPTER 10. THE LP SPACES

Thus, |f(x) g(x)| ≤ |f(x)| ρ∞(g) a.e. and since the right hand side is summable, by Theorem 9.6.1, we

see fg is also summable and∫|f g| dµ ≤

∫|f | ρ∞(g) dµ = ρ∞(g)

∫|f | dµ

10.4 The Hilbert Space L2

The space L2(X,S, µ) is a Normed linear space with norm ‖ [·] ‖2. This space is also an inner product

space which is complete. Such a space is called a Hilbert space.

Definition 10.4.1 (Inner Product Space).Let X be a non empty vector space over <. Let ω X × X → < be a mapping which satisfies

(IP1:) ω(x + y, z) = ω(x, z) + ω(y, x), ∀ x, y, z ∈ X,

(IP2:) ω(α x, y) = α ω(x, y), ∀ α ∈ <, ∀ x, y ∈ X,

(IP3:) ω(x, y) = ω(y, x), ∀ x, y ∈ X,

(IP4:) ω(x, x) ≥ 0, ∀ x, ∈ X, and ω(x, x) = 0 ⇔ x = 0.

Such a mapping is called an real inner product on the real vector spaceX . It is easy to define a similar

mapping on complex vector spaces, but we will not do that here. We typically use the symbol < ·, · >to denote the value ω(·, ·).

There is much more we could say on this subject, but instead we will focus on how we can define an

inner product on L2(X,S, µ).

Theorem 10.4.1 (The Inner Product on The Space of Square Summable Equivalence Classes).For brevity, let L2 denote L2(X,S, µ). The mapping < ·, · > on L2 × L2 defined by

< [f ], [g] > =∫

u v dµ, ∀ u ∈ [f ], v ∈ [g]

is an inner product on L2. Moreover, it induces the norm ‖ [·] ‖2 by

‖ [f ] ‖2 =

√∫|f |2 dµ

=√< [f ], [f ] >.

244

10.5. HOMEWORK CHAPTER 10. THE LP SPACES

Proof 10.4.1. The proof of these assertions is immediate as we have already shown ‖ · ‖2 is a norm and

the verification of properties IP1 to IP4 is straightforward.

Finally, from our general Lp results, we know L2 is complete. However, for the record, we state this

as a theorem.

Theorem 10.4.2 (The Space of Square Summable Equivalence Classes Is A Hilbert Space).For brevity, let L2 denote L2(X,S, µ). Then L2 is complete with respect to the norm induced by the

inner product < [·], [·] >. The inner product space (L2, < ·, · >) is often denoted by the symbolH.

10.5 Homework

Exercise 10.5.1. Let (X,S, µ) be a measure space. Let f be in Lp(X,Sµ) for 1 ≤ p < ∞. Let E =x | |f(x)| 6= 0. Prove E is σ - finite.

Hint 10.5.1. Divide the indicated set into (1 ≤ |f(x)|) and ∪n (1/n ≤ |f(x)| < 1).

Exercise 10.5.2. Let (X,S, µ) be a finite measure space. If f is measurable, let

En = x | n− 1 ≤ |f(x)| < n

.

Prove f is in L1(X,Sµ) if and only if∑∞

n=1 nµ(En) <∞.

More generally, prove f is in Lp(X,Sµ), 1 ≤ p <∞, if and only if∑∞

n=1 npµ(En) <∞.

Hint 10.5.2. Note E1 has finite measure because X does.

245

10.5. HOMEWORK CHAPTER 10. THE LP SPACES

246

Part V

Constructing Measures

247

Chapter 11

Constructing Measures

Although you now know quite a bit about measures, measurable functions, associated integration and the

like, you still do not have many concrete and truly interesting measures to work with. In this chapter, you

will learn how to construct interesting measures using some simple procedures. A very good reference

for this material is (Bruckner et al. (1) 1997). Another good source is (Taylor (7) 1985). We begin with a

definition.

11.1 Measures From Outer Measure

Definition 11.1.1 (Outer Measure).Let X be a non empty set and let µ∗ be an extended real valued mapping defined on all subsets of X

that satisfies

(i): µ∗(∅) = 0.

(ii): If A and B are subsets of X with A ⊆ B, then µ∗(A) ≤ µ∗(B).

(iii): If (An) is a sequence of subsets of X , then µ∗( ∪nAn) ≤∑

n µ∗(An).

Such a mapping is an outer measure on X and condition (iii) is called the countable subadditivity

(CSA) condition if the sets are disjoint.

Comment 11.1.1. Since ∅ ⊆ A for all A in X , condition (ii) tells us µ∗(∅) ≤ µ∗(A). Hence, by condition

(i), we have µ∗(A) ≥ 0 always. Thus, the outer measure is non negative.

The outer measure is defined on all the subsets of X . In Chapter 9, we defined the notion of a measure on

a σ - algebra of subsets of X . Look back at Definition 9.0.1 again. Recall, the mapping µ : S → < is a

measure on S if

249

11.1. VIA OUTER MEASURE CHAPTER 11. BUILDING MEASURES

(i): µ(∅) = 0,



µ(∪nEn) =∑

n µ(En).

The third condition says the mapping µ is countably additive and hence, we label this condition as condi-

tion (CA). The collection of all subsets of X is the largest σ - algebra of subsets of X , so to convert the

outer measure µ∗ into a measure, we have to convert the countable subadditivity condition to countable

additivity. This is not that easy to do! Now if T and E are any subsets of X , then we know

T =(T ∩ E

) ⋃(T ∩ EC

).

The outer measure µ∗ is subadditive on finite disjoint unions and so we always have

µ∗(T ) ≤ µ∗(T ∩ E

)+ µ∗

(T ∩ EC

).

To have equality, we need to have

µ∗(T ) ≥ µ∗(T ∩ E

)+ µ∗

(T ∩ EC

),

also. So, as a first set towards the countable additivity condition we need, why don’t we look at all subsets

E of X that satisfy the condition

µ∗(T ) ≥ µ∗(T ∩ E

)+ µ∗

(T ∩ EC

), ∀ T ⊆ X.

We don’t know how many such sets E there are at this point. But we certainly want finite additivity to

hold. Therefore, it seems like a good place to start. This condition is called the Caratheodory Condition.

Definition 11.1.2 (Caratheodory Condition).Let µ∗ be an outer measure on the non empty set X . A subset E of X satisfies the Caratheodory

Condition if for all subsets T ,

µ∗(T ) ≥ µ∗(T ∩ E

)+ µ∗

(T ∩ EC

).

Such a setE is called µ∗ measurable. The collection of all µ∗ measurable subsets ofX will be denoted

byM.

We will first prove that the collection of µ∗ measurable sets is an algebra of sets.

250


Definition 11.1.3 (Algebra Of Sets).Let X be a non empty set and let A be a nonempty family of subsets of X . We say A is an algebra of

sets if

(i): ∅ is in A.

(ii): If A and B are in A, so is A ∪B.

(iii): if A is in A, so is AC = X \A.

Theorem 11.1.1 (The µ∗ Measurable Sets Form An Algebra).Let X be a non empty set, µ∗ an outer measure on X and M be the collection of µ∗ measurable

subsets of X . ThenM is a algebra.

Proof 11.1.1. For the empty set,

µ∗(T ∩ ∅

)+ µ∗

(T ∩ ∅C

)= µ∗

(∅)

+ µ∗(T ∩ X

)= 0 + µ∗

(T

).

Hence ∅ satisfies the Caratheodory condition and so ∅ ∈ M.

Next, if A ∈ M, we note the Caratheodory condition is symmetric with respect to complementation and

so AC ∈M also.

To showM is closed under countable unions, we will start with the union of just two sets and then proceed

by induction. Let E1 and E2 be inM. Let T be in X . Then, since E1 and E2 both satisfy Caratheodory’s

condition, we know

µ∗(T ) = µ∗(T ∩ E1) + µ∗(T ∩ EC1 ) (a)

and

µ∗(T ) = µ∗(T ∩ E2) + µ∗(T ∩ EC2 ). (b)

In Equation b, let “T” be “T ∩ EC1 ”. This gives

µ∗(T ∩ EC1 ) = µ∗(T ∩ EC1 ∩ E2) + µ∗(T ∩ EC1 ∩ EC2 ). (c)

251


We also know that

T ∩ E1 = T ∩ (E1 ∪ E2) ∩ E1, T ∩ EC1 ∩ E2 = T ∩ (E1 ∪ E2) ∩ EC1 . (d)

Now replace the term “µ∗(T ∩ EC1 )” in Equation a by the one in Equation c. This gives

µ∗(T ) = µ∗(T ∩ E1) + µ∗(T ∩ EC1 ∩ E2) + µ∗(T ∩ EC1 ∩ EC2 ).

Next, replace the sets in the first two terms on the right side in the equation above by what is shown in

Equation d. We obtain

µ∗(T ) = µ∗(T ∩ (E1 ∪ E2) ∩ E1) + µ∗(T ∩ (E1 ∪ E2) ∩ EC1 ) + µ∗(T ∩ EC1 ∩ EC2 ).

But E1 is inM and so

µ∗(T ∩ (E1 ∪ E2)) = µ∗(T ∩ (E1 ∪ E2) ∩ E1) + µ∗(T ∩ (E1 ∪ E2) ∩ EC1 ).

Using this identity, we then have

µ∗(T ) = µ∗(T ∩ (E1 ∪ E2)) + µ∗(T ∩ EC1 ∩ EC2 )

= µ∗(T ∩ (E1 ∪ E2)) + µ∗(T ∩ (E1 ∪ E2)C),

using De Morgan’s laws. Since the set T is arbitrary, we have shown E1 ∪ E2 is also inM.

Since, E1 and E2 are inM, we now know EC1 ∪ EC2 is inM too. But this set is the same as E1 ∩ E2.

Thus,M is closed under intersection.

It then follows that E1 \E2 = E1 ∩EC2 is inM. SoM is also closed under set differences. Hence,M is

an algebra.

Theorem 11.1.2 (µ∗ Measurable Sets Properties).Let X be a non empty set, µ∗ an outer measure on X and M be the collection of µ∗ measurable

subsets of X . Then, if (En) is a countable disjoint sequence fromM, ∪nEn is inM and

µ∗(T ∩ ∪∞i=1 Ei) =∞∑i=1

µ∗(T ∩ Ei)

).

for all T in X .

Proof 11.1.2. Let “T” be “T ∩ (E1 ∪ E2) in the Caratheodory condition of E2. Then, we have

µ∗(T ∩ (E1 ∪ E2)) = µ∗(T ∩ (E1 ∪ E2) ∩ E2) + µ∗(T ∩ (E1 ∪ E2) ∩ EC2 ).

252


This simplifies to

µ∗(T ∩ (E1 ∪ E2)) = µ∗(T ∩ E2) + µ∗(T ∩ E1 ∩ EC2 ).

But E1 and E2 are disjoint. Hence, E1 is contained in EC2 . Hence, we can further simplify to

µ∗(T ∩ (E1 ∪ E2)) = µ∗(T ∩ E2) + µ∗(T ∩ E1).

Let’s do another step. Since E3 is inM, we have

µ∗(T ∩ (E1 ∪ E2 ∪ E3)) = µ∗(T ∩ (E1 ∪ E2 ∪ E3) ∩ E3)

+ µ∗(T ∩ (E1 ∪ E2 ∪ E3) ∩ EC3 ).

This can be rewritten as

µ∗(T ∩ (E1 ∪ E2 ∪ E3)) = µ∗(T ∩ E3) + µ∗(T ∩ E1 ∩ EC3 ∪ T ∩ E2 ∩ EC3 )

= µ∗(T ∩ E3) + µ∗(T ∩ E1 ∪ T ∩ E2)

= µ∗(T ∩ E3) + µ∗(T ∩ (E1 ∪ E2)),

because E1 ⊆ EC3 and E2 ⊆ EC3 since all the En are disjoint. Then, we can apply the first step to

conclude

µ∗(T ∩ (E1 ∪ E2 ∪ E3)) = µ∗(T ∩ E3) + µ∗(T ∩ E2) + µ∗(T ∩ E1).

We have therefore shown

µ∗(T ∩ (∪3i=1 Ei)) =

3∑i=1

µ∗(T ∩ Ei).

It is now clear, we can continue this argument by induction to show

µ∗(T ∩ (∪ni=1 Ei)) =n∑i=1

µ∗(T ∩ Ei). (a)

for any positive integer n. Further, sinceM is an algebra, induction also shows ∪ni−1 Ei is inM for

any such n. It then follows that for any T in X ,

µ∗(T ) = µ∗(T ∩ (∪ni=1 Ei)) + µ∗(T ∩ (∪ni=1 Ei)C).

Using Equation a, we then have

µ∗(T ) =n∑i=1

µ∗(T ∩ Ei) + µ∗(T ∩ (∪ni=1 Ei)C). (b)

253


Next, note for all n

T ∩ (∪ni=1 Ei)C ⊇ T ∩ (∪∞i=1 Ei)

C ,

and hence

µ∗(T ∩ (∪∞i=1 Ei)

C

)≤ µ∗

(T ∩ (∪ni=1 Ei)

C

).

Using this in Equation b, we find

µ∗(T ) ≥n∑i=1

µ∗(T ∩ Ei) + µ∗(T ∩ (∪∞i=1 Ei)C). (c)

Since this holds for all n, letting n→∞, we obtain

µ∗(T ) ≥∞∑i=1

µ∗(T ∩ Ei) + µ∗(T ∩ (∪∞i=1 Ei)C). (d)

Finally, since

∞⋃i=1

(T ∩ Ei) = T⋂

(∪∞i=1Ei),

by the countable subadditivity of µ∗, it follows that

µ∗(T⋂

(∪∞i=1Ei))

= µ∗( ∞⋃i=1

(T ∩ Ei))≤

∞∑i=1

µ∗(T ∩ Ei)

).

Using this in Equation c, we have

µ∗(T ) ≥ µ∗(T⋂

(∪∞i=1Ei))

+ µ∗(T ∩ (∪∞i=1 Ei)

C

). (e)

Since this holds for all subsets T , this tells us ∪n En is inM. This proves thatM is a σ - algebra.

However, with all this work already done, we can also derive a very nice result which will help us later.

Countable subadditivity of µ∗ gives us

µ∗(T ) ≤ µ∗(T⋂

(∪∞i=1Ei))

+ µ∗(T ∩ (∪∞i=1 Ei)

C

).

Hence, using countable subadditivity again,

µ∗(T ) ≤∞∑i=1

µ∗(T ∩ Ei)

)+ µ∗

(T ∩ (∪∞i=1 Ei)

C

). (f )

254


Combining Equation d and Equation f , we find

µ∗(T ) =∞∑i=1

µ∗(T ∩ Ei)

)+ µ∗

(T ∩ (∪∞i=1 Ei)

C

).

Thus, letting “T” be “T ∩ (∪nEn)”, we find

µ∗(T ∩ ∪∞i=1 Ei) =∞∑i=1

µ∗(T ∩ Ei)

). (g)

Theorem 11.1.3 (The Measure Induced By An Outer Measure).Let X be a non empty set, µ∗ an outer measure on X and M be the collection of µ∗ measurable

subsets of X . Then,M is a σ - algebra and µ∗ restricted toM is a measure we will denote by µ.

Proof 11.1.3. Recall thatM is a σ - algebra if

(i) ∅, X ∈ M.

(ii) If A ∈M, so is AC .

(iii) If An∞n=1 ∈ M, then ∪∞n=1 An ∈ M.

Since we knowM is an algebra of sets, all that remains is to show it is closed under countable unions. We

have already shown all the properties of a σ - algebra except closure under arbitrary countable unions.

The previous theorem, however, does give us closure under countable disjoint unions. So, let (An) be a

countable collection of sets inM. Letting

E1 = A1

E2 = A2 \ A1

... =...

En = An \(∪n−1i=1 Ai

)... =

...,

we see each En is inM by Theorem 11.1.1. Further, they are pairwise disjoint and so by Theorem 11.1.2,

we can conclude ∪n En is inM. But it is easy to see that ∪n En = ∪nAn. Thus,M is a σ - algebra.

To show µ∗ restricted toM, µ, is a measure, we must show

(i): µ(∅) = 0,


255

11.2. VIA METRIC OUTER MEASURE CHAPTER 11. BUILDING MEASURES


µ(∪nEn) =∑

n µ(En).

Since µ∗(∅) = 0, condition (i) follows immediately. Also, we know µ∗(E) ≥ 0 for all subsets E, and so

condition (ii) is valid. It remains to show countable additivity. Let (Bn) be a countable disjoint family in

M. We can apply Equation g to conclude, using T = X , that

µ∗(∪∞i=1 Bi) =∞∑i=1

µ∗(Bi).

Finally, since µ∗ = µ on these sets, we have shown µ is countably additive and so is a measure.

It is also true that the measure constructed from an outer measure in this fashion is a complete measure.

Theorem 11.1.4 (The Measure Induced By An Outer Measure Is Complete).If E is a subset of X satisfying µ∗(E) = 0, then E ∈ M. Also, if F ⊆ E, then F ∈ M as well, with

µ∗(F ) = 0. Note, this tells us that if µ(E) = 0, then subsets of E are also inM with µ(F ) = 0; i.e.,

µ is a complete measure.

Proof 11.1.4. We know µ∗(T ∩ E) ≤ µ∗(E) for all T ; hence, µ∗(T ∩ E) = 0 here. Thus, for any T ,

µ∗(T ∩ E) + µ∗(T ∩ EC) = µ∗(T ∩ EC) ≤ µ∗(T ).

This tells us E satisfies the Caratheodory condition and so is inM. Thus, we have µ(E) = 0. Now, let

F ⊆ E. Then, µ∗(F ) = 0 also; hence, by the argument above, we can conclude F ∈ M with µ(F ) = 0.

11.2 Measures From Metric Outer Measure

Definition 11.2.1 (Metric Outer Measure).Let (X, d) be a non empty metric space and for two subsetsA andB ofX , define the distance between

A and B by

D(A,B) = inf d(a, b) | a ∈ A, b ∈ B .

If µ∗ is an outer measure on X which satisfies

µ∗(A ∪ B) = µ∗(A) + µ∗(B)

whenever D(A,B) > 0, we say µ∗ is metric outer measure.

The σ algebra of open subsets of X is called the Borel σ algebra B. We can use the construction

process in Section 11.1 to construct a σ algebra of subsets,M, which satisfy the Caratheodory condition

256


for this metric outer measure µ∗. This gives us a measure on M. We would like to be able to say that

open sets in the metric space are µ∗ measurable. Thus, we want to prove B ⊆ M. This is what we do in

the next theorem. It is becoming a bit cumbersome to keep saying µ∗ measurable for the sets inM. We

will make the following convention for later use: a set inM will be called OMI measurable, where OMIstands for outer measure induced.

Theorem 11.2.1 (Open Sets in a Metric Space Are OMI Measurable).Let (X, d) be a non empty metric space and µ∗ a metric outer measure on X . Then open sets are OMI

measurable.

Proof 11.2.1. let E be open in X . To show E is µ∗ measurable we must show

µ∗(T ) ≥ µ∗(T ∩ E) + µ∗(T ∩ EC)

for all subsets T in X . Since this is true for all subsets with µ∗(T ) =∞, it suffices to prove the inequality

is valid for all subsets with µ∗(T ) finite. Also, we already know ∅ and X are µ∗ measurable, so we can

further restrict our attention to nonempty strict subsets E of X . We will prove this in a series of steps:

Step (i): Let En be defined for each positive integer n by

En = x |D(x,EC) >1n.

It is clear En ⊆ E and that En ⊆ En+1.

Note, if y ∈ En and x ∈ Ec, we have d(y, x) > 1/n and so

infy∈En, x∈Ec

d(y, x) ≥ 1n

and so D(En, EC) ≥ 1/n. This immediately tells us

D(T ∩ En, T ∩ EC) ≥ 1/n

also for all T .

Since µ∗ is a metric outer measure, we then have

µ∗(

(T ∩ En) ∪ (T ∩ EC))

= µ∗(T ∩ En) + µ∗(T ∩ EC).

However, we also know En is a subset of E and so

(T ∩ En) ∪ (T ∩ EC) ⊆ (T ∩ E) ∪ (T ∩ EC) = T.

257


We conclude then

µ∗(

(T ∩ En) ∪ (T ∩ EC))≤ µ∗(T ).

Hence, for all T , we have

µ∗(T ∩ En)

)+ µ∗

(T ∩ EC)

)≤ µ∗(T ). (∗)

Step (ii): If limn µ∗(T ∩ En) = µ∗(T ), then letting n go to infinity in Equation ∗, we would find

µ∗(T ∩ E)

)+ µ∗

(T ∩ EC)

)≤ µ∗(T ).

This means E satisfies the Caratheodory condition and so is µ∗ measurable.

To show this limit acts in this way, we will construct a new sequence of sets (Wn) that are disjoint from

one another with E = ∪nWn so that the new sets Wn have useful properties. Since E is open, every point

p in E is an interior point. Thus, there is a positive r so that B(p; r) ⊆ E. So, if z ∈ EC , we must have

and d(p, z) ≥ r. It follows that D(p,EC) ≥ r > r/2. We therefore know that p ∈ En for some n. Since

our choice of p is arbitrary, we have shown

E ⊆ ∪n En.

It was already clear that ∪nEn ⊆ E; we conclude E = ∪nEn. We then define the needed disjoint

collection (Wn) as follows

W1 = E1

W2 = E2 \ E1

W2 = E3 \ E2

......

...

Wn = En \ En−1

(It helps to draw a picture here for yourself in terms of the annuli En \ En−1. We can see that for any n,

we can write

T ∩ E = (T ∩ En)⋃∪∞k=n+1 (T ∩ Wk)

as the terms T ∩ Wk give the contributions of each annuli or strip outside of the core En. Hence,

µ∗(T ∩ E) ≤ µ∗(T ∩ En) +∞∑

k=n+1

(T ∩ Wk) (∗∗)

258


because µ∗ is subadditive. At this point, the series sum∑∞

k=n+1 (T ∩ Wk)could be ∞; we haven’t

determined if it is finite yet.

For any k > 1, if x ∈Wk, then x ∈ Ek \ Ek−1 and so

1k≤ D(x, EC) ≤ 1

k − 1.

Next, if x ∈Wk and y ∈Wk+p for any p ≥ 2, we can use the triangle inequality with an arbitrary z ∈ EC

to conclude

d(x, z) ≤ d(x, y) + d(y, z).

But, this says

d(x, y) ≥ d(x, z) − d(y, z)

≥ D(x,EC) − d(y, z) >1k− d(y, z).

We have shown the fundamental inequality

d(x, y) >1k− d(y, z), ∀ x ∈ Wk, ∀ y ∈ Wk+p (α)

holds for p ≥ 2. The definition of the set Ek+p then implies for these p,

1k + p

< D(y,EC) ≤ 1k + p− 1

. (β)

Now consider how D(y,EC) is defined. Since this is an infimum, by the Infimum Tolerance Lemma, given

a positive ε, there is a zε ∈ EC so that

D(y,EC) ≤ d(y, zε) < D(y,EC) + ε.

Hence, using Equation β, we have

−d(y, zε) > −D(y,EC) − ε

> − 1k + p− 1

− ε.

Also, using Equation α, we find

d(x, y) >1k− d(y, zε)

259


>1k− − 1

k + p− 1− ε

=p− 1

k(k + p− 1)− ε.

Since ε > 0 is arbitrary, we conclude

d(x, y) ≥ p− 1k(k + p− 1)

> 0

for all x ∈Wk and y ∈Wk+p with p ≥ 2. Hence,

D(Wk, Wk+p) ≥p− 1

k(k + p− 1)> 0

It follows that

D(W1, W3) > 0

and, in general, we find this is true for the successive odd integers

D(W2k+1, W2k+3) > 0.

Since µ∗ is a metric outer measure, this allows us to say

n∑k=0

µ∗(T ∩ W2k+1) = µ∗(∪nk=0 T ∩ W2k+1

)≤ µ∗

(∪∞k=0 T ∩ W2k+1

)≤ µ∗(T ).

A similar argument shows that successive even integers satisfy

D(W2k, W2k+2) > 0.

Again, as µ∗ is a metric outer measure, this allows us to say

n∑k=0

µ∗(T ∩ W2k) = µ∗(∪nk=0 T ∩ W2k

).

Therefore, we have

n∑k=0

µ∗(T ∩ W2k) ≤ µ∗(∪∞k=0 T ∩ W2k

)≤ µ∗(T ).

260


We conclude

n∑k=0

µ∗(T ∩ Wk) =∑k even

µ∗(T ∩ Wk) +∑k odd

µ∗(T ∩ Wk)

≤ 2 µ∗(T )

for all n. This implies the sum∑

k µ∗(T ∩Wk) converges to a finite number.

Since the series converges, we now know given ε > 0, there is an N so that

∞∑k=n

µ∗(T ∩Wk) < ε,

for all n > N . Now go back to Equation ∗∗. We have for any n > N ,

µ∗(T ∩ E) ≤ µ∗(T ∩ En) + ε.

This tells us µ∗(T ∩ En)→ µ∗(T ∩ E). By our earlier remark, this completes the proof.

We can even prove more.

Theorem 11.2.2 (Open Sets In A Metric Space Are µ∗Measurable If and Only If µ∗ Is A Metric Outer

Measure).Let X be a non empty metric space. Then Open sets are µ∗ measurable if and only if µ∗ is a metric

outer measure.

Proof 11.2.2. If we assume µ∗ is a metric outer measure, then opens sets are µ∗ measurable by Theorem

11.2.1.

On the other hand if we know that all the open sets of µ∗ measurable, this implies all Borel sets are µ∗

measurable as well. Let A and B be any two sets with D(A,B) = r > 0. For each x ∈ A, let

G(x) = u | d(x, u) < r/2

and

G =⋃x∈A

G(x).

Then G is open, A ⊆ G and G ∩ B = ∅. Since G is measurable, it satisfies the Caratheodory condition

using test set T = A ∪B; thus,

µ∗(A ∪ B

)= µ∗

((A ∪ B) ∩ G

)+ µ∗

((A ∪ B) ∩ GC

).

261

11.3. BUILDING OUTER MEASURE CHAPTER 11. BUILDING MEASURES

But (A ∪ B) ∩ G is simplified to A because A ⊆ B and B is disjoint from G. Further since A is disjoint

from GC and B ⊆ GC , we have (A ∪ B) ∩ GC = B. We conclude

µ∗(A ∪ B) = µ∗(A) + µ∗(B).

This shows µ∗ is a metric outer measure.

11.3 Constructing Outer Measure

We still have to find ways to construct outer measures. We want the resulting OMI measure we induce

have certain properties useful to us. Let’s discuss how to do this now.

Definition 11.3.1 (Premeasures and Covering Families).Let X be a nonempty set. Let T be a family of subsets of X that contains the empty set. This family is

called a covering family for X . Let τ be a mapping on T so that τ(∅) = 0. The mapping τ is called

a premeasure.

It is hard to believe, but even with virtually no restrictions on τ and T , we can build an outer measure.

Theorem 11.3.1 (Constructing Outer Measures Via Premeasures).Let X be a nonempty set. Let T be a covering family of subsets of X and τ : T → [0,∞] be a

premeasure. For any A in X , define

µ∗(A) = inf ∑n

τ(Tn) | Tn ∈ T , A ⊆ ∪n Tn

where the sequence of sets (Tn) from T is finite or countably infinite. Such a sequence is called

a cover. In the case where there are no sets from T that cover A, we define the infimum over the

resulting empty set to be∞. Then µ∗ is an outer measure on X .

Proof 11.3.1. To verify the mapping µ∗ is an outer measure on X , we must show

(i): µ∗(∅) = 0.

(ii): If A and B are subsets of X with A ⊆ B, then µ∗(A) ≤ µ∗(B).

(iii): If (An) is a sequence of disjoint subsets of X , then µ∗( ∪nAn) ≤∑

n µ∗(An).

It is straightforward to see condition (i) and (ii) are true. It suffices to prove condition (iii) is valid. Let

(An) be a countable collection, finite or infinite, of subsets of X . If there is an index n with τ(An) infinite,

then since µ∗(∪nAn) ≤ ∞ anyway, it is clear

µ∗(∪∞i=1 Ai) ≤∞∑i=1

µ∗(Ai) = ∞.

262


On the other hand, if µ∗(An) is finite for all n, given any ε > 0, we can use the Infimum Tolerance Lemma

to find a sequence of families (Tn k) in T so that

∞∑k=1

τ(Tn k) < µ∗(An) +ε

2n.

We also know that

∞⋃n=1

An ⊆∞⋃n=1

∞⋃k=1

Tn k.

Hence, the collection ∪n ∪k Tn k is a covering family for ∪n An) and so by the definition of µ∗, we have

µ∗( ∞⋃n=1

An

)≤

∞∑n=1

∞∑k=1

µ∗(Tn k

)

≤∞∑n=1

µ∗(An) +ε

2n

≤∞∑n=1

µ∗(An) + ε.

Since ε is arbitrary, we see µ∗ is countable subadditive and so is an outer measure.

There is so little known about τ and T , that it is not clear at all that

(i): T ⊆ M, whereM is the σ - algebra of sets that satisfy the Caratheodory condition for the outer

measure µ∗ generated by τ . If this is true, we will call M an OMI-F σ - algebra, where the “F”

denotes the fact that the covering family is inM.

(ii): If A ∈ T , then τ(A) = µ(A) where µ is the measure obtained by restricting µ∗ toM. If this is

true, we will call the constructed σ - algebra, an OMI-FE σ - algebra, where the “E” indicates the

fact the µ restricted to T recovers τ .

If τ represents some primitive notion of size of special sets, like length of intervals on the real line, we

normally want both condition (i) and (ii) above to be valid. We can obtain these results if we add a few

more properties to τ and T . First, T needs to be an algebra (which we have already defined) and τ needs

to be additive on the algebra.

Definition 11.3.2 (Additive Set Function).Let A of subsets of the set X be an algebra. Let ν be an extended real valued function defined on Awhich satisfies

(i): ν(∅) = 0.

(ii): If A and B in A are disjoint, then ν(A ∪B) = ν(A) + ν(B).

Then ν is called an additive set function on A.

263


We also need a property of outer measures called regularity.

Definition 11.3.3 (Regular Outer Measures).Let X be a nonempty set, µ∗ be an outer measure on X andM be the set of all µ∗ measurable sets of

X . The outer measure µ∗ is called regular if for all E in X there is a µ∗ measurable F ∈ M so that

E ⊆ F with µ∗(E) = µ(F ), where µ is the measure induced by µ∗ onM. The set F is often called a

measurable cover for E.

We begin with a technical lemma.

Lemma 11.3.2 (Condition For Outer Measure To Be Regular).Let X be a nonempty set, T a covering family and τ a premeasure. Then if the σ - algebra, M,

generated by τ using T contains T , µ∗ is regular.

Proof 11.3.2. Let A be a subset in X . We need to show there is measurable set B containing A so that

µ∗(A) = µ(B). If the µ∗(A) = ∞, then we can choose X as the needed set. Otherwise, we have

µ∗(A) is finite. Applying the Infimum Tolerance Lemma, for each m, there is a family of sets (Emn ) so that

A ⊆ ∪n Emn and

∑n

τ(Emn ) < µ∗(A) +1m.

Let

Em =⋃n

Emn

H =⋂m

Em;

these sets are measurable by assumption. Also, A ⊆ H and H ⊆ Em. Hence, µ∗(A) ≤ µ(H). We now

show the reverse inequality. For each m, we have

µ∗(Em) ≤∑n

µ∗(Emn ) ≤∑n

τ(Emn )

≤ µ∗(A) +1m.

Further, since H ⊆ Em for each m, we find

µ(H) ≤ µ∗(Em) ≤ µ∗(A) +1m.

This is true for all m; hence, it follows that µ(H) ≤ µ∗(A). Combining inequalities, we have µ(H) =µ∗(A) and so H is a measurable cover. Thus, µ∗ is regular.

264


Theorem 11.3.3 (Conditions For OMI-F Measures).Let X be a nonempty set, T a covering family which is an algebra and τ an additive set function on

T . Then the σ - algebra,M, generated by τ using T contains T and µ∗ is regular.

Proof 11.3.3. By Lemma 11.3.2, it is enough to show each member of T is measurable. So, let A be in

T . As usual, it suffices to show that

µ∗(T ) ≥ µ∗(T ∩ A) + µ∗(T ∩ AC)

for all sets T of finite outer measure. This will show A satisfies the Caratheodory condition and hence, is

measurable. Let ε > 0 be given. By the Infimum Tolerance Lemma, there is a family (An) from T so that

T ⊆ ∪nAn and ∑n

τ(An) < µ∗(T ) + ε.

since τ is additive on T , we know

τ(An) = τ(A ∩ An) + τ(AC ∩ An).

Also, we have

A⋂

T ⊆⋃n

(A ∩ An), and AC⋂

T ⊆⋃n

(AC ∩ An).

Hence,

µ∗(A ∩ T ) ≤∑n

µ∗(A ∩ An), µ∗(AC ∩ T ) ≤∑n

µ∗(AC ∩ An). (α)

µ∗(T ) + ε >∑n

τ(An) =∑n

τ(An ∩ A) +∑n

τ(An ∩ AC)

≥∑n

µ∗(An ∩ A) +∑n

µ∗(An ∩ AC)

≥ µ∗(A ∩ T ) + µ∗(AC ∩ T ),

by Equation α. Thus, A satisfies the Caratheodory condition and is measurable.

In order for condition (ii) to hold, we need to add one more additional property to τ : it needs to be a

pseudo-measure.

265

11.4. EXAMPLES CHAPTER 11. BUILDING MEASURES

Definition 11.3.4 (Pseudo-Measure).Let the mapping τ : A → [0,∞] be additive on the algebra A. Assume whenever (Ai) is a countable

collection of disjoint sets in A whose union is also in A (note this is not always true because A is not

a σ - algebra), then it is true that

τ(∪i Ai) =∑i

τ(Ai).

Such a mapping τ is called a pseudo-measure on A.

Theorem 11.3.4 (Conditions For OMI-FE Measures).Let X be a nonempty set, T a covering family which is an algebra and τ an additive set function on

T which is a pseudo-measure. Then the σ - algebra,M, generated by τ using T contains T , µ∗ is

regular and µ(T ) = τ(T ) for all T in T .

Proof 11.3.4. see Bruckner.

Comment 11.3.1. The results above tell us that we can construct measures so that T is contained inMand the measure recovers τ as long as the premeasure is a pseudo-measure and the covering family is an

algebra. This means the covering family must be closed under complementation. Hence, if we a covering

family such as the collection of all open intervals ( which we do when we construct Lebesgue measure

later) these theorems do not apply.

11.4 Worked Out Problems

Let’s work out a specific examples of this process to help the ideas sink in. Note the covering families here

do not simply contain open intervals!

Example 11.4.1. Let U be the family of subsets of < of the form (a, b], (−∞, b], (a,∞) and (−∞,∞)and the empty set. It is easy to show that F , the collection of all finite unions of sets from U is an algebra

of subsets of <.

Let τ be the usual length of an interval. and extend τ to F additively. This extended τ is a premeasure on

F . τ can then be used to define an outer measure as usual µ∗(τ). There is then an associated σ - algebra

of µ∗τ measurable sets of <,Mτ , and µ∗τ restricted toMτ is a measure is a measure, µτ .

We will now prove F is contained inMτ . Let’s consider the set I from U . Let T be any subset of < and

let ε > 0 be given. It is enough to consider sets T with µ∗(F ) finite. Then there is a cover (An) of sets

from the algebra F so that ∑n

τ(An) ≤ µ∗τ (T ) + ε.

266

11.4. EXAMPLES CHAPTER 11. BUILDING MEASURES

Now I ∩T ⊆ ∪n (An ∩ I) and IC ∩T ⊆ ∪n (An ∩ IC). So because F is an algebra, this means (An ∩ I)covers I ∩ T and (An ∩ IC) covers IC ∩ T . Hence,

µ∗τ (T ∩ I) ≤∑n

τ(An ∩ I),

µ∗τ (T ∩ IC) ≤∑n

τ(An ∩ IC).

Combining, we see

µ∗τ (T ∩ I) + µ∗τ (T ∩ IC) ≤∑n

(τ(An ∩ I) + τ(An ∩ IC)

).

But τ is additive on F , and hence

∑n

(τ(An ∩ I) + τ(An ∩ IC)

)=

∑n

τ(An).

Thus,

µ∗τ (T ∩ I) + µ∗τ (T ∩ IC) ≤ µ∗τ (T ) + ε.

Since ε > 0 is arbitrary, we have shown I satisfies the Caratheodory condition. This shows that I is OMI

measurable and so F ⊆Mτ .

Example 11.4.2. Let U be the family of subsets of < of the form (a, b], (−∞, b], (a,∞) and (−∞,∞)and the empty set. It is easy to show that F , the collection of all finite unions of sets from U is an algebra

of subsets of <. Let g be the monotone increasing function on < defined by g(x) = x3. Note g is right

continuous which means

limh→0+

g(x+ h) exists , ∀ x,

limx→−∞

g(x) exists,

limx→∞

g(x) exists.

where the last two limits are −∞ and∞ respectively. Define the mapping τg on U by

τg

((a, b]

)= g(b) − g(a),

τg

((−∞, b)

)= g(b) − lim

x→−∞g(x),

τg

((a,∞)

)= lim

x→∞g(x) − g(a),

τg

((−∞,∞)

)= lim

x→∞g(x) − lim

x→−∞g(x).

Extend τg to F additively as usual. This extended τg is a premeasure on F . τg can then be used to define

267

11.5. HOMEWORK CHAPTER 11. BUILDING MEASURES

an outer measure as usual µ∗(g). There is then an associated σ - algebra of µ∗g measurable sets of <,Mg,

and µ∗g restricted toMg is a measure, µg.

We will now prove F is contained inMg. Let’s consider the set I from U . Let T be any subset of < and

let ε > 0 be given. Again, it is enough to consider sets T with µ∗(F ) finite. Then there is a cover (An) of

sets from the algebra F so that ∑n

τg(An) ≤ µ∗g(T ) + ε.

Now I ∩ T ⊆ ∪n (An ∩ I) and IC ∩ T ⊆ ∪n (An ∩ IC). So

µ∗g(T ∩ I) ≤∑n

τg(An ∩ I),

µ∗g(T ∩ IC) ≤∑n

τg(An ∩ IC).

Combining, we see

µ∗g(T ∩ I) + µ∗g(T ∩ IC) ≤∑n

(τg(An ∩ I) + τg(An ∩ IC)

).

But τg is additive on F , and hence

∑n

(τg(An ∩ I) + τg(An ∩ IC)

)=

∑n

τg(An).

Thus,

µ∗g(T ∩ I) + µ∗g(T ∩ IC) ≤ µ∗g(T ) + ε.

Since ε > 0 is arbitrary, we have shown I satisfies the Caratheodory condition. This shows that I is OMI

measurable and so F ⊆Mg.

11.5 Homework

Exercise 11.5.1. Let X = (0, 1]. LetA consist of the empty set and all finite unions of half- open intervals

of the form (a, b] from X . Prove A is an algebra of sets of (0, 1].

Exercise 11.5.2. Let A be the algebra of subsets of (0, 1] given in Exercise 11.5.1. Let f be an arbitrary

function on [0, 1]. Define νf on A by

νf

((a, b]

)= f(b) − f(a).

268


Extend νf to be additive on finite disjoint intervals as follows: if (Ai) = (ai, bi]) is a finite collection of

disjoint intervals of (0, 1], we define

νf

(∪ni=1 (ai, bi]

)=

n∑i=1

f(bi) − f(ai).

1. Prove that νf is additive on A.

Hint 11.5.1. It is enough to show that the value of νf (A) is independent of the way in which we

write A as a finite disjoint union.

2. Prove νf is non negative if and only if f is non decreasing.

Exercise 11.5.3. If λ is an additive set function on an algebra of subsets A, prove that λ can not take on

both the value∞ and −∞.

Hint 11.5.2. If there is a set A in the algebra with λ(A) = ∞ and there is a set B in the algebra with

λ(B) = −∞, then we can find disjoint setsA′ andB′ inA so that λ(A′) =∞ and λ(B′) = −∞. But this

is not permitted as the value of λ(A′ ∪ B′) must be a well - defined extended real value not the undefined

value∞−∞.

Exercise 11.5.4. Let T be a covering family for a nonempty set X . Let τ be a non negative, possibly

infinite valued premeasure. For any A in X , define

µ∗(A) = inf ∑n

τ(Tn) | Tn ∈ T , A ⊆ ∪n Tn

where the sequence of sets (Tn) from T is finite or countably infinite. In the case where there are no sets

from T that cover A, we define the infimum over the resulting empty set to be∞.

Prove µ∗ is an outer measure on X .

Exercise 11.5.5. Let X = 1. 2, 3 and T consist of ∅, X and all doubleton subsets x, y of X . Let τ

satisfy

(i): τ(∅) = 0.

(ii): τ(x, y

)= 1 for all x 6= y in X .

(iii): τ(X) = 2.

(a): Prove the method of Exercise 11.5.4 gives rise to an outer measure µ∗ defined by µ∗(∅) = 0,

µ∗(X) = 2 and µ∗(A) = 1 for any other subset A of X .

(b): Now do the construction process again letting τ(X) = 3. What changes?

269


Exercise 11.5.6. Let X be the natural numbers N and let τ consist of ∅, N and all singleton sets. Define

τ(∅) = 0 and τ(x) = 1 for all x in N.

(a): Let τ(N) = 2. Prove the method of Exercise 11.5.4 gives rise to an outer measure µ∗. Determine

the family of measurable sets (i.e., the sets that satisfy the Caratheodory Condition ).

(b): Let τ(N) =∞ and answer the same questions as in Part (a).

(c): Let τ(N) = 2 and set τ(x) = 2−(x−1). Now answer the same questions as in Part (a).

(d): Let τ(N) = ∞ and again set τ(x) = 2−(x−1). Now answer the same questions as in Part (a).

You should see N is measurable but τ(N) 6= µ(N), where µ denotes the measure constructed in the

process of Part (a).

(e): Let τ(N) = 1 and again set τ(x) = 2−(x−1). Now answer the same questions as in Part (a).

What changes?

270

Chapter 12

Lebesgue Measure

We will now construct Lebesgue measure on <k. We will begin by defining the mapping µ∗ on the subsets

of <k which will turn out to be an outer measure. The σ - algebra of subsets that satisfy the Caratheodory

condition will be called the σ - algebra of Lebesgue measurable subsets. We will denote this σ - algebra

byM as usual. We will usually be able to tell from context what σ - algebra of subsets we are working

with in a given study area or problem. The primary references here are again (Bruckner et al. (1) 1997).

and (Taylor (7) 1985). We like the development of Lebesgue measure in (Taylor (7) 1985) better than that

of (Bruckner et al. (1) 1997) and so our coverage reflects that. In all cases, we have added more detail to

the proofs of propositions to help you build your analysis skills by looking hard at many interesting and

varied proof techniques.

12.1 Lebesgue Outer Measure

We will be working in <k for any positive integer k. We have to work our way through a fair bit of

definitional material; so be patient while we set the stage. We let x = (x1, x2, . . . , xk) denote a point in

the Euclidean space <k. An open interval in <k will be denoted by I and it is determined by the cross -

product of k intervals of the form (ai, bi) where each ai and bi is a finite real number. Hence, the interval

I has the form

I = Πki=1 (ai, bi).

The interval (ai, bi) is called the ith edge of I and the number ì = bi − ai is the length of the ith edge.

The content of the open interval I is the product of the edge lengths and is denoted by |I|; i.e.

|I| = Πki=1

(bi − ai

).

271

12.1. OUTER MEASURE CHAPTER 12. LEBESGUE MEASURE

We need additional terminology. The center of I is the point

p = (a1 + b1

2,a2 + b2

2, . . . ,

ak + bk2

);

if the interval J has the same center as the interval I , we say the intervals are concentric.

If I and J are intervals, for convenience of notation, let `J and Ì denote the vector of edge lengths of J

and I , respectively. In general, there is no relationship between `J and Ì . However, there is a special case

of interest. We note that if J is concentric with I and each edge in `J is a fixed multiple of the correspond-

ing edge length in Ì , we can say `J = λ Ì for some constant λ. In this case, we write J = λ I . It then

follows that |J | = λk |I|.

We are now ready to define outer measure on <k. Following Definition 11.3.1, we define a suitable

covering family T and premeasure τ . Then, the mapping µ∗ defined in Theorem 11.3.1 will be an outer

measure. For ease of exposition, let’s define this here.

Definition 12.1.1 (Lebesgue Outer Measure).Let T be the the collection of all open intervals in <k and define the premeasure τ by τ(I) = |I| for

all I in T . For any A in X , define

µ∗(A) = inf ∑n

|In| | In ∈ T , A ⊆ ∪n In

We will call a collection (In) whose union contains A a Lebesgue Cover of A.

Then, µ∗ is an outer measure on <k and as such induces a measure through the usual Caratheodory condi-

tion route. It remains to find its properties. The covering family here is not an algebra, so we can not use

Theorem 11.3.3 and Theorem 11.3.4 to conclude

(i): T ⊆M; i.e.M is an OMI-F σ - algebra.

(ii): If A ∈ T , then |A| = µ(A); i.e.M is an OMI-FE σ - algebra.

However, we will be able to alter our original proofs to get these results with just a little work.

Comment 12.1.1. (i): If I is an interval in <k, then (I) covers I itself and so by definition µ∗(I) ≤ |I|.

(ii): If x is a singleton set, choose any open interval I that has x as its center. Then, I is a cover of

x and so µ∗(x) ≤ |I|. We see the the concentric intervals 1/2n I also are covers of x and so

µ∗(x) ≤ 1/2n for all n. It follows µ∗(x) = 0.

(iii): From (ii), it clear that µ∗(E) = 0 if E is a finite set.

(iv): If E is countable, label its points by (an). Let ε > 0 be given. Then by the Infimum Tolerance

Lemma, there are intervals In having an as a center so that |In| < ε/2n. Then the intervals (In)

272


cover E and by definition,

µ∗(E) ≤∑n

|In| ≤∑n

ε/2n = ε.

Since ε is arbitrary, we see µast(E) = 0 if E is countable.

We want to see if µ∗(I) = |I|. This is not clear since our covering family is not an algebra. We now

need a technical lemma.

Lemma 12.1.1 (Sums Over Finite Lebesgue Covers Of I Dominate |I|).Let I be any interval of <k and let (I1, . . . , IN ) be any finite Lebesgue cover of I . Then

N∑n=1

|In| ≥ |I|.

The proof is based on an algorithm that cycles through the covering sets Ii one by one and picks out

certain relevant subintervals. We can motivate this by looking at an interval I in <2 whose closure is

covered by 3 overlapping intervals I1, I2 and I3. This is shown in Figure 12.1. We do not attempt to

indicate the closure of I in this figure nor the fact that the intervals I1 and so forth are open. We simply

draw boxes and you can easily remove or add edges in your mind to open an interval or close it.

I

I3

I2I1

An example in <2: cover I1,I2 and I3 of I .

Figure 12.1: Motivational Lebesgue Cover

These four intervals all have endpoints on both the x and y axes. If we draw all the possible constant x

and constant y lines corresponding to these endpoints, we subdivide the original four intervals into many

smaller intervals as shown in Figure 12.2.

In particular, if we looked at interval I1, it is divided into 16 subintervals (J1, i), for 1 ≤ i ≤ 16 as shown

in Figure 12.3.

273


I1, I2, I3 and I determine sub-divisions into smaller intervals.

Figure 12.2: Subdivided Lebesgue Cover

These rectangles are all disjoint and

I1 =16⋃i=1

J1, i.

although we won’t show it in a figure, I2 and I3 are also sliced up into smaller intervals; using the same

left to right and then downward labeling scheme that we used for I1, we have

• I2 is divided by 4 horizontal and 4 vertical lines into 16 disjoint subintervals, J2,1 to J2,16. Further,

I2 =16⋃i=1

J2, i.

• I3 is divided by 4 horizontal and 6 vertical lines into 24 disjoint subintervals, J3,1 to J3,24. We thus

know

I3 =24⋃i=1

J3, i.

Finally, I is also subdivided into subintervals: it is divided by 4 horizontal and 2 vertical lines into 8disjoint subintervals, J1 to J8 and

I =8⋃i=1

J i.

274


J1,13

J1,9

J1,5

J1,1

J1,14

J1,10

J1,6

J1,2

J1,15

J1,11

J1,7

J1,3

J1,16

J1,12

J1,8

J1,4

I1

I1 is subdivided into 16 newrectangles, J1,1 to J1,16.

Figure 12.3: Subdivided I1

We also know

|I| =8∑i=1

|Ji|,

|I1| =16∑i=1

|J1,i|,

|I2| =16∑i=1

|J2,i|,

|I3| =24∑i=1

|J3,i|.

Now look at Figure 12.2 and you see immediately that the intervals Jkj and Jpq are either the same or are

disjoint. For example, the subintervals match when interval I2 and I3 overlap. We can conclude each Ji is

disjoint from a Jkj or it equals Jkj for some choice of k and j. Here is the algorithm we want to use:

Step 1: We know I ⊆ I1 ∪ I2 ∪ I3 and J1 = Jn1,q1 where n1 is the smallest index from 1, 2 or 3 which

works. For this fixed n1, consider the collection

Sn1 = Jn1,1, . . . , Jn1,p(n1)

275


where we are using the symbol p(n1) to denote the number of subintervals for In1 . Thus, p(1) = p(2) = 16and p(3) = 24 in our example. In our example, we find n1 = 1 and

J1 = J1,12

S1 = J1,1, . . . , J1,16.

Look at Figure 12.4 to see what we have done so far.

J7

J5

J3

J1

J8

J6

J4

J2

I

I is subdivided into 8 newrectangles, J1 to J8. Theshaded part is covered by I1.

Figure 12.4: The Part Of I Covered by I1

By referring to Figure 12.2, you can see J1 = J1,12 and J3 = J1,16. Now, let

Tn1 ≡ T1 = i | ∃k 3 Ji = Jn1,k.

Here T1 = 1, 3. Also, let

Un1 ≡ U1 = k | ∃i 3 Jn1,k = Ji.

We see U1 = 12, 16.

Step 2: Now look at the indices

Vn1 ≡ V1 = 1, 2, 3, . . . , 8 \ T1

= 2, 4, 5, 6, 7, 8.

276


The smallest index in this set is 2. Next, find the smallest index n2 so that

J2 = Jn2,k

for some index k. From Figure 12.2, we see both I2 and I3 intersect I \ I1. The smallest index n2 is thus

n2 = 2. The index k that works is 7 and so J2 = J2,7. In figure 12.5, we have now shaded the part of I

not in I1 that lies in I2.

J7

J5

J3

J1

J8

J6

J4

J2

I

I is subdivided into the 8 newrectangles, J1 to J8. The twoshaded parts are covered by I1

(lighter shading) and I2 (darkershading).

Figure 12.5: The Part Of I Covered by I1 and I2

We can see that J2 = J2,7, J4 = J2,11, J5 = J2,14 and J6 = J2,15. Let

Tn2 ≡ T2 = i ∈ V1 | ∃k 3 Ji = Jn2,k.

Here T2 = 2, 4, 5, 6. Also, let

Un2 ≡ U2 = k | ∃i 3 Jn2,k = Ji.

We see U1 = 7, 11, 14, 15.

Step 3: Now look at the indices

Vn2 ≡ V2 = 1, 2, 3, . . . , 8 \ (T1 ∪ T2)

= 7, 8.

The smallest index in this set is 7. Next, find the smallest index n3 so that

J7 = Jn3,k

277


for some index k. From Figure 12.2, we see both I2 and I3 intersect I \ (I1 ∪ I2). The smallest index n3

must be 3 and so n3 = 3. The index k that works now is 15 and we have J7 = J3,15. In figure 12.6, we

have now shaded the part of I not in I1 ∪ I2 that lies in I3.

J7

J5

J3

J1

J8

J6

J4

J2

I

I is subdivided into the 8 newrectangles, J1 to J8. The threeshaded parts are covered by I1

(lighter shading) and I2 (darkershading) and I3 (darkest shad-ing).

Figure 12.6: The Part Of I Covered by I1, I2 and I3

In fact, we have J7 = J3,15 and J8 = J3,16. Thus, we set

Tn3 ≡ T3 = i ∈ V2 | ∃k 3 Ji = Jn3,k

= 7, 8.

Also, we let

Un3 ≡ U3 = k | ∃i 3 Jn3,k = Ji.

We see U1 = 15, 16.

We have now expressed each Ji as some Jn1,k through Jn3,k. We are now ready to finish our argument.

Step 4: We have

1, . . . , 8 = Tn1 ∪ Tn2 ∪ Tn3

= T1 ∪ T2 ∪ T3.

Thus,

278


∑k∈Un3=U3

|Jn3,k| ≤p(n3)∑k=1

|Jn3,k| =24∑k=1

|J3,k| ≤ |I3|,

∑k∈Un2=U2

|Jn2,k| ≤p(n2)∑k=1

|Jn2,k| =16∑k=1

|J2,k| ≤ |I2|,

∑k∈Un1=U1

|Jn1,k| ≤p(n1)∑k=1

|Jn1,k| =16∑k=1

|J1,k| ≤ |In1 |.

Thus,

|I| =8∑i=1

|Ji| =3∑p=1

∑k∈U(np)

|Jnp,k|

≤3∑p=1

|Inp |.

This proves that

|I| ≤3∑i=1

|Ii|.

This is our desired proposition for a particular example set in <2 using three intervals. We are now ready

to adapt this algorithm to prove the general result.

Proof 12.1.1. We are given intervals I1 to IN in <k whose union covers I . Each interval Ii is the product

(αi1, βi1)× · · · × (αik, βik),

and I is the product

(α1, β1)× · · · × (αk, βk).

On the xj axis, the N intervals and the interval I determine a collection of points

(α1j , β1j), xj edge from interval I1;

(α2j , β2j), xj edge from interval I2;...

(αNj , βNj), xj edge from interval IN ;

(αj , βj), xj edge from interval I.

We do not care if these points are ordered. These xj axis points, for 1 ≤ j ≤ k, “slice” the intervals

I1 through IN and I into smaller intervals just as we did in the example for <2 shown in Figure 12.2. We

279


have

I −→ J1, . . . , Jp

I1 −→ J11, . . . , J1,p(1)

...

IN −→ JN1, . . . , JN,p(1).

Step 1: Look at J1. There is a smallest index n1 so that J − 1 = Jn1,` for some `. Let

Tn1 = i1, . . . , p | ∃ ` 3 Ji = Jn1,`,

Un1 = ` | ∃ i 3 Ji = Jn1,`.

This uses up Tn1 of the indices 1, . . . , p. You can see this process in Figure 12.4.

Step 2: Let

V1 = 1, . . . , p \ Tn1

and let q be the smallest index from the set V1. For this q, find the smallest index n2 6= n1 so that Jq = Jn2,`

for some `. This is the process we are showing in Figure 12.5. We define

Tn2 = i ∈ V1 | ∃ ` 3 Ji = Jn2,`,

Un2 = ` | ∃ i ∈ V1 3 Ji = Jn2,`.

This uses up more of the smaller subintervals I1 to Ip.

Additional Steps : Let

V2 = 1, . . . , p \ (Tn1 ∪ Tn2).

We see V2 is a smaller subset of the original 1, . . . , p than V1. We continue this construction process

until we have used up all the indices in 1, . . . , p. This takes say Q steps and we know Q ≤ p.

Final Step: After the process terminates, we have

|I| =p∑i=1

|Ji|

=Q∑p=1

∑` ∈ U(np)

|Jnp,`|

280


≤Q∑p=1

|Inp | ≤N∑i=1

|Ii|.

this completes the proof.

We can now finally prove that µ∗(I) = |I|. Note that we have to work this hard because our original

covering family was not an algebra! The final arguments are presented in the next two lemmatta.

Lemma 12.1.2 (µ∗(I = |I|).Let I be an open interval in <k. Then µ∗(I) = |I|.

Proof 12.1.2. Let (In) be any Lebesgue cover of I . Since I is compact, this cover has a finite subcover,

In1 , . . . , InN . Applying Lemma 12.1.1, we see

|I| ≤N∑i=1

|Ini | ≤∑i

|Ii|.

Since (In) is an arbitrary cover of I , we then have |I| is a lower bound for the set

∑n

|In| | (In) is a cover of I.

It follows that

|I| ≤ µ∗(I).

To prove the reverse inequality holds, let U be an open interval concentric with I so that I ⊆ U . Then

U is a cover of I and so µ∗(I) ≤ |U |. Hence, for any concentric interval, λI , 1 < λ < 2, we have

µ∗(I) ≤ λk |I|. Since this holds for all λ > 1, we can let λ→ 1 to obtain µ∗(I) ≤ |I|.

Lemma 12.1.3 (µ∗(I) = |I|).If I is an open interval of <k, then µ∗(I) = |I|.

Proof 12.1.3. We know I is a cover of itself, so it is immediate that µ∗(I) ≤ |I|. To prove the reverse

inequality, let λI be concentric with I for any 0 < λ < 1. Then, λI ⊆ I and since µ∗ is an outer measure,

it is monotonic and so

µ∗(λI) ≤ µ∗(I).

But µ∗(λI) = λk |I|. We thus have λk |I| ≤ µ∗(I) for all λ ∈ (0, 1). Letting λ → 1, we obtain the

desired inequality.

281

12.2. LOM IS MOM CHAPTER 12. LEBESGUE MEASURE

12.2 Lebesgue Outer Measure Is A Metric Outer Measure

We have now shown that if I ∈ T , then |I| = µ∗(I). However, we still do not know that the intervals

I from T are µ∗ measurable. We will do this by showing that Lebesgue outer measure is a metric outer

measure. Then, it will follow from Theorem 11.2.1 that the open sets in <k are µ∗ measurable, i.e. are in

M. Of course, this implies T ⊆M as well. Then, since an interval I is measurable, we have |I| = µ(I).

Let’s prove µ∗ is a metric outer measure. We begin with a technical definition.

Definition 12.2.1 (The Mδ Form of µ∗).For any set E is <k and any δ > 0, let (In) be a cover of E with each In an interval in <k with each

edge of an In having a length less than δ. Then

Mδ(E) = inf ∑n

|In| | (In).

Next, we need a technical lemma concerning finite Lebesgue covers.

Lemma 12.2.1 (Approximate Finite Lebesgue Covers Of I .).Let I be a open interval and let I denote its closure. Let ε and δ be given positive numbers. Then there

exists a finite Lebesgue Covering of I , I1, . . . , IN so that each edge of Ii has length less than δ and

|I1| + · · · + |IN | < |I| + ε

Proof 12.2.1. Let

I = Πki=1 (ai, bi)

and divide each component interval (ai, bi) into ni uniform pieces so that (bi − ai)/2 < δ/2. This deter-

mines ni open intervals of the form (aij , bij) for 1 ≤ j ≤ ni with bij − aij < δ/2.

Let N = n1 n2 · nk and let J = (j1, . . . , jk) denote the k - tuple of indices chosen so that 1 ≤ ji ≤ ni.

There are N of these indices. Let j indicate any such k - tuple. Then j determines an interval Ij where

Ij = Πki=1 (aij , bij), with (bij − aij) < δ/2.

Hence, |Ij | < (δ/2)k. It is also clear that ∑|Ij | = |I|.

282


Now choose concentric open intervals λIj for any λ with 1 < λ < 2. Then since λ > 1, (λIj over all k -

tuples j is a Lebesgue cover of I , we have

|λIj | = λk |Ij |

and so ∑|λIj | = λk

∑|Ij |

= λk |I|.

Since λk → 1, for our given ε > 0, there is a η > 0 so that if 1 < λ < 1 + η, we have

λk − 1 <ε

|I| + 1.

In particular, if we pick λ = (1 + η)/2, then

|λIj | <

(1 +

ε

|I| + 1

)|I| < |I| + ε.

Since ε is arbitrary, we see

|λIj | <

(1 +

ε

|I| + 1

)|I| < |I| + ε.

Thus, the finite collection ((1 + η)/2 Ij) is the one we seek as each edge has length ((1 + η)/2 δ/2 which

is less than δ.

Lemma 12.2.2 (Mδ = µ∗).For any subset E of <k, we have Mδ(E) = µ∗(E).

Proof 12.2.2. Let’s pick a given δ > 0. The way Mδ is defined then tells us immediately that µ∗(E) ≤Mδ(E) for any δ > 0 and subset E. It remains to prove the reverse inequality. If µ∗(E) was infinite, we

would have µ∗(E) ≥Mδ(E); hence, it is enough to handle the case where µ∗(E) is finite. By the Infimum

Tolerance Lemma for a given ε > 0, there is a Lebesgue cover (In) of E so that∑n

|In| < µ∗(E) +ε

2.

By Lemma 12.2.1, there is a finite Lebesgue cover of each (In) which we will denote by (Jnj), 1 ≤ j ≤p(n) so that each interval Jnj has edge length less than δ and satisfies

p(n)∑j=1

|Jnj | < |In| +ε

2n+1.

283


The combined family of intervals (Jnj for all n and 1 ≤ j ≤ p(n) is clearly a Lebesgue cover of E also.

Thus, by definition of µ∗, we have

∞∑n=1

p(n)∑j=1

|Jnj | <∞∑n=1

|In| +∞∑n=1

ε

2n+1

< µ∗(E) + ε.

Now each edge length of the interval Inj is less than δ and so

Mδ ≤∞∑n=1

p(n)∑j=1

|Jnj |

by definition. We see we have established

Mδ ≤ µ∗(E) + ε

for an arbitrary ε; hence, Mδ ≤ µ∗(E).

We now have enough “ammunition” to prove Lebesgue outer measure is a metric outer measure; i.e.

LOM is a MOM!

Theorem 12.2.3 (Lebesgue Outer Measure Is a Metric Outer Measure).The Lebesgue Outer Measure, µ∗ is a metric outer measure; i.e., if A and B are two sets in <k with

D(A,B) > 0, then µ∗(A ∪ B) = µ∗(A) + µ∗(B).

Proof 12.2.3. We always know that µ∗(A∪B) ≤ µ∗(A) +µ∗(B) for any A and B. Hence, for two sets A

and B with D(A,B) = δ > 0, it is enough to show µ∗(A) + µ∗(B) ≤ µ∗(A ∪ B). Let ε > 0 be chosen.

Since Mδ = µ∗, there is a cover of A ∪B so that the edge length of each In is less than δ/k and

Mδ(A ∪B) = µ∗(A ∪B) ≤∑n

|In| < µ∗(A ∪B) + ε

by an application of the Infimum Tolerance Lemma.

If x and y in A ∪B are both in a given In, then

d(x, y) =

√√√√ k∑i=1

(xi − yi)2 <

√√√√ k∑i=1

(δ

k)2 =

√k2δ2

k2= δ.

However, D(A,B) = δ by assumption. Thus, a given In can not contain points of both A and B. We can

therefore separate the family (In) into two collections indexed by U and V , respectively. If n ∈ U , then

In ∩A is non empty and if n ∈ V , In ∩B is non empty. We see Inn∈U is a cover for A and Inn∈V is

284


a cover for B. Thus, µ∗(A) ≤∑

n∈U |In| and µ∗(B) ≤∑

n∈V |In|. It then follows that

µ∗(A ∪B) + ε ≥∑n

|In| =∑n∈U|In| +

∑n∈V

|In|

≥ µ∗(A) + µ∗(B).

Since ε is arbitrary, we have shown µ∗(A) + µ∗(B) ≤ µ∗(A∪B). This completes the proof that Lebesgue

outer measure is a metric outer measure.

This theorem is the final piece we need to fully establish the two conditions

(i): T ⊆M; i.e.M is an OMI-F σ - algebra.

(ii): If I ∈ T , then |I| = µ(I); i.e.M is an OMI-FE σ - algebra.

Comment 12.2.1. We see immediately that since Lebesgue outer measure is a metric outer measure, the

σ - algebra of µ∗ measurable subsets contains all the open sets of <k. In particular, any open interval I is

measurable. As mentioned previously, we thus know the Borel σ - algebra of subsets is contained inM.

By Theorem 11.1.4, we know Lebesgue measure µ is complete.

We can also prove Lebesgue measure µ is regular.

Theorem 12.2.4 (Lebesgue Measure Is Regular).For any set E in <k,

µ∗(E) = inf µ(U) | U, E ⊆ U, U is open

µ∗(E) = inf µ(F ) | E, E ⊆ F, F is Lebesgue measurable .

Thus, Lebesgue measure is regular.

Proof 12.2.4. Since U is open, U is Lebesgue measurable and so µ∗(U) = µ(U). It follows immediately

that µ∗(E) ≤ µ(U) for such U . Hence,

µ∗(E) ≤ inf µ(U) | U, E ⊆ U, U is open .

On the other hand, if ε > 0 is given, the Infimum Tolerance Lemma tells us there is a Lebesgue cover of

E, (In), so that

µ∗(E) ≤∑n

|In| < µ∗(E) + ε.

However, this open cover generates an open set G = ∪n In containing E with µ(G) ≤∑

n |In| because

µ(In) = |In|. We conclude, using the definition of µ∗ that

µ(G) ≤∑n

|In| < µ∗(E) + ε.

285

12.3. APPROXIMATION RESULTS CHAPTER 12. LEBESGUE MEASURE

Hence, we must have

inf µ(U) | U, E ⊆ U, U is open ≤ µ∗(E) + ε.

Since ε is arbitrary, the result follows.

Since each open U is measurable, we then know

µ∗(E) = inf µ(U) | U, E ⊆ U, U is open

≥ inf µ(F ) | E, E ⊆ F, F ∈ M

by the first argument. To obtain the reverse inequality, note that since µ∗(F ) = µ(F ) for all measurable

F , monotonicity of µ∗ says µ∗(E) ≤ µ∗(F ) for all measurable F . We conclude

µ∗(E) ≤ inf µ(F ) | E, E ⊆ F, F ∈ M.

Now recall the definition of a regular measure from Definition 11.3.3. Using the Infimum Tolerance Lemma

again, there is are measurable sets (Fn) so that E ⊆ Fn for all n and

µ∗(E) ≤ µ(Fn) < µ∗(E) +1n.

Then, ∩nFn is also measurable and so by our equivalent form of µ∗, we have µ∗(E) ≤ µ(∩n Fn).

However, ∩n Fn ⊆ Fn always and hence,

µ∗(E) ≤ µ(∩n Fn) ≤ µ(Fn) < µ∗(E) +1n.

We conclude for all n, µ∗(E) ≤ µ(∩n Fn) < µ∗(E) + 1n . Letting n go to infinity, we find µ∗(E) =

µ(∩n Fn) which shows µ is regular.

12.3 Approximation Results

We now present some approximation results for Lebesgue measurable sets and some applications to the

Lp spaces. We begin with the following result.

12.3.1 Approximating Measurable Sets

Theorem 12.3.1 (Measurability and Approximation Conditions).A set E in <k is Lebesgue measurable if and only if for all ε > 0, there is a pair of sets (F,G) so that

F is closed, G is open with F ⊆ E ⊆ G and µ(G \ F ) < ε. Moreover, if E is bounded, we can

choose the open set G so that G is compact.

286


Proof 12.3.1. First, we prove that if the condition holds for all ε > 0, then E must be measurable. For

each positive integer n, there are therefore closed sets Fn and open sets Gn so that

Fn ⊆ E ⊆ Gn

with µ(Gn \ Fn) < 1n . Let F = ∪nFn and G = ∩nGn. Then, G and F are measurable and

G \ F ⊆ Gn \ Fn

with µ(G \ F ) ≤ µ(Gn \ Fn) < 1n . Hence, as n goes to ∞, we see µ(G \ F ) = 0. Since Lebesgue

measure is complete and E \ F ⊆ G \ F , we also know E \ F is measurable with measure 0. Finally,

since E = F ∪ (E \ F ) and each piece is measurable, E is measurable.

To prove the other direction is a bit more complicated. We now start with the assumption that E is

measurable. First, let’s assume E is a bounded set. Since E is bounded, there is a bounded open interval

I so that µ(E) ≤ |I| with E ⊆ I . Further, for any ε that is positive, by Theorem 12.2.4, there is an open

set H so that E ⊆ H and µ(H) < µ(E) + ε2 . But then the set G = H ∩ I is also open and since it

contains E, we must have

µ(E) ≤ µ(G) ≤ µ(H) < µ(E) +ε

2.

It is also clear G is compact.

Now choose a compact set A so that E ⊆ A (we can do this because E is bounded). Next, since the set

A \ E is measurable, by Theorem 12.2.4 we can find an open set B so that A \ E ⊆ B and

µ(B) < µ(A \ E) +ε

2.

Define the closed set F by F = A \B. Then, an easy calculation shows F ⊆ E. We thus have shown

µ(F ) = µ(A) − µ(A ∩B) ≥ µ(A) − µ(B)

> µ(A) − µ(A \ E) − ε

2= µ(E) − ε

2.

We conclude µ(G \ F ) = µ(G)− µ(F ) < ε. This shows the result holds if E is bounded.

To show the result is valid if E is not bounded, let Sn = x : ||x|| ≤ n be the closed ball of radius n in

<n. Define the sets En as follows:

E1 = E ∩ S1

E2 = E ∩ (S2 \ S1)

E3 = E ∩ (S3 \ S2)...

287


En = E ∩ (Sn \ Sn−1)...

Then E = ∪nEn and each En is bounded and measurable. Hence, for each n, the result for a bounded

measurable set applies. Thus, there are closed sets Fn and open sets Gn with Fn ⊆ En ⊆ Gn and

µ(Gn \ Fn) <ε

2n.

Let F = ∪Fn and G = ∪Gn. We see

G \ F ⊆ ∪n(Gn \ Fn)

and

µ(G \ F ) ≤∑n

µ(Gn \ Fn) <∑n

ε

2n< ε.

Since F ⊆ E ⊆ G, all that is left to prove is that F is closed.

To do this, let xi be a sequence in F which converges to x. Since the sequence converges (xi) is

bounded, all the xi live in some SN . But Fn ⊆ E ∩ (Sn ⊆ Sn) for all n ≥ 2 and hence

FN+1 ⊆ SN+1 \ SNFN+2 ⊆ SN+2 \ SN+1 ⊆ SN+2 \ SN

and so forth. Thus, Fn ⊆ Sn \ SN ⊆ SCN for all n > N . Hence, the points xi must all live in ∪Nn=1 Fn

which is a bounded set. But then the limit point x must also be in this set which tells us x is in F . Thus, F

is closed and we have proven the desired result.

We easily prove Theorem 12.3.1 in a more general setting. First, let’s introduce a common definition.

Definition 12.3.1 (Borel Measure).Let X be a metric space and S a sigma-algebra of subsets of X . We say the measure µ on S is a Borel

Measure if the Borel sets in X are µ measurable.

Theorem 12.3.2 (Finite Measure Borel Sets Can Be Approximated By Closed Sets).Let X be a metric space, µ a Borel measure on X , ε > 0 and B a Borel Set with µ(B) <∞. Then B

contains a closed set F with µ(B \ F ) < ε.

Proof 12.3.2. First, let’s assume µ(X) <∞. Let F be the defined as follows:

F = E ⊆ X and ∀ γ > 0, ∃ closed set K ⊆ E with µ(E \K) < γ.

288


It is easy to see that closed sets with finite measure are in F . Let ε > 0 be chosen. Now suppose sets E1

to EN are in F for some positive integer N . Then, there are closed sets K1 to KN so that Ki ⊆ Ei and

µ(Ei \Ki) < ε2i

. Then

µ(∩Ni=1 Ei \ ∩Ni=1 Ki) ≤ µ(∩Ni=1 (Ei \Ki)

<N∑i=1

ε

2i= 1− 1

2N< ε.

Hence, the closed set ∩Ni=1 Ki is contained in the set ∩Ni=1 Ei and satisfies the γ condition. Thus, F is

closed under finite intersections. From this argument, it is also clear that F is closed under countable

intersections using minor changes in the reasoning as a countable intersection of closed sets is closed.

Now, let’s look at a finite union. We again suppose sets E1 to EN are in F for some positive integer N .

Then, there are closed sets K1 to KN so that Ki ⊆ Ei and µ(Ei \Ki) < ε2i

. Then,

µ(∪Ni=1 Ei \ ∪Ni=1 Ki) ≤ µ(∪Ni=1 (Ei \Ki)

<N∑i=1

ε

2i= 1− 1

2N< ε.

Since the finite union of closed sets is closed, we see the closed set ∪Ni=1 Ki is contained in the set ∪Ni=1 Ei

and satisfies the γ condition also. Hence, F is closed under finite unions. To handle countable unions,

define the sequence of sets

Fi = ∪∞i=1 Ei \ ∪Ni=1 Ki.

Then, . . . , FN ⊆ FN−1 ⊆ . . . ⊆ F1 and µ(F1) < ∞ because µ(X) < ∞. Hence, we can invoke Lemma

9.1.2 to conclude

limNµ(∪∞i=1 Ei \ ∪Ni=1 Ki) = µ(∪∞i=1 Ei \ ∪∞i=1 Ki)

≤ µ(∪∞i=1 (Ei \Ki)

<

N∑i=1

ε

2i= 1− 1

2N< ε.

Thus, for large enough choice of N , we must have

µ(∪∞i=1 Ei \ ∪Ni=1 Ki) < ε.

289


The set ∪Ni=1 Ki is a closed subset of ∪∞i=1 Ei and satisfies the γ condition. Hence, F is closed under

countable unions. Since F contains the closed subsets, it now contains the open sets as well. We conclude

F contains the Borel sets.

If µ(X) =∞, given a set B with µ(B) <∞, note SB = E∩B : E ∈ S is a sigma-algebra for X and

µB defined as the restriction of µ to this sigma-algebra is finite. Apply the argument for the finite measure

case to µB to get the desired result.

Theorem 12.3.3 (Finite Measure Borel Sets Can Be Approximated By Open Sets).Let X be a metric space, µ as Borel measure on X , ε > 0 and B a Borel set. If µ(X) < ∞ or more

generally, if B is contained in a countable union of open sets Ui, each of finite measure, then B is

contained in an open set G with µ(G \B) < ε.

Proof 12.3.3. We use Theorem 12.3.2 here. Since the µ(X) is finite, µ(BC) is finite. So there is a closed

set K in BC so that µ(BC \K) < ε. Let the open set G = KC . Then µ(G \B) < ε and we are done.

In the more general situation, choose a closed set Ki contained in each Ui \B so that

µ

((Ui \B) \Ki

)<

ε

2i.

Next, note that B ∩ Ui ⊆ Ui \ Ki which is an open set. Define G = ∪∞i=1 (Ui \ Ki). Then G is open,

contains B and µ(G \B) < ε.

12.3.2 Approximating Measurable Functions

We can use Theorem 12.3.1 in many ways. One way is to construct continuous functions on suitable

domains which approximate summable functions. Let’s start with the characteristic function of the mea-

surable set E, IE .

Theorem 12.3.4 (Continuous Approximation Of A Characteristic Function).Let E in <k be Lebesgue measurable. Then given ε > 0, there is a continuous function φE so that

||IE − φE ||1 < ε. Moreover, if E is bounded, the support of φE is compact.

Proof 12.3.4. Using Theorem 12.3.1, given ε > 0, we see there is a closed set F and an open set G with

F ⊆ E ⊆ G with µ(FC ∩G) < ε2 . For any subset C in <k, it is easy to see the distance function

d(x,C) = infd(x, y) : y ∈ C

is continuous on <k, where d denotes the standard Euclidean metric. It follows that the function f defined

on <k by

290


f(x) =d(x,GC)

d(x,GC) + d(x, F )

is continuous and satisfies φE is 1 on F and 0 on GC and 0 ≤ f(x) ≤ 1 always. Since φE is nonzero on

GC , it follows that φE has compact support if E is bounded.

Finally, note

∫|φE − IE | dµ =

∫F|φE − IE | dµ +

∫FC∩E

|φE − IE | dµ

+∫EC∩G

|φE − IE | dµ +∫GC|φE − IE | dµ.

However, since E ⊆ G, GC ⊆ EC which tells us both IE and φE are 0 on GC . Hence, the last integral

vanishes. Also, note FC ∩ E ⊆ FC ∩G and EC ∩G ⊆ FC ∩G. Thus,∫FC∩E

|φE − IE | dµ +∫EC∩G

|φE − IE | dµ ≤ µ(FC ∩ E) + µ(EC ∩ E) ≤ 2µ(FC ∩G) < ε.

This allows us to conclude∫|φE − IE | dµ < ε which is the desired result.

Next, we see we can approximate simple functions arbitrarily close in the L1 “norm”.

Theorem 12.3.5 (Continuous Approximation Of A Simple Function).Let µ denote Lebesgue measure in <k. Let φ : <k− > < be a simple function. Then given any ε > 0,

there is a continuous function g so that∫|φ− g| dµ < ε. Moreover, if the support of φ is bounded, the

support of g is compact.

Proof 12.3.5. The simple function φ has the standard representation φ =∑n

i=1 aiIEi where the numbers

ai are distinct and nonzero and the sets Ei are measurable and disjoint. Let A = max1≤i≤n |ai|. By

Theorem 12.3.4, there are continuous functions gi so that∫

(IEi − gi)dµ < εA n . Let g =

∑ni=1 aigi. Then

∫|g − φ| dµ =

n∑i=1

|ai||IEi − gi| dµ

≤ A

n∑i=1

ε

A n< ε.

It is clear that g has compact support if ∪ni=1Ei is bounded.

Now, we can approximate a summable function f with a continuous function as well.

291


Theorem 12.3.6 (Approximation Of A Summable Function With A Continuous Function).Let f be summable on <k with respect to Lebesgue measure. Then given ε > 0, there is a continuous

function of compact support g so that∫|f − g| dµ < ε.

Proof 12.3.6. First, define the functions fn by

fn = f I[−n,n].

Then, for all n, fn is dominated by the summable function f and we also know fn → f . By the Dominated

Convergence Theorem, we then have∫fn dµ →

∫f dµ. Hence, given ε > 0, there is an N so that if

n > N , then

∫|f − fn| dµ =

∫[−n,n]C

|f | dµ <ε

4.

Choose a particular p > N . Then since fp is summable, we know there are sequences of simple functions,

(φj) and (ψj) so that φj ↑ f+p and ψj ↑ f−p with fp = f+

p −f−p . We know also that∫φj dµ ↑

∫f+p dµ and∫

ψj dµ ↑∫f−p dµ. These simple functions are constructed using the technique given in Theorem 8.8.2.

Since fp has compact support, it follows that for each and each j these simple functions have compact

support also. We see there is an M and a simple function ζ = φM −ψM so that∫

(fp− ζ) dµ < ε4 . Using

Theorem 12.3.5, there is a continuous function of compact support g so that∫

(ζ−g)dµ < ε4 . Combining,

we see

∫|f − g| dµ ≤

∫|f − fp| dµ +

∫|fp − ζ| dµ +

∫|ζ − g| < ε.

This result allows us to show that L1 is separable.

Theorem 12.3.7 (L1 Is Separable).Let [a, b] be a finite interval. The space ([a, b],M, µ) where µ is Lebesgue measure on [a, b] andMis the sigma-algebra of Lebesgue measurable subsets is separable.

Proof 12.3.7. Let f be a summable function. By Theorem 12.3.6 there is a continuous function g so that∫ ba |f − g| dµ <

ε4 . Further, there is a polynomial p on [a, b] by the Weierstrass Approximation Theorem

5.1.2 so that supx∈[a,b] |g(x)− p(x)| < ε4(b−a) . Further, there is a polynomial q with rational coefficients

so that supx∈[a,b] |p(x)− q(x)| < ε4(b−a) . Combining, we have

292

12.4. NON MEASURABLE SETS CHAPTER 12. LEBESGUE MEASURE

∫|f − q| dµ ≤

∫|f − g| dµ +

∫|g − p| dµ +

∫|p− q| dµ

≤ ε

4+ (b− a) ||g − p||∞ + (b− a) ||p− q||∞

≤ ε

4+

ε

4+

ε

4< ε.

since the set of all polynomials on [a, b] with rational coefficients is countable, we see we have shown L1

is separable here.

Comment 12.3.1. It is straightforward to extend these results to any 1 ≤ p <∞. We can also extend the

separability result to a bounded subset Ω of <k but we will not do that here.

Comment 12.3.2. We see Theorem 12.3.6 tells us the continuous functions with compact support are dense

in L1 on <k.

12.4 The Existence Of Non Lebesgue Measurable Sets

We now show that there are subsets of <n which are not Lebesgue measurable. We begin by showing

Lebesgue measure is translation invariant: this means if E is measurable, so it t + E for any t in <n and

µ(E) = µ(E + t).

Theorem 12.4.1 (Lebesgue Measure Is Translation Invariant).Let E in <n be Lebesgue measurable and let t in <n be arbitrary. Then, µ∗(E) = µ∗(t + E), t + E

is also measurable and hence, µ(E) = µ(T + E).

Proof 12.4.1. We will provide a sketch. The proof is left as an exercise.

Step 1: First, to show µ∗(E) = µ∗(t+ E) for all t, we use a standard Lebesgue Cover argument.

Step 2:

we show if E is measurable, so is t+E. This is done by showing the set t+E satisfies the Caratheodory

condition.

• Prove T ∩ (t+B) = (T − t) ∩B + t for all t and sets B and T .

• Prove (t+B)C = t+BC .

• Finally, prove t+ E satisfies the Caratheodory condition which essentially finishes the proof.

The translation invariance of Lebesgue measure is important in the construction of a non Lebesgue

measurable set. We establish this via two preliminary results which we put into Lemmas and then the final

theorem.

293


Lemma 12.4.2 (Non Measurable Set Lemma 1).Let θ be in (0, 1). Let E be a measurable set in <n with µ(E) > 0. Then, there is an open interval I

so that µ(E ∩ I) > θµ(I).

Proof 12.4.2. First do the case µ(E) is finite. Start with a Lebesgue cover of E using the supremum

tolerance lemma for positive epsilon δ. If you think about this carefully, by choosing δ right this can be

phrased as

µ(E) >1

1 + δ

∑n

|In|.

Now pick δ so that 11+δ = θ. Then we know µ(E) > θ

∑n |In|. The result then follows.

If the measure of E is not finite and for all intervals J we had µ(E ∩ J) = 0, we would find µ(E) = 0 as

well which is not possible. So there must be one interval J with µ(E ∩ J) positive. Now use the first part

to get the result.

Lemma 12.4.3 (Non Measurable Set Lemma 2).Given E in <n, we define the set of all differences x − y for x and y in E to the difference set of E

which we denote by Ed. If E is measurable in <n and if µ(E) > 0, the Ed contains a neighborhood

of 0.

Proof 12.4.3. This one is a bit trickier. First choose λ so that

1− 12n+1

< λ < 1.

Then by Lemma 12.4.2, we can find an open interval I so that λµ(I) < µ(E ∩ I . Let δ be the smallest

edge length of the interval I and let J be the open interval whose ith edge is (− δ2 ,

δ2).

• Show if x is from J , then I + x contains the center of I .

• For any x from J , look at I and I + x on their ith edge. For convenience, let the point x from J

satisfy 0 ≤ xi <di2 . Then, show the common part of I and I + x here has length larger than di

2 .

Thus, µ(I ∩ (I + x)) > ( 12n µ(I).

• Then show for x in J , µ(I ∩ (I + x)) < 2λµ(I).

• For any x from J , let A = E ∩ I and B = E ∩ I + x. both µ(A) and µ(B) are larger than

λµ(I). If A and B were disjoint, this would imply µ(A ∪ B) > 2λµ(I). However, we have shown

µ(A ∪B) < 2λµ(I) which is a contradiction. Hence A and B can not be disjoint.

The above tells us E ∩ I and (E ∩ I) + x have a common point for each x in J . It then follows we can

write x as u− v for some u and v from E implying x is in Ed. This completes the proof.

294


We can now show there is a set in <n which is not Lebesgue Measurable. We already know the

sigma-algebra of Lebesgue measurable subsets, L , of <n strictly contains the Borel sigma algebra, B.

Theorem 12.4.4 tells us that L is strictly contained in power set of<n. From this, we can then prove every

measurable set of positive measure contains a non measurable set.

Theorem 12.4.4 (Non Lebesgue Measurable Set).There is a non Lebesgue measurable set in <n. Further, any set of positive measure contains a non

measurable set.

Proof 12.4.4. Let z be a fixed irrational number and let the setM = Q. It is easy to showMn is countable

and closed under addition and subtraction and dense in <n.

Then, to prove the result:

• Prove the relation ∼ on <n ×<n defined by x ∼ y if x− y is in Mn is an equivalence relation.

• Let µ∗(E) be defined by

µ∗(E) = sup µ(F ) : F ⊆ E, F measurable .

Prove if E is measurable, then µ(E) = µ∗(E) = µ∗(E).

• Let M be the collection of equivalence classes determined by ∼ and use the Axiom Of Choice (look

this up!) to pick a unique point from each equivalence class to form the set S. Recall, we know that

µ∗(S) = sup µ(E) : E ⊆ S, E measurable. If µ(E) > 0, use Lemma 12.4.3 to note Ed contains

a neighborhood of 0. Since M is dense, Ed ∩M contains a point x which is not 0. Prove this point

x = u − v with u, v from E implying u ∼ v with x 6= 0 which is not possible by the way S was

constructed. This implies µ∗(E) = 0. Thus, by the previous result, if S were measurable, µ(S) = 0.

• Let (xn) be an enumeration of M . Then, show <n = ∪n(S + xn) and prove that S measurable

implies∞ = 0 giving a contradiction. We must therefore conclude that S can not be a measurable

set.

• Show if E is measurable with positive measure, then E contains a non measurable set.

Now go back and reread Section 13.4 where we used these results!

12.4.1 Exercises

Exercise 12.4.1. Prove Lemma 12.4.2 in detail.

Exercise 12.4.2. Prove Lemma 12.4.3 in detail.

Exercise 12.4.3. Prove Theorem 12.4.4 in detail.

295

12.5. METRIC SPACES CHAPTER 12. LEBESGUE MEASURE

12.5 Metric Spaces Of Finite Measure Sets

We can do many things with measurable sets. This material is a connection of sorts to the standard

discussions of linear analysis.

Theorem 12.5.1 (The Metric Space of Finite Measurable Sets).Let M be the collection of all Lebesgue measurable sets in <n with finite measure. Recall, the sym-

metric difference of sets A and B is

A∆B =(A ∩BC

)∪(AC ∩B

).

Then

1. The relation ∼ on M ×M defined by

A ∼ B if µ(A∆B) = 0

is an equivalence relation.

2. Let M denote the set of all equivalence classes with respect to this relation and let [A] be a

typical equivalence class with representative A. Prove M is a metric space with metric D

defined by

D([A], [B]) := µ(A∆B).

Proof 12.5.1. A sketch of the proof is given below:

• First prove if A and B are in M ,

|µ(A)− µ(B)| ≤ µ(A∆B).

• Then, prove if A ∼ E and B ∼ F , then µ(A∆B) = µ(E∆F ).

Hint: Show

|µ(A∆B)− µ(E∆F )| ≤ µ(A∆E) + µ(B∆F ).

• Prove D is a metric which requires you prove D is well-defined by showing its value is independent

of the choice of equivalence class representatives.

Theorem 12.5.2 (The Metric Space of All Finite Measure Sets Is Complete).The metric space (M , D) is complete.

296


Proof 12.5.2. We provide a detailed sketch. Let (En) be a Cauchy sequence.

1. Prove there is a subsequence (Enm) of (En) so that

µ(Enm∆En) <1

2mfor all n > nm,

µ(Enk∆Enm) <1

2mfor all nk > nm.

2. For expositional convenience, let Gm = Enm in what follows. Let H = lim supGm and G =lim inf Gm.

• Prove µ(H \Gm)→ 0 and µ(Gm \G)→ 0.

• Prove µ(H∆Gm) → 0. This argument uses the disjoint decomposition idea from Lemma

9.1.5.

Step 1: Since

µ(HC ∩Gj) = limm∩n≥m(GCn ∩Gj)

overestimate the right hand side by µ(GCn ∩Gj) which is also bounded above by µ(Gn∆Gj)which is as small as we want. This shows µ(HC ∩Gj)→ 0.Step 2: First note

H ∩GCj = ∩m ∪n≥m Gn ∩GCj

which shows H ∩ GCj is a decreasing limit of sets. Now look at ∪n≥mGn ∩ GCj which is

contained in GCj always. However, since the measure of Gj is finite, the measure of its com-

plement is not, so this upper bound is not very useful. For convenience, let Bm = ∪n≥mGn.

To understand how to use the properties of our sequence Gn, we rewrite the union Bm like

this:

Um = Gm

Um+1 = Gm+1 \ (Gm)

Um+2 = Gm+2 \ (Gm ∪Gm+1)...

Um+n = Gm+n \ (Gm ∪Gm+1 ∪ · · · ∪Gm+n−1)...

From the above, we see for all m > j

µ(Bm ∩Gj) =∞∑n=0

µ(Um+n ∩GCj ).

However, we can overestimate a bit using

297


µ(Um ∩GCj ) <1

2m

µ(Um+1 ∩GCj ) = µ(Gm+1 ∩GCm ∩GCj )

≤ µ(Gm+1 ∩GCm)

<1

2m.

µ(Um+2 ∩GCj ) = µ(Gm+2 ∩GCm ∩GCm+1 ∩GCj )

≤ µ(Gm+2 ∩GCm+1)

<1

2m+1.

...

µ(Um+n ∩GCj ) = µ(Gm+n ∩GCm ∩ · · · ∩GCm+n−1 ∩GCj )

≤ µ(Gm+n ∩GCm+n−1)

<1

2m+n−1.

Thus,

µ(Bm ∩Gj) <1

2m+∞∑n=1

12m+n−1

=1

2m+

12m−1

.

We conclude

µ(H \Gj) = limm

µ(Bm ∩GCj ) ≤ limn

(1

2m+

12m−1

) = 0.

This shows the desired result.

3. Prove µ(H∆Em)→ 0 also which proves completeness.

Comment 12.5.1. We can also prove µ(G∆H) = 0 so that G ∼ H .

12.5.1 Exercises



298

Chapter 13

Cantor Set Experiments

We now begin a series of personal investigations into the construction of an important subset of [0, 1] called

the Cantor Set. We follow a great series of homework exercise outlined, without solutions, in a really hard

but extraordinarily useful classical analysis text by Stromberg, (Stromberg (6) 1981).

13.1 The Generalized Cantor Set

Let (an) for n ≥ 0 be a fixed sequence of real numbers which satisfy

a0 = 1, 0 < 2an < an−1 (13.1)

Define the sequence (dn) by

dn = an−1 − 2an

Note each dn > 0. We can use the sequence (an) to define a collection of intervals Jn,k and In,k as

follows.

(0) J0,1 = [0, 1] which has length a0.

(1) J1,1 = [0, a1] and J1,2 = [1 − a1, 1]. You can see each of these intervals has length a1. We

let W1,1 = J1,1 ∪ J1,2 and I1,1 = J0,1 −W1,1 where the minus symbol used here represents set

difference. This step creates an open interval of [0, 1] which has length d1 > 0. Let P1 = J1,1∪J1,2.

This is a closed set.

(2) Set J2,1 = [0, a2], J2,2 = [a1− a2, a1], J2,3 = [1− a1, 1 + a2− a1], and J2,4 = [1− a2, 1]. These 4closed subintervals have length a2. It is not so mysterious how we set up the J2,k intervals. Step (1)

299

13.1. GENERALIZED CHAPTER 13. CANTOR SETS

created a closed interval [0, a1], an open interval (a1, 1− a1) and another closed interval [1− a1, 1].The first closed subinterval is what we have called J1,1. Divide it into three parts; the first part will

be a closed interval that starts at the beginning of J1,1 and has length a2 and the third part will be

closed interval of length a2 that ends at the last point of J1,1. When these two closed intervals are

subtracted from J1,1, an open interval will remain. The length of J1,1 is a1. So the open interval

must have length a1− 2a2 = d2. A little thought tells us that the first interval must be [0, a2] (which

we have named J2,1 ) and the third interval must be [a1 − a2, a1] (which we have named J2,2). To

get the intervals J2,3 and J2,4, we divide J1,2 into the same type of three subintervals as we did for

J1,1. The first and third must have length a2 which will give an open interval in the inside of length

d2. This will give J2,3 = [1− a1, 1− a1 + a2] and j2,4 = [1− a2, 1].

Then let W2,1 = J2,1 ∪ J2,2, and W2,2 = J2,3 ∪ J2,4. Then create new intervals by letting I2, 1 =J1,1 −W2,1 and I2, 2 = J1,2 −W2,2. We have now created 4 open subintervals of length d2. Let

P2 = J2,1 ∪ J2,2 ∪ J2,3 ∪ J2,4. We can write this more succinctly as P2 = ∪J2,k|1 <= k <= 22.Again, notice that P2 is a closed set that consists of 4 closed subintervals of length a2.

Let’s look even more closely at the details. A careful examination of the process above with pen

and paper in hand gives the following table that characterizes the left hand endpoint of each of the

intervals J2,k.

J2,1 0J2,2 a2 + d2

J2,3 2a2 + d2 + d1

J2,4 3a2 + 2d2 + d1

Since we know the left hand endpoint and the length is always a2, this fully characterizes the subin-

tervals J2,k. Also, as a check, the last endpoint 3a2 + 2d2 + d1 plus one more a2 should add up to

1. We find

4a2 + 2d2 + d1 = 4a2 + 2(a1 − 2a2) + (a0 − 2a1)

= a0 = 1.

(3) Step (2) has created 4 closed subintervals J2,k of length a2 and 2 new open intervals I2,i of length

d2. There is also the first open interval I1,1 of length d1 which was abstracted from [0, 1]. Now we

repeat the process described in Step (2) on each closed subinterval J2,k. We do not need to use the

auxiliary setsW3,i now as we can go straight into the subdivision algorithm. We divide each of these

intervals into 3 pieces. The first and third will be of length a3. This leaves an open interval of length

d3 between them. We label the new closed subintervals so created by J3,k where k now ranges from

1 to 8. The new intervals have left hand endpoints

300

13.1. GENERALIZED CHAPTER 13. CANTOR SETS

J3,1 0J3,2 a3 + d3

J3,3 2a3 + d3 + d2

J3,4 3a3 + 2d3 + d2

J3,5 4a3 + 2d3 + d2 + d1

J3,6 5a3 + 3d3 + d2 + d1

J3,7 6a3 + 3d3 + 2d2 + d1

J3,8 7a3 + 4d3 + 2d2 + d1

Each of these subintervals have length a3 and a simple calculation shows (7a3 + 4d3 + 2d2 + d1) +a3 = 1 as desired. There are now 4 more open intervals I3,i giving a total of 6 open subintervals

arranged as follows:

Parent Length

I1,1 J0,1 d1

I2,1 J1,1 d2

I2,2 J1,2 d2

I3,1 J2,1 d3

I3,2 J2,2 d3

I3,3 J2,3 d3

I3,4 J2,4 d4

We define P3 = ∪J3,k|1 <= k <= 23 and note that P1 ∩ P2 ∩ P3 = P3.

We can, of course, continue this process recursively. Thus, after Step n, we have constructed 2n closed

subintervals Jn,k each of length an. The union of these subintervals is labeled Pn and is therefore defined

by Pn = ∪Jn,k|1 <= k <= 2n. The left hand endpoints of Jn,k can be written in a compact and

illuminating form, but we will delay working that out until later. Now, we can easily see the form of the

left hand endpoints for the first few intervals:

Jn,1 0Jn,2 an + dn

Jn,3 2an + dn + dn−1

Jn,4 3an + 2dn + dn−1

Definition 13.1.1 (The Generalized Cantor Set).Let (an), N ≥ 0 satisfy Equation 13.1. We call such a sequence a Cantor Set Generating Sequence and

we define the Cantor Set generated by (an) to be the set P = ∩∞n=1Pn, where the sets Pn are defined

recursively via the discussion in this section. We will denote the generalized Cantor Set generated by

the Cantor Sequence (an) by Ca.

301

13.2. REPRESENTATION CHAPTER 13. CANTOR SETS

Comment 13.1.1. The Cantor Set generated by the sequence (1/3n), n ≥ 0 is very famous and is called

the Middle Thirds set because we are always removing the middle third of each interval in the construction

process. We will denote the Middle Third Cantor set by C.

Exercise 13.1.1. Write out the explicit endpoints of all these intervals up to and including Step 4. Illustrate

this process with clearly drawn tables and graphs.

Exercise 13.1.2. Write out explicitly P1, P2, P3 and P4. Illustrate this process with clearly drawn tables

and graphs.

Exercise 13.1.3. Do the above two steps for the choice an = 3−n for n >= 0. Illustrate this process with

clearly drawn tables and graphs.

Exercise 13.1.4. Do the above two steps for the choice an = 5−n for n >= 0. Illustrate this process with

clearly drawn tables and graphs.

Exercise 13.1.5. As mentioned, the above construction process above can clearly be handled via induction.

Prove the following:

(a) Pn−1 − Pn = ∪In,k | 1 <= k <= 2n−1

(b) Let P = ∩∞n=0 Pn. Then P0 − P = ∪∞n=1

(Pn−1 − Pn

)Exercise 13.1.6. Prove any Cantor set is in the Borel σ-algebra B.

13.2 Representing The Generalized Cantor Set

We are now in a position to prove additional properties about the Cantor Set Ca for a Cantor generating

sequence (an). Associate with (an) the sequence (rn) whose entries are defined by rn = an−1 − an. Let

S denote the set of all sequences of real numbers whose values are either 0 or 1; i.e. S = x = (xn) |xn =0 or xn = 1. Now define the mapping f : S → Ca by

f(x) =∞∑n=1

xnrn (13.2)

Theorem 13.2.1 (Representing The Cantor Set).

1. f is well - defined.

2. f(x) is an element of Ca.

3. f is 1− 1 from S to Ca.

4. f is onto Ca.

302

13.2. REPRESENTATION CHAPTER 13. CANTOR SETS

Proof 13.2.1. You will prove this Theorem by establishing a series of results.

Exercise 13.2.1. For any Cantor generating sequence (an), we have limn an = 0.

Exercise 13.2.2. Show

∞∑j=n+1

rj = limm

m∑j=n+1

rj = limm

(an − am) = an

Exercise 13.2.3. rn >∑∞

j=n+1 rj .

Exercise 13.2.4. For n >= 1 and any finite sequence (x1, x2, ..., xn) of 0’s and 1’s, define the closed

interval

J(x1, ..., xn) = [n∑j=1

xjrj , an +n∑j=1

xjrj ]

Show

1. Show J(0) = [0, a1] = J1,1.

2. Show J(1) = [r1, a1 + r1] = [1− a1, 1] = J1,2.

3. Now use induction on n to show that the intervals J(x1, ..., xn) are exactly the 2n intervals Jn,k for

1 <= k <= 2n that we described in the previous section.

Hint 13.2.1. i.e. assume true for n − 1. Then we can assume that there is a unique (x1, ..., xn−1)choice so that Jn−1,k = J(x1, ..., xn−1).

Recall how the J’s are constructed. At Step n − 1, the interval Jn−1,k is used to create 2 more

intervals on level n by removing a piece. The 2 intervals left both have length an and we would

denote them by Jn,2k−1 and Jn,2k. Now use the definition of the closed intervals J(x1, ..., xn) to

show that (remember our x1, ..., xn−1 are fixed)

J(x1, ..., xn−1, 0) = Jn,2k−1

J(x1, ..., xn−1, 1) = Jn,2k

This will complete the induction.

Exercise 13.2.5. Let x be in S. Show that f(x) is in J(x1, ..., xn) for each n.

Sketch Of Argument: We know that each J(x1, ..., xn) = Jn,k for some k. Let this k be written k(x, n) to

help us remember that it depends on the x and the n. Also remember that 1 <= k(x, n) <= 2n. So f(x)is in Jn,k(x,n) which is contained in Pn Hence, f(x) is in Pn for all n which shows f(x) is in Ca. This

shows f maps S into Ca.

Exercise 13.2.6. Now let x and y be distinct in S. Choose an index j so that xj is different from yj . Show

this implies that f(x) and f(y) then belong to different closed intervals on the jth level. This implies f(x)is not the same as f(y) and so f is 1− 1 on S.

303

13.3. CANTOR FUNCTION CHAPTER 13. CANTOR SETS

Exercise 13.2.7. Show f is surjective. To do this, let z be in Ca. Since z is in P1, either z is in J(0) or z

is in J(1). Choose x1 for that z is in J(x1). Then assuming x1, ..., xn−1 have been chosen, we have z is

in J(x1, ..., xn−1). Now z is in

Pn ∩ J(x1, ..., xn−1) = J(x1, ..., xn−1, 0) ∪ J(x1, ..., xn−1, 1).

This tells us how to choose xn.

Hence, by induction, we can find a sequence (xn) in S so that z is in intersection over n of J(x1, ..., xn).

But by our earlier arguments, f(x) is in the same intersection!

Finally, each of these closed intervals has length an which we know goes to 0 in the limit on n. So z

and f(x) are both in a decreasing sequence of sets whose lengths go to 0. Hence z and f(x) must be the

same. (This uses what is called the Cantor Intersection Theorem).

We can also prove a result about the internal structure of the generalized Cantor set: it can not contain

any open intervals.

Exercise 13.2.8. Prove Ca contains no open intervals.

In addition, we have the following result:

Exercise 13.2.9. The limit of 2nan always exists and is in [0, 1].

13.3 The Cantor Function

We now prove additional interesting results that arise from the use of generalized Cantor sets via a series

of exercises that you complete. As usual, let (an) be a Cantor Set generating sequence. Using the function

f defined in the previous section, let’s define the mapping φ by

φ((xn)) =∞∑j=1

xj (1/2j)

Hence, φ : S → [0, 1]. and φ f : S → [0, 1]. Let the mapping Ψ = φ f−1. Note Ψ : Ca → [0, 1].

Exercise 13.3.1. φ maps S one to one and onto [0, 1] with a suitable restriction on the base 2 representa-

tion of a number in [0, 1].

Exercise 13.3.2. x < y in Ca implies Ψ(x) ≤ Ψ(y).

Exercise 13.3.3. Ψ(x) = Ψ(y) if and only if (x, y) is one of the intervals removed in the Cantor set

construction process, i.e.

(x, y) =

(n−1∑j=1

xjrj + an,n−1∑j=1

xjrj + rn

)

304

13.4. CONSEQUENCES CHAPTER 13. CANTOR SETS

Exercise 13.3.4. In the case where Ψ(x) = Ψ(y) extend the mapping Ψ to [0, 1]−Ca by

Ψ(t) = Ψ(x) = Ψ(y), x < t < y.

Finally, define Ψ(0) = 0 and Ψ(1) = 1. Prove Ψ : [0, 1] → [0, 1] is a non decreasing continuous map

of [0, 1] onto [0, 1] and is constant on each component interval of [0, 1] − Ca where component interval

means the In,k sets we constructed in the Cantor set construction process.

Comment 13.3.1. If Ca is the Cantor set constructed from the sequence (1/3n), we call Ψ the Lebesgue

Singular Function.

Now, letC be a Cantor set constructed from the generating sequence (an) where lim 2nan = 0. Let Ψbe the mapping discussed above for thisC. Define the mapping g : [0, 1]→ [0, 1] by g(x) = (Ψ(x)+x)/2.

Exercise 13.3.5. Prove g is strictly increasing and continuous from [0, 1] onto [0, 1].

Exercise 13.3.6. Prove that

g(∞∑j=1

xj rj) =∞∑j=1

xj r′j

where rj ′ = (1/2j + rj)/2.

Exercise 13.3.7. Prove C′ = g(C) is also a generalized Cantor set.

Comment 13.3.2. Note that the sequence a′j = (1/2)(1/2j + aj) is also a Cantor generating sequence

that gives the desired rj ′ for the previous exercise.

Exercise 13.3.8. Compute the outermeasure of the Cantor set generated by an when lim 2nan = 0 and

also the outermeasure of the Cantor set C′ = g(C).

Exercise 13.3.9. Compute the Lebesgue measure of the Cantor set generated by an when lim 2nan = 0and also the Lebesgue measure of the Cantor set C′ = g(C).

Exercise 13.3.10. If a subset of < has content 0, it also has Lebesgue measure 0.

Next, we show this function g is of great importance in developing a better understanding of measures.

13.4 Interesting Consequences

We now argue the Borel sigma - algebra is strictly contained in the Lebesgue sigma - algebra by using the

special functions we have constructed here. IfC is a Cantor set constructed from the generating sequence

(an) where lim 2nan = 0, the Lebesgue measure ofC is 0. Further, if Ψ is the mapping we defined earlier

associated with C we can define the mapping g : [0, 1] → [0, 1] by g(x) = (Ψ(x) + x)/2. The mapping

g is quite nice: it is 1 − 1, onto, strictly increasing and continuous. We also showed in the exercises in

Section 13.3 that g(C) is another Cantor set with lim 2na′n = 1/2, where (a′n) is the generating sequence

for g(C). We also know

305

13.4. CONSEQUENCES CHAPTER 13. CANTOR SETS

µB(C) = µL(C) = 0

µB(g(C)) = µL(g(C)) = 1/2.

A nonconstructive argument using the Axiom of Choice then allows us to show any Lebesgue measurable

set with positive Lebesgue measure must contain a subset which is not in the Lebesgue sigma - algebra.

So since µL(g(C)) = 1/2, there is a set F ⊆ g(C) which is not is L. Thus, g−1(F ) ⊆ C which

has Lebesgue measure 0. Lebesgue measure is a measure which has the property that every subset of a

set of measure 0 must be in the Lebesgue sigma - algebra. Then, using the monotonicity of µL, we have

µL(g−1(F )) is also 0. From the above remarks, we can infer something remarkable.

Let the mapping h be defined to be g−1. Then h is also continuous and hence it is measurable with

respect to the Borel sigma-algebra. Note since B ⊆ L, this tells us immediately that h is also measurable

with respect to the Lebesgue sigma - algebra. Thus, h−1(U) is in the Borel sigma - algebra for all Borel

sets U . But we know h−1 = g, so this tells us g(U) is in the Borel sigma -algebra if U is a Borel set.

Hence, if we chose U = g−1(F ), then g(U) = F would have to be a Borel set if U is a Borel set. However,

we know that F is not inL and so it is also not a Borel set. We can only conclude that g−1(F ) can not be a

Borel set. However, g−1(F ) is in the Lebesgue sigma - algebra. Thus, there are Lebesgue measurable sets

which are not Borel! Thus, the Borel sigma - algebra is strictly contained in the Lebesgue sigma - algebra!

We can use this example to construct another remarkable thing. Using all the notations from above,

note the indicator function of CC , the complement of C, is defined by

ICC (x) =

1 x ∈ CC

0 x ∈ C.

We see f = ICC is Borel measurable. Next, define a new mapping like this:

φ(x) =

1 x ∈ CC

2 x ∈ C \ g−1(F )3 x ∈ g−1(F ).

Note that φ = f a.e. with respect to Borel measure. However, φ is not Borel measurable because φ−1(3)is the set g−1(F ) which is not a Borel set.

We conclude that in this case, even though the two functions were equal a.e. with respect to Borel

measure, only one was measurable! The reason this happens is that even though C has Borel measure 0,

there are subsets of C which are not Borel sets! We discussed this example earlier in Chapter 9 as well.

306

Chapter 14

Lebesgue Stieljes Measure

We now show how to construct a very important class of measures using right continuous monotonic

functions on <. We know that if λ on < is a measure, then for any point b

limx→b

λ

((x, b]

)= λ

((a, b]

).

In fact, if we define the function h by h(x) = λ([a, x]), then we see that h must be continuous from the

right at each a; i.e. limx→a+ h(x) = h(a+) = h(a). On the other hand, using a monotonic sequence of

sets that decrease, we see

limx→b−

λ

((a, x]

)= h(b−).

Hence, it is not required that h must be continuous from the left at each b. Now if g is any monotone

increasing function on [a, b], g is of bounded variation and so it always has right and left hand limits at

any point. From the discussion above, we see we can define outer measures in a more general fashion. For

Lebesgue Outer Measure, we use the premeasure τ((a, b]) = b − a which corresponds to the continuous,

strictly monotone increasing choice g(x) = x. We can also define a premeasure τg by τg((a, b]) =g(b)− g(a). This will generate a measure in the usual way. Call this measure λg. Then

λg

([a, b]

)= λg

(a)

+ λg

((a, b]

).

But since

307

14.1. LEBESGUE-STIELJES CHAPTER 14. LEBESGUE STIELJES MEASURE

λg

(a)

= limε→0+

λg

((a− ε, a+ ε)

)= g(a+)− g(a−),

we see for all x > a,

λg

([a, x]

)= g(a+)− g(a−) + g(x)− g(a).

Letting x→ a+, we find

λg

(a)

= limx→a+

(g(a+)− g(a−) + g(x)− g(a)

).

Simplifying, we have

limx→a+

g(x) = g(a).

which tells us the monotonic non decreasing function g we use to construct a new measure must be right

continuous.

14.1 Lebesgue - Stieljes Outer Measure and Measure

From the discussion above, to construct new measures, we choose any g which is non -decreasing function

on < and continuous from the right. Moreover, the unbounded limits are well - defined limx→−∞ g(x)and limx→∞ g(x). These last two limits could be −∞ and∞ respectively. Then, define the mapping τgon U by

τg(∅) = 0,

τg

((a, b]

)= g(b) − g(a),

τg

((−∞, b]

)= g(b) − lim

x→−∞g(x),

τg

((a,∞)

)= lim

x→∞g(x) − g(a),

τg

((−∞,∞)

)= lim

x→∞g(x) − lim

x→−∞g(x).

This defines τg on the collection of sets U consisting of the empty set, intervals of the form (a, b] for finite

numbers a and b and unbounded intervals of the form (−∞, b] and (a,∞). LetA be the algebra generated

308


by finite unions of sets from U . Note A contains <.

Let’s extend the mapping τg to be additive on A. If E1, E2, . . . , En is a finite collection of disjoint sets in

A, we extend the definition of τg to this finite disjoint unions as follows:

τg

( n⋃i−1

Ei

)=

n∑i=1

τg(Ei). (14.1)

Lemma 14.1.1 (Extending τg To Additive Is Well - Defined).The extension of τg from U to the algebra A is well - defined; hence, τg is additive on A.

Proof 14.1.1. For (a, b] ∈ A, write

(a, b] =n⋃i=1

(ai, bi],

for any positive integer n with a1 = a, bn = b and the in between points satisfy ai+1 = bi for all

i. Of course, there are many such decompositions of (a, b] we could choose. Also, these are the only

decompositions we can have. If we use the unbounded sets, we can not recapture (a, b] using a finite

number of unions! Then, using Equation 14.1, we have

τg((a, b]) =n∑i=1

τg((ai, bi])

=n∑i=1

g(bi) − g(ai).

But since ai+1 = bi, this sum collapses to

τg((a, b]) = g(b) − g(a).

This was the original definition of τg on the element (a, b] in U . We conclude the value of τg on elements

of the form (a, b] is independent of the choice of decomposition of it into a finite union of sets from U .

For an unbounded interval of the form (a,∞), any finite disjoint decomposition can have only one interval

of the form (b,∞) giving (a,∞) = (a, b]∪ (b,∞), with the piece (a, b] written as any finite disjoint union

(a, b] = ∪ni=1 (ai, bi] as before. The same arguments as used above then show τg is well - defined on this

type of element of U also. We handle the sets (−∞, b] is a similar fashion.

Next, if we look at any arbitraryA inA, thenA can be written as a finite union of membersA1, . . . , Ap

of U . Each of these elements Ai can then be written using a finite disjoint decomposition into intervals

309


(aij , bij ], 1 ≤ j ≤ p(i) as we have done above. Thus,

A = ∪mi=1 ∪p(i)j=1 (aij , bij ]

where we abuse notation, for convenience, by noting it is possible a11 = −∞ and bm p(m) =∞. We simply

interpret a set of the form (a,∞] as (a,∞). We then combine these intervals and relabel as necessary to

write A as a finite disjoint union

A = ∪Ni=1 (ai, bi]

with bi ≤ ai+1 and again it is possible that a1 = −∞ and bN =∞. We therefore know that

τg(A) = ∪Ni=1 τg((ai, bi]).

Now assume A has been decomposed into another finite disjoint union, A = ∪Mj=1 Bj , each Bj ∈ A. Let

Cj = i |(ai, bi] ⊆ Bj.

Note a given interval (ai, bi] can not be in two different sets Bj and Bk because they are assumed disjoint.

Hence, we have

Bj = ∪i∈Cj (ai, bi]

and

τg(Bj) =∑i∈Cj

τg((ai, bi]).

Thus,

M∑j=1

τg(Bj) =M∑j=1

∑i∈Cj

τg((ai, bi])

=N∑i=1

τg((ai, bi]).

This shows that our extension for τg is independent of the choice of finite decomposition and so the exten-

sion of τg is a well - defined additive map on A

We can now apply Theorem 11.3.3 to conclude that since the covering family A is an algebra and τg is

additive on A , the σ - algebra,Mg, generated by τg contains A and the induced measure, µg, is regular.

Next, we want to know that µg(A) = τg(A) for all A in A. To do this, we will prove the extension τg is

actually a pseudo-measure. Thus, we will be able to invoke Theorem 11.3.4 to get the desired result.

310


Lemma 14.1.2 (Lebesgue - Stieljes Premeasure Is a Pseudo-Measure).The mapping τg is a pseudo-measure on A.

Proof 14.1.2. We need to show that if (Tn) is a sequence of disjoint sets fromA whose union ∪nTn is also

in A, then

τg( ∪n Tn) =∑n

τg(Tn).

First, notice that if there was an index n0 so that τg(Tn0) = ∞, then letting B = ∪n Tn \ Tn0 , we can

write ∪nTn as the finite disjoint union B ∪ Tn0 and hence

τg( ∪n Tn) = τg(B) + τg(Tn0) = ∞.

Since the right hand side sums to ∞ in this case also, we see there is equality for the two expressions.

Therefore, we can restrict our attention to the case where all the individual Tn sets have finite τg(Tn)values. This means no elements of the form (−∞, b] or (a,∞) can be part of any decomposition of the

sets Tn. Hence, we can assume each Tn can be written as a finite union of intervals of the form (a, b]. It

follows then that it suffices to prove the result for a single interval of the form (a, b].

Since τg is additive on finite unions, if C ⊆ D, we have

τg(D) = τg(C) + τg(D \ C) ≥ τg(C).

Now assume we can write the interval (a, b) as follows:

(a, b] = ∪∞n=1 (ai, bi]

with the sets (ai, bi] disjoint. For any n, we have

(a, b] = ∪nk=1 (ak, bk] ∪ ∪∞k=n+1 (ak, bk].

Therefore

τg

((a, b]

)= τg

(∪nk=1 (ak, bk]

)+ τg

(∪∞k=n+1 (ak, bk]

).

The finite additivity on disjoint intervals then gives us

τg

(∪nk=1 (ak, bk]

)=

n∑k=1

τg

((ak, bk]

)= g(b1) − g(a1) + g(b2) − g(a2) + . . . + g(bn) − g(an).

311


We know g is non decreasing, thus g(b1) − g(a2) ≤ 0, g(b2) − g(a3) ≤ 0, and so forth until we reach

g(bn−1)− g(an) ≤ 0. Dropping these terms, we find

τg

(∪nk=1 (ak, bk]

)≤ g(bn) − g(a1) ≤ g(b) − g(a).

Thus, these partial sums are bounded above and so the series of non negative terms∑

n τg((ak, bk])converges. This tells us that

τg

(∪∞k=1 (ak, bk]

)≤ τg

((a, b]

).

To obtain the reverse inequality, let ε > 0 be given. Then, since the series above converges, there must be

a positive integer N so that if n ≥ N ,

∞∑k=n+1

τg

((ak, bk]

)< ε

We conclude that

τg

((a, b]

)=

n∑k=1

τg

((ak, bk]

)+ τg

(∪∞k=n+1 (ak, bk]

)

≥n∑k=1

τg

((ak, bk]

)+ τg

(∪Kk=n+1 (ak, bk]

)

=n∑k=1

τg

((ak, bk]

)+

K∑k=n+1

τg

((ak, bk]

).

We know that

limK

K∑k=n+1

τg

((ak, bk]

)= 0.

Thus, letting K →∞, we find for all n > N , that

τg

((a, b]

)≥

n∑k=1

τg

((ak, bk]

).

However, the sequence of partial sums above converges. We have then the inequality

τg

((a, b]

)≥

∞∑k=1

τg

((ak, bk]

).

Combining the two inequalities, we have that our extension τg is a pseudo-measure.

312


Comment 14.1.1. It is worthwhile to summarize what we have accomplished at this point. We know now

that the premeasure τg defined by the non decreasing and right continuous map g on the algebra of sets,

A, generated by the collection U consisting of the empty set, finite intervals like (a, b] and unbounded

intervals of the form (−∞, b] and (a,∞) when defined to be additive on A generates an interesting outer

measure µ∗b . We have also proven that the extension τg becomes a pseudo-measure on A. Thus,

(i): The sets A inA are in the σ - algebra of sets that satisfy the Caratheodory condition using µ∗g which

we denote by Mg. We denote the resulting measure by µg. This is because τg is additive on the

algebra A by Theorem 11.3.3.

(ii): We know µg is regular by Theorem 11.3.3 and complete by Theorem 11.1.4.

(iii): We know that µg(A) = τg(A) for all A in A since τg is a pseudo-measure by Theorem 11.3.4.

(iv): Since any open set can be written as a countable disjoint union of open intervals, this means any

open set is inMg becauseMg contains open intervals as they are in A and the σ - algebraMg is

closed under countable disjoint unions. This also tells us that the Borel σ - algebra is contained in

Mg.

(v): Since open sets are µ∗g measurable, by Theorem 11.2.2, it follows that µ∗g is a metric outer measure.

Comment 14.1.2. The measures µg induced by the outer measures µ∗g are called Lebesgue - Stieljes

measures . Since open sets are measurable here, these measures are also called Borel measures .

Comment 14.1.3. So for a given non decreasing right continuous g, we can construct a Lebesgue - Stieljes

measure satisfying

µg

((a, b]

)= g(b) − g(a).

So what about the open interval (a, b)? We know that

(a, b) =⋃n

(a, b − 1n

].

Then

µg

((a, b)

)= lim

ng

(b − 1

n

)− g(a)

= g(b−) − g(a).

What about the singleton b? We know

b =⋂n

(b − 1

n, b

]

313


and so

µg

(b)

= limn

g(b) − g

(b − 1

n

)= g(b) − g(b−).

Note this tells us that the Lebesgue - Stieljes measure of a singleton need not be 0. However, at any point

b where g is continuous, this measure will be zero. Since our g can have at most a countable number of

discontinuities, we see there are only a countable number of singleton sets whose measure is non - zero.

We can also prove approximation lemmas for Lebesgue-Stieljes measures also. A typical one is given

below.

Theorem 14.1.3 (Approximating Sets With A Lebesgue-Stieljes Outer Measure).Let X be a metric space, µ∗g be a Lebesgue-Stieljes outer measure,Mg the induced sigma-algebra of

measurable sets and µg the induced measure. Then given any E contained in X , there is a Gδ set G

(i.e. a countable intersection of open sets)and a Fσ set (i.e. a countable union of closed sets) F so

that F ⊆ E ⊆ G and µg(F ) = µ∗g(E) = µg(G).

Proof 14.1.3. If µ∗g(E) = ∞, we can simply choose G = X and be done. Hence, we can assume

µ∗g(E) <∞. Therefore, we know µg is regular and complete and so there is a set F with µ∗g(E) = µg(T )and Em = ∪n Emn , F = ∩m Em where for each m, the family of sets (Emn ) is in the algebra A used to

construct the outer measure, satisfies A ⊆ ∪n Emn and

∑n

τg(Emn ) < µ∗(A) +1m.

It is straightforward to see we can rewrite each Emn as a countable union of sets of the form (amn , bmn ] and

thus we have Emn = ∪n (amn , bmn ]. For any r > 0, we see Emn = ∪n (amn , b

mn + r), a countable union of

open sets of finite µg measure. Hence, applying Theorem 12.3.3, for any j there is an open set Gj so that

µg(Gj \ F ) < 1j . Hence, if G = ∩jGj , a Gδ set, we have F ⊆ G and

µg(∩jGj \ F ) ≤ µg

( N⋂j=1

Gj \ F)

= µg

( N⋂j=1

(Gj \ F ))

≤ µg(GN \ F ) <1N.

As N →∞, we see µg(G \ F ) = 0 and µ∗g(E) = µg(F ) = µg(G).

314

14.2. PROPERTIES CHAPTER 14. LEBESGUE STIELJES MEASURE

Now if µg(E) is bounded, we have µ∗g(E) = µg(T ) where T = ∩m ∪n Emn ) as before. Thus, T is a Borel

set for which µg(T ) < ∞. We can therefore apply Theorem theorem:measapproxtwo, since µg is a Borel

measure and < is a metric space, to obtain a closed set Fn ⊆ T so that µg(T \ Fn) < 1n for each positive

integer n. Let F = ∪nFn, a Fσ set. Then, for all n, we have

µg(T \ F ) = µg(∩nT \ Fn)

≤ µg(T \ Fn) <1n.

This immediately implies that µg(T \ F ) = 0. Thus, in this case, µ∗g(E) = µg(T ) = µg(F ).

In the case that µ∗g(E) is infinite, we let Ej = E ∩ (j, j + 1] for all integers n. Then E = ∪jEj and the

sets Ej are disjoint. Hence, there are Borel sets Tj with bounded µ∗g values satisfying µ∗g(Ej) = µf (Tj)where Tj = ∩m ∪n Ej,mn ) where the component sets Ej,mn are defined as before. The argument in the first

part now applies. There are Fσ sets Fj so that µ∗g(Ej) = µg(Tj) = µg(Fj). Let F = ∪jFj . Then

µg(E \ F ) = µg(∪jEj \ F ) = µg(∪j(Ej \ F )

≤ µg(∪jEj \ Fj) =∑j

µg(Ej \ Fj) = 0.

This proves the result.

14.2 Properties Of Lebesgue-Stieljes Measures

It is clear Lebesgue measure on <, as developed in Chapter 12 should coincide with the Lebesgue-Stieljes

measure generated by the function g(x) = x. In Theorem 14.2.1 below, we see that since Lebesgue mea-

sure satisfies Conditions (1), (2) and (3), the function g constructed in the proof is exactly this function,

g(x) = x. Hence, the Lebesgue-Stieljes measure for g(x) = x is just Lebesgue measure.

It is also clear that if g is of bounded variation and right continuous, g can be written as a difference of

two non decreasing functions u and v which are right continuous. The function u determines a Lebesgue-

Stieljes measure µu; the function v, the Lebesgue-Stieljes measure µv and finally g defines the charge

µg = µu − µv. Then we see∫fdµg =

∫fdµu −

∫fdµv.

We now summarize our discussions and a give converse result.

315


Theorem 14.2.1 (Characterizations of Lebesgue-Stieljes Measures).Let g be non decreasing and right continuous on <. let µ∗g be the outer measure associated with g as

discussed above. LetMg be the associated set of µ∗g measurable functions and µg be the generated

measure. Then,

1. µ∗g is a metric outer measure and so all Borel sets are µ∗g measurable.

2. If A is a bounded Borel set, the µg(A) is finite.

3. Each setA ⊆ < has a measurable coverU which is of typeGδ; i.e. U is a countable intersection

of open sets. This means A ⊆ U and µ∗g(A) = µg(U).

4. For every half open interval (a, b], µg((a, b]) = g(b)− g(a).

Conversely, if µ∗ is an outer measure on < with resulting measure space (<,M, µ). If conditions (1),

(2) and (3) above are satisfied by µ∗ and µ, then there is a non decreasing right continuous function g

on < so that µ∗g(A) = µ∗(A) for all A ⊆ <. In particular, µg(A) = µ(A) for all A ∈M.

Proof 14.2.1. The first half of the theorem has been proved in our extensive previous comments and using

Theorem 14.1.3. It remains to prove the converse. Define g on < by

g(x) =

µ((0, x]), if x > 00, if x = 0−µ((x, 0]) if x < 0.

It is clear that g is non decreasing on <. Is g right continuous? Let (hn) be any sequence of positive

numbers so that hn → 0. Assume x > 0. Then, we know

(0, x] =∞⋂n=1

(0, x+ hn].

By Condition (2) on the measure µ, we see µ((0, x+ h1]) is finite. Thus, by Lemma 9.1.2, we have

µ((0, x]) = limn→∞

µ((0, x+ hn])

or using our definition of g, g(x) = limn g(x+ hn). We conclude g is right continuous for positive x. The

arguments for x < 0 and x = 0 are similar.

Next, we show µ∗g = µ∗. By definition of g, µg((a, b]) = µ((a, b]) for all half-open intervals (a, b]. Now

both µ and µg are countably additive. Since every open interval is a countable disjoint union of half-open

316


intervals, it follows that µg(I) = µ(I) for any open interval I . Since any open set G is a countable union

of open intervals, we then see µg(G) = µ(G) for all open sets G. Further, if H is a countable intersection

of open sets, we can write H = ∩nGn where each Gn is open. Let

H1 = G1

H2 = G1 ∩G2

...

Hn = ∩nj=1Gj

Then (Hn) is a decreasing sequence of open sets because finite intersections of open sets are still open.

If we assume H is bounded, then using Lemma 9.1.2 again, we have µ(H) = limn µ(Hn). But we al-

ready have shown that µg(Hn) = µ(Hn) since Hn is an open set. We conclude therefore that µg(H) =limn µg(Hn) = limn µ(Hn) = µ(H). Thus, µ and µg match on bounded sets of type Gδ.

Now let A be any bounded subset of <. Since µ satisfies Condition (3), there is a Gδ set H1 so that

µ∗(A) = µ(H1) with A ⊆ H . Since g is non decreasing and right continuous, we know there is also a Gδset H2 so that A ⊆ H2 and µ∗g(A) = µg(H2). Let H = H1 ∩H2. Then A ⊆ H . Since Borel sets are in

Mg andM, we also know µ∗(H) = µ(H) and µ∗g(H) = µg(H). It then follows that

µ∗g(H) ≥ µ∗g(A) = µg(H2) ≥ µg(H) = µ∗g(H)

µ∗(H) ≥ µ∗(A) = µ(H1) ≥ µ(H) = µ∗(H)

we see µ∗g(A) = µg(H) and µ∗(A) = µ(H). But H is of type Gδ and so the values µ(H) and µg(H)must match. We conclude

µ∗g(A) = µ(H) = µ∗(A)

for all bounded subsets A. To handle unbounded sets A, note An = A ∩ [−n, n] is bounded and

A = ∪nAn. It is clear An increases monotonically to A. For each An, there is a Gδ set Hn so that

µ∗g(An) = µg(Hn) = µ(Hn) = µ∗(An).

Claim 1 If ν∗ is an regular outer measure, then for any sequence of sets (An),

ν∗(lim inf An) ≤ lim inf ν∗(An).

Proof . Since ν∗ is regular, there is a measurable set F so that ν∗(lim inf An) = ν(F ) with lim inf An) ⊆F . Further, there are measurable sets Fn with An ⊆ Fn with ν∗(An) = ν(Fn). Hence,

317

14.3. HOMEWORK CHAPTER 14. LEBESGUE STIELJES MEASURE

ν(F ) = ν∗(lim inf An) ≤ ν∗(lim inf Fn).

Recall

lim inf Fn =⋃m

⋂n≥m

Fn.

Hence, we see that for all N , we have ν(∪Nm=1 ∩n≥m Fn) ≤ supm infn≥m ν(Fn). This implies

limNν

(∪Nm=1 ∩n≥m Fn

)≤ lim inf ν(Fn).

.

To finish we note that since the set convergence is monotonic upward to lim inf Fn, we have ν(lim inf Fn) ≤lim inf ν(Fn). Combining inequalities, we find

ν(F ) = ν∗(lim inf An) ≤ ν∗(lim inf Fn)

≤ lim inf ν(Fn) = lim]infν∗(An)

which completes the proof.

Claim 2: If (An) is a monotonically increasing sequence with limit A and ν∗ is a regular outer

measure then ν∗(An)→ ν∗(A).

Proof . Since the sequence is monotonic increasing, it is clear the limit exists and is bounded above by

ν∗(A). On the other hand, we know from Claim 1 that ν∗(lim inf An) = ν∗(A) ≤ lim inf ν∗(An). But

since the limit exists, the lim inf matches the limit and we are done.

Using Claim 2 applied to µ∗g and µ∗ on the monotonically increasing sets An, we have µ∗g(An)→ µ∗g(A)and µ∗(An) → µ∗(A). Since µ∗g(An) = µ∗(An), we have µ∗g(A) = µ∗(A). This shows the µ∗g = µ∗ on

<.

14.3 Homework

Exercise 14.3.1. Let h be our Cantor function

h(x) = (x + Ψ(x))/2.

318


We know τh defines a Lebesgue-Stieljes measure and since the resulting sigma-algebra contains the Borel

sets, it is a Borel-Stieljes measure. Determine if τh is absolutely continuous with respect to the Borel

measure on < (Borel measure is just Lebesgue measure restricted to B).

319


320

Part VI

Abstract Measure Theory Two

321

Chapter 15

Modes Of Convergence

There are many ways a sequence of functions in a measure space can converge. In this chapter, we will

explore some of them and the relationships between them.

There are several types of convergence here:

(i): Convergence pointwise,

(ii): Convergence uniformly,

(iii): Convergence almost uniformly,

(iv): Convergence in measure,

(v): Convergence in Lp norm for 1 ≤ p <∞,

(vi): Convergence in L∞ norm.

We will explore each in turn. We have already discussed the p norm convergence in Chapter 10 so there

is no need to go over those ideas again. However, some of the other types of convergence in the list above

are probably not familiar to you. Pointwise and pointwise a.e. convergence have certainly been mentioned

before, but let’s make a formal definition so it is easy to compare it to other types of convergence later.

323

CHAPTER 15. CONVERGENCE MODES

Definition 15.0.1 (Convergence Pointwise and Pointwise a.e.).Let (X,S) be a measurable space. Let (fn) be a sequence of extended real valued measurable func-

tions: i.e. (fn) ⊆ M(X,S). Let f : X → < be a function. Then, we say fn converges pointwise to

f on X if limn fn(x) = f(x) for all x in X . Note that this type of convergence does not involve a

measure although it does use the standard metric, || on <. We can write this as

fn → f [ptws].

If there is a measure µ on S , we can also say the sequence converges almost everywhere if

µ(x | fn(x) 6→ f(x)) = 0. We would write this as

fn → f [ptws a.e.].

Next, you have probably already seen uniform convergence in the context of advanced calculus. We

can define it nicely in a measure space also.

Definition 15.0.2 (Convergence Uniformly).Let (X,S) be a measurable space. Let (fn) be a sequence of real valued measurable functions: i.e.

(fn) ⊆ M(X,S). Let f : X → < be a function. Then, we say fn converges uniformly to f on X

if for any ε > 0, there is a positive integer N (depending on the choice of ε so that if n > N , then

| fn(x)− f(x) |< ε for all x in X . We can write this as

fn → f [unif ].

However, if we are in a measure space, we can relax the idea of uniform convergence of the whole

space by taking advantage of the underlying measure.

Definition 15.0.3 (Almost Uniform Convergence).Let (X,Sµ) be a measure space. Let (fn) ⊆ M(X,S, µ) be a sequence of functions which are finite

a.e. Let f : X → < be a function. We say fn converges almost uniformly to f on X if for any ε > 0,

there is a measurable set E such that µ(EC) < ε and (fn) converges uniformly to f on E. We write

this as

fn → f [a.u.].

Finally, we can talk about a brand new idea: convergence using only measure itself.

324

15.1. EXTRACTING SUBSEQUENCES CHAPTER 15. CONVERGENCE MODES

Definition 15.0.4 (Convergence In Measure).Let (X,Sµ) be a measure space. Let (fn) ⊆ M(X,S, µ) be a sequence of functions which are finite

a.e. Let f : X → < be a function. Let E be a measurable set. We say fn converges in measure to f

on E if for any pair (ε, δ) of positive numbers, there exists a positive integer N (depending on ε and

δ) so that if n > N , then

µ(x | | fn(x)− f(x) | ≥ δ) < ε.

We write this as

fn → f [meas on E].

If E is all of X , we would just write

fn → f [meas].

15.1 Extracting Subsequences

In some cases, when a sequence of functions converges in one way, it is possible to prove that there is at

least one subsequence that converges in a different manner. We will now make this idea precise.

Definition 15.1.1 (Cauchy Sequences In Measure).Let (X,S, µ) be a measure space and (fn) be a sequence of extended real valued measurable func-

tions. We say (fn) is Cauchy in Measure if for all α > 0 and ε > 0, there is a positive integer N so

that

µ

(|fn(x) − fm(x)| ≥ α

)< ε, ∀ n, m > N.

We can prove a kind of completeness result next.

Theorem 15.1.1 (Cauchy In Measure Implies A Convergent Subsequence).Let (X,S, µ) be a measure space and (fn) be a sequence of extended real valued measurable func-

tions which is Cauchy in Measure. Then there is a subsequence (f1n) and an extended real valued

measurable function f such that f1n → f [a.e.], f1

n → f [a.u.] and f1n → f [meas].

Proof 15.1.1. For each pair of indices n and m, there is a measurable set Enm on which the definition of

the difference fn − fm is not defined. Hence, the set

E =⋃n

⋃m

Enm

is measurable and on EC , all differences are well defined. We do not know the sets Enm have measure 0here as the members of the sequence do not have to be summable or essentially bounded.

325


Now, let’s get started with the proof.

(Step 1): let α1 = 1/2 and ε1 = 1/2 also. Then, (fn) Cauchy in Measure implies

∃N1 3 n, m > N1 ⇒ µ

(|fn(x) − fm(x)| ≥ 1/2

)< 1/2.

Let

g1 = fN1+1.

(Step 2): let α2 = 1/22 and ε1 = 1/22 also. Then, (fn) Cauchy in Measure again implies there is an

N2 > N1 so that

n, m > N2 ⇒ µ

(|fn(x) − fm(x)| ≥ 1/4

)< 1/4.

Let

g2 = fN2+1.

It is then clear by our construction that

µ

(|g2(x) − g1(x)| ≥ 1/2

)< 1/2.

(Step 3): let α3 = 1/23 and ε1 = 1/23 also. Then, (fn) Cauchy in Measure again implies there is an

N3 > N2 so that

n, m > N3 ⇒ µ

(|fn(x) − fm(x)| ≥ 1/8

)< 1/8.

Let

g3 = fN3+1.

It follows by construction that

µ

(|g3(x) − g2(x)| ≥ 1/4

)< 1/4.

Continuing this process by induction, we find a subsequence (gn) of the original sequence (fn) so that

for all k ≥ 1,

µ

(|gk+1(x) − gk(x)| ≥ 1/2k

)< 1/2k.

Define the sets

Ej =(|gj+1(x) − gj(x)| ≥ 1/2j

)and

Fk =∞⋃j=k

Ej .

326


Note if x ∈ FCk ,

|gj+1(x) − gj(x)| < 1/2j

for any index j ≥ k. Each set Fk is then measurable and they form an increasing sequence. Let’s get a

bound on µ(Fk). First, if A and B are measurable sets, then

µ

(A ∪ B

)= µ

(A ∪ BC

): + µ

(A ∩ B

)+ µ

(AC ∪ B

)

But adding in µ(A ∩ B

)simply makes the sum larger. We see

µ

(A ∪ B

)≤ µ

(A ∪ BC

): + µ

(A ∩ B

)+ µ

(A ∩ B

)+ µ

(AC ∪ B

)= µ(A) + µ(B).

This result then extends easily to finite unions. Thus, if (An) is a sequence of measurable sets, then by the

sub additive result above,

µ

( n⋃i=1

Ai

)≤

n∑i=1

µ(Ai).

Hence, the sets ∪ni=1 Ai form an increasing sequence and we clearly have

µ

( ∞⋃i=1

Ai

)= lim

nµ

( n⋃i=1

Ai

)≤

∞∑i=1

µ(Ai).

We can apply this idea to the increasing sequence (Fk) to obtain

µ(Fk) ≤∞∑j=k

µ(Ej)

<

∞∑j=k

1/2j = 1/2k−1.

Now, for any i > j, we have

|gi(x) − gj(x)| ≤i−1∑`=j

|g`+1 − g`|.

Choosing the indices i and j so that i > j ≥ k, we then find if x 6∈ Fk, that

|g`+1(x) − g`(x)| < 1/2`.

327


Hence, for these indices,

|gi(x) − gj(x)| ≤i−1∑`=j

|g`+1 − g`|

<i−1∑`=j

1/2` =∞∑`=j

1/2` = 1/2j−1.

We conclude that if x ∈ FCk and i > j ≥ k we have

|gi(x) − gj(x)| ≤ 1/2j − 1. (∗)

Now let F = ∩k Fk. The F is measurable and µ(F ) = limk µ(Fk) = 0. Let x be in FC . By De

Morgan’s Laws, x ∈ ∪k FCK which implies x is in some FCk . Call this k∗. Then given ε > 0, choose J so

that 1/2J−1 < ε. Then, by Equation ∗, if i > j ≥ J ≥ k∗,

|gi(x) − gj(x)| ≤ 1/2j−1 < 1/2J−1 < ε.

Thus, the sequence gk(x) is a Cauchy sequence of real numbers for each x in FC . Hence, limk gk(x)exists for such x. Defining f by

f(x) =

limk gk(x), x ∈ FC

0, x ∈ F,

we see f is measurable and it is the pointwise limit a.e. of the subsequence (gk). This completes the proof

of the first claim. To see that (gk) converges in measure to f , look again at Equation ∗:

|gi(x) − gj(x)| ≤ 1/2j − 1, ∀ i > j ≥ k, ∀ x ∈ FCk .

Now let i→∞ and use the continuity of the absolute value function to obtain

|f(x) − gj(x)| ≤ 1/2j − 1, ∀ j ≥ k, ∀ x ∈ FCk . (∗∗)

Equation ∗∗ says that (gk) converges to f uniformly on FCk . Further, recall µ(Fk) < 1/2k−1. Note

given any δ > 0, there is an integer k∗ so that 1/2k∗−1 < δ and gk converges uniformly on FCk∗ . We

therefore conclude that (gk) converges almost uniformly to f as well.

To show the last claim, given an arbitrary α > 0 and ε > 0, choose a positive integer k∗ so that

µ(F ∗k ) < 1/2k∗−1 < min(α, ε).

328


Then, by Equation ∗∗, we have(|f(x) − gj(x)| ≥ α

)⊆

(|f(x) − gj(x)| > 1/2k

∗−1

).

Then, again by Equation ∗∗, we have

⊆(|f(x) − gj(x)| > 1/2k

∗−1

)⊆ Fk∗ .

Combining, we have

µ

(|f(x) − gj(x)| ≥ α

)≤ µ

(Fk∗

)< 1/2k

∗−1 < ε.

This shows that (gk) converges to f in measure.

The result above allows us to prove that Cauchy in Measure implies there is a function which the

Cauchy sequence actually converges to.

Theorem 15.1.2 (Cauchy In Measure Implies Completeness).Let (X,S, µ) be a measure space and (fn) be a sequence of extended real valued measurable functions

which is Cauchy in Measure. Then there is an extended real valued measurable function f such that

fn → f [meas] and the limit function f is determined uniquely a.e.

Proof 15.1.2. By Theorem 15.1.1, there is a subsequence (f1n) and a real valued function measurable

function f so that f1n → f [meas]. Let α > 0 be given. If |f(x)− fn(x)| ≥ α, then given any f1

n in the

subsequence, we have

α ≤ |f(x)− fn(x)| ≤ |f(x)− f1n(x)| + |fn(x)− f1

n(x)|.

Note, just as in the previous proof, there is a measurable set E where all additions and subtractions of

functions are well-defined. Now, let β = |f(x) − f1n(x)| and γ = |fn(x) − f1

n(x)|. The equation above

thus says

β + γ ≤ α

Since β and γ are non negative and both are less than or equal to α, we can think about this inequality in

a different way. If there was equality

β∗ + γ∗ = α

with both β∗ and γ∗ not zero, then we could let t = β∗/α and we could say β∗ = t α and γ∗ = (1− t) αas γ∗ = α − β∗. Now imagine β and γ being larger α. Then, β and γ would have to be bigger than or

329


equal to the values β∗ = t α and γ∗ = (1− t)α for some t in (0, 1). Similar arguments work for the cases

of β = 0 and γ = 0 which will correspond to the cases of t = 0 and t = 1. Hence, we can say that if

|f(x)− fn(x)| ≥ α, then there is some t ∈ [0, 1] so that

|f(x)− f1n(x)| ≥ t α,

|fn(x)− f1n(x)| ≥ (1− t) α.

The following reasoning is a bit involved, so bear with us. First, if x is a value where |f(x)−fn(x)| ≥α, we must have that |f(x)− f1

n(x)| ≥ t α (call this Condition I) and |fn(x)− f1n(x)| ≥ (1− t) α (call

this Condition II).

Case (i): if 0 ≤ t ≤ 1/2, then since an x which satisfies Condition I must also satisfy Condition II, we see

for these values of t, we have

x | |f(x)− f1n(x)| ≥ t α ⊆ x | |fn(x)− f1

n(x)| ≥ (1− t) α

⊆ x | |fn(x)− f1n(x)| ≥ 1/2 α.

Hence, for 0 ≤ t ≤ 1/2, we conclude

x | |f(x)− f1n(x)| ≥ t α

⋃x | |fn(x)− f1

n(x)| ≥ (1− t) α ⊆ x | |fn(x)− f1n(x)| ≥ 1/2 α.

A similar argument shows that if 1/2 ≤ t ≤ 1, any x satisfying Condition II must satisfy Condition I.

Hence, for these t,

x | |f(x)− f1n(x)| ≥ t α

⋃x | |fn(x)− f1

n(x)| ≥ (1− t) α

⊆ x | |f(x)− f1n(x)| ≥ (1− t) α

⊆ x | |f(x)− f1n(x)| ≥ 1/2 α.

Combining these results, we find

⋃0≤t≤1

(x | |f(x)− f1

n(x)| ≥ t α⋃x | |fn(x)− f1

n(x)| ≥ (1− t) α)

⊆ x | |fn(x)− f1n(x)| ≥ 1/2 α

⋃x | |f(x)− f1

n(x)| ≥ 1/2 α .

Finally, from the triangle inequality,

|f(x)− fn(x)| ≤ |f(x)− f1n(x)| + |f1

n(x)− fn(x)|,

and so, we have

x | |f(x)− fn(x)| ≥ α ⊆⋃

0≤t≤1

(x | |f(x)− f1

n(x)| ≥ t α⋃x | |fn(x)− f1

n(x)| ≥ (1− t) α)

⊆ x | |fn(x)− f1n(x)| ≥ 1/2 α

⋃x | |f(x)− f1

n(x)| ≥ 1/2 α.

330


Next, pick an arbitrary ε > 0. Since f1n → f [meas], there is a positive integer N1 so that

µ

(|f(x)− f1

n(x)| ≥ α/2)

< ε/2, ∀ n1 > N1.

where n1 denotes the index of the function f1n. Further, since (fn) is Cauchy in measure, there is a positive

integer N2 so that

µ

(|fn(x)− f1

n(x)| ≥ α/2)

< ε/2, ∀ n, n1 > N2.

So if n1 is larger than N = max(N1, N2), we have

µ

(|f(x)− fn(x)| ≥ α/2

)< ε, ∀ n > N.

This shows fn → f [meas] as desired.

To show the uniqueness a.e. of f , assume there is another function g so that fn → g [meas]. Then, by

arguments similar to ones we have already used, we find

x | |f(x)− g(x)| ≥ α ⊆ x | |fn(x)− f(x)| ≥ 1/2 α.

Then, mutatis mutandi, we obtain

µ

(x | |f(x)− g(x)| ≥ α

)≤ µ

(x | |fn(x)− f(x)| ≥ 1/2 α

)+ µ

(x | |fn(x)− g(x)| ≥ 1/2 α

)< ε.

Since ε > 0 is arbitrary, we see for any α > 0,

µ

(x | |f(x)− g(x)| ≥ α

)= 0.

However, we know

µ

(x | |f(x)− g(x)| > 0

)=

⋃n

(x | |f(x)− g(x)| ≥ 1/n

),

which immediately tells us that

µ

(x | |f(x)− g(x)| > 0

)= 0.

This says f = g a.e. and we are done.

331


Theorem 15.1.3 (p-Norm Convergence Implies Convergence in Measure).Assume 1 ≤ p < ∞. Let (fn) be a sequence in Lp(X,S, µ) and let f ∈ Lp(X,S, µ) so that

fn → f [p− norm]. Then fn → f [meas] which is Cauchy in Measure.

Proof 15.1.3. Let α > 0 be given and let

En(alpha) = x | |fn(x) − f(x)| ≥ α.

Then, given ε > 0, there is a positive integer N so that∫|fn − f |p dµ < αp ε, ∀ n > N.

Thus, ∫En(α)

|fn − f |p dµ ≤∫|fn − f |p dµ < αp ε, ∀ n > N.

But on En(α), the integrand in the first term is bigger than or equal to αp. We obtain

αp µ(En(α)) < αp ε, ∀ n > N.

Canceling the αp term, we have µ(En(α)) < ε, for all n > N . This implies fn → f [meas].

Comment 15.1.1. Let’s assess what we have learned so far. We have shown

(i):

fn → f [p− norm] ⇒ fn → f [meas]

by Theorem 15.1.3.

(ii): It is a straightforward exercise to show

fn → f [meas] ⇒ (fn) Cauchy In Measure .

Then,

(fn) Cauchy In Measure ⇒ ∃ (f1n) ⊆ (fn) 3 f1

n → f [a.e.]

by Theorem 15.1.1. Note, we proved the existence of such a subsequence already in the proof of the

completeness of Lp as discussed in Theorem 10.1.10.

(iii): Finally, we can also apply Theorem 15.1.1 to infer

fn → f [meas] ⇒ ∃ (f1n) ⊆ (fn) 3 f1

n → f [a.u.]

332

15.2. EGOROFF’S THEOREM CHAPTER 15. CONVERGENCE MODES

Theorem 15.1.4 (Almost Uniform Convergence Implies Convergence In Measure).Let (X,S) be a measurable space. Let (fn) be a sequence of real valued measurable functions: i.e.

(fn) ⊆M(X,S). Let f : X → < be measurable. Then

fn → f [a.u.] ⇒ fn → f [meas].

Proof 15.1.4. If fn converges to f a.u., given arbitrary ε > 0, there is a measurable set Eε so that

µ(Eε) < ε and fn converges uniformly on ECε . Now let α > 0 be chosen. Then, there is a positive integer

Nα so that

|fn(x) − f(x)| < ε, ∀ n > Nα, ∀ x ∈ ECε .

Hence, if n > Nα and x satisfies |fn(x) − f(x)| ≥ α, we must have that x ∈ Eε. We conclude(|fn(x) − f(x)| ≥ α

)⊆ Eε, ∀ n > Nα.

This implies immediately that

µ

(|fn(x) − f(x)| ≥ α

)≤ µ(Eε) < ε, ∀ n > Nα.

This proves fn → f [meas].

Comment 15.1.2. We have now shown

fn → f [p− norm] ⇒ fn → f [meas]

by Theorem 15.1.3. This then implies by Theorem 15.1.1

∃ (f1n) ⊆ (fn) 3 f1

n → f [a.u.]

15.2 Egoroff’s Theorem

A famous theorem tells us how pointwise a.e. convergence can be phrased “almost” like uniform conver-

gence. This is Egoroff’s Theorem.

Theorem 15.2.1 (Egoroff’s Theorem).Let (X,S, µ) be a measure space with µ(X) < ∞. Let f be an extended real valued function which

is measurable. Also, let (fn) be a sequence of functions in M(X,S) such that fn → f [a.e.]. Then,

fn → f [a.u.] and fn → f [meas].

333


Proof 15.2.1. From previous arguments, the way we handle converge a.e. is now quite familiar. Also,

we know how we deal with the measurable set on which addition of the function fn are not well defined.

Hence, we may assume without loss of generality that the convergence here is on all ofX and that addition

is defined on all of X . With that said, let

Enk =∞⋃k=n

(|fk(x)− f(x)| ≥ 1/m

).

Note that each Enk is measurable and En+1,k ⊆ Enk so that this is an decreasing sequence of sets in the

index n. Given x in X , we have fn → f(x). Hence, for ε = 1/m, there is a positive integer N(x, ε) so

that

|fn(x) − f(x)| < ε = 1/m, ∀ n > N(x, ε).

Thus, (|fn(x) − f(x)| ≥ 1/m

)= ∅, ∀ n > N(x, ε). (∗)

Now consider Fm =⋂∞n=1 Enm. If x ∈ Fm, then x is in Enm for all n. In particular, letting

n∗ = N(x, ε) + 1, we have x ∈ En∗ m. Looking at how we defined En∗ m, we see this implies that there is

a positive integer k′ > n∗, so that |f ′k(x)− f(x)| ≥ 1/m. However, by Equation ∗, this set is empty. This

contradiction means our original assumption that Fm was non empty is wrong. Hence, Fm = ∅. Now,

since µ(X) <∞, µ(E1m is finite also. Hence, by Lemma 9.1.2,

0 = µ(Fm) = limn

µ(Enm.

This implies that given δ > 0, there is a positive integer Nm so that

µ(Enm < δ/2m, ∀m > Nm,

since limm µ(Enm = 0. For each integer m, choose a positive integer nm > Nm. We can arrange for

these integers to be increasing; i.e., nm < nm+1. Then,

µ

(Enm m

)< δ/2m

and letting

Eδ =∞⋃m=1

Enm m,

334


we have

µ(Eδ) ≤∞∑m=1

δ/2m < δ.

Finally, note

ECδ =

( ∞⋃m=1

Enm m

)C=

∞⋂m=1

ECnm m.

Next, note

ECnm m =

( ∞⋃k=nm

(|fk(x) − f(x)| ≥ 1/m

))C

=∞⋂

k=nm

(|fk(x) − f(x)| ≥ 1/m

)C=

∞⋂k=nm

(|fk(x) − f(x)| < 1/m

).

Thus, since x ∈ ECδ means x is inECnm m for allm, the above says |fk(x)− f(x)| < 1/m for all k > nm.

Therefore, given ε > 0, pick a positive integer M so that 1/M < ε. Then, for all x in ECδ , we have

|fk(x) − f(x)| < 1/M < ε, ∀ k ≥ nM .

This says fn converges uniformly to f on ECδ with µ(Eδ) < δ. Hence, we have shown fn → f [a.u.]

Finally, if fn → f [a.u.], by Theorem 15.1.4, we have fn → f [meas] also.

Next, let’s see what we can do with domination by a p-summable function.

Theorem 15.2.2 (Pointwise a.e. Convergence Plus Domination Implies p-Norm Convergence).Let 1 ≤ p < ∞ and (X,S, µ) be a measure space. Let f be an extended real valued function which

is measurable. Also, let (fn) be a sequence of functions in Lp(X,S) such that fn → f [a.e.]. Assume

there is a dominating function g which is p-summable; i.e. |fn(x)| ≤ g(x) a.e. Then, if fn → f [a.e.],f is p-summable and fn → f [p− norm].

Proof 15.2.2. Since |fn(x)| ≤ g(x) a.e., we have immediately that |f | ≤ g a.e. since fn → f [a.e.].Thus, |f |p ≤ gp and we know f is in Lp(X,S). Since all the functions here are p-summable, the set

where all additions is not defined has measure zero. So, we can assume without loss of generality that this

set has been incorporated into the set on which convergence fails. Hence, we can say

|fn(x)− f(x)| ≤ |fn(x)| + |f(x)| ≤ 2 g(x), a.e.

335

15.3. VITALI’S THEOREM CHAPTER 15. CONVERGENCE MODES

So,

|fn(x)− f(x)|p ≤ 2p |g(x)|p, a.e.

By assumption, g is p-summable, so we have 2p gp is in L1(X,S). Applying Lebesgue’s Dominated

Convergence Theorem, we find

limn

∫|fn(x)− f(x)|p dµ =

∫limn|fn(x)− f(x)|p dµ = 0.

Thus, fn → f [p− norm].

15.3 Vitali’s Theorem

This important theorem is one that gives us more technical tools to characterize p-norm convergence for a

sequence of functions. We need a certain amount of technical infrastructure to pull this off; so bear with

us as we establish a series of lemmatta.

Lemma 15.3.1 (p-Summable Functions Have p-Norm Arbitrarily Small Off a Set).Let 1 ≤ p < ∞ and (X,S, µ) be a measure space. Let f be in Lp(X,S). Then given ε > 0, there

is a measurable set Eε so that µ(Eε) <∞ and if F ⊆ ECε is measurable, then ‖ f IF ‖p < ε.

Proof 15.3.1. Let En = (|fn(x)| ≥ 1/n). Note En ∈ S and the sequence (En) is increasing and

∪nEn = X . Let fn = fIEn . It is straightforward to verify that fn ↑ f as fn ≤ fn+1 for all n. Further,

|fn|p ≤ |f |p; hence, by the Dominated Convergence Theorem,

limn

∫|fn|p dµ =

∫limn|fn|p dµ =

∫|f |p dµ < ∞.

The definition of fn and En then implies

µ(En)/np ≤∫En

|f |p dµ ≤∫|f |p dµ < ∞.

This tells us µ(En) <∞ for all n.

Now choose ε > 0 arbitrarily. Then there is a positive integer N so that∫|f |p dµ −

∫|fn|p dµ < εp, ∀ n > N.

Thus, since fn = fIEn , we can say∫En

|f |p dµ +∫ECn

|f |p dµ −∫En

|f |p dµ < εp, ∀ n > N.

336


or ∫ECn

|f |p dµ < εp, ∀ n > N.

So choose Eε = EN+1 and we have ∫ECε

|f |p dµ < εp.

which implies the desired result.

Lemma 15.3.2 (p-Summable Inequality).Let 1 ≤ p < ∞ and (X,S, µ) be a measure space. Let (fn) be a sequence of functions in Lp(X,S).

Define βn on S by

βn(E) = ‖ fn IE ‖p, ∀ E.

Then,

|βn(E) − βm(E)| ≤ ‖ fn − fm ‖p, ∀ E, ∀ n, m.

Proof 15.3.2. By the backwards triangle inequality, for any measurable E,

‖ fn − fm ‖p ≥ | ‖ fn IE ‖p − ‖ fm IE ‖p | = |βn(E) − βm(E)|.

Lemma 15.3.3 (p-Summable Cauchy Sequence Condition I).Let 1 ≤ p < ∞ and (X,S, µ) be a measure space. Let (fn) be a Cauchy Sequence in Lp(X,S).

Define βn on S as done in Lemma 15.3.2. Then, there is a positive integer N and a measurable set Eεof finite measure, so that if F is a measurable subset of Eε, then βn(E) < ε for all n > N .

Proof 15.3.3. Since (fn) is a Cauchy sequence in p-norm, there is a function f in Lp(X,S) so that

fn → f [p − norm]. By Lemma 15.3.1, given ε > 0, there is a measurable set Eε with finite measure so

that ∫ECε

|f |p dµ < (ε/2)p.

337


Now given a measurable F contained in ECε , recalling the meaning of βn(F ) as described in Lemma

15.3.2, we can write

βn(F ) ≤ ‖ fn ‖p ≤ ‖ fn IECε ‖p≤ ‖ (fn − f) IECε ‖p + ‖ f IECε ‖p< ε/2 + ‖ (fn − f) IECε ‖p .

Since fn → f [p− norm], there is a positive integer N so that if n > N ,

‖ (fn − f) IECε ‖p < ε/2.

This shows βn(F ) < ε when n > N as desired.

Lemma 15.3.4 (Continuity Of The Integral).Let (X,S, µ) be a measure space and f be a summable function. Then for all ε > 0 there is a δ > 0,

so that

|∫Ef dµ| < ε, ∀ E ∈ S, with µ(E) < δ.

Proof 15.3.4. Define the measure γ on S by γ(E) =∫E |f | dµ. Note, by Comment 9.4.3, we know that

γ is absolutely continuous with respect to µ. Now assume the proposition is false. Then, there is an ε > 0so that for all choices of δ > 0, we have a measurable set Eδ for which µ(Eδ) < δ and |

∫Eδ

f dµ|/geqε.In particular, for the sequence δn = 1/2n, we have a sequence of sets En with µ(En) < 1/2n and

|∫En

f dµ|/geqε.

Let

Gn =∞⋃k=n

Ek, G =∞⋂n=1

Gn.

Then,

µ(G) ≤ µ(Gn) ≤∞∑k=n

µ(Ek) <

∞∑k=n

1/2k = 1/2n−1.

This implies mu(G) = 0 and thus, since γ is absolutely continuous with respect to µ, γ(G) = 0 also. We

also know the sequence Gn is decreasing and so γ(Gn)→ γ(G) = 0. Finally, since

γ(Gn) ≥ γ(En) ≥ |∫Ef dµ| ≥ ε,

338


we have γ(G) = limn γ(Gn) ≥ ε as well. This is impossible. Hence, our assumption that the proposition

is false is wrong.

Lemma 15.3.5 (p-Summable Cauchy Sequence Condition II).Let 1 ≤ p < ∞ and (X,S, µ) be a measure space. Let (fn) be a Cauchy Sequence in Lp(X,S).

Define βn on S as done in Lemma 15.3.2. Then, given ε > 0, there is a δ > 0 and a positive integer

N so that if n > N , then

βn(E) < ε, ∀ E ∈ S, with µ(E) < δ.

Proof 15.3.5. Since Lp(X,S, µ) is complete, there is a p-summable function f so that fn → f [p−norm].Then, by Lemma 15.3.4, given an ε > 0, there is a δ > 0, so that∫

E|f |p dµ < (ε/2)p, if µ(E) < δ.

Hence, using the convenience mapping βn(E) previously defined in Lemma 15.3.2, we see

βn(E) = ‖ fn IE ‖p = ‖ (f − fn) IE ‖p + ‖ f IE ‖p≤ ‖ (f − fn) IE ‖p + ε/2

when µ(E) < δ. Finally, since fn → f [p − norm], there is a positive integer N so that if n > N , then

‖ (f − fn) IE ‖p< ε/2. Combining, we have βn(E) < ε if n > N for µ(E) < δ.

Theorem 15.3.6 (Vitali Convergence Theorem).Let 1 ≤ p < ∞ and (X,S, µ) be a measure space. Let (fn) be a sequence of functions in Lp(X,S).

Then, fn → f [p− norm] if and only if the following three conditions hold.

(i): fn → f [meas],(ii): For all ε > 0, there exists N and a set Eε ∈ S with µ(Eε) <∞ so that if F is a measurable set

in ECε , then∫F |fn|

p dµ < εp, for all n > N .

(iii): For all ε > 0, there is a δ > 0 and an N so that if E is measurable with µ(E) < δ, then∫E |fn|

p dµ < εp, for all n > N .

Proof 15.3.6.⇒: If fn → f [p−norm], then by Theorem 15.1.3, fn → f [meas] which shows (i) holds. Then, since

fn → f [p−norm], (fn) a Cauchy sequence. Thus, by Lemma 15.3.1, condition (ii) holds. Finally, since

(fn) is a Cauchy sequence, by Lemma 15.3.5, condition (iii) holds.

⇒: Now assume conditions (i), (ii) and (iii) hold. Let ε > 0 be given. From condition (ii), we see there is

339


a measurable set Eε of finite measure and a positive integer N1 so that∫ECε

|fn|p dµ < (ε/4)p

if n > N1. Thus, for indices n and m larger than N1, we have

‖ fn − fm ‖p = ‖ (fn − fm) IEε + (fn − fm) IECε ‖p≤ ‖ (fn − fm) IEε ‖p + ‖ (fn − fm) IECε ‖p≤ ‖ (fn − fm) IEε ‖p + ‖ fn IECε ‖p + ‖ fm IECε ‖p< ‖ (fn − fm) IEε ‖p + ε/2.

We conclude

‖ fn − fm ‖p < ‖ fm IEε ‖p + ε/2, ∀ n, m > N1. (∗)

Now let β = µ(Eε and set

α =ε

4 β1/p

and define the sets Hnm by

Hnm = x | |fn(x)− fm(x)| ≥ α.

Apply condition (ii) for our given ε now. Thus, there is a δ(ε) and a positive integer N2 so that∫E|fn|p dµ < (ε/8)p, n > N − 2, when µ(E) < δ(ε). (∗∗)

Since fn → f [meas], (fn) is Cauchy in measure. Hence, there is a positive integer N3 so that

µ(Hnm < δ(ε), ∀ n, m > N3. (∗ ∗ ∗)

Finally, using the Minkowski Inequality, we have

‖ (fn − fm) IEε ‖p = ‖ (fn − fm) IEε\Hnm + (fn − fm) IHnm ‖p≤ ‖ (fn − fm) IEε\Hnm ‖p + ‖ fn IHnm ‖p + ‖ fm IHnm ‖p

340

15.4. SUMMARY CHAPTER 15. CONVERGENCE MODES

Now let N = maxN1, N2, N3. Then, if n and m exceed N , we have Equation ∗, Equation ∗∗ and

Equation ∗ ∗ ∗ all hold. This implies

‖ (fn − fm) IEε ‖p ≤(αp µ(Eε \Hnm)

)1/p

+ ε/8 + ε/8

& ≤ α

(µ(Eε)

)1/p

+ ε/4

= α β1/p + ε/4 = ε/(4 β1/p) β1/p + ε/4

= ε/2.

From Equation ∗, we have for these indices n and m,

‖ fn − fm ‖p < ‖ fm IEε ‖p + ε/2

< ε.

Thus, (fn) is a Cauchy sequence in p-norm. Since Lp(X,S, µ) is complete, there is a function g so that

fn → g [p−norm]. So by Theorem 15.1.3, fn → g [meas]. It is then straightforward to show that f = g

a.e. This tells us f and g belong to the same equivalence class of Lp(X,S, µ).

15.4 Summary

We can summarize the results of this chapter as follows. If we know the measure is finite, we can say quite

a bit.

Theorem 15.4.1 (Convergence Relationships On Finite Measure Space One).Let (X,S, µ) be a measure space with µ(X) <∞. Let f and (fn) be in M(X,S). Then,

fn → f [p− norm]fn → f [unif ]fn → f [a.u.]fn → f [a.e.]

⇒ fn → f [meas].

Certain types of convergence give us pointwise convergence.

Theorem 15.4.2 (Convergence Relationships On Finite Measure Space Two).Let (X,S, µ) be a measure space with µ(X) <∞. Let f and (fn) be in M(X,S). Then,

fn → f [unif ]fn → f [a.u.]

⇒ fn → f [a.e.].

Uniform convergence on a finite measure space implies p norm convergence.

341


Theorem 15.4.3 (Convergence Relationships On Finite Measure Space Three).Let (X,S, µ) be a measure space with µ(X) <∞. Let f and (fn) be in M(X,S). Then,

fn → f [unif ] ⇒

fn → f [a.u.]fn → f [p− norm]

The final result says that pointwise convergence is “almost” uniform convergence.

Theorem 15.4.4 (Convergence Relationships On Finite Measure Space Four).Let (X,S, µ) be a measure space with µ(X) <∞. Let f and (fn) be in M(X,S). Then,

fn → f [a.e.]. ⇒ fn → f [a.u.].

If the measure of X is infinite, we have many one way implications.

Theorem 15.4.5 (Convergence Relationships On General Measurable Space).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Then,

(i):

fn → f [p− norm]fn → f [unif ]fn → f [a.u.]


(ii):


⇒ fn → f [a.e.].

(iii):

fn → f [unif ] ⇒ fn → f [a.u.].

Next, if we can dominate the sequence by an Lp function, we can say even more.

342


Theorem 15.4.6 (Convergence Relationships With p-Domination One).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Assume there is a g ∈ Lp so that

|fn| ≤ g. Then, we know the following implications:

fn → f [p− norm]fn → f [unif ]fn → f [a.u.]fn → f [a.e.]


Theorem 15.4.7 (Convergence Relationships With p-Domination Two).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Assume there is a g ∈ Lp so that



⇒ fn → f [a.e.].

Theorem 15.4.8 (Convergence Relationships With p-Domination Three).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Assume there is a g ∈ Lp so that


fn → f [unif ] ⇒

fn → f [a.u.]fn → f [p− norm]

Theorem 15.4.9 (Convergence Relationships With p-Domination Four).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Assume there is a g ∈ Lp so that


fn → f [a.e.]. ⇒ fn → f [a.u.].

343

15.5. HOMEWORK CHAPTER 15. CONVERGENCE MODES

Theorem 15.4.10 (Convergence Relationships With p-Domination Five).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Assume there is a g ∈ Lp so that


fn → f [a.e.]fn → f [a.u.]

⇒ fn → f [p− norm].

Theorem 15.4.11 (Convergence Relationships With p-Domination Six).Let (X,S, µ) be a measure space. Let f and (fn) be in M(X,S). Assume there is a g ∈ Lp so that


fn → f [meas]. ⇒ fn → f [p− norm].

There are circumstances where we can be sure we can extract a subsequence that converges in some

fashion.

Theorem 15.4.12 (Convergent Subsequences Exist).Let (X,S, µ) be a measure space. It doesn’t matter whether or not µ(X) is finite. Let f and (fn) be

in M(X,S). Then, we know the following implications:

(i):

fn → f [p− norm]fn → f [meas]

⇒ ∃ subsequence f1

n → f [a.u.].

(ii):

fn → f [p− norm]fn → f [meas]

⇒ ∃ subsequence f1

n → f [a.e.].

Further, the same implications hold if we know there is a g ∈ Lp so that |fn| ≤ g.

15.5 Homework

Exercise 15.5.1. Characterize convergence in measure when the measure in counting measure.

344


Exercise 15.5.2. Let (X,Sµ) be a measure space. Let (fn), (gn) ⊆M(X,S, µ) be sequences of functions

which are finite a.e. Let f, g : X → < be functions. Prove if fn → f [measonE] and gn → g[measonE],then (fn + gn)→ (f + g)[meas on E].

Exercise 15.5.3. Let (X,Sµ) be a measure space with µ(X) < ∞. Let (fn), (gn) ⊆ M(X,S, µ) be

sequences of functions which are finite a.e. Let f, g : X → < be functions. Prove if fn → f [meas on E]and gn → g[meas on E], then (fn gn) → (f g)[meas on E]. Hint: first consider the case that fn →0[meas on E] and gn → 0[meas on E].

Exercise 15.5.4. Let (X,S, µ) be a measure space. Let (fn) ⊆ M(X,S, µ) be a sequence of functions

which are finite a.e. Let f : X → < be a function. Prove if fn → f [a.u.], then fn → f [ptws a.e.] and

fn → f [meas].

Exercise 15.5.5. Let (X,S, µ) be a finite measure space. for any pair of measurable functions f and g,

define

d(f, g) =∫

| f − g |1+ | f − g |

dµ.

(i): Prove M(X,S, µ) is a semi-metric space.

(ii): Prove if (fn) is a sequence of measurable functions and f is another measurable function, then

fn → f [meas] if and only if d(fn, f)→ 0.

Hint: You don’t need any high power theorems here. First, let φ(t) = t/(1 + t) so that d(f, g) =∫φ(|f − g|dµ. Then try this:

(⇒): We assume fn → f [meas]. Then, given any pair of positive numbers (δ, ε), we have there is an N

so that if n > N , we have

µ(|fn(x)− f(x)| ≥ δ) < ε/2.

Let Eδ denote the set above. Now for such n > N , note

d(fn, f) =∫Eδ

φ(|fn − f |) dµ +∫ECδ

φ(|fn − f |) dµ.

Since φ is increasing, we see that on ECδ , φ(|fn(x) − f(x)|) < φ(δ). Thus, you should be able to show

that if n > N , we have

d(fn, f) < µ(Eδ) + φ(δ) µ(X) = ε/2 + φ(δ) µ(X).

Then a suitable choice of δ does the job.

(⇐): If we know d(fn, f) goes to zero, break the integral up the same way into a piece on Eδ and ECδ .

This tells us right away that given ε > 0, there is an N so that n > N implies

φ(δ) µ(Eδ) < ε.

This gives us the result with a little manipulation.

345


Exercise 15.5.6. Let (<,M, µ) denote the measure space consisting of the Lebesgue measurable setsMand Lebesgue measure µ. Let the sequence (fn) of measurable functions be defined by

fn = n I[1/n,2/n].

Prove fn → 0 on all <, fn → 0 [meas] but fn 6→ 0 [p− norm] for 1 ≤ p <∞.

346

Chapter 16

Decomposition Of Measures

We now examine the structure of a charge λ on a σ - algebra S. For convenience, let’s recall that a charge

is a mapping on S to < which assigns the value 0 to ∅ and which is countably additive. We need some

beginning definitional material before we go further.

16.1 The Jordan Decomposition Of A Charge

Definition 16.1.1 (Positive and Negative Sets For a Charge).Let λ be a charge on (X,S). We say P ∈ S is a positive set with respect to λ if

λ(E ∩ P ) ≥ 0, ∀ E ∈ S.

Further, we say N ∈ S is a negative set with respect to λ if

λ(E ∩ N) ≤ 0, ∀ E ∈ S.

Finally, M ∈ S is a null set with respect to λ is

λ(E ∩ M) = 0, ∀ E ∈ S.

347

16.1. JORDAN DECOMPOSITION CHAPTER 16. DECOMPOSING MEASURES

Definition 16.1.2 (The Positive and Negative Parts Of a Charge).Let λ be a charge on (X,S). Define the mapping λ+ on S by

λ+(E) = sup λ(A) |A ∈ S, A ⊆ E.

Also, define the mapping λ− on S by

λ−(E) = − inf λ(A) |A ∈ S, A ⊆ E.

Theorem 16.1.1 (The Jordan Decomposition Of A Finite Charge).Let λ be a finite valued charge on (X,S). Then, λ+ and λ− are finite measures on S and λ = λ+−λ−.

The pair (λ+, λ−) is called the Jordan Decomposition of λ.

Proof 16.1.1. Let’s look at λ+ first. Given any measurable E, since ∅ is contained in E, by the definition

of λ+, we must have λ+(E) ≥ λ(∅) = 0. Hence, λ+ is non negative.

Next, if A and B are measurable and disjoint, By definition of λ+, for C1 ⊆ A C2 ⊆ B, we must have

λ+(A ∪ B) ≥ λ(C1 ∪ C2)

= λ(C1) + λ(C2).

This says λ+(A ∪ B) − λ(C2) is an upper bound for the set of numbers λ(C1). Hence, by definition

of λ+(A), we have

λ+(A ∪ B) ≥ λ(C1 ∪ C2)

= λ+(A) + λ(C2).

A similar argument then shows that λ(C2) is bounded above by λ+(A ∪ B) − λ+(A). Thus, we have

λ+(A ∪ B) ≥ λ(C1 ∪ C2)

= λ+(A) + λ+(B).

On the other hand, if C ⊆ A ∪B, then we have

λ(C) = λ(C ∩A ∪ C ∩B)

≤ λ+(A) + λ+(B).

This immediately implies that

λ+(A ∪ B) ≤ λ+(A) + λ+(B).

348


Thus, it is clear λ+ is additive on finite disjoint unions.

We now address the question of the finiteness of λ+. To see λ+ is finite, assume that it is not. So there is

some set E with λ+(E) =∞. Hence, by definition, there is a measurable set A1 so that λ(A1) > 1. Thus,

by additivity of λ+, we have

λ+(A1) + λ+(E \A1) = λ+(E) = ∞.

Thus, at least one of of λ+(A1) and λ+(E \ A1) is also ∞. Pick one such a set and call it B1. Thus,

λ+(B1) =∞. Let’s do one more step. Since λ+(B1) =∞, there is a measurable set A2 inside it so that

λ(A2) > 2. Then,

λ+(A2) + λ+(B1 \A2) = λ+(B1) = ∞.

Thus, at least one of of λ+(A2) and λ+(B1 \ A2) is also∞. Pick one such a set and call it B2. Thus, we

have λ+(B2) =∞. You should be able to see how we construct the two sequences (An) and (Bn). When

we are done, we know An ⊆ Bn−1, λ(An) > n and λ(Bn) =∞ for all n.

Now, if for an infinite number of indices nk, Bnk = Bnk−1 \ Ank , what happens? It is easiest to see with

an example. Suppose B5 = B4 \ A5 and B8 = B7 \ A8. By the way we construct these sets, we see A6

does not intersect A5. Hence, A7 ∩A5 = ∅ also. Finally, we have A8 ∩A5 = ∅ too. Hence, extrapolating

from this simple example, we can infer that the sequence (Ank is disjoint. By the countable additivity of λ,

we then have

λ

(⋃k

Ank

)=

∑k

λ(Ank >∑k

nk = ∞.

But λ is finite on all members of S. This is therefore a contradiction.

Another possibility is that there is an index N so that if n > N , the choice is always that of Bn = An. In

this case, we have

E ⊇ AN+1 ⊇ AN+2 . . .

Since λ is finite and additive,

λ(AN+j−1 \AN+j) = λ(AN+j−1) − λ(AN+j)

for j > 2 since all the λ values are finite. We now follow the construction given in the proof of the second

part of Lemma 9.1.2 to finish our argument. Construct the sequence of sets (En) by

E1 = ∅

E2 = AN+1 \AN+2

E3 = AN+1 \AN+3

349


......

...

En = AN+1 \AN+n−1.

Then (En) is an increasing sequence of sets which are disjoint and so λ(∪n En) = limn λ(En). Since

λ(AN+1) is finite, we then know that λ(En) = λ(AN+1) − λ(AN+n). Hence, λ(∪n En) = λ(AN+1) −limn λ(AN+n). Next, note by De Morgan’s Laws,

λ

(∪n En

)= λ

(⋃n

AN+1 ∩ACN+n

)= λ

(AN+1

⋂∪nACN+n

)= λ

(AN+1

⋂(∩nAN+n

)C)

= λ

(AN+1 \

(∩nAN+n

)).

Thus, since λ(AN+1) is finite and ∩nAN+n ⊆ AN+1, it follows that

λ(∪nEn) = λ(AN+1)− λ(∩nAN+n).

Combining these results, we have

λ(AN+1)− limnλ(AN+n) = λ(AN+1)− λ(∩nAN+n).

Canceling λ(AN+1) from both sides, we find

λ(∩nAN+n) = limnλ(AN+n) ≥ lim

nN + n = ∞.

We again find a set ∩nAN+n with λ value∞ inside E. However, λ is always finite. Thus, in this case also,

we arrive at a contradiction.

We conclude at this point that if λ+(E) = ∞, we force λ to become infinite for some subsets. Since that

is not possible, we have shown λ+ is finite. Since λ− = (−λ)+, we have established that λ− is finite also.

Next, given the relationship between λ+ and λ−, it is enough to prove λ+ is a measure to complete this

proof.

It is enough to prove that λ+ is countably additive. Let (En) be a countable sequence of measurable sets

and let E be their union. If A ⊆ E, then A = ∪n A ∩ En and so

λ(A) =∑n

λ(A ∩ En)

≤∑n

λ+(En),

by the definition of λ+. Since this holds for all such subsets A, we conclude∑

n λ+(En) is an upper

350


bound for the collection of all such λ(A). Hence, by the definition of a supremum, we have λ+(E) ≤∑n λ

+(En).

To show the reverse, note λ+(E) is finite by the arguments in the first part of this proof. Now, pick ε > 0.

Then, by the Supremum Tolerance Lemma, there is a sequence (An) of measurable sets, each An ⊆ En so

that

λ+(En) − ε/2n < λ(An) ≤ λ+(En).

Let A = ∪nAn. Then A ⊆ E and so we have λ(A) ≤ λ+(E). Hence,

∑n

λ+(En) <∑n

(λ(An) + ε/2n

)<

∑n

λ(An) + ε

since the second term is a standard geometric series. Next, since (An) is a disjoint sequence, the countable

additivity of λ gives

∑n

λ+(En) < λ

(∪nAn

)+ ε.

But A = ∪nAn and since this holds for all ε > 0, we can conclude∑n

λ+(En) < λ(A) ≤ λ+(E).

Combining these inequalities, we see λ+ is countably additive and hence is a measure.

Comment 16.1.1. If we had allowed the charge in Definition 9.0.2 to be extended real valued; i.e. take

on the values of∞ and −∞, what would happen? First, note by applying the arguments in the first part

of the proof above, we can say if λ+(E) = ∞, λ(E) = ∞ and similarly, if λ−(E) = ∞, λ(E) = −∞.

Conversely, note by definition of λ+, if λ(E) = ∞, then λ+(E) = ∞ also and if λ(E) = −∞, then

λ−(E) = ∞. So if λ+(E) = ∞, what about λ−(E)? If λ−(E) = ∞, that would force λ(E) = −∞contradicting the value it already has. Hence λ−(E) is finite. Next, given any measurable set F , what

about λ−(F )? There are several cases. First, if F ⊆ E, then

λ−(E)& = λ−(F ) + λ−(F \ E).

Since λ− ≥ 0, if λ−(F ) = ∞, we get λ−(E), which is finite, is also infinite. Hence, this can not happen.

Second, if F and E are disjoint, with λ−(F ) =∞, we find

λ−(E ∪ F ) = λ−(E) + λ−(F ).

351

16.2. HAHN DECOMPOSITION CHAPTER 16. DECOMPOSING MEASURES

The right hand side is∞ and so since λ−(E ∪ F ) is infinite, this forces λ(E ∪ F ) = −∞. But since λ is

additive on disjoint sets, this leads to the undefined expression

λ(E ∪ F ) = λ(E) + λ(F )

−∞ = (∞) + (−∞).

This is not possible because by assumption, λ takes on a well defined value in < for all measurable

subsets. Thus, we conclude if there is a measurable set E so that λ+(E) is infinite, then λ− will be finite

everywhere. The converse is also true: if λ−(E) is infinite, then λ+ will be finite everywhere. Thus, we

can conclude if λ is extended real valued, only one of λ+ or λ− can take on∞ values.

Theorem 16.1.2 (The Jordan Decomposition Of A Charge).Let λ be an extended valued charge on (X,S). Then, λ+ and λ− are measures on S and λ = λ+−λ−.

The pair (λ+, λ−) is called the Jordan Decomposition of λ.

Proof 16.1.2. By Comment 16.1.1, we can assume the charge λ takes on only∞ on some measurable sets.

The first part of the proof of Theorem 16.1.1 is the same. It is clear both λ+ and λ− are nonnegative and

additive on finite unions of disjoint sets. The argument λ− is countably additive is the same as the one

in the proof of Theorem 16.1.1 because we have assumed λ− is finite. There are two cases to show λ+ is

countably additive. These are

1. λ+(∪nEn) <∞. In this case, we can use the argument of Theorem 16.1.1 to show that λ+(∪nEn) ≥∑n λ

+(En) which shows countable additivity in this case.

2. λ+(∪nEn) = ∞. Here, the argument in the previous theorem shows there is a set A ⊆ ∪nEn so

that λ(A) = ∞. Hence, in this case, there is at least one N so that λ(A ∩ EN ) = ∞. However,

this implies λ+(En) =∞ and so∑

n λ+(En) =∞ also. Hence, the two sides match and again we

have established countable additivity.

16.2 The Hahn Decomposition Associated With A Charge

Now we show that any charge λ has associated Positive and Negative sets.

Theorem 16.2.1 (The Hahn Decomposition Associated With A Charge).Let λ be a charge on (X,S). Then, there is a positive set P and a negative set N so that X = P ∪ Nand P ∩ N = ∅. The pair (P,N) is called a Hahn Decomposition associated with the charge λ.

Proof 16.2.1. We may assume that λ+ is finite; hence, by the Supremum Tolerance Lemma there are

measurable sets An so that

352


λ(An) > λ+(X) − 12n

(α)

for all n. Let A = lim supAn. Then, AC = lim inf An. Recall

lim supAn =∞⋂m=1

∞⋃n=m

An

lim inf An =∞⋃m=1

∞⋂n=m

ACn .

Let Bm = ∩∞n=mAn. Since Bm ⊆ An for m ≥ n, we see λ+(Bm) ≤ λ+(Am). Hence, we have

lim inf λ+(Bm) ≤ lim inf λ+(An).

But the sequence of sets Bm increases monotonically to ∪mBm and so we know by Lemma 9.1.2 that

limλ+(Bm) = λ+(∪mBm). Since lim inf λ+(Bm) = limλ+(Bm) here, we obtain

λ+(∪mBm) = λ+(lim inf λ+(An) ≤ lim infmλ+(An).

.

Now, from Equation α, it then follows since λ+ is finite, that

λ+(ACn ) = λ+(X)− λ+(An)

≤ λ+(X)− λ(An)

≤ 12n.

Thus,

0 ≤ λ+(AC) ≤ lim inf λ+(ACn ) = 0

and we have shown that λ+(AC) = 0. It remains to show λ−(A) = 0. Note

0 ≤ λ−(An) = λ+(An)− λ(An)

≤ λ+(X)− λ(An) ≤ 12n.

353


Hence, for all m,

0 ≤ λ−(A) ≤ λ−( ∞⋃n=m

An

)

≤∞∑n=m

λ−(An) ≤∞∑n=m

12n

<1

2m−1.

This clearly implies that λ−(A) = 0 and we have the desired decomposition.

We can use the Hahn Decomposition to characterize λ+ and λ− is a new way.

Lemma 16.2.2 (The Hahn Decomposition Characterization of a Charge).Let (A,B) be a Hahn Decomposition for the charge λ on (X,S). Then, if E is measurable, λ+(E) =λ(E ∩ A) and λ−(E) = −λ(E ∩ B).

Proof 16.2.2. Let D be a measurable subset of E ∩A. Then λ(D) ≥ 0 by the definition of the positive set

A. Since λ is countably additive, we then have

λ

(E ∩ A

)= λ

((E ∩ A) ∩ D

)+ λ

((E ∩ A) ∩ DC

)= λ

(D

)+ λ

((E ∩ A) ∩ DC

)

But the second set is contained in E ∩A and so its λ measure is non negative. Hence, we can overestimate

the left hand side as

λ

(E ∩ A

)≥ λ

(D

)≥ 0.

Since this is true for all subsets D, the definition of λ+ implies λ+(E ∩A) ≤ λ(E ∩A). Now,

λ+(E) = λ+(E ∩A) + λ+(E ∩B).

If F is a measurable subset of By the definition of E ∩ B, then λ(F ) ≤ 0 and so sup λ(F ) ≤ 0. This

tells us λ+(E∩B) = 0. Thus, we have established that λ+(E) = λ+(E∩A). and so λ+(E) ≤ λ(E∩A).

The reverse inequality is easier. Since E ∩ A is a measurable subset of E, the definition of λ+ implies

λ(E ∩A) ≤ λ+(E). Combining these results, we have λ+(E) = λ(E ∩A) as desired.

A similar argument shows that λ−(E) = −λ(E ∩B).

354

16.3. VARIATION CHAPTER 16. DECOMPOSING MEASURES

16.3 The Variation Of A Charge

A charge λ has associated with it a concept that is very similar to that of the variation of a function. We

now define the variation of a charge.

Definition 16.3.1 (The Variation of a Charge).Let (X,S) be a measure space and λ be a charge on S . For a measurable setE, a mesh inE is a finite

collection of disjoint measurable sets inside E, E1, . . . , En for some positive integer n. Define the

mapping Vλ by

Vλ(E) = sup ∑i

|λ(Ei)| | Ei is a mesh in E

where we interpret the sum as being over the finite number of sets in the given mesh. We say Vλ(E) is

the total variation of λ on E and Vλ is the total variation of λ.

Theorem 16.3.1 (The Variation of a Charge is a Measure).Let (X,S) be a measure space and λ be a finite charge on S. Then Vλ is a measure on S .

Proof 16.3.1. Given a measurable setE, the Jordan Decomposition of λ implies that for a mesh E1, . . . , Enin E, |λ(Ei)| ≤ λ+(Ei) + λ−(Ei). Hence, since λ+ and λ− are measures and countably additive, we

have ∑i

|λ(Ei)| ≤∑i

λ+(Ei) +∑i

λ−(Ei)

≤ λ+(E) + λ−(E) < ∞

since λ+ and λ− are both finite. We conclude Vλ is a finite mapping.

Since the only mesh in ∅ is ∅ itself, we see Vλ(∅) = 0. It remains to show countable additivity. Let (En)be a countable disjoint family in S and let E be their union. Let A1, . . . , Ap be a mesh in E. Then each

Ai is inside E and they are pairwise disjoint. Let Ai n = Ai ∩ En. Note Ai is the union of the sets Ai n.

Then it is easy to see A1n, . . . , Apn is a mesh in En. For convenience, call this mesh Mn. Then

p∑i=1

|λ(Ai)| =p∑i=1

∑n

|λ(Ai n| =∑n

( p∑i=1

|λ(Ai n|).

The term in parenthesis is the sum over the mesh Mn of En. By definition, this is bounded above by

Vλ(En). Thus, we must have

p∑i=1

|λ(Ai)| ≤∑n

Vλ(Ai).

355


To get the other inequality, we apply the Supremum Tolerance Lemma to the definition of Vλ(En) to find

meshes

M εn = Aε1 n, . . . , Aεpn n,

where pn is a positive integer, so that

Vλ(En) <

pn∑i=1

|λ(Aεi n)| + ε/2n.

It follows that the union of a finite number of these meshes is a mesh of E. For each positive integer N , let

MN =N⋃i=1

M εi

denote this mesh. Then,

N∑n=1

Vλ(En) <

N∑n=1

( pn∑i=1

|λ(Aεi n)| + ε/2n).

The first double sum corresponds to summing over a mesh of E and so by definition, we have

N∑n=1

Vλ(En) < Vλ(E) +N∑n=1

ε/2n ≤ Vλ(E) +∞∑n=1

ε/2n = Vλ(E) + ε.

Since N is arbitrary, we see the sequence of partial sums on the left hand side converges to a finite limit.

Thus,

∞∑n=1

Vλ(En) ≤ Vλ(E) + ε.

Since ε is arbitrary, the other desired inequality follows.

Theorem 16.3.2 (Vλ = λ+ + λ−).Let (X,S) be a measure space and λ be a finite charge on S. Then Vλ = λ+ + λ−.

Proof 16.3.2. Choose a measurable set E and let ε > 0 be chosen. Then, by the Supremum Tolerance

Lemma, there is a mesh M ε = Aε1, . . . , Aεp so that

Vλ(E) − ε <∑i

|λ(Aεi)| ≤ Vλ(E).

Let F be the set of indices i in the mesh above where λ(Aεi) ≥ 0 and G be the other indices where

λ(Aεi) < 0. Let F be the union over the indices in F and G be the union over the indices in G. Note we

356


have

Vλ(E) − ε <∑i

|λ(Aεi)|

=∑F

|λ(Aεi)| +∑F

|λ(Aεi)|.

Now in F ,

|λ(Aεi)| = λ+(Aεi)− λ−(Aεi)

≤ λ+(Aεi),

and in G,

|λ(Aεi)| = λ−(Aεi)− λ+(Aεi)

≤ λ−(Aεi).

Thus, we can say

Vλ(E) − ε ≤∑F

λ+(Aεi) +∑G

λ−(Aεi)

= λ+(F) + λ−(G)

≤ λ+(E) + λ−(E).

Thus, for all ε > 0, we have

Vλ(E) ≤ λ+(E) + λ−(E) + ε.

This implies

Vλ(E) ≤ λ+(E) + λ−(E).

To prove the reverse, note if A ⊆ E for E ∈ S, then A itself is a mesh (a pretty simple one, of course) and

so |λ(A)| ≤ Vλ(A). Further, λ(E) = λ(A) + λ(E \A). Thus, we have

2 λ(A) ≤ λ(A) + |λ(A)| ≤ λ(E) − λ(E \A) + |λ(A)|

≤ λ(E) + |λ(E \A)| + |λ(A)|

But the collection A,E \A is a mesh for E and so

2 λ(A) ≤ λ(E) + Vλ(E).

357

16.4. ABSOLUTE CONTINUITY CHAPTER 16. DECOMPOSING MEASURES

Next, using the definition of λ+, we find

2 λ+(E) ≤ λ(E) + Vλ(E).

Finally, using the Jordan Decomposition of λ, we obtain

2 λ+(E) ≤ λ+(E) − λ−(E) + Vλ(E).

This immediately leads to λ+(E)− λ−(E) ≤ Vλ(E).

16.4 Absolute Continuity Of Charges

Now we are ready to look at absolute continuity in the context of charges.

Definition 16.4.1 (Absolute Continuity Of Charges).Let (X,S, µ) be a measurable space and let λ be a charge on S . Then λ is said to be absolutely

continuous with respect to µ if whenever E is a measurable set with µ(E) = 0, then λ(E) = 0 also.

We write this as λ µ. The set of all charges that are absolutely continuous with respect to µ is

denoted by AC[µ].

There is an intimate relationship between the absolute continuity of Vλ, λ, λ+ and λ−; essentially, one

implies all the others.

Theorem 16.4.1 (Equivalent Absolute Continuity Conditions For Charges).Let (X,S, µ) be a measurable space. Then for the statements

(1): λ+ and λ− are in AC[µ],

(2): Vλ is in AC[µ], and

(3): λ is in AC[µ], we have (1) ⇔ (2) ⇔ (3).

Proof 16.4.1.(1) → (2): if µ(E) = 0, then λ+(E) and λ−(E) are also zero by assumption. Applying the Jordan

Decomposition of λ, we see λ(E) = 0 too. Hence, λ is in AC[µ].

(2)→ (3): if µ(E) = 0, then Vλ(E) = 0. But, by Theorem 16.3.2, we have both λ+(E) and λ−(E) are

zero. Then, applying the Jordan Decomposition again, we have λ(E) = 0. This tells us λ is absolutely

continuous with respect to µ.

(3) → (1): Let (A,B) be a Hahn Decomposition of X due to λ. If µ(E) = 0, then λ(E) = 0 by

assumption. Thus, λ(E ∩ A) = λ(E ∩ B) = 0 as well. By Lemma 16.2.2, we then have that λ+(E) =λ−(E) = 0 showing that (1) holds.

There is another characterization of absolute continuity that is useful.

358

16.4. ABSOLUTE CONTINUITY CHAPTER 16. DECOMPOSING MEASURES

Lemma 16.4.2 (ε− δ Version Of Absolute Continuity Of a Charge).Let λ be a charge on S . Then

λ µ ⇔ ∀ ε > 0, ∃ δ > 0 3 |λ(E)| < ε for measurable E with µ(E) < δ.

Proof 16.4.2.(⇒): If λ is absolutely continuous with respect to µ, then note by Theorem 16.4.1 Vλ is also in AC[µ]. We

will prove this result by contradiction. Assume the desired implication does not hold for Vλ. Then, there is

a positive ε so that for all n, there is a measurable set En with µ(En) < 1/2n and Vλ(En) ≥ ε.Let Gn =

⋃∞k=n Ek and G =

⋂n Gn. Then,

µ(G) ≤ µ(Gn) ≤∞∑k=n

Ek <

∞∑k=n

1/2k = 1/2n−1.

Since this holds for all n, this implies µ(G) = 0. Since Vλ is in AC[µ], we then have Vλ(G) = 0. But

Vλ(G) = limn

Vλ(Gn) ≥ ε.

This contradiction implies that our assumption that the right hand side did not hold must be false. Hence,

the condition holds for Vλ. It is easy to see that since Vλ = λ+ + λ−, that the condition holds for them

also. This then implies the condition holds for λ = λ+ − λ−.

(⇐): We assume the condition on the right hand side holds. Now let (A,B) be a Hahn Decomposition

for X with respect to λ. In particular, if µ(E) = 0, then µ(E ∩ A) = 0 also. The condition then implies

λ(E ∩A) < ε. However, the choice of ε is arbitrary which then implies |λ(E ∩A)| = 0. But the absolute

values are unnecessary as λ is non negative on A. We conclude λ+(E) = λ(E ∩ A) = 0. A similar

argument then shows λ−(E) = −λ(E ∩ B) = 0. This tells us λ(E) = 0 by the Jordan Decomposition.

Lemma 16.4.3 (The Absolute Continuity Of The Integral).Let (X,S, µ) be a measure space and f be a summable function. Define the map λ by λ(E) =

∫E f dµ

for all measurable E. Then, λ is a charge with

λ+(E) =∫Ef+ dµ, λ−(E) = −

∫Ef− dµ.

Moreover, if Pf = x | f(x) ≥ 0 and Nf = PCf , then (Pf , Nf ) is a Hahn Decomposition for X with

respect to λ. Finally, since λ µ, we know for all positive ε, there is a positive δ, so that if E is a

measurable set with µ(E) < δ, then∣∣∣∣∫E f dµ

∣∣∣∣ < ε.

359

16.5. RADON-NIKODYM CHAPTER 16. DECOMPOSING MEASURES

Proof 16.4.3. It is easy to see that ν1 =∫E f+ dµ and ν2 =

∫E f− dµ define measures and that

λ = ν1 − ν2. Hence, λ is a charge which is absolutely continuous with respect to µ. It is also easy to see

that (Pf , Nf ) is a Hahn Decomposition for λ. Now if B is measurable and contained in the measurable

set E, we have

λ(B) =∫B∩Pf

f+ dµ −∫B∩Nf

f+ dµ

≤∫B∩Pf

f+ dµ

≤∫E∩Pf

f+ dµ.

Next, note that∫E∩Pf f

+ dµ =∫E f+ dµ because the portion of E that lies in Nf does not contribute to

the value of the integral. Thus, for any B ⊆ E, we have

λ(B) ≤∫Ef+ dµ = ν1(E).

The definition of λ+ then implies two things: first, the inequality above tells us λ+(E) ≤ ν1(E) and

second, since E ∩ Pf is a subset of E, we know λ(E ∩ Pf ) ≤ λ+(E). However, λ(E ∩ Pf ) = ν1(E) and

hence, ν1(E) ≤ λ+(E) also. Combining, we have λ+(E) = ν1(E).

A similar argument shows that λ−(E) = ν2(E).

The last statement of the proposition follows immediately from Lemma 16.4.2.

16.5 The Radon - Nikodym Theorem

From our work above, culminating in Lemma 16.4.3, we know that integrals of summable functions define

charges which are absolutely continuous with respect to the measure we are using for the integration. The

converse of this is that if a measure is absolutely continuous, we can find a summable function so that the

measure can be found by integration. That is if λ µ, there exists f summable so that λ(E) =∫f dµ.

This result is called the Radon - Nikodym theorem and as you might expect, its proof requires some

complicated technicalities to be addressed. Hence, we begin with a lemma.

Lemma 16.5.1 (Radon - Nikodym Technical Lemma).Let (X,S, µ) be a measurable space with µ(X) finite. Let λ be a measure which is finite with λ(X) >0 and λ µ. Then there is a positive ε and a measurable set A with µ(A) > 0 so that

ε µ(E ∩A) ≤ λ(E ∩A), ∀ E ∈ S.

360


Proof 16.5.1. Pick a fixed ε > 0 and assume the set A exists. Let ν = λ − ε µ. Then, ν is a finite charge

also. Note, our assumption tells us that

ν(B) = λ(B) − ε µ(B) ≥ 0,

for all measurable subsets B of A. Hence, by the definition of ν−, we must have that −ν−(A) ≥ 0 or

ν−(A) ≤ 0. But ν− is always non negative. Combining, we have ν−(A) = 0. This gives us some clues

as to how we can find the desired A. Note if (A,B) is a Hahn Decomposition for ν, then we have this

desired inequality, ν−(A) = 0. So, we need to find a positive value of ε∗ so that when (A,B) is a Hahn

Decomposition of

ν∗(A) = λ(A) − ε∗ µ(A),

we find ν∗(A) > 0.

To do this, for ε = 1/n, let (An, Bn) be a Hahn Decomposition for νn = λ − (1/n) µ. Let G = ∪n Anand H = ∩n Bn. We also know An ∪ Bn = X and An ∩ Bn = ∅ for all n. Further,

HC =(⋂

n

Bn

)C=⋃n

BCn =

⋃n

An = G.

We concludeX = G ∪H; it is easy to seeG∩H = ∅. Now,H ⊆ Bn for all n, so νn(H) = −ν−n (H) ≤ 0as Bn is a negative set. Hence, we can say

λ(H) − (1/n) µ(H) ≤ 0

which implies λ(H) ≤ (1/n) µ(H) for all n. Since λ is a measure, we then have

0 ≤ λ(H) ≤ µ(H)/n

which implies by the arbitrariness of n that λ(H) = 0. Hence,

λ(X) = λ(G) + λ(H) = λ(G).

Thus, λ(G) > 0 as λ(X) > 0. Since λ µ, it then follows that µ(G) > 0 also. Since G = ∪n An, it

must be true that there is at least one n with µ(An) > 0. Call this index N . Then, νN (E ∩ AN ) ≥ 0 as

AN is a positive set for νN . This implies

λ(E ∩ AN ) − µ(E ∩ AN )N

≥ 0,

which is the result we seek using A = AN and ε = 1/N .

361


Theorem 16.5.2 (The Radon - Nikodym Theorem: Finite Charge Case).Let (X,S, µ) be a measurable space with µ σ - finite. Let λ be a charge with λ µ. Then, there is a

summable function f so that

λ(E) =∫Ef dµ

for all measurable E. Moreover, if g is another summable function which satisfies this equality, then

f = g µ a.e. The summable function f is called the Radon - Nikodym derivative of λ with respect to

µ and is often denoted by the usual derivative symbol: f = dλdµ . Hence, this equality is often written

λ(E) =∫E

dλ

dµdµ

Proof 16.5.2. We will do this in three steps.

Step 1: We assume µ(X) is finite and λ is a finite measure.

Step 2: We assume µ is σ - finite and λ is a finite measure.

Step 3: We assume µ is σ - finite and λ is a finite charge.

As is usual, the proof of Step 1 is the hardest.

Proof Step 1: Let

F = f : X → < | f ≥ 0, f summable and∫Ef dµ ≤ λ(E), ∀ E ∈ S.

Note since f = 0X is in F , F is nonempty. From the definition of F , we see∫X f dµ ≤ λ(X) < ∞ for

all f in F . Hence,

c = supf∈F

∫Xf dµ < ∞.

We will find a particular f ∈ F so that c =∫X f dµ. Let (fn) ⊆ F be a minimizing sequence: i.e.∫

X fn dµ → c. We will assume without loss of generality that each fn is finite everywhere as the set of

points where all are infinite is a set of measure zero. Now, there are details that should be addressed in that

statement, but we have gone through those sort of manipulations many times before. As an exercise, you

should go through them again on scratch paper for yourself. With that said, we will define a new sequence

of finite functions (gn) by

gn = f1 ∨ f2 ∨ . . . ∨ fn

= max f1, . . . , fn.

362


This is a pointwise operation and it is clear that (gn) is an increasing sequence of non negative functions.

Since f1 and f2 are summable, let A be the set of points where f1 > f2. Then,∫Xf1 ∨ f2 dµ =

∫Af1 dµ +

∫AC

f2 dµ

≤∫Xf1 dµ +

∫Xf2 dµ.

This tells us f1 ∨ f2 is summable also. A simple induction argument then tells us gn is summable for all n.

Is gn ∈ F? Let E be measurable. Define the measurable sets (En) by

E1 = x | gn(x) = f1(x) ∩ E,

E2 = x | gn(x) = f2(x) ∩ (E \ E1),...

En = x | gn(x) = fn(x) ∩ (E \ ∪n−1i=1 Ei).

Then, it is clear E = ∪iEi, each Ei is disjoint from the others and gn(x) = fi(x) on Ei. Thus, since each

fi is in F , we have ∫Egn dµ =

n∑i=1

∫Ei

fi dµ

≤n∑i=1

λ(Ei) = λ(∪ni=1 Ei)

= λ(E).

We conclude each gn is in F for all n. Next, if g = sup gn, then gn ↑ g and∫Egn dµ ≤ λ(E) ≤ λ(X)

for all n. Now apply the Monotone Convergence Theorem to see g is summable and∫Egn dµ→

∫Eg dµ ≤ λ(E).

Let’s define f by

f(x) =

g(x) g(x) < ∞,0 g(x) = ∞.

Since g is summable, the set of points where it takes on the value∞ is a set of measure 0. Thus, f = g µ

a.e. and f is measurable. It is easy to see f is in F .

363


Moreover, since fn ≤ gn, we have

c = limn

∫Xfn dµ

≤ limn

∫Xgn dµ ≤ c

because gn ∈ F . Thus,

c = limn

∫Xgn dµ =

∫Xg dµ.

This immediately tells us that∫X f dµ = c with f ∈ F .

Next, define m : S → < by

m(E) = λ(E) −∫Ef dµ,

for all measurable E. It is straightforward to show m is difference of two measures and hence is a finite

charge. Also, since f is in F , we see m is non negative and thus is a measure. In addition, since λ µ

and the measure defined by∫E f dµ is also absolutely continuous with respect to µ, we have that m µ

too. Now if m(X) = 0, this would imply, since m(E) ≤ m(X), that

0 ≤ λ(E) −∫Ef dµ ≤ m(X) = 0.

But this says λ(E) =∫E f dµ for all measurable E which is the result we seek.

Hence, it suffices to show m(X) = 0. We will do this by contradiction. Assume m(X) > 0. Now apply

Lemma 16.5.1 to conclude there is a positive ε and measurable set A so that µ(A) > 0 and

ε µ(E ∩A) ≤ m(E ∩A), (∗)

for all measurable E. Define a new function h using Equation ∗ by h = f + ε IA. Then for a given

measurable E, we have ∫Eh dµ =

∫Ef dµ + ε µ(E ∩A)

≤∫Ef dµ + m(E ∩A)

by Equation ∗. Now replace m by its definition to find∫Eh dµ ≤

∫Ef dµ + λ(E ∩A) −

∫E∩A

f dµ

364


=∫E∩AC

f dµ + λ(E ∩A).

Finally, use the fact that f is in F to conclude∫Eh dµ ≤ λ(E ∩AC) + λ(E ∩A) = λ(E).

This shows that h is in F . However,∫Xh dµ =

∫Xf dµ + ε µ(A) > c!

which is our contradiction. This completes the proof of Step 1.

Proof Step 2: Now µ is σ finite. This means there is a countable sequence of disjoint measurable sets

(Xn) with µ(Xn) finite for each n and we can write X = ∪n Xn. Let Sn be the σ - algebra of subsets of

Xn given by S ∩Xn. By Step 1, there are summable non negative functions fn so that

λ(F ) =∫Ffn d µ,

for each F in Sn. Now define f by f(x) = fn(x) when x ∈ Xn. This is a well - defined function and it

is easy to see f is measurable. We also know by Theorem 9.4.5 that µf defined by µf (E) =∫E f dµ is a

measure. Now if E is measurable, then E = ∪n E ∩Xn, E = ∪n E ∩Xn and

µf (E) =∫Ef d µ =

∫∪n E∩Xn

f d µ= limn

∫∪ni=1 E∩Xn

f d µ

Then, for any n, ∫∪ni=1 E∩Xi

f d µ =n∑i=1

∫E∩Xi

f d µ =n∑i=1

∫E∩Xi

fn d µ

=n∑i=1

λ(E ∩Xi) = λ(∪ni=1 E ∩Xi) ≤ λ(E),

which is a finite number. Hence, the series of non negative terms∑

n

∫E∩Xn f d µ converges to a finite

number and ∫Ef d µ =

∑n

∫E∩Xn

fn d µ = λ(∪n E ∩Xn) = λ(E).

This establishes the result for Step 2.

365


Proof Step 3: Here, we have µ is σ - finite and λ is a finite charge. By the Jordan Decomposition of λ, we

can write

λ(E) = λ+(E) − λ−(E),

for all measurable E. Now apply Step 2 to find non negative summable functions f+ and f− so that

λ+(E) =∫Ef+ d µ,

λ−(E) =∫Ef− d µ.

Let f = f+ − f− and we are done with the proof of Step 3.

Finally, it is clear from the proof above, that the Radon - Nikodym derivative of λ with respect to µ, is

unique up to redefinition on a set of µ measure 0.

We can also prove the Radon- Nikodym theorem for the case that λ is a sigma-finite charge. Of course,

this includes the case where λ is a charge with values in <. Recall by Comment 16.1.1 that in this case,

the charge λ can only take on∞ or −∞ values as it must be additive.

Theorem 16.5.3 (The Radon - Nikodym Theorem: Sigma-Finite Charge Case).Let (X,S, µ) be a measurable space with µ σ - finite. Let λ be a sigma-finite measure with λ µ.

Then, there is a function f so that

λ(E) =∫Ef dµ

for all measurable E. Moreover, if g is another function which satisfies this equality, then f = g

µ a.e. The function f is called the Radon - Nikodym derivative of λ with respect to µ and is often

denoted by the usual derivative symbol: f = dλdµ . Hence, this equality is often written

λ(E) =∫E

dλ

dµdµ

Proof 16.5.3. We can still do this in three steps.

Step 1: We assume µ and λ are finite measures.

Step 2: We assume µ and λ are sigma-finite measures.

Step 3: We assume µ is σ - finite and λ is a sigma-finite charge.

The proof of Step 1 is identical to the one presented in Theorem 16.5.2. The proof of Step 2 is also quite

similar. The only difference is that we choose a disjoint countable collection of sets Xn for which both µ

366


and λ are finite.

Proof Step 2: Since µ and λ are σ finite, there is a countable sequence of disjoint measurable sets (Xn)with µ(Xn) and λ(Xn) finite for each n andX = ∪nXn. Let Sn be the σ - algebra of subsets ofXn given

by S ∩Xn. By Step 1, there are summable non negative functions fn so that

λ(F ) =∫Ffn d µ,

for each F in Sn. Now define f by f(x) = fn(x) when x ∈ Xn. This is a well - defined function and it is

easy to see f is measurable. If E is measurable, then E = ∪n E ∩Xn, E = ∪n E ∩Xn and∫Ef d µ =

∫∪n E∩Xn

f d µ.

Then, for any n, ∫∪ni=1 E∩Xi

f d µ =n∑i=1

∫E∩Xi

f d µ =n∑i=1

∫E∩Xi

fn d µ

=n∑i=1

λ(E ∩Xi) = λ(∪ni=1 E ∩Xi) ≤ λ(E).

Now, since λ is sigma-finite, it is possible for this value to be∞. However, whether it is a finite number or

infinite in value, the partial sums defined on the left hand side are clearly bounded above by this number.

Hence, the series of non negative terms∑

n

∫E∩Xn f d µ converges and∫

Ef d µ =

∑n

∫∪n E∩Xn

fn d µ = λ(E ∩Xn) = λ(E).

We can no longer say the function f is summable, of course. This establishes the result for Step 2.

Proof Step 3: Here, we have µ is σ - finite and λ is a sigma-finite charge. From Theorem 16.1.2, we may

assume λ has a Jordan Decomposition λ = λ+ − λ− with λ+ a finite measure. Thus, we can write

λ(E) = λ+(E) − λ−(E),

for all measurable E. Now apply Step 2 to find non negative functions f+ and f− so that

λ+(E) =∫Ef+ d µ,

λ−(E) =∫Ef− d µ.

367

16.6. LEBESGUE DECOMPOSITION CHAPTER 16. DECOMPOSING MEASURES

Since λ+ is assumed finite, it is clear f+ is actually summable and so the addition defined above is well

defined in all cases. We let f = f+ − f− and we are done with the proof of Step 3.

16.6 The Lebesgue Decomposition of a Measure

Definition 16.6.1 (Singular Measures).Let (X,S, µ) be a measure space and let λ be a charge on S. Assume there is a decomposition of X

into disjoint measurable subsets U and V (X = U ∪ V and U ∩ V = ∅) so that µ(U) = 0 and

λ(E ∩ V ) = 0 for all measurable subsets E of V . In this case, we say λ is perpendicular to µ and

write λ ⊥ µ.

Comment 16.6.1. If λ ⊥ µ, let (U, V ) be a decomposition of X associated with the singular measure λ.

We then know that µ(U) = 0 and λ(E ∩ V ) = 0 for all measurable E. Note, if E is measurable, then

E =(E ∩ U

)∪(E ∩ V

).

Thus,

λ(E) = λ

(E ∩ U

)+ λ

(E ∩ V

)= λ

(E ∩ U

).

Further,

µ(E) = µ

(E ∩ U

)+ µ

(E ∩ V

)= µ

(E ∩ V

).

Comment 16.6.2. If λ ⊥ µ with λ 6= 0, then there is a measurable set E so that λ(E ∩ U) 6= 0. But for

this same set µ(E ∩ U) = 0 as E ∩A is a subset of U . Thus, λ 6 µ.

Comment 16.6.3. If λ ⊥ µ and λ µ, then for any measurable set E, we have λ(E) = λ(E ∩ U). But,

since µ(E ∩ U) = 0, we must have λ(E ∩ U) = 0 because λ µ. Thus, λ = 0.

Comment 16.6.4. It is easy to prove that λ ⊥ µ implies Vλ ⊥ µ, λ+ ⊥ µ and λ− ⊥ µ. Also, if λ+ ⊥ µ

and λ− ⊥ µ, this implies λ ⊥ µ.

Theorem 16.6.1 (Lebesgue Decomposition Theorem).Let (X,S, µ) be a σ - finite measure space. Let λ be a finite charge on S. Then, there are two unique

finite measures, λac µ and λp ⊥ µ such that λ = λac + λp.

Proof 16.6.1. We will prove this result in four steps.

Step 1: λ and µ are finite measures.

368


Step 2: µ is a σ - finite measure and λ is a finite measure.

Step 3: µ is a σ - finite measure and λ is a finite charge.

Step 4: The decomposition is unique.

Proof Step 1: As is usual, this is the most difficult step. We can see, in this case, that λ+ µ is a measure.

Note that (λ + µ)(E) = 0 implies that λ(E) is 0 too. Hence, λ (λ + µ). By the Radon - Nikodym

Theorem, there is then a non negative λ+ µ summable f so that for any measurable E,

λ(E) =∫Ef d (λ + µ).

Hence, f is µ and λ summable as well and

λ(E) =∫Ef d λ +

∫Ef d µ).

Let

A1 = x | f(x) = 1,

A2 = x | f(x) > 1, and

B = x | f(x) < 1.

Also, for each n, let

En = x | f(x) ≥ 1 + 1/n.

Then, we see immediately A2 = ∪n En and X = A ∪B. Now, we also have

λ(En) =∫En

f d (λ + µ)

≥ (1 + 1/n)(λ(En) + µ(En)

).

This implies λ(En) ≥ (1 + 1/n) λ(En) which tells us λ(En) ≤ 0. But since λ is a measure, this forces

λ(En) = 0. From the same inequality, we also have λ(En) ≥ λ(En) + µ(En). which forces µ(En) = 0too.

Next, note the sequence of sets (En) increases to A2 and so

limn

µ(En) = µ(A2),

limn

λ(En) = λ(A2).

369


Since µ(En) = λ(En) = 0 for all n, we conclude µ(A2) = λ(A2) = 0.

Also,

λ(A1) =∫A1

f d (λ + µ)

=∫A1

1 d (λ + µ)

= µ(A1) + λ(A1),

which implies µ(A1) = 0. Let A = A1 ∪ A2. Then, the above remarks imply µ(A) = 0. We now suspect

that A and B will gives us the decomposition of X which will allow us to construct the measures λac µ

and λp ⊥ µ. Define λac and λp by

λac = λ(E ∩B),

λp = λ(E ∩A).

Then,

λ(E) = λ(E ∩A) + λ(E ∩B)

= λac(E) + λp(E),

showing us the we have found a decomposition of λ into two measures.

Is λac µ? Let µ(E) = 0. Then µ(E ∩B) = 0 as well. Now, we know

λ(E ∩B) =∫E∩B

f d (λ + µ)

=∫E∩B

f d λ +∫E∩B

f d µ.

However, the second integral must be zero since µ(E ∩B) = 0. Thus, we have

λ(E ∩B) =∫E∩B

f d λ.

We also have λ(E ∩B) =∫E∩B 1 dλ and so∫

E∩B1 d λ =

∫E∩B

f d λ.

Thus, ∫E∩B

(1 − f

)d λ = 0.

But on E ∩ B, 1 − f > 0; hence, we must have λ(E ∩ B) = 0. This means λac(E ∩ B) = 0 implying

370


λac µ.

Is λp ⊥ µ? Note, for any measurable E, we have

λp(E ∩B) = λ

((E ∩B) ∩A

)= λ(∅) = 0.

Thus, λp ⊥ µ. In fact, we have shown

λ(E) =∫E∩B

f d λ + λp(E).

Proof Step 2:

Note that once we find a decomposition X = A ∪ B with A and B measurable and disjoint satisfying

µ(A) = 0 and λ(E ∩ B) = 0 if µ(E) = 0, then we can use the technique in the proof of Step 1. We let

λac(E) = λ(E ∩B) and λp(E) = λ(E ∩A). This furnishes the decomposition we seek. Hence, we must

find a suitable A and B.

The measure µ is now σ - finite. Hence, there is a sequence of disjoint measurable sets Xn with µ(Xn) <∞ and X = ∪nXn. Let Sn denote the σ - algebra of subsets S ∩Xn. By Step 1, there is a decomposition

Xn = An∪Bn of disjoint and measurable sets so that µ(An) = 0 and λ(E∩Bn) = 0 if µ(E) = 0. Since

the setsXn are mutually disjoint, we know the sequences (An) and (Bn) are disjoint also. LetA = ∪nAnand B = AC and note AC = ∩n Bn. Then, since µ is a measure, we have

µ(∪ni=1 Ai) =n∑i=1

µ(Ai) = 0

for all n. Hence,

µ(A) = limn

µ(∪ni=1 Ai) = 0.

Next, if µ(E) = 0, then µ(E ∩ Bn) = 0 for all n by the properties of the decomposition (An, Bn) of Xn.

Since

E ∩ B = ∩n(E ∩Bn

),

and λ(E ∩B1) is finite, we have

λ(E ∩ B) = limn

λ(E ∩Bn).

However, each λ(E ∩ Bn) is zero because µ(E) = 0 by assumption. Thus, we conclude λ(E ∩ B) = 0.

We then have the A and B we need to construct the decomposition.

371

16.7. HOMEWORK CHAPTER 16. DECOMPOSING MEASURES

Proof Step 3: The mapping λ is now a finite charge. Let λ = λ+ − λ− be the Jordan Decomposition of

the charge λ. Applying Step 2, we see there are pairs of measurable sets (A1, B1) and (A2, B2) so that

X = A1 ∪ B1, A1 ∩ B1 = ∅, µ(A1) = 0, µ(E) = 0 ⇒ λ+(E ∩ B1) = 0,

and

X = A2 ∪ B2, A2 ∩ B2 = ∅, µ(A2) = 0, µ(E) = 0 ⇒ λ−(E ∩ B2) = 0.

Let A = A1 ∪A2 and B = B1 ∩B2. Note BC = A. It is clear then that µ(A) = 0. Finally, if µ(E) = 0,

then λ+(E ∩B1) = 0 and λ−(E ∩B2) = 0. This tells us

λ(E ∩ B) = λ+(E ∩ B) − λ−(E ∩ B)

= λ+(E ∩ B1 ∩ B2) − λ−(E ∩ B1 ∩ B2)

= λ+

((E ∩ B1) ∩ B2

)− λ−

((E ∩ B2) ∩ B1

).

Both of the terms on the right hand side are then zero because we are computing measures of subsets of a

set of measure 0. We conclude λ(E ∩B) = 0. The decomposition is then

λac(E) = λ(E ∩B) =(λ+ − λ−

)(E ∩ B),

λp(E) = λ(E ∩A) =(λ+ − λ−

)(E ∩ A).

Proof Step 4: To see this decomposition is unique, assume λ = λ1 + λ2 and λac + λp are two Lebesgue

decompositions of λ. Then, λac−λ1 = λ2−λp. But since λ1 and λac are both absolutely continuous with

respect to µ, it follows that λac − λ1 µ also. Further, since both λ2 and λp are singular with respect

to µ, we see λ2 − λp ⊥ µ. However, λac − λ1 = λ2 − λp by assumption and so λac − λ1 µ and

λac − λ1 ⊥ µ. By Comment 16.6.3, this tells us λac = λ1. This then implies λ2 = λp.

16.7 Homework

Exercise 16.7.1. Let (X,S) be a measurable space and λ is a charge on S. Prove if P1 and P2 are positive

sets for λ, then P1 ∪ P2 is also a positive set for λ.

Exercise 16.7.2. Let g1(x) = 2x, g2(x) = I[0,∞), g3(x) = x I[0,∞) and g4(x) = arctan(x). All of these

functions generate Borel - Stieljes measures on <.

(i): Determine which are absolutely continuous with respect to Borel measure. Then, if absolutely con-

tinuous with respect to Borel measure, find their Radon - Nikodym derivative.

(ii): Which of these measures are singular with respect to Borel measure?

372


Exercise 16.7.3. Let λ and µ be σ - finite measures on S, a σ - algebra of subsets of a set X . Assume λ is

absolutely continuous with respect to µ. If g ∈M+(X,S), prove that∫g dλ =

∫g f dµ

where f = dλ/dµ is the Radon - Nikodym derivative of λ with respect to µ.

Exercise 16.7.4. Let λ, ν and µ be σ - finite measures on S , a σ - algebra of subsets of a set X . Use the

previous exercise to show that if ν λ and λ µ, then

dν

dµ=

dν

dλ

dλ

dµ, µ a.e.

Further, if λ1 and λ2 are absolutely continuous with respect to µ, then

d(λ1 + λ2)/dµ = dλ1/dµ + dλ2/dµ µ a.e.

Exercise 16.7.5. Prove the results of Comment 16.6.4.

373


374

Chapter 17

Connections To Riemann Integration

Theorem 17.0.1 (Every Riemann Integrable Function Is Lebesgue Integrable and The Two Integrals

Coincide).Let f [a, b] → < be a Riemann integrable function. If we let inf fdµ =

∫[a,b] fdµ represent the

Lebesgue integral of f with µ denoting Lebesgue measure on the real line and∫ ba f(x)dx be the

Riemann integral of f on [a, b], then

∫[a,b]

fdµ =∫ b

af(x)dx.

Proof 17.0.1. For each positive integer n, let πn denote the uniform partition of [a, b] which divides the

interval into pieces of length 12n . Hence, πn = a + i b−a2n : 0 ≤ i ≤ 2n. Let mi and Mi be defined as

usual in Chapter 4. Define the simple functions φn and ψn as follows:

φn(x) =2n∑i=1

miI[xi,xi+1)

ψn(x) =2n∑i=1

MiI[xi,xi+1).

Then φn ≤ φn+1 ≤ f and ψn ≥ ψn+1 ≥ f for all n Further, these monotonic limits define measurable

functions g and h so that φn ↑ g and ψn ↓ h pointwise on [a, b]. Finally, it follows that g ≤ f ≤ h on [a, b].

Next, note that

375

CHAPTER 17. CONNECTIONS TO RIEMANN INTEGRATION

∫φn dµ =

2n∑i=1

miµ([xi, xi+1))

=2n∑i=1

mi(xi+1 − xi)

= L(f, πn)∫ψn dµ =

2n∑i=1

Miµ([xi, xi+1))

=2n∑i=1

Mi(xi+1 − xi)

= U(f, πn)

Since f is Riemann integrable by assumption, we knowL(f, πn)→∫ ba f(x)dx andU(f, πn)→

∫ ba f(x)dx.

Hence, limn

∫φndµ ≤

∫ ba f(x)dx. It follows from Levi’s Theorem 9.5.4 that g is summable and

∫φn dµ ↑∫

gdµ. We can apply Levi’s Theorem again to −psin to conclude h is summable and∫ψn dµ ↓

∫hdµ.

However, we also know that

∫(h− g) dµ = lim

n

∫(psin − φn) dµ

= limn

(U(f, πn)− L(f, πn)

)= 0.

We conclude h = g a.e. Since g ≤ f ≤ h, this tells us f = g = h a.e. Since Lebesgue measure is complete

and g and h are measurable, we now know f is measurable and summable with

∫f dµ = lim

n

∫φn dµ = lim

nL(f, πn) =

∫ b

af(x)dx.

We can prove a connection between Riemann-Stieljes integrals and Lebesgue-Stieljes integrals also.

376


Theorem 17.0.2 (Riemann-Stieljes Integrable Functions Are Lebesgue-Stieljes Integrable and The

Two Integrals Coincide: One).Let f : [a, b] → < and g : [a, b] → < be bounded functions. Assume g is monotone increasing and

continuous from the right on [a, b]. Assume further that f is Riemann-Stieljes integrable with respect

to the integrator g on [a, b]. Let µg denote the Lebesgue-Stieljes measure induced by g on [a, b]. If we

let∫

[ a, b]fdµg represent the Lebesgue-Stieljes integral of f and∫ ba f(x)dg be the Riemann-Stieljes

integral of f with respect to g on [a, b], then

∫[a,b]

f dµg =∫ b

af(x)dg.

Proof 17.0.2. For each positive integer n, let πn denote the uniform partition of [a, b] which divides the

interval into pieces of length 12n . Hence, πn = a + i b−a2n : 0 ≤ i ≤ 2n. Let mi and Mi be defined as

usual in Chapter 4. Define the simple functions φn and ψn as follows:

φn(x) =2n∑i=1

miI[xi,xi+1)

ψn(x) =2n∑i=1

MiI[xi,xi+1).

Then φn ≤ φn+1 ≤ f and ψn ≥ ψn+1 ≥ f for all n Further, these monotonic limits define measurable

functions u and v so that φn ↑ u and ψn ↓ v pointwise on [a, b]. Finally, it follows that u ≤ f ≤ v on [a, b].

Next, note that

∫φn dµg =

2n∑i=1

miµg([xi, xi+1))

=2n∑i=1

mi(g(xi+1)− g(xi))

= L(f, g, πn)∫φn dµg =

2n∑i=1

Miµg([xi, xi+1))

=2n∑i=1

Mi(g(xi+1)− g(xi))

= U(f, g, πn)

377


Since f is Riemann-Stieljes integrable with respect to g by assumption, we know L(f, g, πn)→∫ ba f(x)dg

and U(f, g, πn) →∫ ba f(x)dg. Hence, limn

∫φndµg ≤

∫ ba f(x)dg. It follows from Levi’s Theorem 9.5.4

that u is summable and∫φn dµg ↑

∫u dµg. We can apply Levi’s Theorem again to −psin to conclude v

is summable and∫ψndµ ↓

∫vdµg.

However, we also know that

∫(u− v) dµg = lim

n

∫(ψn − φn) dµg

= limn

(U(f, g, πn)− L(f, g, πn)

)= 0.

We conclude u = v a.e. Since u ≤ f ≤ v, this tells us u = f = v a.e. Since Lebesgue measure is complete

and u and v are measurable, we now know f is measurable and summable with

∫f dµg = lim

n

∫φn dµg = lim

nL(f, g, πn) =

∫ b

af(x)dg.

Comment 17.0.1. It is easy to see this theorem extends to Riemann-Stieljes integrators g that are of

bounded variation. If g is of bounded variation and continuous from the right, then we can write g =u − v where both u and v are increasing and right continuous. The function u determines a Lebesgue-

Stieljes measure µu; the function v, the Lebesgue-Stieljes measure µv and finally g defines the charge

µg = µu − µv. Then we see∫f dµg =

∫ ba fdg.

378

Chapter 18

Differentiation

We will discuss the kinds of properties a function f needs to have so that we have a Fundamental Theorem

of Calculus type result: f(b)− f(a) =∫ ba f ′dµ in our setting of measures on the real line.

18.1 Absolutely Continuous Functions

Definition 18.1.1 (Absolute Continuity Of Functions).Let [a, b] be a finite interval in <. f : [a, b]→ < is absolutely continuous if for each ε > 0, there is a

δ > 0 such that if [an, bn] is any finite or countable collection of non overlapping closed intervals

in [a, b] with∑

k(bk − ak) < δ then∑

k |f(bk)− f(ak)| < ε.

Theorem 18.1.1 (Properties Of Absolutely Continuous Functions).Let f be absolutely continuous on the finite interval [a, b]. Then

1. f is continuous on [a, b],

2. f is of bounded variation on [a, b],

3. If E is a subset of < with Lebesgue Measure 0, then f(E) is Lebesgue measurable also with

measure 0; i.e. µ(E) = 0⇒ µ(f(E)) = 0.

Proof 18.1.1. Condition (1) is clear. To prove (2), choose a δ > 0 so that if [ak, bk] is any collection of

non overlapping closed intervals in [a, b] with∑

k(bk − ak) < δ then∑

k |f(bk)− f(ak)| < 1. If [c, d] is

any interval in [a, b] with d− c < δ, then for any non overlapping sequence of intervals ([αn, βn]) inside

[c, d] whose summed length is less than δ, we must have∑|f(αn) − f(βn)| < 1. It then follows that

379

18.2. LS AND AC CHAPTER 18. DIFFERENTIATION

V (f, c, d) ≤ 1.

Now choose the integer N so that N > b−aδ . Partition [a, b] into N intervals I1 to IN . The variation of f

on each Ik is less than 1; hence, V (f, a, b) ≤ N <∞. This proves condition (2).

To prove (3), let ε > 0 be given. Choose δ > 0 so that if [ck, dk] is any collection of non overlapping

closed intervals in [a, b] with∑

k(dk − ck) < δ then∑

k |f(dk) − f(ck)| < ε. By the infimum tolerance

lemma, there is an open set G so that 0 = µ(E) ≤ µ(G) + δ. Since G is open, we also know we can write

G as a finite or countable union of disjoint open intervals (ak, bk); i.e. G = ∪k(ak, bk). Then

f(E) ⊆ f(G)

⊆ f

(∪k[ak, bk]

)⊆

⋃k

[f(uk), f(vk)]

where uk and vk are the points in [ak, bk] where f achieves its minimum and maximum, respectively.

Hence, µ∗(f(E)) ≤∑

k µ∗([f(uk), f(vk)] or µ∗(f(E)) ≤

∑k([f(vk)− f(uk)]. However, the points uk

and vk determine a closed subinterval of [ak, bk] with

∑k

|vk − uk| ≤∑k

|bk − ak| < δ.

Since the intervals (ak, bk) are disjoint, the intervals formed by uk and vk are non overlapping. Thus,

we conclude∑

k(f(vk) − f(uk)) < ε. This tells us µ∗(f(E)) < ε. But since ε was arbitrary, we see

µ∗(f(E)) = 0; thus, f(E) is measurable and has measure 0.

18.2 Lebesgue-Stieljes Measures and Absolutely Continuous Functions

If f is continuous and non decreasing, there is a nice connection between f and the associated Lebesgue-

Stieljes measure µf . First, we explore a relationship between the Lebesgue-Stieljes outer measure and

Lebesgue outer measure.

Theorem 18.2.1 (µ∗f (E) = µ∗(f(E))).Let f be continuous and non decreasing on < and let µ∗f be the associated Lebesgue-Stieljes outer

measure. For all sets E in <, µ∗f (E) = µ∗(f(E)).

Proof 18.2.1. Let E be in <. Let G be an open set that contains f(E). We know G can be written as a

countable disjoint union of open intervals Jn. Since f is continuous and non decreasing, we must have

380

18.2. LS AND AC CHAPTER 18. DIFFERENTIATION

In = f−1(Jn) is an interval as well. Letting In = (an, bn) gives Jn = (f(an), f(bn)). Noting E ⊆ ∪nInand µ(G) =

∑n µ(Jn), we find

µ∗f (E) ≤ µ∗f (∪nIn) = µf (∪nIn)

≤∑n

µf (In) =∑n

µ(Jn)

= µ(G).

This is true for all such open sets G. We therefore conclude µ∗f (f(E)) ≤ µ∗(f(E)) by Theorem 12.2.4.

The above immediately tells us that if µ∗f (E) =∞, we also have µ∗(f(E)) =∞ and the result holds. We

may thus safely assume that µ∗f (E) is finite for the rest of the argument. Let ε > 0 be given. Let ((an, bn])be a collection of half open intervals whose union covers E with

∑n (f(bn)− f(an)) < µ∗f (E) + ε. Let

Jn = f((an, bn]). Again, since f is continuous and non decreasing, each interval Jn can be written as

Jn = (f(an), f(bn)]. Since f(E) ⊆ ∪nJn, we must have

µ∗(f(E)) ≤ µ∗(∪nJn)

≤∑n

(f(bn)− f(an)) < µ∗f (E) + ε.

Since ε is arbitrary, we see µ∗(f(E)) ≤ µ∗f (E). This second inequality completes the argument.

There is a very nice connection between a continuous non decreasing f and its associated Lebesgue-

Stieljes measure µf .

Theorem 18.2.2 (f Is Absolutely Continuous IFF µf << µ.).A continuous non decreasing function f is absolutely continuous on [a, b] if and only if its associated

Lebesgue-Stieljes measure µf is absolutely continuous with respect to Lebesgue measure µ.

Proof 18.2.2. Let f be continuous and non decreasing on [a, b]. By Theorem 18.2.1, we have µ∗f (E) =µ∗(f(E)) for all subsets E in [a, b].

If we assume f is absolutely continuous on [a, b], by Condition (3) of Theorem 18.1.1, when E has

Lebesgue measure 0, we know µ(f(E)) = 0 as well. Thus, µ∗(f(E)) = 0 = µ∗f (E). But then E is

µf measurable with µf measure 0. This tells us µf µ.

Conversely, if µf µ, we can apply Lemma 16.4.3. Given ε > 0, there is a δ > 0, so that µ(E) < δ

implies µf (E) < ε. Hence, for any E which is a non overlapping countable union of intervals [an, bn]with

∑n(bn − an) < δ, we have

∑n(f(bn)− f(an)) = µf (E) < ε. This says f is absolutely continuous

on [a, b].

381

18.3. BOUNDED VARIATION DERIVATIVES CHAPTER 18. DIFFERENTIATION

18.3 Derivatives of Functions of Bounded Variation

We are going to look carefully at the behavior of monotone functions and their rates of change.

Definition 18.3.1 (Derived Numbers).We say the extended real number α is a derived number for a function f at the point x0 in dom(f) is

there is a sequence of nonzero numbers x0 + hn in dom(f) with hn → 0 so that

limn

f(x0 + hn)− f(x0)hn

= α

We use the notationDf(x) to denote the collection of all derived numbers for a function f at the point

x in its domain.

If a function f is strictly increasing on [a, b] we want to be able to look quantitatively how the outer

measure size of E compares to f(E) for any subset E. To do this, we need a technical tool called a Vitali

Cover.

Definition 18.3.2 (Vitali Cover).Let I be a collection of non degenerate closed intervals in <. Let E be a subset of < and V be a sub

collection of I. If for all x ∈ E and ε > 0, there is a V ∈ V so that x ∈ V and µ(V ) < ε, we say V is

a Vitali Cover for E or a Vitali Covering for E.

We can prove this important result.

Theorem 18.3.1 (Vitali Covering Theorem).Let V be a Vitali Cover of a set E in <. The there is a countable collection (Vn) in V that is pairwise

disjoint and µ(E \ ∪nVn) = 0.

Proof 18.3.1. First assume E is bounded. Let J be any open interval containing E and let V0 be those

intervals in V that are contained in J . Then V0 is also a Vitali Cover for E. Let V1 be chosen from V0. If

µ(E \ V1) = 0, we have proven our conjecture. If not, we continue this process using induction. Choose

V2 this way. Let F1 = V1 and G1 = J \ F1. Then G1 is open. Define the new collection V1 by

V1 = V ∈ V0 : V ⊆ G1.

Since by assumption E \ V1 is not empty and V0 is a Vitali Cover for E, there must be sets in the family

V1. Let

S1 = sup µ(V ) : V ∈ V1.

382


The members of a Vitali Cover are non degenerate which implies S1 > 0. Further, each member of V0 is

in J which tells us S1 is finite. Choose V2 from V1 so that µ(V2) > S12 . Then V2 ⊆ G1 implying V1 and V2

are disjoint. If µ(E \ V1 ∪ V2) = 0 we are done. Otherwise, we continue.

We can now see how to do the induction. Suppose we have chosen the sets V1, V2, . . . , Vn so that they

are pairwise disjoint and µ(E \∪ni=1Vi) 6= 0. Then, let Fn = ∪ni=1Vi and Gn = J \Fn. Then Gn is open.

Define the new collection Vn by

Vn = V ∈ V0 : V ⊆ Gn.

Since by assumption E \ Vn is not empty and Vn is also a Vitali Cover for E, there must be sets in the

family Vn. Let

Sn = sup µ(V ) : V ∈ Vn.

The members of a Vitali Cover are non degenerate which implies Sn > 0. Further, each member of V0 is

in J which tells us Sn is finite. Choose Vn+1 from Vn so that

µ(Vn+1) >Sn2. (18.1)

Then again, Vn+1 ⊆ Gn implying the new collection V1, V2, . . . , Vn, Vn+1 is pairwise disjoint.

If this process terminates after a finite number of steps, we have proven the result. Otherwise, we construct

a countable number of sets Vn which form a pairwise disjoint collection from V0. Let S = ∪nVn. We will

show µ(E \ S) = 0 proving the result.

Each Vn has a midpoint. Let Wn be an interval with the same midpoint as Vn which is 5 times the length.

Then µ(Wn) = 5µ(Vn). Also, by construction S = ∪nVn ⊆ E ⊆ J and so

∑n

µ(Wn) = 5∑n

µ(Vn) ≤ 5 µ(J). (18.2)

Now let x ∈ E \ S. Then, x ∈ ∩n(E ∩ V Cn ) implying x ∈ E ∩ ECn for all n. But since E ⊆ J , this tells

us x ∈ J∩ECn always. From the definition ofGn, it follows that x ∈ Gn for all n. We conclude x ∈ ∩nGn.

Now fix the positive integer i. Since x is in Gi which is open, there is a positive number r so that

B(x; r) ⊆ Gi. From the definition of of a Vitali Cover of E, it follows there is a non degenerate closed

interval V in V0 with µ(V ) < r2 . Hence, V is contained in Gi and so V ∩ Vi = ∅.

383


Since x ∈ SC , we see that V can not be any of the intervals Vn we have constructed. Now all the intervals

Vn are non degenerate and all live in the bounded interval J . Thus,∑

i µ(Vi) < µ(J) which implies

µ(Vi)→ 0. From Equation 18.1, it follows that Sn → 0 as well. Choose a value ofN so that SN < µ(V ).

From the definition of SN , we then have that V can not be in GN and so V ∩ FN 6= ∅.

Let M = infj : V ∩ Fj 6= ∅. It is clear M > i. Hence, for this index M , V ∩ FM−1 = ∅ implying

V ⊆ GM−1. Further, V ∩ FM 6= ∅ and this tells us V ∩ VM 6= ∅. From the definition of SM−1, it then

follows that µ(V ) ≤ SM−1 < 2µ(Vm). Since WM shares the same center as VM with 5 times the length,

this means V ⊆WM .

Finally, M > i, so V ⊆ ∪∞j=iWj . Since x ∈ V , this shows x ∈ ∪∞j=iWj . We conclude E \S ⊆ ∪∞j=1 Wj .

From Equation 18.2, we know the series∑

n µ(Wn) converges. Given any positive ε, there is a P so that∑∞j=1 µ(Wj) < ε if i > P . Since µ(E \ S) ≤

∑∞j=1 µ(Wj) for all i, we conclude µ(E \ S) = 0.

The proof for the case that E is unbounded is left to you.

Lemma 18.3.2 (µ∗(f(E)) ≤ p µ∗(E)).Let f be strictly increasing on the interval [a, b] and let E ⊆ [a, b]. If at each point x ∈ E, there exists

at least one derived number α in Df(x) satisfying α < p, then µ∗(f(E)) ≤ p µ∗(E).

Proof 18.3.2. Let ε > 0 be chosen and let G be a bounded open set containing E so that

µ(G) < µ∗(E) + ε. (18.3)

For any x0 in E, there is a sequence (hn), all hn 6= 0, with hn → 0 so that the interval [x0, x0 + hn] (if

hn > 0) or [x0 + hn, x− 0] (if hn < 0) are in G and

f(x0 + hn) − f(x0)hn

< p. (18.4)

To keep our notation simple, we will simply use the notation [x0, x0 + hn] whether hn is positive or

negative. Let In(x0) = [x0, x0 + hn] and Jn(x0) = [f(x0), f(x0 + hn)]. Since f is strictly increasing,

f(In(x0)) ⊆ Jn(x0) and Jn(x0) is a non degenerate closed interval. We also know µ(In(x0) = |hn| and

mu(Jn(x0) = |f(x0 + hn)− f(x0)|. It follows from Equation 18.4 that

µ(Jn(x0)) < p µ(In(x0)) = p |hn|. (18.5)

Since hn → 0, we then see limn µ(Jn(x0) = 0. Let V be the collection of intervals Jn(x0) : x0 ∈E, n ∈ Z+. It is easy to see V is a Vitali Cover for f(E). Thus, by Theorem 18.3.1, there is a countable

disjoint family Jni(xi) : i ∈ Z+ so that

384


µ

(f(E) \ ∪i Jni(xi)

)= 0. (18.6)

From Equation 18.6, we then find

µ∗(f(E)) ≤∑i

µ(Jni(xi)) < p∑i

µ(Ini(xi)). (18.7)

But f is strictly increasing and so the intervals Ini(xi) must also be pairwise disjoint. Using Equation

18.3, we infer

∑i

µ(Ini(xi)) = µ

(∪iIni(xi)

)≤ µ(G) < µ∗(E) + ε. (18.8)

Combining, we have µ∗(f(E)) < pµ∗(E) + ε. Since ε is arbitrary, this proves the result.

Lemma 18.3.3 (µ∗(f(E)) ≥ q µ∗(E)).Let f be strictly increasing on the interval [a, b] and let E ⊆ [a, b]. If at each point x ∈ E, there exists

at least one derived number α in Df(x) satisfying α > q, then µ∗(f(E)) ≥ q µ∗(E).

Proof 18.3.3. The proof of this is left to you.

We are now ready to prove a very nice result: a function of bounded variation is differentiable µ a.e.

Theorem 18.3.4 (Functions of Bounded Variation Are Differentiable a.e).Let f be a function of bounded variation on the bounded interval [a, b]. Then f has a finite derivative

µ a.e.

Proof 18.3.4. Since a function of bounded variation can be decomposed into the difference of two mono-

tone functions, it is enough to prove this result for a non decreasing function. Further, if f is non decreas-

ing, then g(x) = x+ f(x) is strictly increasing. If we prove the result for g, we will know the result is true

for f as well. Hence, we may assume, without loss of generality, that f is strictly increasing in our proof.

Let E∞ be the set of points in [a, b] where Df(x) contains the value ∞. Then f(E∞) ⊆ [f(a), f(b)]since f is strictly increasing. Now for any positive integer q, since∞ is a derived number at x in E∞, by

Lemma 18.3.3, it follows that µ∗(f(E∞)) ≥ qµ∗(E∞)). We also know µ∗(f(E∞)) ≤ µ∗([f(a), f(b)] =f(b)− f(a). Thus,

q µ∗(E∞) ≤ f(b)− f(a), for all positive integers q.

385


Hence, µ∗(E∞) = 0.

Now choose real numbers u and v so that 0 ≤ u < v and define the set Euv by

Euv = x : ∃ α, β ∈ Df(x) 3 α < u < v < β.

Applying Lemma 18.3.2 and Lemma 18.3.3, we have

vµ∗(Euv ≤ µ∗(f(Euv) ≤ uµ∗(Euv).

Since v > u, we then have (v − u)µ∗(Euv) ≤ 0. This implies µ∗(Euv) = 0.

Now, if f is not differentiable at a point x, f either has ∞ as a derived number or it has at least two

different finite derived numbers there. In the second case, let the two derived numbers be αx and βx. Then

there are rational numbers r1 and r2 so that αx < r1 < r2 < βx. This implies x ∈ Er1 r2 . Hence, if N is

the set of points where f fails to be differentiable in [a, b], we have

N ⊆ E∞ ∪ Epq : p, q rational .

But the outer measure of all these component sets is 0. Therefore, N has outer measure 0 also which tells

us N is measurable and has measure 0.

We are now getting closer to our Fundamental Theorem of Calculus extension. We can now prove a very

weak form of the Recapture Theorem in Riemann integration.

Theorem 18.3.5 (The Weak Monotone Recapture Theorem or Monotone Functions Have Summable

Derivatives).Let f be non decreasing on [a, b]. Then f ′ is measurable and

∫ ba f ′dµ ≤ f(b)− f(a). Note this tells

us f ′ is summable.

Proof 18.3.5. First, extend f to the interval [a, b+1] by setting f(x) = f(b) on [b, b+1]. For convenience,

we will call this extended f by the same name. Then, let the functions fn be defined by

fn(x) =f(x + 1

n) − f(x)1n

= n

(f(x +

1n

) − f(x)).

Then, at each point where f ′ exists, fn → f ′. Since each fn is measurable as f is monotone, we have the

pointwise limit f ′ is measurable a.e. Now apply Fatou’s Lemma to see

386


∫ b

alim inf

nfn dµ =

∫ b

af ′ dµ ≤ lim inf

n

∫ b

afn dµ.

Looking at the definition of the limit inferior, we see lim infn ≤ supn always. Hence, for all n, we have

∫ b

af ′ dµ ≤

∫ b

afn dµ.

Since f is monotone, each fn is Riemann integrable and so the Lebesgue integrals here are Riemann

integrals. We will use substitution to finish our argument. Note

∫ b

afn(x) dx = n

∫ b

a(f(x+

1n

)− f(x)) dx

= n intb+ 1

n

a+ 1n

f(x) dx − n intba f(x) dx

= n

∫ b+ 1n

bf(x) dx − n

∫ a+ 1n

af(x) dx

= f(b) − n

∫ a+ 1n

af(x) dx,

because on [b, b + 1n ], f(x) = f(b). Finally, on [a, a + 1

n ], f(x) ≥ f(a). Hence, the last integral is

bounded below by n f(a) 1n = f(a). Hence, supn

∫ ba fn dµ ≤ f(b) − f(a) always. We conclude∫ b

a f ′ dµ ≤ f(b)− f(a) too.

If f is absolutely continuous, we can say more. First, we show an absolutely continuous function can be

rewritten as the difference of two non decreasing absolutely continuous functions.

Lemma 18.3.6 (AC Functions Are Difference of AC Non Decreasing Functions).Let f be absolutely continuous on [a, b]. Then there are absolutely continuous non decreasing func-

tions u and v so that f = u− v. In fact, u = vf and v = vf − f are the usual choices.

Proof 18.3.6. For any a ≤ α < β ≤ b, from Theorem 3.4.1 we know the total variation function is additive

on intervals. Recall the total variation function of f is defined by

Vf (x) =

0, x = a

V (f ; a, x), a < x ≤ b

Thus, it follows that

Vf (β)− Vf (α) = V (f, α, β).

387


Let ε > 0 be chosen. Then there is a δ > 0 so that if ((an, bn)) is a collection of non overlapping intervals

with∑

n (bn − an) < δ, then∑

n |f(bn) − f(an)| < ε2 . Let πn be any partition of [an, bn]. Then its

component intervals form a non overlapping collection of intervals whose summed length is smaller than

δ. Call this collection In and let its intervals be labeled [xin, yin] for convenience. In fact, the component

intervals in each In can be glued together via a union to form a collection whose summed length is also

less than δ. Call this larger collection I . Then, the absolute continuity condition for f says

∑In

|f(yin)− f(xin)| <ε

2.

However, the choice of partitions πn is arbitrary. Hence, it follows that

∑n

V (f, an, bn) < ε.

But, we know the total variation is additive. Hence, we can rewrite the inequality using V (f, an, bn) =Vf (bn)− Vf (an) to get

∑n

Vf (bn)− Vf (an) < ε.

Thus, vf is absolutely continuous on [a, b]. Moreover, v = vf − f is also absolutely continuous as sums

and differences of absolutely continuous functions are also absolutely continuous. It is then clear that

u = vf and v = vf − f is a suitable decomposition.

We can now prove a reasonable recapture theorem.

Theorem 18.3.7 (The Absolutely Continuous Recapture Theorem).Let f be absolutely continuous on [a, b]. Then

∫ b

af ′ dµ = f(b) − f(a).

Proof 18.3.7. We have proven most of the requisite pieces for this result. From Lemma 18.3.6, we know

f can be written as the difference of two absolutely continuous functions. From Theorem 18.3.4, we then

know f is differentiable a.e. From Theorem 18.3.5, we know f ′ is summable. It is now clear, we can

assume without loss of generality that our absolutely continuous function f is non decreasing. To finish,

look at the proof of Theorem 18.3.5. Extend f to [b, b + 1] as before and consider the same sequence of

functions (fn).

388


fn(x) =f(x + 1

n) − f(x)1n

= n

(f(x +

1n

) − f(x)).

Since f is absolutely continuous, it is also continuous and so this time, we know each fn is continuous.

From the proof of Theorem 18.3.5, we know

∫ b

afn(x) dx = f(b) − n

∫ a+ 1n

af(x) dx.

The integral term above converges to f(a) using the standard Fundamental Theorem of Calculus for

continuous integrands. Thus, we know

∫ b

afn(x) dx → f(b) − f(a).

Let ε > 0 be chosen. Our extension of f to [b, b+ 1] is still absolutely continuous and so there is a δ1 > 0so that given any collection of non overlapping intervals ([an, bn]) from [a, b + 1] whose summed length

is less than δ1, we have∑

n |f(bn)− f(an)| < ε3 .

Further, since f ′ is summable, by the absolutely continuity of the integral, there is a δ2 > 0 so that∫F |f

′| dµ < ε3 whenever µ(F ) < δ2. We can choose δ2 < δ1.

LetE be the set of points in [a, b] where f is differentiable. ThenEC is measure zero and we know fn → f ′

pointwise a.e. From Egoroff’s Theorem, Theorem 15.2.1, we then know fn → f ′ almost uniformly. Apply

this theorem for tolerance δ2. Then, there is a measurable set G with measure less than δ2 so that fn → f ′

uniformly on GC . Thus, we also know∫G f ′ dµ < ε

3 . From the definition of uniform convergence of GC ,

we know there is a positive integer N so that

|f ′(x)− fn(x)| <ε

3(b − a, ∀ n > N.

It is clear that GC does not contain points of EC!) It then follows that for all n > N ,

∫GC|f ′ − fn| dµ < µ)GC)

ε

3(b − a)<ε

3.

Now let’s combine our pieces. We have for n > N ,

389


∣∣∣∣∫ b

a(fn − f ′) : dµ

∣∣∣∣ =∣∣∣∣∫G

(fn − f ′) : dµ +∫GC

(fn − f ′) : dµ∣∣∣∣

≤∫GC|fn − f ′| : dµ +

∫G|fn| : dµ +

∫G|f ′| : dµ

< 2ε

3+ +

∫Gfn dµ,

where we can drop the absolute values of fn because f is non decreasing by assumption and so each fn is

non negative.

It remains to show∫G fn dµ <

ε3 also. Since µ(G) < δ2 < δ1, we can find an open subset V of (a, b) so

that G ⊆ V and µ(V ) < δ1. Express V as a countable union of disjoint open intervals ((ci, di)). Pick

any x from [0, 1]. Then the collection ([ci + x, di + x]) is a non overlapping collection of intervals whose

summed length is less than δ1. Hence,

∑i

|f(di + x)− f(ci + x)| < ε

3.

From the proof of Theorem 18.3.5, we find

∫ di

ci

fn(x) dx = n

∫ di+1n

di

f(x) dx − n

∫ ci+1n

ci

f(x) dx

= n

∫ 1n

0(f(di + x) − f(ci + x)) dx.

Summing over i, we have

∑i

∫ di

ci

fn(x) dx = n

∫ 1n

0

(∑i

(f(di + x) − f(ci + x)))dx.

But the inner summation adds up to less than ε3 . We conclude

∑i

∫ di

ci

fn(x) dx < n

∫ 1n

0

ε

3dx =

ε

3.

Thus,

∫Gfn dµ ≤

∫Vfn dµ =

∑i

∫ di

ci

fn(x) dx <ε

3.

390


This is the last piece to complete the proof of this result.

We can now establish the linkage between the Lebesgue-Stieljes charges induced by an absolutely contin-

uous function f on [a, b] and the derivative f ′.

Theorem 18.3.8 (Characterizing Lebesgue-Stieljes Measures Constructed From Absolutely Continu-

ous Functions).Let f be absolutely continuous on [a, b]. Let νf be the finite charge induced by f in the Lebesgue-

Stieljes construction process. Then, for all measurable E, we have

νf (E) =∫Ef ′ dµ

Proof 18.3.8. Since f is absolutely continuous, νf µ. From the Radon-Nikodym theorem, there is

therefore a summable g so that

νf (E) =∫Eg dµ

for all measurable E. In particular, for any x ∈ (a, b],

νf ((a, x]) = f(x) − f(a) =∫

(a,x]g dµ.

However, we also know

f(x) − f(a) =∫

(a,x]f ′ dµ.

Hence,∫ xa (g − f ′) dµ = 0. In fact,

∫I(g − f ′) dµ = 0 for any interval in [a, b]. Let h = g− f ′ which is

summable. Now let’s assume there is a measurable set E in [a, b] with∫E h dµ > 0 and let ε =

∫E h dµ.

From the continuity of the integral, for this ε, there is a δ > 0 so that∫G h dµ < ε for any measurable set

G with µ(G) < δ.

Choose an open set U in < which contains E with µ(U) < µ(E) + δ. Write the open set U as a countable

union of open intervals In = (an, bn). Let Vn = In∩ [a, b]. Then,∫Vnhdµ = 0 as Vn is an interval. From

this we conclude, if V = ∪nVn, that∫V h dµ = 0 as well. By construction, E ⊆ V ⊆ U which implies

V \ E ⊆ U . Thus

391


µ(E) + µ(EC ∩ V ) + µ(V C ∩ U) = µ(U)

< µ(E) + δ.

It follows that µ(V \ E) < δ and hence,∫V \E h dµ < ε. However,

0 =∫Vh dµ =

∫Eh dµ +

∫V \E

h dµ

It follows immediately that

ε =∣∣∣∣∫Eh dµ

∣∣∣∣ =∣∣∣∣∫V \E

h dµ

∣∣∣∣ < ε.

This is a contradiction. We conclude∫E h dµ = 0 for all measurable E. In particular, this is true for

x | h(x) > 0 implying h+ = 0 a.e. and for x | h(x) ≤ 0 implying h− = 0 a.e. Thus, h = 0 a.e. We

therefore have shown that f ′ = g a.e. and we can conclude

νf (E) =∫Ef ′ dµ.

392

Part VII

Summing It All Up

393

Chapter 19

Summing It All Up

We have now come to the end of this series of lecture notes. We have not covered all of the things we

wanted to but we view that as a plus: there is more to look forward to! In particular, in a second, more

advanced class, we would like to discuss

• Representation theorems for bounded linear functionals on certain spaces. For example, the dual

space, (C[a, b])′ has a nice relationship to certain types of measures.

• Generalizations of differential equations to summable data and problems of the form x′ = f(t, x)a.e. We could then seque into control theory more easily.

• Hausdorff measures on the real line are quite interesting and we have covered most of the tools

needed to discuss them adequately. Our general methods of constructing outer measures comes in

handy here.

So here’s to the future and another Advanced Measure Theory Course!

395

CHAPTER 19. SUMMING IT ALL UP

396

Part VIII

References

397

References

[1] A. Bruckner, J. Bruckner, and B. Thomson. Real Analysis. Prentice - Hall, 1997.

[2] S. Douglas. Introduction To Mathematical Analysis. Addison-Wesley Publishing Company, 1996.

[3] W. Fulks. Advanced Calculus: An Introduction to Analysis. John Wiley & Sons, third edition, 1978.

[4] H. Sagan. Advanced Calculus of real valued functions of a Real Variable and Vector - Valued Functions

of a Vector Variable. Houghton Mifflin Company, 1974.

[5] G. Simmons. Introduction to Topology and Modern Analysis. McGraw-Hill Book Company, 1963.

[6] K. Stromberg. Introduction To Classical Real Analysis. Wadsworth International Group and Prindle,

Weber and Schmidt, 1981.

[7] A. Taylor. General Theory of Functions and Integration. Dover Publications, Inc., 1985.

399

REFERENCES REFERENCES

400

Part IX

Detailed Indices

401

Index

Theorem

Radon - Nikodym Theorem: Sigma-Finite

Charge Case, 366

Definition

RS[g, a, b], 116

Absolute Continuity Of A Measure, 199

Absolute Continuity Of Charges, 358

Abstract Darboux Integrability, 213

Additive Set Function, 263

Algebra Of Sets, 251

Almost Uniform Convergence, 324

Caratheodory Condition, 250

Cauchy Sequence In Norm, 230

Cauchy Sequences In Measure, 325

Charges, 179

Common Refinement Of Two Partitions, 39

Complete Measure, 192

Complete NLS, 230

Conjugate Index Pairs, 226

Content Of Open Interval, 271

Continuity Of A Function At A Point: ε−δVersion, 14

Continuity Of A Function At A Point: Limit

Version, 14

Continuous Almost Everywhere, 107

Convergence In Measure, 325

Convergence Pointwise and Pointwise a.e.,

324

Convergence Uniformly, 324

Darboux Integrability, 71

Darboux Lower And Upper Integrals, 70

Darboux Upper and Lower Sums, 67

Derived Numbers, 382

Differentiability of A Function At A Point,

14

Equivalent Conditions For The Measura-

bility of a Function, 161

Equivalent Conditions For The Measura-

bility of an Extended Real Valued Func-

tion, 166

Essentially Bounded Functions, 237

Extended Real Number System, 158

Functions

Absolute Continuity, 379

Functions Of Bounded Variation, 51

Inner Product Space, 244

Integral Of A Non-negative Measurable Func-

tion, 186

Integral Of A Simple Function, 186

Lebesgue Outer Measure, 272

Limit Inferior And Superior Of Sequences

Of Sets, 182

Measurability of a Function, 160

Measurability Of Extended Real Valued Func-

tions, 165

Measures, 178

Borel Measure, 288

Metric On A Set, 222

Metric Outer Measure, 256

403

INDEX INDEX

Monotone Function, 40

Associated Saltus Function, 45

Norm Convergence, 222

Norm On A Vector Space, 221

Outer Measure, 249

Partition, 38

Positive and Negative Sets For a Charge,

347

Premeasures and Covering Families, 262

Propositions Holding Almost Everywhere,

185

Pseudo-Measure, 266

Refinement Of A Partition, 38

Regular Outer Measures, 264

Rewriting Lebesgue Outer Measure Using

Edge Length Restricted Covers, 282

Riemann - Stieljes Criterion For Integra-

bility, 125

Riemann - Stieljes Darboux Integral, 125

Riemann - Stieljes Sum, 115

Riemann Integrability Of a Bounded f , 64

Riemann Integrability Of A Bounded Func-

tion, 18

Riemann Sum, 16, 63

Riemann’s Criterion for Integrability, 71

Set of Extended Real Valued Measurable

Functions, 166

Sets Of Content Zero, 107

Sigma - Algebra Generated By Collection

A, 155

Sigma Algebra, 153

Simple Functions, 185

Singular Measures, 368

Space Of p Summable Functions, 225

Spaces of Essentially Bounded Functions,

238

Step Function, 119

Summable Functions, 201

The Continuous Part Of A Monotone Func-

tion, 46

The Discontinuity Set Of A Monotone Func-

tion, 43

The Generalized Cantor Set, 301

The Positive and Negative Parts Of a Charge,

348

Upper and Lower Riemann - Stieljes Dar-

boux Sums, 124

Upper and Lower Riemann - Stieljes Inte-

grals, 125

Variation of a Charge, 355

Vitali Cover, 382

Differentiability Implies Continuity, 15

Functions

Antiderivative, 15

Bounded, 15

Continuity, 14

Differentiable, 14

Primitive, 15

Integration

Antiderivatives Of Simple Powers, 26

Antiderivatives of Simple Trigonometric Func-

tions, 26

Cauchy Fundamental Theorem of Calcu-

lus, 24

Definite Integrals Of Simple Powers, 26

Definite Integrals Of Simple Trigonomet-

ric Functions, 27

Evaluation Set, 16

Functions With Jump Discontinuities, 32

Functions With Removable Discontinuities,

31

Norm of a partition, 19

Partition, 15

Primitive, 15

Riemann Integrable, 22

Riemann Sum, 16, 18

Sequences of partitions, 17

Symbol For The Antiderivative of f is∫f ,

26

404

INDEX INDEX

Symbol For The Definite Integral of f on

[a, b] is∫ ba f(t) dt, 26

The indefinite integral of f is also the an-

tiderivative, 26

Lemma

Mδ = µ∗, 283

µ∗(f(E)) ≥ q µ∗(E), 385

µ∗(f(E)) ≤ p µ∗(E), 384

f = g on (a, b) Implies Riemann Integrals

Match, 95

f Zero On (a, b) Implies Zero Riemann In-

tegral, 94

Outer Measure Of The Closure Of Interval

Equals Content Of Interval, 281

Absolute Continuity Of The Integral, 359

AC Functions Are Difference of AC Non

Decreasing Functions, 387

Approximate Finite Lebesgue Covers Of

I ., 282

Characterizing Limit Inferior And Superi-

ors Of Sequences Of Sets, 182

Condition For Outer Measure To Be Reg-

ular, 264

Continuity Of The Integral, 338

Continuous Functions Of Finite Measur-

able Functions Are Measurable, 171

Continuous Functions Of Measurable Func-

tions Are Measurable, 173

De Morgan’s Laws, 154

Disjoint Decompositions Of Unions, 183

Epsilon - Delta Version Of Absolute Con-

tinuity Of a Charge, 359

Essentially Bounded Functions Bounded Above

By Their Essential Bound a.e, 239

Essentially Bounded Functions That Are

Equivalent Have The Same Essential

Bound, 239

Extended Valued Measurability In Terms

Of The Finite Part Of The Function,

166

Extending τg To Additive Is Well - Defined,

309

Finite Jump Step Functions As Integrators,

123

Function f Zero a.e. If and Only If Its Inte-

gral Is Zero, 199

Function Measurable If and Only If Pos-

itive and Negative Parts Measurable,

165

Hahn Decomposition Characterization of a

Charge, 354

Infimum Tolerance Lemma, 40

Lebesgue - Stieljes Outer Premeasure Is a

Pseudo-Measure, 311

Limit Inferiors And Superiors Of Mono-

tone Sequences Of Sets, 183

Measure Of Monotonic Sequence Of Sets,

180

Monotonicity, 180

Monotonicity Of The Abstract Integral For

Non Negative Functions, 190

Non Measurable Set Lemma 1, 294

Non Measurable Set Lemma 2, 294

One Jump Step Functions As Integrators

One, 119


Three, 122


Two, 121

Outer Measure Of Interval Equals Content

Of Interval, 281

p-Summable Cauchy Sequence Condition

I, 337

p-Summable Cauchy Sequence Condition

II, 339

p-Summable Functions Have p-Norm Ar-

bitrarily Small Off a Set, 336

p-Summable Inequality, 337

405

INDEX INDEX

Pointwise Infimums, Supremums, Limit In-

feriors and Limit Superiors are Mea-

surable, 169

Products of Measurable Functions Are Mea-

surable, 170

Properties of Extended Valued Measurable

Functions, 168

Properties of Measurable Functions, 163

Properties Of Simple Function Integrations,

187

Radon - Nikodym Technical Lemma, 360

Real Number Conjugate Indices Inequal-

ity, 226

Sums Over Finite Lebesgue Covers Of I

Dominate Content Of I , 273

Supremum Tolerance Lemma, 40

The Upper And Lower Darboux Integral Is

Additive On Intervals, 81

The Upper And Lower Riemann - Stieljes

Darboux Integral Is Additive On Inter-

vals, 128

Measure

Borel, 313

Lebesgue - Stieljes, 313

Measurable Cover, 264

Monotone Function

Continuous at x From Left If and Only If

u(x) = 0, 43

Continuous at x From right If and Only If

v(x) = 0, 43

Continuous at x If and Only If u(x) =v(x) = 0, 43

Left Hand Jump at x, u(x), 43

Right Hand Jump at x, v(x), 43

Total Jump at x, u(x) + v(x), 43

Monotone Functions

Saltus Function

Properties, 45

Partitions

Gauge or Norm, 39

Proposition

Refinements and Common Refinements, 39

Theorem, 215

DAI(f(x) ≡ c) is cµ(X), 215

DAI(f,m0,M0) is independent of the choice

of m0 and M0, 214

L(f,M0) ≤ U(f,M0), 213

L(f,π1) ≤ U(f,π2), 70, 213

L(f, g,π1) ≤ U(f, g,π2), 124

L1 Semi-norm, 222

Lp Is A Vector Space, 229

Lp Semi-Norm, 229, 240

RI[a, b] Is A Vector Space andRI(f ; a, b)Is A Linear Mapping, 64

U(f,M0) ≤ L(f,M0), 214

Vf and Vf − f Are Monotone For a Func-

tion f of Founded Variation, 58

π′ refines π Implies L(f,π) ≤ L(f,π′)and U(f,π) ≥ U(f,π′), 213

π π′ Implies L(f,π) ≤ L(f,π′) and

U(f,π) ≥ U(f,π′), 67

π π′ Implies L(f, g,π) ≤ L(f, g,π′)and U(f, g,π) ≥ U(f, g,π′), 124

L1 Is Separable, 292

µ∗f (E) = µ∗(f(E)), 380

f ∈ BV [a, b] ∩ C[a, b] If and Only If Vfand Vf − f Are Continuous and In-

creasing, 62

f ∈ RS[g, a, b] Implies f ∈ RS[Vg, a, b]and f ∈ RS[Vg − g, a, b], 129

f Bounded Variation and g Continuous Im-

plies Riemann - Stieljes Integral Ex-

ists, 139

f Bounded and Continuous At All But Finitely

Many Points Implies f is Riemann In-

tegrable, 97

f Bounded and Continuous At All But One

Point Implies f is Riemann Integrable,

96

406

INDEX INDEX

f Continuous and g Bounded Variation Im-

plies Riemann - Stieljes Integral Ex-

ists, 139

f Continuous and g Riemann Integrable

Implies f g is Riemann Integrable,

105

f Is Absolutely Continuous IFF µf <<

µ., 381

f2, f1f2 and 1/f Riemann Stieljes Inte-

grable With Respect To g Of Bounded

Variation, 132

µ∗ Measurable Sets Form Algebra, 251

µ∗ Measurable Sets Properties, 252

A Function Of Bounded Variation Is The

Difference of Two Increasing Functions,

58

A Monotone Function Has A Countable Num-

ber of Discontinuities, 42

Abstract Darboux Integral Absolute Inequal-

ity, 218

Abstract Darboux Integral Is Additive, 217

Abstract Darboux Integral Is Monotone, 216

Abstract Darboux Integral Is Scalable, 218

Abstract Darboux Integral Measures, 215

Abstract Darboux Integral Measures Are

Absolutely Continuous, 218

Abstract Darboux Integral Zero Implies f =0 a.e., 218

Abstract Integral Darboux Lower and Up-

per Bounds, 215

Abstract Integration Is Additive, 196

Almost Uniform Convergence Implies Con-

vergence In Measure, 333

Alternate Characterization Of Essentially

Bounded Functions, 238

Approximation Of A Summable Function

With A Continuous Function, 292

Approximation Of Non negative Measur-

able Functions By Monotone Sequences,

172

Approximation Of The Riemann Integral,

89

Average Value For Riemann Integrals, 87

Bounded Differentiable Implies Bounded

Variation, 52

Bounded Variation Implies Riemann Inte-

grable, 80

Cauchy - Schwartz Inequality, 227

Cauchy Fundamental Theorem Of Calcu-

lus, 25

Cauchy In Measure Implies A Convergent

Subsequence, 325

Cauchy In Measure Implies Completeness,

329

Cauchy Schwartz Inequality: Sequence Spaces,

237

Cauchy’s Fundamental Theorem, 88

Characterizing Lebesgue-Stieljes Measures

Constructed From Absolutely Contin-

uous Functions, 391

Conditions For OMI-F Measures, 265

Conditions For OMI-FE Measures, 266

Constant Functions Are Riemann Integrable,

79

Constructing Measures From Non Nega-

tive Measurable Functions, 198

Constructing Outer Measures Via Premea-

sures, 262

Continuous Approximation Of A Charac-

teristic Function, 290

Continuous Approximation Of A Simple

Function, 291

Continuous Implies Riemann Integrable, 79

Convergence Relationships On Finite Mea-

sure Space, 341, 342

Convergence Relationships On General Mea-

surable Space, 342

Convergence Relationships With p-Domination,

343, 344

Convergent Subsequences Exist, 344

407

INDEX INDEX

Egoroff’s Theorem, 333

Equality a.e. Can Imply Measurability Even

If The Measure Is Not Complete, 193

Equality a.e. Implies Measurability If The

Measure Is Complete, 192

Equivalent Absolute Continuity Conditions

For Charges, 358

Every Riemann Integrable Function Is Lebesgue

Integrable and The Two Integrals Co-

incide, 375

Existence Of The Riemann Integral, 20

Extended Monotone Convergence Theorem,

195

Extended Monotone Convergence Theorem

Two, 200

Fatou’s Lemma, 197

Function Of Bounded Variation Continu-

ous If and Only If Vf Is Continuous,

60

Functions

Properties Of Absolutely Continuous Func-

tions, 379

Functions Of Bounded Variation Always

Possess Right and Left Hand Limits,

59

Functions Of Bounded Variation Are Bounded,

51

Functions Of Bounded Variation Are Closed

Under Addition, 53

Functions of Bounded Variation Are Dif-

ferentiable a.e, 385

Functions Of Bounded Variation Have Count-

able Discontinuity Sets, 59

Fundamental Abstract Integration Inequal-

ities, 206

Fundamental Riemann Integral Estimates,

65

Fundamental Riemann Stieljes Integral Es-

timates, 127

Fundamental Theorem Of Calculus, 22

Fundamental Theorem Of Calculus Reversed,

24

Holder’s Inequality, 226

Holder’s Inequality: p = 1, 243

Holder’s Inequality: Sequence Spaces, 236

Hahn Decomposition Associated With A

Charge, 352

Inner Product On The Space of Square Summable

Equivalence Classes, 244

Integrals Of Summable Functions Create

Charges, 205

Integrand Continuous and Integrator Con-

tinuously Differentiable Implies Rie-

mann - Stieljes Integrable, 139

Integrand Riemann Integrable and Integra-

tor Continuously Differentiable Implies

Riemann - Stieljes Integrable, 141

Integration By Parts, 91

Inverses Of Functions Of Bounded Varia-

tion, 54

Lebesgue Decomposition Theorem, 368

Lebesgue Measure Is Regular, 285

Lebesgue Measure Is Translation Invariant,

293

Lebesgue Outer Measure Is Metric Outer

Measure, 284

Lebesgue’s Criterion For The Riemann In-

tegrability of Bounded Functions, 107

Lebesgue’s Dominated Convergence The-

orem, 208

Lebesgue-Stieljes Measures

Properties, 316

Leibnitz’s Rule, 93

Levi’s Theorem, 203

Limit Interchange Theorem For Riemann -

Stieljes Integral, 148

Linearity of the Riemann - Stieljes Inte-

gral, 116

Measurability and Approximation Condi-

tions, 286

408

INDEX INDEX

Measure Induced By Outer Measure, 255

Measure Induced By Outer Measure Is Com-

plete, 256

Minkowski’s Inequality, 227

Minkowski’s Inequality: Sequence Spaces,

237

Monotone Convergence Theorem, 194

Monotone Functions

A Partition Sum Estimate, 41

Monotone Functions Are Of Bounded Vari-

ation, 52

Monotone Implies Riemann Integrable, 79

Non Lebesgue Measurable Set, 295

Open Set Characterization Lemma, 158

Open Sets in a Metric Space Are OMI Mea-

surable, 257

Open Sets In Metric Space µ∗ Measurable

If and Only If µ∗ Metric Outer Mea-

sure, 261

p-Norm Convergence Implies Convergence

in Measure, 332

Pointwise a.e. Convergence Plus Domi-

nation Implies p-Norm Convergence,

335

Pointwise Limits of Measurable Functions

Are Measurable, 169

Products Of Functions Of Bounded Varia-

tion Are Of Bounded Variation, 53

Properties of fc, 46

Properties Of The Riemann Integral, 75

Properties Of The Riemann Stieljes Inte-

gral, 127

Radon - Nikodym Theorem: Finite Charge

Case, 362

Representing The Cantor Set, 302

Riemann - Lebesgue Lemma, 108

Riemann - Stieljes Integral, 116

Riemann Stieljes Fundamental Theorem Of

Calculus, 135

Riemann Stieljes Integral Is Additive On

Subintervals, 133

Riemann Stieljes Integration By Parts, 117

Riemann-Stieljes Integrable Functions Are

Lebesgue-Stieljes Integrable and The

Two Integrals Coincide: One, 377

Sequences Of Equivalence Classes in Lp

That Converge Are Cauchy, 230

Sequences That Converge In p - Norm Pos-

sess Subsequences Converging Point-

wise a.e., 235

Space of Square Summable Equivalence Classes

Is A Hilbert Space, 245

Substitution In Riemann Integration, 92

Summable Function Equal a.e. To Another

Function With Measure Complete Im-

plies The Other Function Is Also Summable,

203

Summable Function Equal a.e. To Another

Measurable Function Implies The Other

Function Is Also Summable, 202

Summable Function Form A Linear Space,

207

Summable Implies Finite a.e., 202

The Absolutely Continuous Recapture The-

orem, 388

The Antiderivative of f , 87

The Fundamental Theorem Of Calculus, 84

The Jordan Decomposition Of A Finite Charge,

348

The Jordan Decomposition Of A Charge,

352

The Mean Value Theorem For Riemann In-

tegrals, 86

The Metric Space of All Finite Measure

Sets Is Complete, 296

The Metric Space of Finite Measurable Sets,

296

The Recapture Theorem, 89

409

INDEX INDEX

The Riemann Integral Equivalence Theo-

rem, 71

The Riemann Integral Exists On Subinter-

vals, 82

The Riemann Integral Is Additive On Subin-

tervals, 83, 129

The Riemann Integral Is Order Preserving,

66

The Riemann Integral Limit Interchange The-

orem, 99

The Riemann Stieljes Integral Equivalence

Theorem, 125

The Riemann Stieljes Integral Exists On

Subintervals, 128

The Riemann Stieljes Integral Is Order Pre-

serving, 128

The Set Of Equivalence Classes of L1 Is A

Normed Linear Space, 224

The Set Of Equivalence Classes of L∞ Is

A Normed Linear Space, 241

The Set Of Equivalence Classes of Lp Is A

Normed Linear Space, 229

The Space of Equivalence Class of L∞ Is

A Banach Space, 241

The Space of Equivalence Class of Lp Is A

Banach Space, 230

The Total Variation Is Additive On Inter-

vals, 56

The Total Variation Of A Function Of Bounded

Variation, 53

The Upper And Lower Abstract Darboux

Integrals Are Finite, 213

The Upper And Lower Darboux Integral

Are Finite, 70

The Upper And Lower Riemann - Stieljes

Darboux Integral Are Finite, 124

The Variation Function Of a Function f Of

Bounded Variation, 58

The Weak Monotone Recapture Theorem

or Monotone Functions Have Summable

Derivatives, 386

Two Riemann Integrable Functions Match

At All But Finitely Many Points Im-

plies Integrals Match, 96

Variation of a Charge In Terms of The Plus

and Minus Parts, 356

Variation of a Charge is a Measure, 355

Vitali Convergence Theorem, 339

Vitali Covering Theorem, 382

Weierstrass Approximation Theorem, 102

Worked Out Solutions

Integration Substitution∫(t2 + 1) 2dt, 27∫(t2 + 1)3 4dt, 28∫sin(t2 + 1) 5t dt, 30∫ √t2 + 1 3t dt, 30∫ 5

1 (t2 + 2t + 1)2 (t + 1) dt, 30

410

Part X

Glossary Of Terms

411

Glossary

C

Cauchy Fundamental Theorem of Calculus LetG be any antiderivative of the Riemann integrable

function f on the interval [a, b]. Then G(b) − G(a) =∫ ba f(t) dt., p. 25.

continuity A function f is continuous at a point p if for all positive tolerances ε, there is a positive δ

so that | f(t) − f(p) | < ε if t is in the domain of f and | t − p | < δ. You should note

continuity is something that is only defined at a point and so functions in general can have

very few points of continuity. Another way of defining the continuity of f at the point p is to

say the limt→ p f(t) exists and equals f(p)., p. 14.

D

differentiability A function f is differentiable at a point p if there is a number L so that for all positive

tolerances ε, there is a positive δ so that

| f(t) − f(p)t − p

− L |< ε if t is in the domain of f and | t − p |< δ

You should note differentiability is something that is only defined at a point and so functions

in general can have very few points of differentiability. Another way of defining the differ-

entiability of f at the point p is to say the limt→ pf(t)− f(p)

t− p exists. At each point p where

this limit exists, we can define a new function called the derivative of f at p. This is usually

denoted by f ′(p) or dfdt (p)., p. 14.

F

Fundamental Theorem of Calculus Let f be Riemann Integrable on [a, b]. Then the function F

defined on [a, b] by F (x) =∫ xa f(t) dt satisfies


2. F is differentiable at each point x in [a, b] where f is continuous and F ′(x) = f(x).

413

GLOSSARY GLOSSARY

, p. 22.

P

Primitive The primitive of a function f is any function F which is differentiable and satisfies F ′(t) =f(t) at all points in the domain of f ., p. 15.

R

Riemann Integral If a function on the finite interval [a, b] is bounded, we can define a special limit

which, if it exists, is called the Riemann Integral of the function on the interval [a, b]. Select a

finite number of points from the interval [a, b], t0, t1, , . . . , tn−1, tn. We don’t know how

many points there are, so a different selection from the interval would possibly gives us more

or less points. But for convenience, we will just call the last point tn and the first point t0.

These points are not arbitrary – t0 is always a, tn is always b and they are ordered like this:

t0 = a < t1 < t2 < . . . < tn−1 < tn = b

The collection of points from the interval [a, b] is called a Partition of [a, b] and is denoted by

some letter – here we will use the letter P. So if we say P is a partition of [a, b], we know it

will have n+ 1 points in it, they will be labeled from t0 to tn and they will be ordered left to

right with strict inequalities. But, we will not know what value the positive integer n actually

is. The simplest Partition P is the two point partition a, b. Note these things also:

1. Each partition of n+ 1 points determines n subintervals of [a, b]

2. The lengths of these subintervals always adds up to the length of [a, b] itself, b− a.

3. These subintervals can be represented as

[t0, t1], [t1, t2], . . . , [tn−1, tn]

or more abstractly as [ti, ti+1] where the index i ranges from 0 to n− 1.

4. The length of each subinterval is ti+1 − ti for the indices i in the range 0 to n− 1.

Now from each subinterval [ti, ti+1] determined by the Partition P , select any point you want

and call it si. This will give us the points s0 from [t0, t1], s1 from [t1, t2] and so on up to the

last point, sn−1 from [tn−1, tn]. At each of these points, we can evaluate the function f to get

the value f(sj). Call these points an Evaluation Set for the partition P . Let’s denote such

an evaluation set by the letter E. If the function f was nice enough to be positive always and

continuous, then the product f(si) × (ti+1 − ti) can be interpreted as the area of a rectangle;

in general, though, these products are not areas. Then, if we add up all these products, we get

a sum which is useful enough to be given a special name: the Riemann sum for the function

f associated with the Partition P and our choice of evaluation set E = s0, . . . , sn−1. This

414

GLOSSARY GLOSSARY

sum is represented by the symbol S(f, P,E) where the things inside the parenthesis are there

to remind us that this sum depends on our choice of the function f , the partition P and the

evaluation set E. The Riemann sum is normally written as

S(f, P,E) =∑i ∈ P

f(si) (ti+1 − ti)

and we just remember that the choice of P will determine the size of n. Each partition P has

a maximum subinterval length – let’s use the symbol || P || to denote this length. We read

the symbol || P || as the norm of P . Each partition P and evaluation set E determines the

number S(f, P,E) by a simple calculation. So if we took a collection of partitions P1, P2

and so on with associated evaluation sets E1, E2 etc., we would construct a sequence of real

numbers S(f, P1, E1), S(f, P2, E2), . . . , , S(f, Pn, En), . . .. Let’s assume the norm of

the partition Pn gets smaller all the time; i.e. limn→∞ || Pn || = 0. We could then ask if

this sequence of numbers converges to something. What if the sequence of Riemann sums

we construct above converged to the same number I no matter what sequence of partitions

whose norm goes to zero and associated evaluation sets we chose? Then, we would have

that the value of this limit is independent of the choices above. This is what we mean by the

Riemann Integral of f on the interval [a, b]. If there is a number I so that

limn→∞

S(f, Pn, En) = I

no matter what sequence of partitions Pn with associated sequence of evaluation sets Enwe choose as long as limn→∞ || Pn ||= 0, we say that the Riemann Integral of f on [a, b]exists and equals the value I . The value I is dependent on the choice of f and interval [a, b].So we often denote this value by I(f, [a, b]) or more simply as, I(f, a, b). Historically, the

idea of the Riemann integral was developed using area approximation as an application, so

the summing nature of the Riemann Sum was denoted by the 16th century letter S which

resembled an elongated or stretched letter S which looked like what we call the integral sign∫. Hence, the common notation for the Riemann Integral of f on [a, b], when this value

exists, is∫ ba f . We usually want to remember what the independent variable of f is also and

we want to remind ourselves that this value is obtained as we let the norm of the partitions

go to zero. The symbol dt for the independent variable t is used as a reminder that ti+1 − ti

is going to zero as the norm of the partitions goes to zero. So it has been very convenient to

add to the symbol∫ ba f this information and use the augmented symbol

∫ ba f(t) dt instead.

Hence, if the independent variable was x instead of t, we would use∫ ba f(x) dx. Since for

a function f , the name we give to the independent variable is a matter of personal choice,

we see that the choice of variable name we use in the symbol∫ ba f(t) dt is very arbitrary.

Hence, it is common to refer to the independent variable we use in the symbol∫ ba f(t) dt as

the dummy variable of integration., p. 17.

415

GLOSSARY GLOSSARY

416

Part XI

Appendix: Undergraduate AnalysisExaminations

417

Appendix A

Senior Advanced Calculus I StudyGuides

Presented below are study guides and coursedescription material from some of the times wehave taught the first semester senior level un-dergraduate analysis course here. At ClemsonUniversity, it is called MTHSC 453. Anothersource of information is our web-based discus-sion board material. The MTHSC 453 discus-sion board can be accessed either through myweb page by following the links or by the di-rect routehttp://www.ces.math.clemson.edu/discus. Onthat page you will see exercises, worked outproblems and so forth in an easy to follow for-mat.

A-1 Course Structure

This course is about learning how to think care-fully about mathematics. This course (and ourcompanion one Abstract Algebra: here MTHSC412 are the first courses where we really insistthat you begin to master one of the big tools ofour trade: abstraction. Now, of course this isnot easy as most of you have not really neededto learn how to do this in your previous courses.However, the time has come, and in this class,we will work very hard to help you learn how tothink well and deep about mathematical issues.We will learn how to do proofs of many propo-

sitions and learn a lot of facts about how func-tions of a real variable x behave. A lot of whatwe learn can be generalized (and for good use-ful reasons as we will see) to more abstract set-tings and we’ll set pointers to the future aboutthat as much as we can. But most of all, wewill consistently challenge you to think aboutthe how and why of all that we do. It should befun!

To help you with all of this new and inter-esting stuff, we will be using web-based discus-sion pages. You can check out details about thislater in this syllabus.

In broad outline, we will discuss:

• Sequences

• Subsequences

• Bolzano-Weierstrass theorem

• Limit inferiors and Limit superiors

• Basics about sets of real numbers: callthis point set topology

• Limits of functions

• Continuity of functions

• Consequences of Continuity

• Differentiation of functions

419

A-2. STUDY GUIDE APPENDIX A. ADVANCED CALCULUS I

• Consequences of Differentiation

A-2 Study Guide

The final exam is a closed book and closed notestest. It is also cumulative for all of the materialin this class. Here is a list of what is typicallydone in this class.

Give precise mathematical definitions of thefollowing mathematical concepts using the ap-propriate mathematical formalism:

Basics: 1. Mathematical Induction

2. Bounded sets

3. Infimum and supremum of a boundedset

4. Theorem: If S is bounded aboveand M = sup S, then for any y <M , there is an x ∈ S such that y <x ≤ M . (Similar type theorem forinfimums).

5. Theorem: Every nonempty set boundedabove has a supremumEvery nonempty set bounded belowhas an infimum

6. Theorem: Triangle Inequality andReverse Triangle Inequality

Functions, Sequences and Limits: 1. Functionsf : S → <, where S is a subset of<.

2. Let f : S → <; limx→x0 f(x)

3. Let f : S → <;

(a) S is unbounded above: limx→∞ f(x),convergence or divergence

(b) S is unbounded below: limx→−∞ f(x),convergence or divergence

4. Sequences

(a) f : N0 → <; N0 is the set ofintegers starting at n0.

(b) Let f(n) = an. Then the se-quence can be written as an∞n0

≡an, where it is usually un-derstood where the sequence be-gins.

(a) limn→∞ an = A, convergenceor divergence.

(b) Squeezing Lemma(c) Theorem: Uniqueness of Lim-

its

5. Operations with Sequences; limn→∞ ≡lim an

(a) Theorem: If sequence convergesto A, so does any subsequence

(b) Theorem: If sequence converges,sequence forms bounded set

(c) Theorem: lim(an + bn)(d) Theorem: lim(an − bn)(e) Theorem: lim(anbn)(f) Theorem: lim(anbn )(g) Theorem: lim |an| = |lim(an)|(h) Theorem: If an ≥ bn, then

lim an ≥ lim bn

6. Limits of Functions

(a) Limits from the right and theleft

(b) Operations with Limits of Func-tions: (All of the items underoperations with sequences trans-lated to the function setting)

7. Monotone Sequences

(a) Increasing and Decreasing Se-quences

(b) Non-increasing and non-decreasingSequences

(c) Completeness Axiom Non-decreasingsequences bounded above con-verge and lim an = sup an.Theorem: (Similar statementfor non-increasing sequences)

8. Monotone Functions

(a) Increasing and decreasing func-tions

(b) Non-decreasing and non-increasingfunctions

Continuity and Limits Revisited: 1. Continuityof f : S → < at a ∈ S.

2. Discontinuity of f : S → < at a ∈S.

420

A-3. EXAMS VERSION A APPENDIX A. ADVANCED CALCULUS I

3. Uniform continuity of f : S → <on S.

4. Operations with continuous functions

(a) Same items as in operations withfunctions

(b) Theorem: Intermediate ValueTheorem

(c) Theorem: Inverse Functions

5. Cluster Points and Accumulation Points

(a) Cluster Point of a sequence(b) Cluster point of a function(c) Theorem: Bolzano-Weierstrass

Theorem Every bounded infi-nite set has at least one accu-mulation point and every boundedsequence has at least one clus-ter point

(d) Theorem: Cauchy Criterion Se-quences

(e) Theorem: Cauchy Criterion Func-tions

6. Limit Inferior

7. Limit Superior

8. Theorem: Let [a, b] be finite. If fcontinuous on on [a, b], then f isbounded on [a, b].

9. Theorem: Let [a, b] be finite. Thenthere is point x0 ∈ [a, b] with f(x0) ≥f(x)∀x ∈ [a, b].(Continuous function achieves max-imum on finite interval).

10. Theorem: Let [a, b] be finite. Thenf continuous on [a, b] implies f isuniformly continuous on [a, b].

Derivatives: 1. Derivative of a function

2. Theorem: f has derivative at a im-plies f is continuous at a.

3. Theorem: Chain Rule

4. Secant Lines, Tangent Lines

5. Theorem: If f attains an extremumat a, then if f

′(a) exists, f

′(a) = 0.

6. Theorem: Rolle’s Theorem

7. Theorem: Mean Value Theorem

8. Theorem: If f′(x) = g

′(x), then

f(x) g(x)+c, where c is a constant.

9. Theorem: f non-decreasing impliesf′(a) ≥ 0 at every point a where f

is differentiable.

10. Theorem: f′(x) ≥ 0 on interval

implies f is non-decreasing on theinterval. If f

′(x) > 0, then f is

increasing on the interval.

11. Theorem: Cauchy Mean value The-orem

12. Theorem: L’Hopital’s Rule

13. Theorem: Taylor’s Theorem withRemainder

14. Maximum and Minimum Values off

(a) Critical Points of f(b) Theorem: First Derivative Test(c) Theorem: Second Derivative

Test

A-3 Sample Exams Version A

A-3.1 Exam 1A

Instructions:This is a closed book and closed notes test. Youwill need to give us all the details of your argu-ments as that is the only way we can decide ifpartial credit is warranted.

Part 1: Definitions (24 Points)

1. (3 Points) Give a precise mathemat-ical definition of the meaning of asequence of real numbers.

2. (3 Points) Give a precise mathemat-ical definition of the phrase ”the se-quence an converges to a”.

3. (9 Points) Let S be a nonempty setof real numbers. Give a precise math-ematical definition of the followingphrases:

(a) U is an upper bound of S(b) U is the least upper bound of S

421


(c) u is a maximal element of S

4. (3 Points) State the Principle of Math-ematical Induction.

5. (3 Points) State the Triangle Inequal-ity for Real Numbers.

6. (3 Points) Give a precise mathemat-ical definition of the phrase ”the limitof the function f(x) as x tends to in-finity is A”

Part 2: Short Answer (36 Points)You must determine whether or not thesestatements are true. If the statement istrue, you MUST give us the reason whyit is true; if the statement is false, youmust give us a counterexample.

1. (4 Points) Is it true that all sequenceswhich converge are bounded?

2. (4 Points) Is it true that if a sequenceis bounded it must converge?

3. (4 Points) If | an |→| a |, is it truethat an → a?

4. (4 Points) If an → a, is it true that| an |→| a |?

Show all your work on the short calcu-lational exercises below. You may use acalculator if you wish.

1. (5 Points) Find the inf and sup ofthe set

S = x ∈ (0,π

2) | tan(x) < exp(−x)

You can indicate these points graph-ically.

2. (5 Points) Give an example of a func-tion whose domain is the real num-bers which has a finite limit as xtends to infinity.

3. (5 Points) Give an example of a se-quence of real numbers which doesnot converge.

4. (5 Points) Give an example of a nonemptyset of real numbers for which theinf and sup are the same.

Part 3: Proofs (40 Points)Provide careful proofs of the followingpropositions. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof.

1. (20 Points)Proposition: For all n ≥ 1,

11 · 2

+ . . .+1

n · (n+ 1)

=n

n+ 1

2. (20 Points)Proposition: If the sequence anis defined by

an =−4n2 − 2n+ 3

n2 − 3n, n ≥ 2,

then the sequence converges.

A-3.2 Exam 2A


Part 1: Definitions (36 Points)Give precise mathematical definitions ofthe following mathematical concepts us-ing the ε− δ formalism:

1. (3 Points) ”The sequence an con-verges to a”.

2. (4 Points) ”The function f : D ⊆R → R has a limit A at the pointa ∈ D”.

3. (4 Points) ”The function f : D ⊆R → R is continuous at the pointa ∈ D”.

4. (4 Points) ”The function f : D ⊆R → R is uniformly continuousin D”.

422


Give precise mathematical definitions ofthe following mathematical concepts:

1. (3 Points) ”The sequence an is amonotone sequence”.

2. (3 Points) ”The function f : [0, 1]→R is monotone increasing”.

State precisely the following mathemati-cal theorems and/or axioms:

1. (5 Points) The Completeness Ax-iom

2. (5 Points) The Intermediate ValueTheorem

3. (5 Points) The Bolzano WeierstrassTheorem

Part 2: Short Answer (30 Points)You must answer the following questions.If the answer is YES, you MUST give usthe reason why; if the answer is NO, youmust also give a reason.

1. (3 Points) If the sequence an isincreasing and bounded above byM , does the sequence converge?

2. (3 Points) If the sequence an isincreasing and bounded above byM , does the sequence converge toM?

3. (3 Points) If the sequence an boundedabove by M and below by L, doesthe sequence converge?

4. (3 Points) If the sequence an boundedabove by M and below by L, doesthe sequence have at least one con-vergent subsequence?

5. (4 Points) Consider f : [0, 1]→ R.Is it possible for f2 to be continu-ous on [0, 1] and f to be discontin-uous at each point in [0, 1]?

6. (4 Points) Consider f : (0, 1)→ R.Is it possible for f to be continuousonly at x = .5 and discontinuouseverywhere else in (0, 1)?

7. (4 Points) Consider f : (0, 1) →R. Is it possible for f to be con-tinuous on the irrational numbers in

(0, 1) and discontinuous on the ra-tional numbers in (0, 1)?

Show all your work on the short calcu-lational exercise below. You may use acalculator if you wish.

1. (6 Points) Prove that there is a solu-tion to the equation

x + 10sin(x) = 0

on the interval [1.0, 4.5].


1. (17 Points)Proposition: If the function f : D ⊆R → R is continuous at a ∈ D,then there exists δ > 0 and M > 0such that |f(x)| < M if |x − a| <δ.


a1 = 1,

an+1 =2an + 3

4, n > 1


A-3.3 Exam 3A


Part 1: Definitions (28 Points)Give precise mathematical definitions ofthe following mathematical concepts us-ing the appropriate mathematical formal-ism:

423


1. (4 Points) ”The Cauchy Criterionfor a sequence an”.

2. (4 Points) ”A is a cluster point ofthe sequence an”.

3. (4 Points) ”The limit superior ofthe sequence an”.

4. (4 Points) ”The function f : D ⊆R → R, where D is the domain ofthe function f , has a derivative atx0 ∈ D”.

State precisely the following mathemati-cal theorems:

1. (4 Points) The The Mean Value The-orem

2. (4 Points) The Bolzano-WeierstrassTheorem

3. (4 Points) The Rolle’s Theorem

Part 2: Short Answer (38 Points)You must answer the following questions.If the answer is YES or NO, you MUSTgive us the reason why; ( e.g. the com-plete statement of a relevant theorem, acounterexample etc.)

1. (4 Points) Is it possible for a func-tion to be continuous at a point butnot differentiable there?

2. (4 Points) Is it possible for a func-tion to be differentiable at a pointbut not continuous there?

3. (4 Points) Is f(x) =√

(x) uni-formly continuous on [0, 1]?

4. (4 Points) Is is possible for a func-tion that is continuous on [2, 5] tobe unbounded?

5. (4 points) Does every continuous func-tion on [0, 1] achieve a maximumvalue?


1. (6 Points) If

f(x) = x2, x ≤ 1

= 2x− 1, x > 1

find f′(x) at all points where the

derivative exists.2. (6 points) Find all cluster points of

the sequence an, where

an = (−1)nsin(nπ

4) +

13n, n ≥ 1,

and verify that each is indeed a clus-ter point.

3. (6 points) Use the Mean Value The-orem to show that if x < y, then

ex(y − x) < ey − ex < ey(y − x).


1. (17 Points)Proposition: Let f : (a, b) → Rand assume that f

′(x) < 0 on (a, b).

Then, f is strictly decreasing on(a, b).

2. (17 Points)Proposition: If the sequence ansatisfies

|an+1 − an| <13n, n > 0,


A-3.4 Final A



424


1. (6 points) f : [a, b]→ R is contin-uous at a ∈ (a, b).

2. (6 points) f : [a, b] → R is uni-formly continuous in [a, b].

3. (6 points) The Cauchy Criterionfor convergence of a sequence an.

4. (6 points) A is a cluster point ofthe sequence an.

5. (6 points) µ is the least upper boundof the nonempty and bounded setS.

6. (6 points) The lim inf an, whenan is a bounded sequence.

State precisely the following mathemati-cal theorems or axioms:

1. (6 Points) The Bolzano WeierstrassTheorem

2. (6 Points) The Completeness Prop-erty of the real numbers

3. (6 Points) Rolle’s Theorem

4. (6 points) The The First Deriva-tive Test

5. (6 points) The Intermediate ValueTheorem

Part 2: Short Answer (66 Points)Answer the following questions. If theanswer is YES or NO, you MUST giveus the reason why; ( e.g. the completestatement of a relevant theorem, a coun-terexample etc.)

1. (4 points) Is it possible for a contin-uous function on [1,∞) to be un-bounded?

2. (4 points) Is it necessarily true thatthe limit inferior of a sequence is acluster point of the sequence?

3. (4 points) Is it possible for a func-tion to be continuous and differen-tiable at only one point?

4. (4 points) If a sequence is bounded,does it have to converge?

5. (4 points) If the first derivative of afunction is zero at a point, does thefunction have to have a maximumor minimum at that point?

6. (4 points) If the second derivativeof a function exists at a point, is thefirst derivative of the function con-tinuous at that point?

7. (10 points)Use the Intermediate Value Theo-rem to prove that x5 + 2x3 + x −3 has at least one real root. UseRolle’s Theorem to show that theroot is unique.

8. (12 points)Let the sequence an be recursivelydefined by the formula

a1 = 1,

an+1 =(

1− 1(n+ 1)2

)an.

Show that an converges.

9. (10 points)Find limx→0+

tan(x)−xx3

10. (10 points)Show that∣∣∣∣ sin(x)−

(x− x3

6+

x5

120

)∣∣∣∣<

15040

, for |x| ≤ 1.


1. (17 points)If an → a, bn → b and cn →c use an ε−δ argument to prove thatan + bn + cn → a+ b+ c.

2. (17 points)If f : R→ R is defined by f(x) =3x2−2x+1, use an ε−δ argumentto prove that f is continuous at x =2.

425

A-4. EXAMS VERSION B APPENDIX A. ADVANCED CALCULUS I

3. (17 points)Prove that if f : [a, b] → R has abounded derivative f

′(x) on (a, b),

then f is uniformly continuous on(a,b).

4. (17 points)Prove for all positive integers

d

dx(xn) = n xn−1

A-4 Sample Exams Version B

A-4.1 Exam 1B



1. (3 Points) Give a precise mathemat-ical definition of the meaning of asequence of real numbers.

2. (3 Points) Give a precise mathemat-ical definition of the phrase ”the se-quence an converges to a”.

3. (3 Points) Let S be a nonempty setof real numbers. Give a precise math-ematical definition of the supremumof S.

4. (6 Points) State precisely the fol-lowing theorems:(a) The limit of the sum of sequences

theorem.(b) The limit of product of sequences

theorem.5. (3 Points) State the Triangle and Re-

verse Triangle Inequality for RealNumbers.


1. (6 Points) Let the sequence an →a, a 6= 0, as n→∞. Then there isan integer N such that if n > N ,then |a|2 ≤ |an| ≤ |a|+ 1.

2. (4 Points) Is it true that if two se-quences diverge, their product mustdiverge?

3. (4 Points) Is it true that the infimumof a set is always achieved by someelement of the set?

4. (4 Points) If |x| < ε for all ε > 0,can x be nonzero?

Show all your work on the short calcula-tional and/or discussion exercises below.You may use a calculator if you wish.

1. (8 Points) The sequence nn+1. clearly

has limit 1. Discuss what happensin an attempted convergence proofif you try to prove the limit is 2.Where does the proof fail?

2. (5 Points) What method of proof wouldbe required to prove the propositionddx(xn) = nxn−1?

3. (4 Points) Find limit, if it exists, forthe sequence whose nth term is:

an =n(n+ 4)4n(n+ 1)

.

4. (3 Points) Find limit, if it exists, forthe sequence whose nth term is:

an =2n

nn.

5. (6 Points) Find the inf and sup ofthe set

S = i2j|i, j are pos. int.


426


1. (19 Points)Proposition: For all n ≥ 1,

1 + 4 + 7 + . . .+ (3n− 2) =12n(3n− 1)


an =3n2 + 12n− 13

2n2 − 5n, n ≥ 1,


A-4.2 Exam 2B



1. Give precise mathematical definitionsof the following mathematical con-cepts using either the ε, δ or ε,Nformalism:(a) (4 Points) ”The function f :

D ⊆ R → R has a limit A atthe point a ∈ D”.

(b) (4 Points) ”The function f :D ⊆ R → R is continuous atthe point a ∈ D”.

(c) (4 Points) ”The function f :D ⊆ R → R is uniformlycontinuous in D”.

2. Give precise mathematical definitionsof the following mathematical con-cepts:(a) (3 Points) A is a cluster point

of the sequence an.(b) (3 Points) A is a cluster point

of the function f(x) at x = a.(c) (3 Points) A is an accumula-

tion point of the nonempty setB.

3. State precisely the following math-ematical theorems and/or axioms: the(a) (5 Points) Completeness Ax-

iom(b) (5 Points) Intermediate Value

Theorem(c) (5 Points) Bolzano Weierstrass

Theorem

Part 2: Short Answer (31 Points)

1. You must answer the following ques-tions. If the answer is YES, youMUST give us the reason why; ifthe answer is NO, you must alsogive a reason or a counterexample.(a) (3 Points) If the sequence an

is increasing and bounded be-low by M , does the sequencenecessarily converge?

(b) (3 Points) If the sequence anis decreasing and bounded be-low by M , does the sequenceconverge to M?

(c) (3 Points) Is it possible for thefunction f + g to be continu-ous for all real numbers eventhough both f and g are con-tinuous at only one point each?

2. Show all your work on the short cal-culational exercises below. You mayuse a calculator if you wish.(a) (6 Points) Prove that there is a

solution to the equation

2 +x

2 + sin2(x)+

x3

x2 + 3= 0

on the interval [−5, 0].(b) (8 Points) Find the set of clus-

ter points for the the functionf(x) = 1

4sin( 5x) at x = 0.

(c) (8 Points) Find the set of clus-ter points for the sequence an,where an = cos(nπ3 ) and ex-hibit a subsequence which con-verges to each point.

427



1. (15 Points) Use an ε, δ argument toprove that if the functions f, g :D ⊆ R → R are continuous ata ∈ D, then the function f + 2g :D ⊆ R → R is also continuous ata ∈ D.

2. (18 Points) Let the sequence anbe defined by

a1 = 1,

an+1 =5(1 + an)

5 + an, n > 1.

It can be shown that√

(5) < an <5 for all n ≥ 2. Prove that this se-quence converges and find the valueof the limit.

BONUS: (5 Points)Show that

√(5) < an < 5 for all

n ≥ 2.

A-4.3 Exam 3B



1. Give precise mathematical definitionsof the following mathematical con-cepts using the appropriate mathe-matical formalism:

(a) (4 Points) ”The Cauchy Cri-terion for a sequence an”.

(b) (4 Points) ”The function f [a, b] ⊆R→ R, where [a, b] is a closed

and finite interval, has a deriva-tive at x0 ∈ (a, b)”.

2. State precisely the following math-ematical theorems: the

(a) (4 Points) Completeness Ax-iom

(b) (4 Points) Bolzano-WeierstrassTheorem

(c) (4 Points) Rolle’s Theorem(d) (4 Points) Intermediate Value

Theorem(e) (4 Points) The Mean Value The-

orem

Part 2: Short Answer (20 Points)You must answer the following questions.If the answer is YES or NO, you MUSTgive us the reason why; ( e.g. the com-plete statement of a relevant theorem, acounterexample etc.)

1. (4 Points) Is it possible for the max-imum and minimum of a functionto be the same?

2. (4 Points) Let f and f′

denote afunction and its derivative. Is it pos-sible for f

′(c) to exist even though

f′

is not continuous at c?

3. (4 Points) Is f(x) = 1x+2 uni-

formly continuous on [0, 1]?4. (4 Points) Can a function f be dif-

ferentiable on an interval I and beunbounded on I?

5. (4 points) If f and g are continuousfunctions on [0, 1], does the func-tion f−g2 achieve a minimum valueon [0, 1]?

Part 3: Calculational Exercises (22 Points)Show all your work on the short calcu-lational exercises below. You may use acalculator if you wish.

1. (8 Points) If

f(x) = x2 + a, x ≤ −1= bx, −1 < x < 1= −cx2 + d, x ≥ 1

428


(a) Find values of a, b, c, d that willmake f

′exist for all x.

(b) Find the set of x where f′

iscontinuous.

2. (8 points) Use the Mean Value The-orem to approximate (28)

13 .

3. (6 Points) Find the following limit:

limx→0+

− ln(x)csc(x)

Part 3: Proofs (30 Points)

Provide careful proofs of the followingpropositions. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof.

1. (15 Points)Prove that the function f(x) = 3x3+4x+ 9 has one and only one root.

2. (15 Points)Proposition: If the sequence ansatisfies

|an+1 − an| <(

25

)n, n > 0,


A-4.4 Final B



1. Give precise mathematical definitionsof the following mathematical con-cepts using the appropriate mathe-matical formalism: (24 Points)

(a) (4 Points) The sequence an con-verges to a.

(b) (4 points) The Cauchy Crite-rion for convergence of a se-quence an.

(c) (4 points) A is a cluster pointof the sequence an.

(d) (4 points) f : [a, b] → R iscontinuous at a ∈ (a, b).

(e) (4 points) f : [a, b] → R isuniformly continuous in [a, b].

(f) (4 Points) A is a cluster pointof the function f at the point a.

2. (32 Points) State precisely the fol-lowing mathematical theorems or ax-ioms: the

(a) (4 Points) Bolzano WeierstrassTheorem

(b) (4 Points) Completeness Prop-erty of the real numbers

(c) (4 Points) Rolle’s Theorem(d) (4 points) Intermediate Value

Theorem(e) (4 Points) Mean Value Theo-

rem(f) (8 Points) Taylor Polynomial

and Remainder Theorem(g) (4 points) First Derivative Test


Answer the following questions. If theanswer is YES or NO, you MUST giveus the reason why; ( e.g. the completestatement of a relevant theorem, a coun-terexample etc.)

1. (4 points) Is it possible for a con-tinuous function on (0, 1) to be un-bounded?

2. (4 points) Is it possible for a con-tinuous function on [0, 1] to be un-bounded?

3. (4 points) Is it possible for a differ-entiable function on (0, 1) to be un-bounded at the point .5?

4. (4 points) Is it possible for a sequencean to have two cluster points whenits limit exists?

429

A-5. EXAMS VERSION C APPENDIX A. ADVANCED CALCULUS I

5. (4 points) Is it possible for a func-tion to be continuous at only onepoint?

6. (4 points) Is it possible for a func-tion to be continuous and differen-tiable at only one point?

7. (4 points) If a function is bounded,does it have to be continuous?

8. (4 Points) If a function is bounded,does it necessarily have a maximumand a minimum?

9. (4 Points) If the first derivative of afunction f is zero at the point x0,is it necessarily true that f attains alocal maximum value at x0?

Part 3: Calculational Exercises (39 Points)

1. (9 points)

Find the cluster points of the sequencedefined by an = (−1)n sin(nπ3 ) forall n > 0 and exhibit explicit sub-sequences for each cluster point.

2. (10 points)

Find the cluster points of the func-tion defined by f(x) = sin( 1

x2 ) atx = 0 and exhibit an explicit sub-sequence for each cluster point.

3. (10 points)Estimate the largest interval [−a, a]for which the

| sin(x)− x |< 10−6,

4. (10 points)Find the smallest order of Taylor poly-nomial, pn(x), centered at x = 0for which

| cos(x)− pn(x) |< 10−3,


1. (9 points)Show that if the sequence an is aCauchy sequence, then it is bounded.

2. (15 points)Let the sequence an be definedby an = n2+2

−3(n2)+5, for all n > 1.

Use an ε, δ argument to prove thelimit as n goes to infinity exists.

3. (15 points)If f : R→ R is defined by f(x) =4(x)2 − 2x + 10, use an ε, δ argu-ment to prove that f is continuousat x = 1.

4. (15 points)If f : R→ R is defined by f(x) =3(x)2+3x+8, use an ε, δ argumentto prove that f is uniformly contin-uous on the interval [−2, 3].

5. (15 points)If the sequence an is defined by

an = 1, n = 1=

√2 an−1, n > 1

,prove that the limit of this sequenceexists and find its value.

A-5 Sample Exams Version C

A-5.1 Exam 1C



430


1. (3 Points) Let S be a nonempty setof real numbers. Give a precise math-ematical definition of the supremumof S.

2. (3 Points) Let p be an accumula-tion point of the nonempty set S.Give a precise mathematical defini-tion of the term accumulation point.

3. (3 Points) State the Triangle andReverse Triangle Inequality for RealNumbers.

4. (4 Points) State the Bolzano-WeierstrassTheorem.

5. (4 Points) State the Heine-Borel The-orem.

6. (3 Points) Give a precise mathemat-ical definition of the meaning of thephrase limx−→a f(x) = L.


1. (4 Points) Is it true that the supre-mum of a set is always achieved bysome element of the set?

2. (4 Points) Is it true that every opencover of the interval [2, 5] must havea finite subcover?

3. (4 Points) Let f(x) = 4λx(1− x),λ ∈ (0, 1) and consider the sequenceof points generated by the proce-dure x0 = .1, x1 = f(x0), x2 =f(x1) etc. What does the Bolzano-Weierstrass theorem say about thissequence?

4. (4 Points) Is it possible for a set tohave no accumulation points?

Show all your work on the short calcula-tional and/or discussion exercises below.You may use a calculator if you wish.

1. (8 Points) Let S = (−1, 10]. Findan open cover of this set that has nofinite subcover.

2. (12 Points) Let S = 14n : n =

1, 2, · · · , ∪ (2, 5).

(a) (6 Points) Find all accumula-tion points of S.

(b) (6 Points) Find the inf and supof S.

3. (8 Points) Finish the following proof:

Proposition A-5.1. If |x| < ε, ∀ε > 0, then x = 0.

Proof:

(a) We will prove by contradiction.(b) Assume x 6= 0.(c) Since x < ε for any positive ε,

we are free to choose the par-ticular value ε = |x|

2 .(d) SHOW THIS LEADS TO A

CONTRADICTION.


1. (12 Points) Provide an ε − δ proofof the following proposition:

Proposition A-5.2.

limx−→ 4

(x2 + 6) = 22.

2. (12 Points)

Proposition A-5.3. For all n > 1,

n

n+ 1=

11 · 2

+1

2 · 3

+1

3 · 4+ · · ·+ 1

n · (n+ 1).

3. (12 Points)

Proposition A-5.4.√

5 is irrational.

A-5.2 Exam 2C

Instructions:This is a closed book and closed notes test. You

431


will need to give us all the details of your argu-ments as that is the only way we can decide ifpartial credit is warranted.


1. (3 Points) Let S be a set of real num-bers, let p ∈ S and let f : S → <be a function. Give a full, completeand precise definition of the conti-nuity of the function f at the pointp.

2. (3 Points) Let S be a set of real num-bers, let p ∈ S and let f : S → <be a function. Give a full, com-plete and precise definition of theuniform continuity of f on the setS.

3. (3 Points) Let S be a set of real num-bers, let p ∈ S and let f : S → <be a function. Give a full, com-plete and precise definition of thederivative of f at the point p.

4. (3 Points) Let S be a set of real num-bers, let p ∈ S and let f : S → <be a function. Give a full, com-plete and precise definition of thelimx→∞ f(x) = −∞.

Part 2: Theorems: (12 Points)Give full, complete and precise statementsof the following theorems:

1. (4 Points) The Intermediate ValueTheorem

2. (4 Points) Rolle’s Theorem

3. (4 Points) The Mean Value Theo-rem


1. (4 Points) Let f be a function de-fined on the finite interval [a, b] with

0 ∈ (a, b). Assume f′(0) = 0 and

f′(0) = 0. Is it possible for f to

have a minimum or maximum at 0?

2. (4 Points) Let f be a function de-fined on the finite set S. If f isbounded and continuous on S, is itnecessarily true that f is uniformlycontinuous on S?

3. (4 Points) Is is possible for a func-tion to be continuous at only onepoint?

4. (4 Points) Let f be a function de-fined on the finite set S and assumef′

and f′

exist on S. Let p ∈ Swith f

′(p) > 0. Is it true that there

is a δ > 0 so that f′> 0 on the

interval (p− δ, p+ δ)?

Part 4: Calculations (34 Points)Show all your work on the short calcula-tional and/or discussion exercises below.You may use a calculator if you wish.

1. (6 Points)

limx→∞

x ln2x+ 12x− 10

2. (6 Points) Prove

√1 + x < 1 +

x

2, ∀x > 0

3. (6 Points) Is f(x) = 1x uniformly

continuous on (0, 1)?

4. (6 Points) Is the function f definedby

f(x) =2x− 1, x ∈ [0, 1]x3 − 5x2 + 5, x ∈ (1, 2]

uniformly continuous on [0, 2]?

5. (10 Points) For any positive integern, let gn be the function

gn(x) =n∑i=0

(12

)i cos((13)iπx)

432


(a) (2 Points) Let n = 7. Illustratewhat the graph of g7 looks likeon the intervals [−.1, .1], and[−10−8, 10−8].

(b) (2 Points) Let n = 12. Illus-trate what the graph of g12 lookslike on the intervals [−.1, .1],and [−10−28, 10−28].

(c) (2 Points) As n → ∞, whathappens to the graphs of gn asthe interval you graph gn on shrinks?

(d) (4 Points) If we let f(x) = limn→∞ gn(x),what kind of function is f?

Part 4: Proofs (26 Points)Provide careful proofs of the followingpropositions. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof. Hint:These proofs are short!

1. (13 Points)

Proposition A-5.5. Let [a, b] be abounded closed interval of real num-bers and f, g : [a, b] → < be con-tinuous functions on [a, b]. Assumef(a) > g(a) and f(b) < g(b). Thenthere exists a point c ∈ (a, b) withf(c) = g(c).

2. (13 Points)

Proposition A-5.6. Let [a, b] be abounded closed interval of real num-bers and f : [a, b] → < be a con-tinuous function on [a, b]. LetA de-note the range of the function f on[a, b]; i. e.,A = y : y = f(x), x ∈[a, b]. ThenA is a closed and boundedinterval, i. e., A = [c, d], for somefinite numbers c and d.

A-5.3 Exam 3C


Part 1: Definitions (18 Points):Define the following terms in full detail.

1. (3 Points) A Partition P of the finiteinterval [a, b].

2. (3 Points) The Lower Sum L(P, f)for the bounded functionf : [a, b] → < and the partition Pof [a, b].

3. (3 Points) The Upper Sum L(P, f)for the bounded functionf : [a, b] → < and the partition Pof [a, b].

4. (3 Points) The Lower integral∫ baf(t)dt.

5. (3 Points) The Upper integral∫ baf(t)dt.

6. (3 Points) The integral∫ ba f(t)dt

Part 2: Theorems (14 Points):State carefully the following theorems.the

1. (4 Points) First Fund. Theorem ofCalculus

2. (5 Points) Second Fund. Theoremof Calculus

3. (5 Points) Taylor’s Theorem withLagrange Form of Remainder

Part 3: Short Answer: (18 Points):

1. (6 Points)If the function f is differ-entiable on the finite interval [a, b],is it necessarily true that the func-tion F defined on [a, b] by F (x) =∫ xa f(t)dt is twice differentiable on

[a, b]?

2. (6 Points) Give an example of a func-tion which is integrable on [0, 1] butnot continuous on [0, 1]. Explainyour example carefully.

3. (6 Points) If f is bounded on [0, 1],is it necessarily true that f is inte-grable?

Part 4: Calculational: (20 Points):

433


1. (8 Points) Compute

d

dx

∫ x3−1

x2

cos(ln(x)) dx

2. (12 Points) Find the Taylor poly-nomial pn0(x) which will approx-imate the function f(x) = cos(2x)on the interval [−π, π] to accuracy10−5.

Part 5: Proofs (30 Points):Provide careful proofs of the followingpropositions. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof.

1. (15 Points)

Proposition A-5.7. Let the functionf be defined on the finite interval[a, b]. Assume f is continuous on[a, b] and f(x) ≤ 0 for all x in [a, b]and there exists a point x0 such thatf(x0) < 0. Then

∫ ba f(t)dt < 0.

2. (15 Points)

Proposition A-5.8. Use the defini-tion of the integral to prove that

∫ 10 3t2dt =

1.0.

A-5.4 Final C



1. Give precise mathematical definitionsfor the following concepts using theappropriate mathematical formalism:(18 Points)

(a) (3 points) A is an accumula-tion point of the set S.

(b) (3 Points) m is the inf of theset S.

(c) (3 Points) The sequence an con-verges to a.

(d) (3 Points) limx→p f(x) = Lfor p ∈ (a, b) where the func-tion f : [a, b]→ <.

(e) (3 points) f : [a, b] → < iscontinuous at p ∈ (a, b).

(f) (3 points) f : [a, b] → < isuniformly continuous in [a, b].

(g) Integration Theory: (12 Points)f : [a, b] → < is a boundedfunction on the finite interval[a, b]. Define the following con-cepts:

i. (3 Points)Q is a refinementof the partition P of [a, b]

ii. (3 Points) The Upper sumU(P, f) for partition P .

iii. (3 Points) The Lower Inte-gral

∫ baf(t)dt.

iv. (3 Points) The integral off on [a, b].

(h) Theorems (24 Points)State precisely the following math-ematical theorems or axioms:

i. (3 Points) The Heine-BorelTheorem

ii. (3 Points) The Least Up-per Bound Axiom

iii. (3 Points) Rolle’s Theoremiv. (3 Points) The Intermedi-

ate Value Theoremv. (3 Points) The Mean Value

Theoremvi. (3 Points) The Weak Prim-

itive Theoremvii. (3 Points) The First Fun-

damental Theorem of Cal-culus

viii. (3 Points) The Second Fun-damental Theorem of Cal-culus


434


Answer the following questions. If theanswer is YES or NO, you MUST giveus the reason why; ( e.g. the completestatement of a relevant theorem, a coun-terexample etc.)

1. (4 points) Is it possible for a func-tion to be continuous nowhere?

2. (4 points) Is it possible for a func-tion to be continuous at one point?

3. (4 points) Is it possible for a func-tion to be differentiable at one point?

4. (4 points) If f(x) = sin(x)+∫ x−1 g(t)dt,

where g is integrable on [−1, 2], isit necessarily true that f is uniformlycontinuous on [−1, 2]?

5. (4 points) Is it necessarily true thata function that is twice differentiableon the interval (0, 1) has a boundedderivative?

6. (4 points) Is it possible for a sequenceto have 16 subsequential limits?

7. (4 points) Is it necessarily true thatan integrable function is continuous?

8. (4 points) If a function has a mini-mum value on the finite interval [a, b],does the function have to be differ-entiable at the point where the min-imum occurs?

Part 3: Calculational Exercises (50 Points)

1. (10 Points)Find an open cover of the interval(1, 2) that has no finite subcover.

2. (10 Points)Use the definition of the derivativeto show that the function f definedby

f(x) = x2 sin(1x

), x 6= 0

= 0, x = 0

satisfies f′(0) = 0.

3. (10 Points)Show that cos(x) = x3 + x2 + 4xhas exactly one root in [0, π2 ].

4. (10 Points)

(a) (5 Points) Using base point 2,find the second order Taylor Poly-nomial, p2,2(x) for the functionf(x) =

√x.

(b) (5 Points) Using the Lagrangeform of the remainder, estimatethe error r2,2(x) over the inter-val [1, 3].

5. (10 Points)Let the function f be defined by

f(x)

= x2 +∫ ex

sin(x)sin

13 (t+ t2) dt

Compute f′(x).


1. (16 points)Let the sequence an be definedby an = 2n2+2

5n2+5n+2, for all n >

1. Use an ε, N argument to provelimn→∞ exists.

2. (16 points)If f : (a, b) → R is differentiableon (a, b) and there exist a positiveB such that | f ′(x) | ≤ B for all xin (a, b), prove that f is uniformlycontinuous on (a, b).

3. (16 points)Let the function f be defined by

f(x) = 1, x > 1= 0, x ≤ 1

Use an ε, δ argument to prove thatf is not continuous at x = 1.

435


4. (16 points)Let f be continuous for all non-negativex. Suppose further that f(x) 6=0 for all positive x. Prove that iff2(x) = 2

∫ x0 f(t)dt, for all x ≥ 0,

then f(x) = x for all x ≥ 0.

436

Appendix B

Senior Advanced Calculus II StudyGuides

Presented below are study guides and coursedescription material from some of the times wehave taught the second semester senior levelundergraduate analysis course here. At Clem-son University, it is called MTHSC 453. An-other source of information is our web-baseddiscussion board material. The MTHSC 454discussion board can be accessed either throughmy web page by following the links or by thedirect routehttp://www.ces.math.clemson.edu/discus.On that page you will see exercises, worked outproblems and so forth in an easy to follow for-mat.

B-1 MTHSC 454

B-2 Course Structure

The intent of this course is to continue to teachyou the basic concepts of analysis. We assumethat you have all had a careful and completegrounding in the in the necessary skills to read,write and understand the process of proving amathematical proposition. We also assume thatyou have covered the equivalent of the first fourchapters of the Fulks’ text. In this course, wewill introduce you to integration theory, infi-nite series, sequences and series of functions

and uniform convergence Taylor series and ad-ditional other concepts as time permits. Hereare some details:

M454/M454H/M654 Outline:

Integration:f : [a, b] → < is bounded functionon the finite interval [a, b].

1. Partition P of [a, b]2. Refinements of Partition P

′of

P .3. Norm of Partition P , ‖P‖4. Upper sum SP (f) ≡ U(P, f)5. Lower sum SP (f) ≡ L(P, f)6. Fundamental InequalitiesL(P, f) ≤ U(P, f)L(P, f) ≤ L(P

′, f)

U(P′, f) ≤ U(P, f)

L(Q, f) ≤ U(P, f) for any twopartitions P and Q of [a, b]

7. Upper Integral∫ ba f(x)dx = inf U(P, f) :

P is a partition of[a, b]8. Lower integral∫ b

af(x)dx = sup L(P, f) :

P is a partition of[a, b]

437

B-2. COURSE STRUCTURE APPENDIX B. ADVANCED CALCULUS II

9.∫ baf(x)dx ≤

∫ ba f(x)dx

10. infx∈[a,b](f(x))(b−a) ≤∫ baf(x)dx ≤∫ b

af(x)dx ≤ supx∈[a,b](f(x))(b−a)

11. Algebra of upper and lower in-tegrals

(a) Theorem:∫ ba(f(x)+c)dx =∫ b

a f(x)dx+ c(b− a)(Similar for lower integrals)

(b) Theorem:∫ ba f(x)dx =∫ c

a f(x)dx +∫ bc f(x)dx,

for any c ∈ [a, b](Similar for lower integrals)

12. Theorem: (Weak Fundamen-tal Theorem)ddx(∫ xaf(t)dt)|c = d

dx(∫ xaf(t)dt)|c =

f(c),at any point c ∈ [a, b] where fis continuous.

13. Theorem: f is Riemann-integrable

if∫ ba f(x)dx =

∫ baf(x)dx

This common value is denoted∫ ba f(x)dx

14. What functions are Riemann-integrable?(a) Theorem: f monotone⇒

f Riemann-integrable(b) Theorem: f continuous⇒

f Riemann-integrable15. Theorem: Necessary and Suf-

ficient Condition for integral toexist(a) Theorem: f Riemann-integrable⇔ ∀ε > 0 ∃ partition Psuch thatU(P, f)−L(P, f) <ε.

16. Upper and Lower integral esti-mates

Theorem: f bounded on [a, b]⇒ ∀ε > 0 ∃δ(ε) so that

U(P, f) <∫ ba f(x)dx + ε

and L(P, f) >∫ baf(x)dx − ε

for any partitionP with ‖P‖ <δ(ε).

17. Riemann SumS(P, f, ξ)

18. Theorem: J = lim‖P‖→0 S(P, f, ξ)19. Theorem: f Riemann-integrable⇔ J =

∫ ba f(x)dx

20. Properties of Riemann-Integrable(Section 5.4)(a) Theorem: f Riemann-integrable⇒ F (x) =

∫ xa f(t)dt is

continuous on [a, b](b) Theorem: Subset Property(c) Theorem: Intermediate Point

Property(d) Theorem: f(x) ≥ g(x)⇒∫ ba f(x)dx ≥

∫ ba g(x)dx

(e) Theorem: Integral of lin-ear combinations of func-tions is linear combinationof integral of functions

(f) Theorem: f integrable⇒f+ and f− integrable

(g) Theorem: f integrable⇒|f | integrable

(h) Theorem: Integral Estimate|∫ ba f(x)dx |≤

∫ ba | f(x) |

dx

(i) Theorem: f, g integrable⇒ fg integrable

21. Fundamental Theorem of Cal-culus

(a) Primitive (antiderivative, in-definite integral) of f

(b) Theorem: Fundamental The-orem of Calculusf continuous and F prim-itive of f ⇒

∫ ba f(x)dx =

F (b)− F (a)(c) When does integrating f

′

recapture f?Theorem: f differentiableand f

′integrable⇒

∫ ba f′(x)dx =

f(b)− f(a)(d) Theorem: Integration by

Substitution(e) Theorem: d

dx(∫ v(x)u(x) f(t)dt) =

v′(x)f(v(x))−u′(x)f(u(x))

438

B-2. COURSE STRUCTURE APPENDIX B. ADVANCED CALCULUS II

22. Theorem: Integration by Parts

Infinite Series: 1. Definition of a se-ries:an∞n0

is a sequence.

(a) Partial Sums: Sp =∑n0+p−1

n0an

Sum of the first p terms ofthe sequence

(b) Theorem: If limp→∞ Spexists, series converges.let∑∞

n0an ≡ limp→∞ Sp,

where∑∞

n0an denotes the

sum of the series.(c) Theorem: If limp→∞ Sp

does not exist, series diverges.(d) Absolute convergence(e) Conditional convergence

2. Summing a series(a) telescoping terms(b) partial fractions

3. Tests for Convergence and/ordivergence(a) Theorem: nth term test(b) Theorem: Cauchy Crite-

rion specialized to series(c) Theorem: Absolute con-

vergence implies original se-ries converges

4. Theorem: Tests for Convergenceand/or divergence:series of non-negative terms(a) Theorem: p series(b) Theorem: geometric series(c) Theorem: comparison test(d) Theorem: limit compari-

son test(e) Theorem: integral test(f) Theorem: root test(g) Theorem: ratio test

5. Tests for Series with Variablesigns(a) Theorem: Alternating Se-

ries Test (AST)(b) Apply any of tests for se-

ries of non-negative terms

to series of absolute values–checks for absolute conver-gence. If this fails, onlyoption is AST

Sequence and Series of Functions:fn ≡ fn∞n0

is a sequence offunctions defined on [a, b]

1. Pointwise convergence of fn toa function f on [a, b]

2. Uniform convergence of fnon [a, b]

3. Theorem: fn converges uni-formly to f on [a, b]⇔Mn =supx∈[a,b] | fn(x)−f(x) |→ 0as n→∞.(a) Find f via pointwise limit

first(b) Once pointwise limit f

known, computeMn limit.If limit nonzero, convergencenot uniform.

4. Theorem: Cauchy Criterion foruniform convergence

5. Theorem: Weierstrass K-TestThis is for series of functions∑∞

n0un(x).

Let Kn = supx∈[a,b] | un(x) |.Then

∑∞n0Kn converges⇒ se-

ries of functions converges uni-formly on [a, b]

6. Consequences of Uniform Con-vergence(a) Theorem: If a sequence

of continuous functions con-verges uniformly on [a, b],then limit function is con-tinuous

(b) Theorem: If series of con-tinuous functions convergesuniformly on [a, b], then itssum is continuous function

(c) Theorem: Interchange ofintegration and limitfn → uniformlyf on [a, b]where each fn is integrable.Then f is also integrableand lim

∫ ba fn =

∫ ba lim fn

439

B-3. EXAMS VERSION A APPENDIX B. ADVANCED CALCULUS II

(also Theorem: (each fnis continuous), Theorem:(each fn is monotone), The-orem: (applied to series offunctions) )

(d) When does f′n → f

′?

Theorem: Very technicalstatement: –nice theorembecause it pulls together alot of results)

Taylor Series: 1. Power Series(a) Radius of convergence(b) Interval of convergence(c) What happens at endpoints

of interval of convergence?2. Power series converges abso-

lutely inside interval of conver-gence and diverges outside in-terval of convergence. Must checkbehavior at endpoints explicitly.

3. Finding Radius of Convergence(a) Theorem: Ratio Test:

q = lim | an+1

an|

R = 0 if q =∞R =∞ if q = 0R = 1

q if q finite(b) Root test

4. Properties of Power Series(a) Theorem: Power series con-

verges uniformly on any closedsubinterval contained in theinterval of convergence

(b) Theorem: Let interval ofconvergence be (−R,R).Then power series convergesat R implies power seriesconverges uniformly on [0, R].(Similar result for conver-gence at other endpoint)

(c) Theorem: If power serieshas interval of convergence(−R,R), so does derivedseries.

(d) Theorem: IfR > 0, deriva-tive of a power series is givenby the derived series andthe interval of convergence

is same for both series. (sametype result for integral ofpower series)

(e) Theorem: IfR > 0, powerseries has derivatives of allorders and interval of con-vergence is always same

(f) Theorem: f(x) =∑anx

n

and radius of convergenceR > 0. Then an = f (n)(0)

k!

(g) Theorem: Power series ex-pansions are unique

5. Operating on Power Series(a) Theorem: Integrating se-

ries(b) Theorem: Differentiating

series(c) Theorem: Summing Se-

ries6. Taylor Series

(a) Theorem: Taylor’s theo-rem with remainder revis-ited

(b) Theorem: Taylor series ex-pansion of arbitrary func-tion

7. Finding Taylor Series(a) Finding pattern in deriva-

tives(b) Using geometric series tricks(c) Using recursion trick for ra-

tios of functions8. Using Taylor Series

(a) Integrating a function(b) Evaluating a limit

B-3 Sample Exams Version A

B-3.1 Exam 1A


440



1. (15 Points) Carefully define and dis-cuss the integral of a bounded func-tion f : [a, b]→ R, where [a, b] is aclosed and finite interval. You willneed to explain in detail all relevantsymbols and terms.


1. (7 Points) Let [a, b] be a closed andfinite interval and assume f, g : [a, b]→R are not integrable on [a, b]. Is itnecessarily true that f + g is not in-tegrable on [a, b]?


1. (17 Points) Consider f(x) = x2 onthe interval [0, 1]. Let Pn be theuniformly spaced partition of [0, 1]consisting of n equal subintervals.Find SPn − SPn and show that

SPn − SPn → 0 as n→∞.

Does this prove that f(x) = x2 isintegrable on [0, 1]? Explain youranswer carefully.

2. Determine whether or not the fol-lowing functions are integrable on[0, 1] giving full reasons for youranswers.

(a) (7 points)

f(x) = x2 − 3x + 7,

(b) (7 points)

f(x) =π

2, 0 ≤ x < .25

= 1.0, x = .25

= .6, .25 < x < .75= .17, x = .75= .038, .75 < x ≤ 1.0

(c) (7 points)

f(x) = 2, x ∈ Q= −9, x ∈ I

where Q and I indicate the ra-tional and irrational numbers re-spectively.

(d) Differentiate the function f :[1, 10]→ R defined by,

i. (3 points)

f(x) =∫ x

1(1 + t2)−2 dt

ii. (3 points)

f(x) =∫ x2

1(2 + t3)−4 dt


1. (17 Points)Proposition: Let [a, b] be a finite in-terval and let f : [a, b]→ R be con-tinuous and non-negative on [a, b],i.e. f(x) ≥ 0, a ≤ x ≤ b. Then if∃ x0, a < x0 < b 3 f(x0) > 0,this implies that

∫ ba f(x) dx > 0.

2. (17 Points)Carefully state and prove a theoremfor calculating

d

dx

(∫ u(x)

af(t) dt

)

441


B-3.2 Exam 2A


Part 1: Definitions (24 Points)Carefully and precisely define the follow-ing mathematical phrases:

1. (6 Points)

∞∑n=0

an = S.

In the phrases below, I is a finite interval,fn is a sequence of real valued func-tions defined on I and f is a real valuedfunction defined on I .

1. (6 Points)

fn(x) → f(x) pointwise on I.

2. (6 Points)

fn(x) → f(x) uniformly on I.

3. (6 Points)

∞∑n=0

fn(x)

converges uniformly on I to the func-tion f(x).


1. (5 Points) Let I be a finite intervaland fn a sequence of real valuedfunctions defined on I . If fn → fpointwise on I , is it necessarily truethat f is continuous on I?

2. (5 Points) If an → 0 is it necessar-ily true that

∑∞n=0 an converges?

For each of the following series, tellwhether or not it is absolutely conver-gent, conditionally convergent or diver-gent,giving full reasons for your answers.

1. (6 Points)

∞∑n=0

2n

3 (5n+1)

2. (6 Points)

∞∑n=3

n3 + 6n− 1n4 − 17

3. (6 Points)

∞∑n=2

n!(2n− 2)!

4. (6 Points)

∞∑n=1

1(2n)n

5. (6 Points)

∞∑n=0

(−12

)n

6. (6 Points)

1− 12− 1

4+

18

− 116− 1

32+

164− . . .

(plus, then 2 minuses)


1. (15 Points) Consider for all n ≥ 1,

fn(x) =n(x2)

1 + n(x2)

442


Discuss the pointwise and uniformconvergence of fn on the intervals:

(a) [−1, 1](b) [1, 4]


1. (15 Points)If∑an and

∑bn both converge

absolutely, then∑

(an + bn) con-verges absolutely.

B-3.3 Exam 3A


Part 1: Convergence of Series (36 Points)Find all the values of x at which the fol-lowing series converge giving full rea-sons for your answers.

1. (12 Points)

∞∑n=1

x2n

n 3n

2. (12 Points)

∞∑n=1

n! xn

nn

3. (12 Points)

∞∑n=1

an xn, an = 1, if n is even

=1n, if n is odd

Part 2: Taylor Series (48 Points)Show all your work on the short calcula-tional exercises below.

1. (16 Points) Let f(x) = ex−1x , for

x 6= 0.

(a) Find the power series that rep-resents f for x 6= 0.

(b) Find the power series that rep-resents f

′for x 6= 0.

2. (16 points) Sum the series

∞∑n=1

xn+1

n+ 1

3. (16 Points) Let f(x) =∫ x

0 tsin(t)dt.Find the Taylor series centered atx = 0 for f .

Part 3: Proofs (16 Points)Provide careful proofs of the followingproposition. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof.

(16 Points)Let fn and gn be two sequences ofreal valued functions defined on the finiteinterval I. If fn → f uniformly on I andgn → g uniformly on I, then (fn+gn)→(f + g) uniformly on I.

B-3.4 Final A



1. Convergence IssuesLet [a, b] be a finite interval of thereal line with fn denoting a se-quence of real-valued functions, fn :[a, b]→ < and f indicating another

443


real-valued function, f : [a, b] →<. Define the following concepts:

(a) (3 points) fn → f pointwisefor x ∈ [a, b]

(b) (3 points) fn → f uniformlyon [a, b]

2. Elementary Topology in <n: De-fine the following terms:

(a) (3 points) S is an open set in<n

(b) (3 points) S is an closed set in<n

(c) (3 points) | x | for x ∈ <n

(d) (3 points) The boundary of aset S in <n

3. Integration Theory:f : [a, b] → < is a bounded func-tion on the finite interval [a, b]. De-fine the following concepts:

(a) (3 points) P is a partition of[a, b]

(b) (3 Points) P′

is a refinement ofthe partition P of [a, b]

(c) (3 Points) The Upper sum S(P, f)of the bounded function f on[a, b] for partition P .

(d) (3 Points) The Lower sum S(P, f)of the bounded function f on[a, b] for partition P .

(e) (3 Points) The Upper Integralof f on [a, b]

(f) (3 Points) The Lower Integralof f on [a, b]

(g) (3 Points) The integral of f on[a, b]

4. Series:Let

∑∞n0an denote an infinite se-

ries: Define the following concepts:

(a) (3 Points) The nth partial sumof the series

(b) (3 Points) The convergence ofthe series

5. TheoremsState precisely the following math-ematical theorems or axioms:

(a) (5 Points) The Cauchy SchwartzInequality

(b) (5 Points) The Fundamental The-orem of Calculus

(c) (5 Points) Taylor’s Theorem withRemainder

Part 2: Short Answer (50 Points)Answer the following questions. If theanswer is YES or NO, you MUST giveus the reason why; ( e.g. the completestatement of a relevant theorem, a coun-terexample etc.)

1. Let f : [a, b] → < for the finiteinterval [a, b].(a) (5 Points) If f is not continu-

ous on [a, b], does that neces-sarily imply f is not integrableon [a, b]?

(b) (5 Points) If f is continuous on[a, b], does that necessarily im-ply that f is integrable on [a, b]?

(c) (5 Points) If f is bounded on[a, b], does that necessarily im-ply that f is integrable on [a, b]?

(d) (5 Points) If f2 is integrable on[a, b], does that necessarily im-ply that f is integrable on [a, b]?

2. Let∑an denote an infinite series

(a) (5 Points) If limn→∞ an = 0,does that necessarily imply thatthe series

∑an converges?

(b) (5 Points) If limn→∞an+1

an=

1, does that necessarily implythat the series

∑an diverges?

3. Let f(x) =∑anx

n denote a powerseries andR denote its radius of con-vergence.(a) (5 Points) LetR > 0. If the in-

terval [a, b] is strictly containedin the interval (−R,R), is itnecessarily true that the powerseries

∑anx

n converges uni-formly in [a, b]?

(b) (5 Points) Is it possible for theradius of convergence of the f

′

series to be different from R?

444

B-4. EXAMS VERSION B APPENDIX B. ADVANCED CALCULUS II

4. Let [a, b] be a finite interval and letf : [a, b] → < denote a functionwith derivatives of all orders. LetR be the radius of convergence ofthe Taylor series of f .

(a) (5 Points) Is it possible for f tohave a Taylor series expansionwith R = 0?

(b) (5 Points) Suppose f is definedon the interval [−3, 3] and ffails to be continuous at the points−1.7 and 2.2. Is it possible forR to be 2.5?

Part 3: Calculational Exercises (56 Points)Provide complete solutions with all ap-propriate detail to the following compu-tational exercises.

1. (14 Points) Let

fn(x) =sin(nx)nx

, x 6= 0

= 1, x = 0

Determine whether the sequence fnconverges uniformly on the follow-ing intervals:

(a) [1, 6](b) [−1, 1]

2. (14 Points) Determine whether ornot the function

f(x, y) =x3 tan(x (y4))

x4 + y4

is continuous at the point (0, 0).

3. (14 Points) Let

f(x) =∫ x

0

11 + t8

dt

(a) Find f′(x)

(b) Express f(.5) without using in-tegrals.

4. (14 Points) Find the Taylor seriesexpansion of f(x) = tan−1(x) aboutbase point 0 and determine its ra-dius of convergence.

Part 4: Proofs (34 Points)Provide a careful proofs of the followingpropositions. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof.

1. (17 points)Let [a, b] be a finite interval of thereal line with fn denoting a se-quence of real-valued functions, fn :[a, b] → <. Assume fn → 0 uni-formly on [a, b] and the function g :[a, b]→ < is bounded on [a, b]. Thengfn → 0 uniformly on [a, b]

2. (17 points)Let f : R → R be continuous onthe finite interval [a, b]. If

∫ xa f(x)dx =

0 for all x ∈ [a, b], then f(x) = 0for all x ∈ [a, b].

B-4 Sample Exams Version B

B-4.1 Exam 1B



1. (15 Points) Carefully define and dis-cuss the integral of a bounded func-tion f : [a, b]→ R, where [a, b] is aclosed and finite interval. You willneed to explain in detail all relevantsymbols and terms.

Part 2: Short Answer (51 Points)You must determine whether or not thesestatements are true. If the statement istrue, you MUST give us the reason why

445


it is true; if the statement is false, youmust give us a counterexample.

1. (7 Points) Let [a, b] be a closed andfinite interval and assume f, g : [a, b]→R are not integrable on [a, b]. Is itnecessarily true that f + g is not in-tegrable on [a, b]?


1. (17 Points) Consider f(x) = x2 onthe interval [0, 1]. Let Pn be theuniformly spaced partition of [0, 1]consisting of n equal subintervals.Find SPn − SPn and show that

SPn − SPn → 0 as n→∞.

Does this prove that f(x) = x2 isintegrable on [0, 1]? Explain youranswer carefully.

2. Determine whether or not the fol-lowing functions are integrable on[0, 1] giving full reasons for youranswers.(a) (7 points)

f(x) = x2 − 3x + 7,

(b) (7 points)

f(x) =π

2, 0 ≤ x < .25

= 1.0, x = .25= .6, .25 < x < .75= .17, x = .75= .038, .75 < x ≤ 1.0

(c) (7 points)

f(x) = 2, x ∈ Q= −9, x ∈ I

where Q and I indicate the ra-tional and irrational numbers re-spectively.

(d) Differentiate the function f :[1, 10]→ R defined by,

i. (3 points)

f(x) =∫ x

1(1 + t2)−2 dt

ii. (3 points)

f(x) =∫ x2

1(2 + t3)−4 dt


1. (17 Points)Proposition: Let [a, b] be a finite in-terval and let f : [a, b] → R benon-negative on [a, b], i.e. f(x) ≥0, a ≤ x ≤ b. Then if ∃ x0, a <x0 < b 3 f(x0) > 0, this impliesthat

∫ ba f(x) dx > 0.

2. (17 Points)Carefully state and prove a theoremfor calculating

d

dx

(∫ u(x)

af(t) dt

)

B-4.2 Exam 2B


Part 1: Definitions (18 Points)Carefully and precisely define the follow-ing mathematical phrases:

1. (6 Points)

∞∑n=0

an = S.

446


In the phrases below, I is a finite interval,fn is a sequence of real valued func-tions defined on I and f is a real valuedfunction defined on I .

1. (3 Points) f is uniformly continu-ous on I .

2. (3 Points)

fn(x) → f(x) pointwise on I.

3. (3 Points)

fn(x) → f(x) uniformly on I.

4. (3 Points)

∞∑n=0

fn(x)

converges uniformly on I to the func-tion f(x).

Part 2: Theorems (12 Points)Carefully and precisely state the follow-ing theorems:

1. (3 Points) The Comparison Test

2. (3 Points) The Limit Comparison Test

3. (3 Points) The Ratio Test

4. (3 Points) The Weierstrass K Test


1. (5 Points) Let I be a finite intervaland fn a sequence of real valuedfunctions defined on I . If fn →f uniformly on I , is it necessarilytrue that fn → f pointwise on I?

2. (5 Points) If an → 1 is it necessar-ily true that

∑∞n=0 an converges?

For each of the following series, tellwhether or not it is absolutely conver-gent, conditionally convergent or diver-gent,giving full reasons for your answers.

1. (6 Points)

∞∑n=0

4n+1 + 5n5n+1 + n2 + 5

2. (6 Points)

∞∑n=3

(−1)n(n2)n3 + n− 2

3. (6 Points)

∞∑n=2

ln(1n2

)

Show all your work on the exercises be-low. You may use a calculator if youwish.

1. (15 Points) Consider for all n ≥ 1,

fn(x) =n2(x6)

n+ n2(x6)

Discuss the pointwise and uniformconvergence of fn on the intervals:

(a) [−1, 1](b) [1

2 , 4]


1. (13 Points)If fn → f and gn → g uniformlyon the interval I , then fn + 3gn →f + 3g uniformly on I .

2. (14 Points)If∑an converges absolutely, then∑ an

1+(an)2converges absolutely.

447


B-4.3 Exam 3B

:Instructions:

This is a closed book and closed notes test. Youwill need to give me all the details of your ar-guments as that is the only way I can decide ifpartial credit is warranted.

Part 1: Definitions (12 Points):Let fn∞n=1 be a sequence of functionsand f be a function defined on the finiteinterval [a, b]. Define the following con-cepts:

1. fn → f pointwise on [a, b].2. fn → f uniformly on [a, b].3.∑∞

1 fn(x) converges pointwise tof on [a, b].

4.∑∞

1 fn(x) converges uniformly tof on [a, b].

Part 2: Theorems (16 Points):Let fn∞n=1 be a sequence of functionsand f be a function defined on the finiteinterval [a, b]. State carefully the follow-ing theorems.

1. (6 Points) The interchange of limitand integration for

(a) For fn → f

(b) For∑∞

1 fn(x) = f(x)2. (10 points) The interchange of limit

and differentiation for

(a) For fn → f

(b) For∑∞

1 fn(x) = f(x)

Part 3: Short Answer: (20 Points):Let fn∞n=1 be a sequence of functionsand f be a function defined on the finiteinterval [a, b].

1. (7 Points) If fn → f uniformly on[a, b], each fn continuous on [a, b],is it necessarily true that f is con-tinuous on [a, b]?

2. (7 Points) If fn → f on [a, b] andlimn

∫ ba fn(x)dx =

∫ ba f(x)dx, does

it necessarily follow that fn → funiformly on [a, b]?

3. (6 Points) If∑∞

1 an(x − x0)n hasradius of convergence R > 0, is itnecessarily true that the series con-verges at x0 +R?

Part 4: Calculational: (24 Points):

1. (8 Points) Compute

lim(x,y)→(0,0)

tan(√x2y2 + x4y4)sin(|xy|)

if it exists.

2. (8 Points) Sum the series∑∞

1xn−1

n+1

3. (8 Points) Find the Taylor series ex-pansion about 0 for f(x) =

∫ x0 e−t2dt

Part 5: Proofs (28 Points):Provide careful proofs of the followingpropositions. You will be graded on themathematical correctness of your argu-ments as well as your use of language,syntax and organization in the proof.

1. (14 Points) Provelim(x,y)→(0,0)

√x+ y + 1 = 1.

2. (14 Points) Let f(x) =∑∞

1 nxn.Provide all the arguments and de-tails that prove that for x ∈ (−1, 1)f′(x) =

∑∞1 n2xn−1.

448

Part XII

Appendix: Linear Analysis Examinations

449

Appendix C

Linear Analysis Study Guide

Presented below are study guides and coursedescription material from some of the times wehave taught the graduate course on linear analy-sis here, MTHSC 821 Another source of infor-mation is our web-based discussion board ma-terial. The MTHSC 821 discussion board canbe accessed either through my web page by fol-lowing the links or by the direct routehttp://www.ces.math.clemson.edu/discus.On that page you will see exercises, worked outproblems and so forth in an easy to follow for-mat.

C-1 Course Structure

The intent of this course is to develop your abil-ity to think abstractly using a standard elementof the practicing analyst’s toolkit–abstract spaces.In addition, we will continue to develop yourabilities to read and write good mathematics atthe professional level.

C-2 Sample Exams Version A

C-2.1 Exam 1A

:Instructions:

This is a closed book and closed notes test. Youwill need to give all the details of your argu-

ments as that is the only way partial credit maybe warranted.

Part 1: Definitions (30 Points)Carefully and precisely define the follow-ing mathematical concepts:

1. (6 Points) Let X be a set.

(a) (3 Points) d : X ×X → < isa metric on X .

(b) (3 Points) d : X → < is anorm on X .


(a) (3 Points) X is a vector spaceover the field of real numbers.

(b) (3 Points) X is a metric spaceover the field of real numbers.

(c) (3 Points)X is a normed spaceover the field of real numbers.


(a) (3 Points)X is a separable met-ric space over the field of realnumbers.

(b) (3 Points) xn is a Cauchy Se-quence of elements in the normedspace X over <.

(c) (3 Points)X is a complete met-ric space over the field of realnumbers.

451

C-2. EXAMS VERSION A APPENDIX C. LINEAR ANALYSIS I

4. (6 Points) LetX be a normed spaceset and M be a subset of X .

(a) (3 Points)M is a compact sub-set of X .

(b) (3 Points) M is a subspace ofX properly contained inX andY is the normed space X/Mwith the usual induced norm.

Part 2: Short Answer (44 Points)There are several types of questions here:

True or False: You must determine whetheror not these statements are true. Ifthe statement is true, you MUST giveus the reason why it is true; if thestatement is false, you must give usa counterexample.

Discussion: A careful discussion and rea-soned answer to the given questionis required.

1. (3 Points) LetX be a normed spaceand assume that the setx ∈ X | || x || ≤ 1 is com-pact. Is it possible forX to be finitedimensional?

2. (3 Points) LetX be a normed spaceand let Y be a finite dimensionalsubspace of X that is not closed. Isthis possible?

3. (8 Points) Let X be the set of allcontinuous functions defined on thefinite interval [a, b]. Describe a normon X which make X complete anda norm on X which makes X notcomplete.

4. (3 Points) Is there a procedure whichenables us to complete an arbitrarymetric space?

5. (3 Points) Is there a procedure whichenables us to complete an arbitrarynormed space?

6. (6 Points) Give an example of an in-finite dimensional metric space whichis not separable.

7. (6 Points) Is necessarily true that themetric in an arbitrary metric spaceX comes from a norm on X?

8. (12 Points) Let X = <7. Let L =< x1, x2, x3, x4, x5, x6, x7 >

′∈ <7 |x1 + x5 + 23.0x7 = 0.(a) (2 Points) Show L is a vector

space.(b) (2 Points) What is the dimen-

sion of L?(c) (2 Points) What is the dimen-

sion of X/L?(d) (2 Points) What are the elements

of X/L?(e) (2 Points) Is L closed?(f) (2 Points) Is L compact?


1. ( 13 Points) Let X be the set of allcontinuous functions defined on thefinite closed interval [a, b] with themaximum metric. Let φ be a givenelement ofX and define TX → Xby

(T (x))(t) =∫ t

aφ(s)x(s) ds

If xn is a Cauchy sequence in X ,prove that there exists an element yin X such that T (xn) → y.

2. ( 13 Points) Let X be the set of allcontinuous functions defined on thefinite closed interval [0, 1] with themaximum metric. Let S be the setof functions

S =et, e2t, e3t

Prove that S is a linearly indepen-dent set in X .

452


C-2.2 Exam 2A

:Instructions:

This is a closed book and closed notes test. Youwill need to give all the details of your argu-ments as that is the only way partial credit maybe warranted.

Part 1: Definitions and Short Answers(34 Points)

Carefully and precisely define the follow-ing mathematical concepts and answer theshort questions.

1. (6 Points)

(a) Precisely define the meaning ofa metric.

(b) Precisely define the metric spaceX over the field of real num-bers.

(c) Is it required thatX be a vectorspace?

2. (10 Points)

(a) Precisely define the meaning ofa norm.

(b) Precisely define the Normed SpaceX over the real numbers.

(c) Is it required thatX be a vectorspace?

(d) Define the metric onX inducedby the norm.

(e) Is it true that all metrics can bederived from a norm?

3. (10 Points)

(a) Precisely define the meaning ofan inner product.

(b) Precisely define the inner prod-uct spaceX over the real num-bers.

(c) Define the norm on X inducedby the inner product.

(d) Define the metric onX inducedby the inner product.

(e) State the Schwartz Inequalityfor this inner product space.

4. (8 Points)(a) Let X be a vector space. Pre-

cisely define the meaning of thealgebraic dual space ofX ,X∗.

(b) LetX be a normed space. Pre-cisely define the meaning of thedual space of X , X

′.

(c) Precisely define the norm of anelement of X

′.

(d) LetX be a normed space. Pre-cisely define the meaning of thedual space of X

′, X

′′.

Part 2: Short Answer (30 Points)A careful discussion and reasoned answerto the given question is required.

1. (15 Points) Characterize the boundedlinear functionals on `1.

2. (15 Points) Characterize the boundedlinear functionals on c0.


1. ( 18 Points) Let X be the set of allcontinuous functions defined on thefinite closed interval [a, b] with themaximum metric. Let φ be a givenelement of X that satisfies for allpositive integers n

|| φn || ≤ || φ ||n

Define for all positive integers nTn X → X by

(Tn(x))(t) =∫ t

aφn(s)x(s) ds

Prove that the sequence of linear op-erators Tn converges to the zeroelement in the space of all boundedlinear operators from X to X .

453


2. ( 18 Points) Assume for all positiveintegers n,

xn = ξn1 , ξn2 , . . . , ∈ `1

and

x = ξ1, ξ2, . . . , ∈ `1

satisfy

f(xn) → f(x), ∀f ∈ (`1)′

Prove that for all positive integers k

ξnk → ξk

C-2.3 Exam 3A

:Instructions:




1. (4 Points) Define what it means fora inner product space X to be com-plete.

2. (15 Points) Completion (No DetailsHere–Just Sketch!!)

(a) (5 Points) State, without proof,the basic ideas behind complet-ing a metric space.

(b) (5 Points) State, without proof,the extra things that need to bedone to complete a normed space.

(c) (5 Points) State, without proof,the extra things that need to bedone to complete an inner prod-uct space.

3. (16 Points)

(a) (4 Points) Define what it meansfor the set M to be orthonor-mal in the inner product spaceX .

(b) (4 Points) Define what it meansfor M to be a total orthonor-mal set in the inner product spaceX .

(c) (4 Points) State Bessel’s Inequal-ity.

(d) (4 Points) Discuss the condi-tions under which Parseval’s Re-lation holds.


1. (5 Points) Let x be an element ofthe Hilbert space H . Let M be theorthonormal sequence (ei) inH . Con-sider the sum

∑∞i=0 < x, ei > ei.

Is it necessarily true that this sumconverges in H?

2. (5 Points) Let x be an element ofthe Hilbert space H . Let M be theorthonormal sequence in H . Con-sider the sum

∑∞i=0 < x, ei > ei.

Is it necessarily that if this sum con-verges, the sum is x?

3. (5 Points) Let Y be a subspace ofthe Hilbert space H . Is it necessar-ily true that

H = Y ⊕ Y ⊥ ?

4. (5 Points) Is it necessarily true thatan orthonormal set M in an innerproduct space X is linearly inde-pendent?

5. (5 Points) If M is an uncountableorthonormal set in an inner prod-

454

C-3. EXAMS VERSION B APPENDIX C. LINEAR ANALYSIS I

uct space X , is it possible that <x,m > 6= 0 for all m in M?

6. (5 Points) If M is a total subset inthe inner product spaceX , is it nec-essarily true that M⊥ = 0?

7. (5 Points) Let M be a nonempty,convex and complete subset of theinner product space X . Let x be anelement of X not in M . Is it possi-ble for the minimum distance fromx to M to be achieved by two dif-ferent points of M?


1. ( 12 Points) Let M be a total or-thonormal set in the Hilbert spaceH . Assume x inH satisfies< x,m >= 0 for all m in M . Prove thatx = 0.

2. ( 18 Points) LetX be the inner prod-uct space C[0, 1] with the usual L2

inner product. Recall that this means

|| x ||2 =

√∫ 1

0| x(s) |2 ds

Let f and φ be chosen arbitrarilyfrom X and define the operator T :X → X by and

(T (x))(t) =

< x, f > +∫ t

0x(s)φ(s)ds

Prove that T is a linear operator andthat

(a) for all t in [0, 1],

| (T (x))(t) |≤|| x ||2 (|| f ||2 + || φ ||2).

(b)

|| T ||≤|| f ||2 + || φ ||2

C-3 Sample Exams Version B

C-3.1 Exam 1B

Instructions:This is a closed book and closed notes test. Youwill need to give all the details of your argu-ments as that is the only way partial credit maybe warranted.

Part 1: Definitions (33 Points)Carefully and precisely define the follow-ing mathematical concepts:


(a) (2 Points) X′, the closure of

X .(b) (2 Points) X is a closed set.(c) (2 Points) X is an open set.

2. (12 Points) LetX be a normed space.

(a) (3 Points)M contained inX isdense in X .

(b) (3 Points)M is a compact sub-set of X .

(c) (3 Points) M is a Hamel Basisfor X .

(d) (3 Points)M is a Schauder Ba-sis for X .

3. (15 Points)

(a) (3 Points) X is a vector spaceover the field of real numbers.

(b) (3 Points) X is a metric spaceover the field of real numbers.

(c) (3 Points)X is a normed spaceover the field of real numbers.

(d) (3 Points)X is a complete met-ric space over the field of realnumbers.

(e) (3 Points) X is a completenormed space over the field ofreal numbers.

455


Part 2: Short Answer (43 Points)There are several types of questions here:

True or False: You must determine whetheror not these statements are true. Ifthe statement is true, you MUST giveus the reason why it is true; if thestatement is false, you must give usa counterexample.

Discussion: A careful discussion and rea-soned answer to the given questionis required.

1. (9 Points) LetX be a an incompletemetric space. Discuss the processby which we can complete X .

2. (5 Points) Let X be the set of allcontinuous functions defined on theinterval [0, 1] with the maximum norm.Is the setx ∈ X | || x || ≤ 1 compact inX?

3. (5 Points) LetX be a normed spaceand let Y be a subspace of X thatis closed. Does Y have to be finitedimensional?

4. (9 Points) Let Y be the set of allcontinuous functions defined on thefinite interval [a, b] such that x(a) =x(b) and let X be the set of all con-tinuous functions defined on [a, b].

(a) Show Y is a subspace of X .(b) Is Y complete with respect to

the maximum norm?

5. (9 Points) Let Y be the set of allcontinuous functions defined on thefinite interval [a, b] such that x(a) =2 and let X be the set of all contin-uous functions defined on [a, b].

(a) Is Y a subspace of X?(b) Is Y a closed set relative to the

maximum norm?

6. (6 Points) Give an example of an in-finite dimensional metric space whichis separable.


1. ( 12 Points) Let X be the set of allcontinuous functions defined on thefinite closed interval [a, b] with themaximum metric. Let φ be a givenelement of X and define T : X →X by

(T (x))(t) = x(t) +∫ t

aφ(s)x(s) ds

If xn is a Cauchy sequence in X ,prove that T (xn) is also a Cauchysequence.

2. ( 12 Points) Let A denote an arbi-trary n × n matrix over the reals.Let X be the set of all such matri-ces endowed with the norm:

|| A ||2 =n∑i=1

n∑j=1

A2ij

Prove

|| A2 || ≤ || A ||2

C-3.2 Exam 2B

Instructions:This is a closed book and closed notes test. Youwill need to give all the details of your argu-ments as that is the only way partial credit maybe warranted.



456


1. (16 Points)

(a) (8 Points) Precisely define whatit means for f : [0, 1] → < tobe of bounded variation.

(b) (8 Points) Let f, g : [0, 1] →< be continuous functions on[0, 1]. Precisely define the mean-ing of the Riemann-Stieljes in-tegral

∫ 10 f dg.

2. (6 Points)

(a) (3 Points) Precisely define whatit means for the set X to be aninner product space over thereal numbers.

(b) (3 Points) State the SchwartzInequality for this inner prod-uct space.

3. (6 Points)

(a) (3 Points) Let X be a vectorspace. Precisely define what itmeans forX to be algebraicallyreflexive.

(b) (3 Points) Let X be a normedspace. Precisely define what itmeans for X to be reflexive inthe context of bounded linearfunctionals.

4. (6 Points) State the Riesz’s Lemma.


1. (4 Points) Is it necessarily true thatevery norm arises from an inner prod-uct?

2. (8 Points) For the following prob-lem, you must give examples of var-ious types of spaces; no space shouldbe used as an answer twice.

(a) (2 Points) Give an example ofa finite dimensional normed space.

(b) (2 Points) Give an example ofan infinite dimensional normedspace.

(c) (2 Points) Give an example ofa finite dimensional inner prod-uct space.

(d) (2 Points) Give an example ofan infinite dimensional inner prod-uct space.

3. (8 Points) For the following prob-lem, you must give examples of var-ious types of operators. Let X andY be the infinite dimensional normedspaces and let T : X → Y be anoperator. You will have to tell mewhat X and Y are for your exam-ple.

(a) (2 Points) Give an example ofT with ker(T) equal to 0.

(b) (2 Points) Give an example ofT with ker(T) not equal to 0.

(c) (2 Points) Give an example ofT that is not bounded.

(d) (2 Points) Give an example ofT that is not linear.

4. (4 Points) Is it necessarily true thatproper subspaces of any normed spaceare finite dimensional?

5. (10 Points)

(a) (4 Points) Characterize(`p)

′), p ≥ 1.

(b) (6 Points) Characterize(C[0, 1])

′).


1. ( 16 Points) Let X be the set of allcontinuous functions defined on thefinite closed interval [a, b] with themaximum metric. Let φ be a con-tinuous real-valued function whosedomain is all of < that satisfies forsome positive number K:

| φ(u)− φ(v) |≤ K | u− v | .

457


Define the (possibly nonlinear op-erator T ) by T : X → X by

(T (x))(t) =∫ t

aφ(x(s)) ds

Prove using the standard ε − δ for-malism that the operator T is con-tinuous as a mapping from X to X .

2. ( 16 Points) Let X be the normedspace `1 with the usual norm. Thenany x in X is a sequence

x = ξ1, ξ2, . . . ,

with∑∞

i=1 | ξi |< ∞.Define the operator T : `1 → `2

by

T (x) =∞∑i=1

| ξi |2 .

Prove that T is a well-defined linearoperator and that || T ||= 1.

C-3.3 Final B

:Instructions:




1. (5 Points) A metric space X .

2. (5 Points) A normed space X .

3. (5 Points) An inner product spaceX .

4. (5 Points) A complete inner prod-uct space X .

5. (5 Points) A function of boundedvariation on the interval [0, 1].

6. (5 Points) The Riemann–Stieljes In-tegral of functions in C[0, 1].

7. (5 Points) The algebraic dual of avector space X , X∗.

8. (5 Points) The continuous dual ofnormed space X , X

′.

9. (5 Points) An orthonormal sequencein an inner product space X .

10. (5 Points) A total orthonormal se-quence in a Hilbert space X .

11. (5 Points) A sesquilinear form.

12. (5 Points) The adjoint of an opera-tor in a Hilbert Space H .

Short Answer: (80 Points)

1. (4 Points) Is it necessarily true thata projection operator P defined forthe closed subspaceX of the HilbertspaceH satisfies the equationP (I−P )x = 0 for all x in H?

2. (6 Points) For each of the follow-ing spaces X , we know what spaceis isomorphic to its dual space X

′.

What is that space?

(a) lp, 1 < p < ∞(b) l1

(c) c

3. (15 Points) For each of the follow-ing spacesX , we know how to char-acterize a given linear functional inX′. Describe that characterization

carefully:

(a) f ∈ (lp)′, 1 < p < ∞

(b) f ∈ (l1)′

(c) f ∈ (c0)′

(d) f ∈ (C[0, 1])′

(e) f ∈ H ′ , whereH is any Hilbertspace.

458


4. (5 Points) Let H be a Hilbert spaceand T : H → H an operator. Is itnecessarily true that the adjoint ofT always exists?

5. (5 Points)Let H be a Hilbert spaceand T : H → H an operator. Isit possible for the distinct operatorsS1 and S2 on H to both satisfy forall x, y in H

< Tx, y > = < x, S1y >

< Tx, y > = < x, S2y > ?

6. (15 Points:) We seek solutions xfromC[0, 1] to the differential equa-tion below where f is an arbitraryelement of C[0, 1].

x′(t) + 3(x′(t))2 + 2∫ t

0x(s) ds

= f(t)

Convert this mapping into the form

T : dom(T ) ⊂ X → X

by specifying the information be-low:

(a) A definition of T .(b) A definition of the domain of

T , dom(T ).

Finally, answer the following ques-tions:

(a) Is dom(T ) a vector space?(b) Is T a linear operator?(c) Is T a linear functional?

7. (15 Points:) Let Π be the partitionof the interval [0, 1] consisting ofP + 1 points given by

Π = t0 = 0 < t1 < . . . < tP = 1

Let A denote the 2x2 matrix

A =[

5 22 2

]For a given pair f and g fromC[01, ],let V (f, f, ti) ≡ V (f, g, i) be therow vector f(ti), g(ti). considerthe problem of finding f and g tominimize:

P∑i=0

V (f, g, i)A V (f, g, i)′

Convert this mapping into the form

T : dom(T ) ⊂ X → X

by specifying the information be-low:(a) A definition of T .(b) A definition of the domain of

T , dom(T ).Finally, answer the following ques-tions:(a) Is dom(T ) a vector space?(b) Is T a linear operator?(c) Is T a linear functional?

8. (15 Points:) Let P be a photographwhich has been discretized into anarray P of size 640 × 480. Eachelement Pij in the array can takeon one of 256 discrete values whichrepresents a gray scale image of theoriginal photograph. We need to com-press this information into a matrixof size 320 × 240. Starting in theupper left hand corner of P , we cancompress the original image by look-ing at 2 × 2 sub-matrices of P andapplying the compression operationwe choose to convert each of thesesub-matrices into some scalar. The1, 1 entry in the compressed digi-tized image thus comes from apply-ing our compression operator to the

459


2 × 2 sub-matrix coming form theupper left corner

P11, P12

P21, P22

the 1, 2 entry in the compressed ar-ray comes from compressing the 2×2 sub-matrix

P13, P14

P23, P24

and so on. For example, given thearray

a, b

c, d

a primitive compression operationconsists of applying the averagingoperation given by computing thereal number

a+ b+ c+ d

4.0

and then mapping that real numberto the closest integer in the range0, 1, . . . , 255.Convert this averaging mapping ofa given discretized photograph P in

X = <640×480256

(the subscript 256 reminds us thatthese are matrices that can only haveentries out of the integers 0, 1, . . . , 255)into the form:

T : dom(T ) ⊂ X → X

by specifying the information be-low:

(a) A definition of T .

(b) A definition of the domain ofT , dom(T ).

Finally, answer the following ques-tions:

(a) Is dom(T ) a vector space?(b) Is T a linear operator?(c) Is T a linear functional?


1. ( 15 Points)Let X be the set of all 640 × 480matrices whose entries lie in the setof integers 0, 1, . . . , 255. Definethe function d : X×X → < by as-signing to each matrix p and q fromX the number d(p, q) defined by

d(p, q) =max| pij − qij |

Prove that d is a metric on X .

2. ( 15 Points)Let H be a Hilbert space and T :H → H be an operator with non-trivial kernel K. Let P be the pro-jection operator onto K and Q bethe projection operator onto K⊥.Prove that

(a) H = K ⊕K⊥

(b) The operator S = TQ is in-vertible on H

3. ( 15 Points)Let (en) be a total orthonormal se-quence in the Hilbert space H . Leta given nonzero element f have theexpansion

f = a0 + a1e1 + a2e2 + . . .

=∞∑n=0

anen

460


Prove that

f

|| f ||=

∞∑n=0

bnen

bn =an√∑∞n=0 a

2n

4. ( 15 Points)Let (fn) be the sequence of func-tions in C[0, 1] defined by fn(t) =tn for all t in the interval [0, 1].

(a) Prove that fn converges to thefunction f defined by f(t) = 0for all t in [0, 1] when C[0, 1]is endowed with the L2 innerproduct.

(b) Prove that fn does not convergeto f when we simply use point-wise convergence on the inter-val [0, 1].

461

Measure and Integration: First Steps - Clemson CECAScecas.clemson.edu/~petersj/Courses/M822/M822.pdf · 2009-04-20 · Measure and Integration: First Steps We are all made of the

Documents