6 Orthogonality and Least Squares INTRODUCTORY EXAMPLE The North American Datum and GPS Navigation Imagine starting a massive project that you estimate will take ten years and require the efforts of scores of people to construct and solve a 1,800,000 by 900,000 system of linear equations. That is exactly what the National Geodetic Survey did in 1974, when it set out to update the North American Datum (NAD)—a network of 268,000 precisely located reference points that span the entire North American continent, together with Greenland, Hawaii, the Virgin Islands, Puerto Rico, and other Caribbean islands. The recorded latitudes and longitudes in the NAD must be determined to within a few centimeters because they form the basis for all surveys, maps, legal property boundaries, and layouts of civil engineering projects such as highways and public utility lines. However, more than 200,000 new points had been added to the datum since the last adjustment in 1927, and errors had gradually accumulated over the years, due to imprecise measurements and shifts in the earth’s crust. Data gathering for the NAD readjustment was completed in 1983. The system of equations for the NAD had no solution in the ordinary sense, but rather had a least-squares solution, which assigned latitudes and longitudes to the reference points in a way that corresponded best to the 1.8 million observations. The least-squares solution was found in 1986 by solving a related system of so-called normal equations, which involved 928,735 equations in 928,735 variables. 1 More recently, knowledge of reference points on the ground has become crucial for accurately determining the locations of satellites in the satellite-based Global Positioning System (GPS). A GPS satellite calculates its position relative to the earth by measuring the time it takes for signals to arrive from three ground transmitters. To do this, the satellites use precise atomic clocks that have been synchronized with ground stations (whose locations are known accurately because of the NAD). The Global Positioning System is used both for determining the locations of new reference points on the ground and for finding a user’s position on the ground relative to established maps. When a car driver (or a mountain climber) turns on a GPS receiver, the receiver measures the relative arrival times of signals from at least three satellites. This information, together with the transmitted data about the satellites’ locations and message times, is used to adjust the GPS receiver’s time and to determine its approximate location on the earth. Given information from a fourth satellite, the GPS receiver can even establish its approximate altitude. 1 A mathematical discussion of the solution strategy (along with details of the entire NAD project) appears in North American Datum of 1983, Charles R. Schwarz (ed.), National Geodetic Survey, National Oceanic and Atmospheric Administration (NOAA) Professional Paper NOS 2, 1989. 329
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
6 Orthogonality andLeast Squares
INTRODUCTORY EXAMPLE
The North American Datumand GPS NavigationImaginestartingamassiveprojectthatyouestimatewilltaketenyearsandrequiretheeffortsofscoresofpeopletoconstructandsolvea1,800,000by900,000systemoflinearequations. ThatisexactlywhattheNationalGeodeticSurveydidin1974, whenitsetouttoupdatetheNorthAmericanDatum(NAD)—anetworkof268,000preciselylocatedreferencepointsthatspantheentireNorthAmericancontinent, togetherwithGreenland, Hawaii, theVirginIslands, PuertoRico, andotherCaribbeanislands.
TherecordedlatitudesandlongitudesintheNADmustbedeterminedtowithinafewcentimetersbecausetheyformthebasisforallsurveys, maps, legalpropertyboundaries, andlayoutsofcivilengineeringprojectssuchashighwaysandpublicutilitylines. However,morethan200,000newpointshadbeenaddedtothedatumsincethelastadjustmentin1927, anderrorshadgraduallyaccumulatedovertheyears, duetoimprecisemeasurements andshifts in the earth’s crust. DatagatheringfortheNAD readjustmentwascompletedin1983.
systemofequations. A carefulexplanationofthisapparentcontradictionwillrequireideasdevelopedinthefirstfivesectionsofthischapter.
WEB
Inordertofindanapproximatesolutiontoaninconsistentsystemofequationsthathasnoactualsolution, awell-definednotionofnearnessisneeded. Section 6.1introducestheconceptsofdistanceandorthogonalityinavectorspace. Sections 6.2and6.3showhoworthogonalitycanbeusedtoidentifythepointwithinasubspace W thatisnearesttoapoint y lyingoutsideof W . Bytaking W tobethecolumnspaceofamatrix,Section 6.5developsamethodforproducingapproximate(“least-squares”)solutionsforinconsistentlinearsystems, suchasthesystemsolvedfortheNAD report.
Section 6.4providesanotheropportunitytoseeorthogonalprojectionsatwork,creatingamatrixfactorizationwidelyusedinnumericallinearalgebra. Theremainingsectionsexaminesomeofthemanyleast-squaresproblemsthatariseinapplications,includingthoseinvectorspacesmoregeneralthan R
n. Theseconceptsprovidepowerfulgeometrictoolsforsolvingmanyappliedproblems, includingtheleast-squaresproblemsmentionedabove. Allthreenotionsaredefinedintermsoftheinnerproductoftwovectors.
The Inner ProductIf u and v arevectorsin R
n, thenweregard u and v as n � 1 matrices. ThetransposeuT isa 1 � n matrix, andthematrixproduct uT v isa 1 � 1 matrix, whichwewriteasasinglerealnumber(ascalar)withoutbrackets. Thenumber uT v iscalledthe innerproduct of u and v, andoftenitiswrittenas u�v. Thisinnerproduct, mentionedintheexercisesforSection 2.1, isalsoreferredtoasa dotproduct. If
u D
2
6664
u1
u2
:::
un
3
7775
and v D
2
6664
v1
v2
:::
vn
3
7775
thentheinnerproductof u and v is
Œ u1 u2 � � � un �
2
6664
v1
v2
:::
vn
3
7775
D u1v1 C u2v2 C � � � C unvn
6.1 Inner Product, Length, and Orthogonality 331
EXAMPLE 1 Compute u�v and v�u for u D
2
4
2
�5
�1
3
5 and v D
2
4
3
2
�3
3
5.
SOLUTION
u�v D uT v D Œ 2 �5 �1 �
2
4
3
2
�3
3
5 D .2/.3/ C .�5/.2/ C .�1/.�3/ D �1
v�u D vT u D Œ 3 2 �3 �
2
4
2
�5
�1
3
5 D .3/.2/ C .2/.�5/ C .�3/.�1/ D �1
ItisclearfromthecalculationsinExample 1why u�v D v�u. Thiscommutativityoftheinnerproductholdsingeneral. ThefollowingpropertiesoftheinnerproductareeasilydeducedfrompropertiesofthetransposeoperationinSection 2.1. (SeeExercises 21and22attheendofthissection.)
THEOREM 1 Let u, v, and w bevectorsin Rn, andlet c beascalar. Then
a. u�v D v�ub. .u C v/�w D u�w C v�wc. .cu/�v D c.u�v/ D u�.cv/
DEF IN I T I ON The length (or norm)of v isthenonnegativescalar kvk definedby
kvk Dpv�v D
q
v21 C v2
2 C � � � C v2n; and kvk2 D v�v
Suppose v isin R2, say, v D
�
a
b
�
. Ifweidentify v withageometricpointinthe
|a|
|b|
x1
x2
(a, b)
a2 + b2√
0
FIGURE 1
Interpretationof kvk aslength.
plane, asusual, then kvk coincideswiththestandardnotionofthelengthofthelinesegmentfromtheoriginto v. ThisfollowsfromthePythagoreanTheoremappliedtoatrianglesuchastheoneinFig. 1.
A similarcalculationwiththediagonalofarectangularboxshowsthatthedefinitionoflengthofavector v in R
3 coincideswiththeusualnotionoflength.Foranyscalar c, thelengthof cv is jcj timesthelengthof v. Thatis,
kcvk D jcjkvk
(Toseethis, compute kcvk2 D .cv/� .cv/ D c2v�v D c2kvk2 andtakesquareroots.)
332 CHAPTER 6 Orthogonality and Least Squares
A vectorwhoselengthis1iscalleda unitvector. Ifwe divide anonzerovector vbyitslength—thatis, multiplyby 1=kvk—weobtainaunitvector u becausethelengthof u is .1=kvk/kvk. Theprocessofcreating u from v issometimescalled normalizingv, andwesaythat u is inthesamedirection as v.
EXAMPLE 2 Let v D .1; �2; 2; 0/. Findaunitvector u inthesamedirectionas v.
SOLUTION First, computethelengthof v:
kvk2 D v�v D .1/2 C .�2/2 C .2/2 C .0/2 D 9
kvk Dp
9 D 3
Then, multiply v by 1=kvk toobtain
u D 1
kvkv D 1
3v D 1
3
2
664
1
�2
2
0
3
775
D
2
664
1=3
�2=3
2=3
0
3
775
Tocheckthat kuk D 1, itsufficestoshowthat kuk2 D 1.
kuk2 D u�u D�
13
�2 C�
� 23
�2 C�
23
�2 C .0/2
D 19
C 49
C 49
C 0 D 1
EXAMPLE 3 Let W bethesubspaceof R2 spannedby x D . 2
3; 1/. Findaunit
vector z thatisabasisfor W .
SOLUTION W consistsofallmultiplesof x, asinFig. 2(a). Anynonzerovectorin W
isabasisfor W . Tosimplifythecalculation, “scale” x toeliminatefractions. Thatis,multiply x by3toget
y D�
2
3
�
Nowcompute kyk2 D 22 C 32 D 13, kyk Dp
13, andnormalize y toget
z D 1p13
�
2
3
�
D�
2=p
13
3=p
13
�
SeeFig. 2(b). Anotherunitvectoris .�2=p
13; �3=p
13/.
(a)
x1
x2
x
W
1
1
(b)
x1
x2
y
z
1
1
FIGURE 2
Normalizingavectortoproduceaunitvector.
Distance in Rn
Wearereadynowtodescribehowcloseonevectoristoanother. Recallthatif a and b
arerealnumbers, thedistanceonthenumberlinebetween a and b isthenumber ja � bj.TwoexamplesareshowninFig. 3. ThisdefinitionofdistanceinR hasadirectanaloguein R
EXAMPLE 4 Computethedistancebetweenthevectors u D .7; 1/ and v D .3; 2/.
SOLUTION Calculate
u � v D�
7
1
�
��
3
2
�
D�
4
�1
�
ku � vk Dp
42 C .�1/2 Dp
17
Thevectors u, v, and u � v areshowninFig. 4. Whenthevector u � v isaddedto v, theresultis u. NoticethattheparallelograminFig. 4showsthatthedistancefromu to v isthesameasthedistancefrom u � v to 0.
||u – v||
x1
x2
v
u
u – v
–v
1
1
FIGURE 4 Thedistancebetween u and v isthelengthof u � v.
EXAMPLE 5 If u D .u1; u2; u3/ and v D .v1; v2; v3/, then
3 andtwolinesthroughtheorigindeterminedbyvectors u and v.ThetwolinesshowninFig. 5aregeometricallyperpendicularifandonlyifthedistancefrom u to v isthesameasthedistancefrom u to �v. Thisisthesameasrequiringthesquaresofthedistancestobethesame. Now
Œ dist.u; �v/ �2 D ku � .�v/k2 D ku C vk2
D .u C v/� .u C v/
D u�.u C v/ C v� .u C v/ Theorem 1(b)D u�u C u�v C v�u C v�v Theorem 1(a), (b)
D kuk2 C kvk2 C 2u�v Theorem 1(a) (1)
334 CHAPTER 6 Orthogonality and Least Squares
Thesamecalculationswith v and �v interchangedshowthatŒdist.u; v/�2 D kuk2 C k � vk2 C 2u� .�v/
D kuk2 C kvk2 � 2u�vThetwosquareddistancesareequalifandonlyif 2u�v D �2u�v, whichhappensifandonlyif u�v D 0.
Thiscalculationshowsthatwhenvectors u and v areidentifiedwithgeometricpoints, thecorrespondinglinesthroughthepointsandtheoriginareperpendicularifandonlyif u�v D 0. Thefollowingdefinitiongeneralizesto R
n thisnotionofperpen-dicularity(or orthogonality, asitiscommonlycalledinlinearalgebra).
DEF IN I T I ON Twovectors u and v in Rn are orthogonal (toeachother)if u�v D 0.
Observethatthezerovectorisorthogonaltoeveryvectorin Rn because 0T v D 0
Twovectors u and v areorthogonalifandonlyif ku C vk2 D kuk2 C kvk2.
Orthogonal ComplementsToprovidepracticeusinginnerproducts, weintroduceaconceptherethatwillbeofuseinSection 6.3andelsewhereinthechapter. Ifavector z isorthogonaltoeveryvectorinasubspace W of R
n, then z issaidtobe orthogonalto W . Thesetofallvectors zthatareorthogonalto W iscalledthe orthogonalcomplement of W andisdenotedbyW ? (andreadas“W perpendicular”orsimply“W perp”).
v
u + v
||u + v|| u
||v||
||u||
0
FIGURE 6
EXAMPLE 6 Let W beaplanethroughtheoriginin R3, andlet L betheline
throughtheoriginandperpendicularto W . If z and w arenonzero, z ison L, and
w
zL
W
0
FIGURE 7
A planeandlinethrough 0 asorthogonalcomplements.
w isin W , thenthelinesegmentfrom 0 to z isperpendiculartothelinesegmentfrom 0to w; thatis, z�w D 0. SeeFig. 7. Soeachvectoron L isorthogonaltoevery w in W .Infact, L consistsof all vectorsthatareorthogonaltothe w’sin W , and W consistsofallvectorsorthogonaltothe z’sin L. Thatis,
L D W ? and W D L?
Thefollowingtwofactsabout W ?, with W asubspaceof Rn, areneededlater
FIGURE 8 Thefundamentalsubspacesdeterminedbyan m � n matrix A.
THEOREM 3 Let A bean m � n matrix. Theorthogonalcomplementoftherowspaceof A isthenullspaceof A, andtheorthogonalcomplementofthecolumnspaceof A isthenullspaceof AT :
.RowA/? D NulA and .ColA/? D NulAT
PROOF Therow–columnruleforcomputing Ax showsthatif x isin NulA, then x isorthogonaltoeachrowof A (withtherowstreatedasvectorsin R
n/. Sincetherowsof A spantherowspace, x isorthogonalto RowA. Conversely, if x isorthogonaltoRowA, then x iscertainlyorthogonaltoeachrowof A, andhence Ax D 0. Thisprovesthefirststatementofthetheorem. Sincethisstatementistrueforanymatrix, itistruefor AT . Thatis, theorthogonalcomplementoftherowspaceof AT isthenullspaceofAT . Thisprovesthesecondstatement, because RowAT D ColA.
Angles in R2 and R
3 (Optional)If u and v arenonzerovectorsineitherR
2 orR3, thenthereisaniceconnectionbetween
theirinnerproductandtheangle # betweenthetwolinesegmentsfromtheorigintothepointsidentifiedwith u and v. Theformulais
22. Let u D .u1; u2; u3/. Explain why u �u � 0. When isu �u D 0?
23. Let u D
2
4
2
�5
�1
3
5 and v D
2
4
�7
�4
6
3
5. Computeandcompare
u �v, kuk2, kvk2, and ku C vk2. Donotuse thePythagoreanTheorem.
24. Verifythe parallelogramlaw forvectors u and v in Rn:
ku C vk2 C ku � vk2 D 2kuk2 C 2kvk2
25. Let v D�
a
b
�
. Describetheset H ofvectors�
x
y
�
thatare
orthogonalto v. [Hint: Consider v D 0 and v ¤ 0.]
26. Let u D
2
4
5
�6
7
3
5, andlet W bethesetofall x in R3 suchthat
u �x D 0. Whattheorem inChapter 4canbeusedtoshowthatW isasubspaceof R
3? Describe W ingeometriclanguage.27. Supposeavector y isorthogonaltovectors u and v. Show
that y isorthogonaltothevector u C v.28. Suppose y is orthogonal to u and v. Showthat y is or-
thogonaltoevery w in Span fu; vg. [Hint: Anarbitrary win Span fu; vg hastheform w D c1u C c2v. Showthat y isorthogonaltosuchavector w.]
Span{u, v}
u
w
vy
0
29. Let W D Span fv1; : : : ; vpg. Showthatif x isorthogonaltoeach vj , for 1 � j � p, then x isorthogonaltoeveryvectorin W .
30. Let W beasubspaceof Rn, andlet W ? bethesetofall
vectorsorthogonalto W . Showthat W ? isasubspaceof Rn
usingthefollowingsteps.a. Take z in W ?, andlet u representanyelementof W .
Then z �u D 0. Takeanyscalar c andshowthat cz isorthogonalto u. (Since u wasanarbitraryelementof W ,thiswillshowthat cz isin W ?.)
b. Take z1 and z2 in W ?, andlet u beanyelementof W .Showthat z1 C z2 is orthogonal to u. Whatcanyouconcludeabout z1 C z2? Why?
c. Finishtheproofthat W ? isasubspaceof Rn.
31. Showthatif x isinboth W and W ?, then x D 0.
32. [M] Constructapair u, v ofrandomvectorsin R4, andlet
A D
2
664
:5 :5 :5 :5
:5 :5 �:5 �:5
:5 �:5 :5 �:5
:5 �:5 �:5 :5
3
775
a. Denote the columns of A by a1; : : : ; a4. Com-pute the length of each column, and compute a1 �a2,a1 �a3; a1 �a4; a2 �a3; a2 �a4, and a3 �a4.
b. Computeandcomparethelengthsof u, Au, v, and Av.c. Useequation(2)inthissectiontocomputethecosineof
theanglebetween u and v. Comparethiswiththecosineoftheanglebetween Au and Av.
d. Repeatparts(b)and(c)fortwootherpairsofrandomvectors. Whatdoyouconjectureabouttheeffectof A onvectors?
33. [M] Generaterandomvectors x, y, and v in R4 withinteger
entries(and v ¤ 0), andcomputethequantities�x �vv �v
�
v;�y �vv �v
�
v;.x C y/�v
v �vv;
.10x/�vv �v
v
Repeatthecomputationswithnewrandomvectors x andy. Whatdoyouconjectureaboutthemapping x 7! T .x/ D�x �vv �v
�
v (for v ¤ 0)? Verifyyourconjecturealgebraically.
34. [M] Let A D
2
66664
�6 3 �27 �33 �13
6 �5 25 28 14
8 �6 34 38 18
12 �10 50 41 23
14 �21 49 29 33
3
77775
. Construct
a matrix N whosecolumnsforma basis for NulA, andconstructamatrixR whose rows formabasisfor RowA (seeSection4.6fordetails). PerformamatrixcomputationwithN and R thatillustratesafactfromTheorem3.
338 CHAPTER 6 Orthogonality and Least Squares
SOLUTIONS TO PRACTICE PROBLEMS
1. a�b D 7, a�a D 5. Hencea�ba�a
D 7
5, and
�a�ba�a
�
a D 7
5a D
�
�14=5
7=5
�
.
2. Scale c, multiplyingby3toget y D
2
4
4
�3
2
3
5. Compute kyk2 D 29 and kyk Dp
29.
Theunitvectorinthedirectionofboth c and y is u D 1
kyky D
2
4
4=p
29
�3=p
29
2=p
29
3
5.
3. d isorthogonalto c, because
d�c D
2
4
5
6
�1
3
5�
2
4
4=3
�1
2=3
3
5 D 20
3� 6 � 2
3D 0
4. d isorthogonalto u because u hastheform kc forsome k, andd�u D d� .kc/ D k.d�c/ D k.0/ D 0
An Orthogonal ProjectionGivenanonzerovector u in R
n, considertheproblemofdecomposingavector y in Rn
intothesumoftwovectors, oneamultipleof u andtheotherorthogonalto u. Wewishtowrite
y D Oy C z (1)
340 CHAPTER 6 Orthogonality and Least Squares
where Oy D ˛u forsomescalar ˛ and z issomevectororthogonalto u. SeeFig. 2. Given
u
y
0 ˆ
z = y – y
y = �u
FIGURE 2
Finding ˛ tomake y � Oyorthogonalto u.
anyscalar ˛, let z D y � ˛u, sothat(1)issatisfied. Then y � Oy isorthogonalto u ifandonlyif
0 D .y � ˛u/�u D y�u � .˛u/�u D y�u � ˛.u�u/
Thatis, (1)issatisfiedwith z orthogonalto u ifandonlyif ˛ D y�uu�u
and Oy D y�uu�u
u.Thevector Oy iscalledthe orthogonalprojectionofyontou, andthevector z iscalledthe componentofyorthogonaltou.
If c isanynonzeroscalarandif u isreplacedby cu inthedefinitionof Oy, thentheorthogonalprojectionof y onto cu isexactlythesameastheorthogonalprojectionof yonto u (Exercise 31). Hencethisprojectionisdeterminedbythe subspace L spannedby u (thelinethrough u and 0). Sometimes Oy isdenotedby projL y andiscalledtheorthogonalprojectionofyonto L. Thatis,
Oy D projL y D y�uu�u
u (2)
EXAMPLE 3 Let y D�
7
6
�
and u D�
4
2
�
. Findtheorthogonalprojectionof y
onto u. Thenwrite y asthesumoftwoorthogonalvectors, onein Span fug andoneorthogonalto u.
SOLUTION Compute
y�u D�
7
6
�
�
�
4
2
�
D 40
u�u D�
4
2
�
�
�
4
2
�
D 20
Theorthogonalprojectionof y onto u is
Oy D y�uu�u
u D 40
20u D 2
�
4
2
�
D�
8
4
�
andthecomponentof y orthogonalto u is
y � Oy D�
7
6
�
��
8
4
�
D�
�1
2
�
Thesumofthesetwovectorsis y. Thatis,�
7
6
�
"y
D�
8
4
�
"Oy
C�
�1
2
�
".y � Oy/
Thisdecompositionof y isillustratedinFig. 3. Note: Ifthecalculationsabovearecorrect, then fOy; y � Oyg willbeanorthogonalset. Asacheck, compute
Oy�.y � Oy/ D�
8
4
�
�
�
�1
2
�
D �8 C 8 D 0
SincethelinesegmentinFig. 3between y and Oy isperpendiculartoL, byconstruc-tionof Oy, thepointidentifiedwith Oy istheclosestpointof L to y. (Thiscanbeprovedfromgeometry. Wewillassumethisfor R
2 nowandproveitfor Rn inSection 6.3.)
6.2 Orthogonal Sets 341
x1
x2
y
u
y
L = Span{u}
3
1 8
6
yy – ˆ
FIGURE 3 Theorthogonalprojectionof y ontoaline L throughtheorigin.
EXAMPLE 4 FindthedistanceinFig. 3from y to L.
SOLUTION Thedistancefrom y to L isthelengthoftheperpendicularlinesegmentfrom y totheorthogonalprojection Oy. Thislengthequalsthelengthof y � Oy. Thusthedistanceis
ky � Oyk Dp
.�1/2 C 22 Dp
5
A Geometric Interpretation of Theorem 5Theformulafortheorthogonalprojection Oy in(2)hasthesameappearanceaseachofthetermsinTheorem 5. ThusTheorem 5decomposesavector y intoasumoforthogonalprojectionsontoone-dimensionalsubspaces.
Itiseasytovisualizethecaseinwhich W D R2 D Span fu1;u2g, with u1 and u2
orthogonal. Any y in R2 canbewrittenintheform
y D y�u1
u1 � u1
u1 C y�u2
u2 � u2
u2 (3)
Thefirsttermin(3)istheprojectionof y ontothesubspacespannedby u1 (thelinethrough u1 andtheorigin), andthesecondtermistheprojectionof y ontothesubspacespannedby u2. Thus(3)expresses y asthesumofitsprojectionsontothe(orthogonal)axesdeterminedby u1 and u2. SeeFig. 4.
0
y
u1
u2
y2 = projection onto u
2
y1 = projection onto u
1
FIGURE 4 A vectordecomposedintothesumoftwoprojections.
Theorem 5decomposeseach y in Span fu1; : : : ; upg intothesumof p projectionsontoone-dimensionalsubspacesthataremutuallyorthogonal.
342 CHAPTER 6 Orthogonality and Least Squares
Decomposing a Force into Component ForcesThedecompositioninFig. 4canoccurinphysicswhensomesortofforceisappliedtoanobject. Choosinganappropriatecoordinatesystemallowstheforcetoberepresentedbyavector y in R
2 or R3. Oftentheprobleminvolvessomeparticulardirectionof
interest, whichisrepresentedbyanothervector u. Forinstance, iftheobjectismovinginastraightlinewhentheforceisapplied, thevector u mightpointinthedirectionofmovement, asinFig. 5. A keystepintheproblemistodecomposetheforceintoacomponentinthedirectionof u andacomponentorthogonalto u. ThecalculationswouldbeanalogoustothosemadeinExample 3above.
y
u
FIGURE 5
Orthonormal SetsA set fu1; : : : ; upg isan orthonormalset ifitisanorthogonalsetofunitvectors. If W
THEOREM 6 An m � n matrix U hasorthonormalcolumnsifandonlyif U TU D I .
PROOF Tosimplifynotation, wesupposethat U hasonlythreecolumns, eachavectorin R
m. Theproofofthegeneralcaseisessentiallythesame. Let U D Œ u1 u2 u3 �
andcompute
U TU D
2
64
uT1
uT2
uT3
3
75
�
u1 u2 u3
�
D
2
64
uT1 u1 uT
1 u2 uT1 u3
uT2 u1 uT
2 u2 uT2 u3
uT3 u1 uT
3 u2 uT3 u3
3
75 (4)
Theentriesinthematrixattherightareinnerproducts, usingtransposenotation. Thecolumnsof U areorthogonalifandonlyif
uT1 u2 D uT
2 u1 D 0; uT1 u3 D uT
3 u1 D 0; uT2 u3 D uT
3 u2 D 0 (5)
Thecolumnsof U allhaveunitlengthifandonlyif
uT1 u1 D 1; uT
2 u2 D 1; uT3 u3 D 1 (6)
Thetheoremfollowsimmediatelyfrom(4)–(6).
THEOREM 7 Let U bean m � n matrixwithorthonormalcolumns, andlet x and y bein Rn.
Then
a. kU xk D kxkb. .U x/�.U y/ D x�yc. .U x/� .U y/ D 0 ifandonlyif x�y D 0
Properties(a)and(c)saythatthelinearmapping x 7! U x preserveslengthsandorthogonality. Thesepropertiesarecrucialformanycomputeralgorithms. SeeExer-cise 25fortheproofofTheorem7.
EXAMPLE 6 Let U D
2
64
1=p
2 2=3
1=p
2 �2=3
0 1=3
3
75 and x D
�p2
3
�
. Noticethat U hasor-
thonormalcolumnsand
U TU D�
1=p
2 1=p
2 0
2=3 �2=3 1=3
�2
4
1=p
2 2=3
1=p
2 �2=3
0 1=3
3
5 D�
1 0
0 1
�
Verifythat kU xk D kxk.
344 CHAPTER 6 Orthogonality and Least Squares
SOLUTION
U x D
2
4
1=p
2 2=3
1=p
2 �2=3
0 1=3
3
5
�p2
3
�
D
2
4
3
�1
1
3
5
kU xk Dp
9 C 1 C 1 Dp
11
kxk Dp
2 C 9 Dp
11
Theorems6and7areparticularlyusefulwhenappliedto square matrices. Anorthogonalmatrix isasquareinvertiblematrixU suchthatU�1 D U T . ByTheorem 6,suchamatrixhasorthonormalcolumns.¹ Itiseasytoseethatany square matrixwithorthonormalcolumnsisanorthogonalmatrix. Surprisingly, suchamatrixmusthaveorthonormal rows, too. SeeExercises 27and28. OrthogonalmatriceswillappearfrequentlyinChapter 7.
23. a. Noteverylinearlyindependentsetin Rn isanorthogonal
set.
b. If y isalinearcombinationofnonzerovectorsfromanorthogonalset, thentheweightsinthelinearcombinationcanbecomputedwithoutrowoperationsonamatrix.
c. Ifthevectorsinanorthogonalsetofnonzerovectorsarenormalized, thensomeofthenewvectorsmaynotbeorthogonal.
d. A matrix with orthonormal columns is an orthogonalmatrix.
e. IfL isalinethrough 0 andif Oy istheorthogonalprojectionof y onto L, then kOyk givesthedistancefrom y to L.
24. a. Noteveryorthogonalsetin Rn islinearlyindependent.
b. Ifaset S D fu1; : : : ; upg hasthepropertythat ui � uj D 0
whenever i ¤ j , then S isanorthonormalset.c. Ifthecolumnsofanm � nmatrixA areorthonormal, then
thelinearmapping x 7! Ax preserveslengths.d. Theorthogonalprojectionof y onto v isthesameasthe
orthogonalprojectionof y onto cv whenever c ¤ 0.e. Anorthogonalmatrixisinvertible.
25. ProveTheorem 7. [Hint: For(a), compute kU xk2, orprove(b)first.]
26. Suppose W is a subspace of Rn spanned by n nonzero
orthogonalvectors. Explainwhy W D Rn.
27. LetU beasquarematrixwithorthonormalcolumns. Explainwhy U isinvertible. (Mentionthetheoremsyouuse.)
28. Let U bean n � n orthogonalmatrix. ShowthattherowsofU formanorthonormalbasisof R
n.
29. Let U and V be n � n orthogonalmatrices. ExplainwhyUV isanorthogonalmatrix. [Thatis, explainwhy UV isinvertibleanditsinverseis .UV /T .]
30. Let U beanorthogonalmatrix, andconstruct V byinter-changingsomeofthecolumnsof U . Explainwhy V isanorthogonalmatrix.
31. Showthattheorthogonalprojectionofavector y ontoalineL throughtheoriginin R
2 doesnotdependonthechoiceofthenonzero u in L usedintheformulafor Oy. Todothis, suppose y and u aregivenand Oy hasbeencomputedbyformula(2)inthissection. Replace u inthatformulaby cu,where c isanunspecifiednonzeroscalar. Showthatthenewformulagivesthesame Oy.
33. Given u ¤ 0 in Rn, let L D Span fug. Showthatthemap-
ping x 7! projL x isalineartransformation.
34. Given u ¤ 0 in Rn, let L D Span fug. For y in R
n, thereflectionofyin L isthepoint reflL y definedby
346 CHAPTER 6 Orthogonality and Least Squares
reflL y D 2� projL y � y
See the figure, which shows that reflL y is the sum ofOy D projL y and Oy � y. Showthatthemapping y 7! reflL yisalineartransformation.
x1
x2 y
u
y
L = Span{u}
yy – ˆ ref lL y
yy –ˆ
Thereflectionof y inalinethroughtheorigin.
35. [M] Showthatthecolumnsofthematrix A areorthogonalbymakinganappropriatematrixcalculation. Statethecal-culationyouuse.
A D
2
66666666664
�6 �3 6 1
�1 2 1 �6
3 6 3 �2
6 �3 6 �1
2 �1 2 3
�3 6 3 2
�2 �1 2 �3
1 2 1 6
3
77777777775
36. [M] Inparts(a)–(d), let U bethematrixformedbynormal-izingeachcolumnofthematrix A inExercise 35.a. Compute U TU and U U T . Howdotheydiffer?b. Generate a random vector y in R
8, and computep D U U Ty and z D y � p. Explainwhy p isin ColA.Verifythat z isorthogonalto p.
c. Verifythat z isorthogonaltoeachcolumnof U .d. Noticethat y D p C z, with p in ColA. Explainwhy z is
in .ColA/?. (Thesignificanceofthisdecompositionofy willbeexplainedinthenextsection.)
SOLUTIONS TO PRACTICE PROBLEMS
1. Thevectorsareorthogonalbecause
u1 � u2 D �2=5 C 2=5 D 0
Theyareunitvectorsbecause
ku1k2 D .�1=p
5/2 C .2=p
5/2 D 1=5 C 4=5 D 1
ku2k2 D .2=p
5/2 C .1=p
5/2 D 4=5 C 1=5 D 1
Inparticular, theset fu1; u2g islinearlyindependent, andhenceisabasisforR2 since
therearetwovectorsintheset.
2. When y D�
7
6
�
and u D�
2
1
�
,
Oy D y�uu�u
u D 20
5
�
2
1
�
D 4
�
2
1
�
D�
8
4
�
Thisisthesame Oy foundinExample 3. Theorthogonalprojectiondoesnotseemtodependonthe u chosenontheline. SeeExercise 31.
3. U y D
2
4
1=p
2 2=3
1=p
2 �2=3
0 1=3
3
5
�
�3p
2
6
�
D
2
4
1
�7
2
3
5
Also, fromExample 6, x D�p
2
3
�
and U x D
2
4
3
�1
1
3
5. Hence
U x�U y D 3 C 7 C 2 D 12; and x�y D �6 C 18 D 12SGMastering: OrthogonalBasis 6–4
n, thereisavector Oy in W suchthat(1) Oy istheuniquevectorin W forwhich y � Oy isorthogonalto W , and(2) Oy istheuniquevectorin W closestto y. SeeFig. 1. Thesetwopropertiesof Oy providethekeytofindingleast-squaressolutionsoflinearsystems, mentionedintheintroductoryexampleforthischapter. ThefullstorywillbetoldinSection6.5.
Toprepareforthefirsttheorem, observethatwheneveravector y iswrittenasalinearcombinationofvectors u1; : : : ;un inR
n, thetermsinthesumfor y canbegroupedy
y0W
FIGURE 1
intotwopartssothat y canbewrittenas
y D z1 C z2
where z1 isalinearcombinationofsomeofthe ui and z2 isalinearcombinationoftherestofthe ui . Thisideaisparticularlyusefulwhen fu1; : : : ;ung isanorthogonalbasis. RecallfromSection 6.1that W ? denotesthesetofallvectorsorthogonaltoasubspace W .
EXAMPLE 1 Let fu1; : : : ; u5g beanorthogonalbasisfor R5 andlet
y D c1u1 C � � � C c5u5
Considerthesubspace W D Span fu1;u2g, andwrite y asthesumofavector z1 in W
andavector z2 in W ?.
SOLUTION Write
y D c1u1 C c2u2„ ƒ‚ …
z1
C c3u3 C c4u4 C c5u5„ ƒ‚ …
z2
z1 D c1u1 C c2u2 isin Span fu1; u2gwhere
z2 D c3u3 C c4u4 C c5u5 isin Span fu3;u4;u5g:and
Toshowthat z2 isin W ?, itsufficestoshowthat z2 isorthogonaltothevectorsinthebasis fu1; u2g for W . (SeeSection 6.1.) Usingpropertiesoftheinnerproduct, compute
z2 �u1 D .c3u3 C c4u4 C c5u5/�u1
D c3u3 � u1 C c4u4 � u1 C c5u5 � u1
D 0
because u1 isorthogonalto u3, u4, and u5. A similarcalculationshowsthat z2 �u2 D 0.Thus z2 isin W ?.
Thenexttheoremshowsthatthedecomposition y D z1 C z2 inExample 1canbecomputedwithouthavinganorthogonalbasisforR
n. Itisenoughtohaveanorthogonalbasisonlyfor W .
348 CHAPTER 6 Orthogonality and Least Squares
THEOREM 8 The Orthogonal Decomposition Theorem
Let W beasubspaceof Rn. Theneach y in R
n canbewrittenuniquelyintheform
y D Oy C z (1)
where Oy isin W and z isin W ?. Infact, if fu1; : : : ; upg isanyorthogonalbasisof W , then
Oy D y�u1
u1 � u1
u1 C � � � C y�up
up � up
up (2)
and z D y � Oy.
Thevector Oy in(1)iscalledthe orthogonalprojectionof y onto W andofteniswrittenas projW y. SeeFig. 2. When W isaone-dimensionalsubspace, theformulaforOy matchestheformulagiveninSection 6.2.
0
W
y
y = projW
yˆ
z = y – y
FIGURE 2 Theorthogonalprojectionof yonto W .
PROOF Let fu1; : : : ; upg beanyorthogonalbasisfor W , anddefine Oy by(2).¹ Then Oyisin W because Oy isalinearcombinationofthebasis u1; : : : ; up . Let z D y � Oy. Sinceu1 isorthogonalto u2; : : : ; up , itfollowsfrom(2)that
z�u1 D .y � Oy/�u1 D y�u1 �� y�u1
u1 � u1
�
u1 � u1 � 0 � � � � � 0
D y�u1 � y�u1 D 0
Thus z isorthogonalto u1. Similarly, z isorthogonaltoeach uj inthebasisfor W .Hence z isorthogonaltoeveryvectorin W . Thatis, z isin W ?.
Toshowthatthedecompositionin(1)isunique, suppose y canalsobewrittenasy D Oy1 C z1, with Oy1 inW and z1 inW ?. Then Oy C z D Oy1 C z1 (sincebothsidesequaly/, andso
Oy � Oy1 D z1 � z
Thisequalityshowsthatthevector v D Oy � Oy1 isin W andin W ? (because z1 and zarebothin W ?, and W ? isasubspace). Hence v�v D 0, whichshowsthat v D 0. Thisprovesthat Oy D Oy1 andalso z1 D z.
Theuniquenessofthedecomposition(1)showsthattheorthogonalprojection Oydependsonlyon W andnotontheparticularbasisusedin(2).
¹Wemayassumethat W isnotthezerosubspace, forotherwise W? D Rn and(1)issimply y D 0C y.Thenextsectionwillshowthatanynonzerosubspaceof Rn hasanorthogonalbasis.
6.3 Orthogonal Projections 349
EXAMPLE 2 Let u1 D
2
4
2
5
�1
3
5, u2 D
2
4
�2
1
1
3
5, and y D
2
4
1
2
3
3
5. Observethat fu1; u2g
isanorthogonalbasisfor W D Span fu1; u2g. Write y asthesumofavectorin W andavectororthogonalto W .
SOLUTION Theorthogonalprojectionof y onto W is
Oy D y�u1
u1 � u1
u1 C y�u2
u2 � u2
u2
D 9
30
2
4
2
5
�1
3
5C 3
6
2
4
�2
1
1
3
5 D 9
30
2
4
2
5
�1
3
5C 15
30
2
4
�2
1
1
3
5 D
2
4
�2=5
2
1=5
3
5
Also
y � Oy D
2
4
1
2
3
3
5 �
2
4
�2=5
2
1=5
3
5 D
2
4
7=5
0
14=5
3
5
Theorem 8ensuresthat y � Oy isin W ?. Tocheckthecalculations, however, itisagoodideatoverifythat y � Oy isorthogonaltoboth u1 and u2 andhencetoallof W . Thedesireddecompositionof y is
y D
2
4
1
2
3
3
5 D
2
4
�2=5
2
1=5
3
5C
2
4
7=5
0
14=5
3
5
A Geometric Interpretation of the Orthogonal ProjectionWhen W isaone-dimensionalsubspace, theformula(2)for projW y containsjustoneterm. Thus, when dimW > 1, eachtermin(2)isitselfanorthogonalprojectionof yontoaone-dimensionalsubspacespannedbyoneofthe u’sinthebasisfor W . Figure 3illustratesthiswhenW isasubspaceofR
3 spannedby u1 and u2. Here Oy1 and Oy2 denotetheprojectionsof y ontothelinesspannedby u1 and u2, respectively. Theorthogonalprojection Oy of y onto W isthesumoftheprojectionsof y ontoone-dimensionalsub-spacesthatareorthogonaltoeachother. Thevector Oy inFig. 3correspondstothevectory inFig. 4ofSection 6.2, becausenowitis Oy thatisin W .
u1
u2
0
y
y2
ˆ
y1
ˆ
y = u2 = y
1 + y
2ˆˆ–––––u
2 . u
2
y . u2–––––u
1 . u
1
y . u1 u
1+
FIGURE 3 Theorthogonalprojectionof y isthesumofitsprojectionsontoone-dimensionalsubspacesthataremutuallyorthogonal.
350 CHAPTER 6 Orthogonality and Least Squares
Properties of Orthogonal ProjectionsIf fu1; : : : ; upg isanorthogonalbasisfor W andif y happenstobein W , thentheformulafor projW y isexactlythesameastherepresentationof y giveninTheorem 5inSection 6.2. Inthiscase, projW y D y.
If y isin W D Span fu1; : : : ; upg, then projW y D y.
Thisfactalsofollowsfromthenexttheorem.
THEOREM 9 The Best Approximation Theorem
Let W beasubspaceof Rn, let y beanyvectorin R
n, andlet Oy betheorthogonalprojectionof y onto W . Then Oy istheclosestpointin W to y, inthesensethat
ky � Oyk < ky � vk (3)
forall v in W distinctfrom Oy.
Thevector Oy inTheorem 9iscalled thebestapproximationto y byelementsofW .Latersectionsinthetextwillexamineproblemswhereagiven y mustbereplaced, orapproximated, byavector v insomefixedsubspace W . Thedistancefrom y to v, givenby ky � vk, canberegardedasthe“error”ofusing v inplaceof y. Theorem 9saysthatthiserrorisminimizedwhen v D Oy.
Inequality(3)leadstoanewproofthat Oy doesnotdependontheparticularorthogo-nalbasisusedtocomputeit. IfadifferentorthogonalbasisforW wereusedtoconstructanorthogonalprojectionof y, thenthisprojectionwouldalsobetheclosestpointin W
to y, namely, Oy.
PROOF Take v inW distinctfrom Oy. SeeFig. 4. Then Oy � v isinW . BytheOrthogonalDecompositionTheorem, y � Oy isorthogonalto W . Inparticular, y � Oy isorthogonalto Oy � v (whichisin W ). Since
y � v D .y � Oy/ C .Oy � v/
thePythagoreanTheoremgives
ky � vk2 D ky � Oyk2 C kOy � vk2
(SeethecoloredrighttriangleinFig. 4. Thelengthofeachsideislabeled.) NowkOy � vk2 > 0 because Oy � v ¤ 0, andsoinequality(3)followsimmediately.
y
v
0
W
||y – v||y
||y – v||ˆ
||y – y||ˆ
FIGURE 4 Theorthogonalprojectionof yonto W istheclosestpointin W to y.
6.3 Orthogonal Projections 351
EXAMPLE 3 If u1 D
2
4
2
5
�1
3
5, u2 D
2
4
�2
1
1
3
5, y D
2
4
1
2
3
3
5, and W D Span fu1; u2g,
asinExample 2, then theclosestpointin W to y is
Oy D y�u1
u1 � u1
u1 C y�u2
u2 � u2
u2 D
2
4
�2=5
2
1=5
3
5
EXAMPLE 4 Thedistancefromapoint y in Rn toasubspace W isdefinedasthe
distancefrom y tothenearestpointinW . Findthedistancefrom y toW D Span fu1; u2g,where
y D
2
4
�1
�5
10
3
5; u1 D
2
4
5
�2
1
3
5; u2 D
2
4
1
2
�1
3
5
SOLUTION BytheBestApproximationTheorem, thedistancefrom y toW is ky � Oyk,where Oy D projW y. Since fu1; u2g isanorthogonalbasisfor W ,
Oy D 15
30u1 C �21
6u2 D 1
2
2
4
5
�2
1
3
5 � 7
2
2
4
1
2
�1
3
5 D
2
4
�1
�8
4
3
5
y � Oy D
2
4
�1
�5
10
3
5 �
2
4
�1
�8
4
3
5 D
2
4
0
3
6
3
5
ky � Oyk2 D 32 C 62 D 45
Thedistancefrom y to W isp
45 D 3p
5.
Thefinaltheoreminthissectionshowshowformula(2)for projW y issimplifiedwhenthebasisfor W isanorthonormalset.
THEOREM 10 If fu1; : : : ; upg isanorthonormalbasisforasubspace W of Rn, then
projW y D .y�u1/u1 C .y�u2/u2 C � � � C .y�up/up (4)
If U D Œ u1 u2 � � � up �, then
projW y D U U Ty forall y in Rn (5)
PROOF Formula(4)followsimmediatelyfrom(2)inTheorem 8. Also, (4)showsthat projW y is a linearcombinationof thecolumnsof U usingtheweights y�u1,y�u2; : : : ; y�up . Theweightscanbewrittenas uT
1 y; uT2 y; : : : ; uT
py, showingthattheyaretheentriesin U Ty andjustifying(5).
Suppose U isan n � p matrixwithorthonormalcolumns, andlet W bethecolumnWEB
spaceof U . ThenU TU x D Ipx D x forall x in R
p Theorem 6
U U Ty D projW y forall y in Rn Theorem 10
If U isan n � n (square)matrixwithorthonormalcolumns, then U isan orthogonalmatrix, thecolumnspace W isallof R
InExercises3–6, verifythat fu1;u2g isanorthogonalset, andthenfindtheorthogonalprojectionof y onto Span fu1;u2g.
3. y D
2
4
�1
4
3
3
5, u1 D
2
4
1
1
0
3
5, u2 D
2
4
�1
1
0
3
5
4. y D
2
4
6
3
�2
3
5, u1 D
2
4
3
4
0
3
5, u2 D
2
4
�4
3
0
3
5
5. y D
2
4
�1
2
6
3
5, u1 D
2
4
3
�1
2
3
5, u2 D
2
4
1
�1
�2
3
5
6. y D
2
4
6
4
1
3
5, u1 D
2
4
�4
�1
1
3
5, u2 D
2
4
0
1
1
3
5
InExercises7–10, let W bethesubspacespannedbythe u’s, andwrite y asthesumofavectorin W andavectororthogonalto W .
7. y D
2
4
1
3
5
3
5, u1 D
2
4
1
3
�2
3
5, u2 D
2
4
5
1
4
3
5
8. y D
2
4
�1
4
3
3
5, u1 D
2
4
1
1
1
3
5, u2 D
2
4
�1
3
�2
3
5
9. y D
2
664
4
3
3
�1
3
775, u1 D
2
664
1
1
0
1
3
775, u2 D
2
664
�1
3
1
�2
3
775, u3 D
2
664
�1
0
1
1
3
775
10. y D
2
664
3
4
5
6
3
775, u1 D
2
664
1
1
0
�1
3
775, u2 D
2
664
1
0
1
1
3
775, u3 D
2
664
0
�1
1
�1
3
775
InExercises11and12, findtheclosestpointto y inthesubspaceW spannedby v1 and v2.
11. y D
2
664
3
1
5
1
3
775, v1 D
2
664
3
1
�1
1
3
775, v2 D
2
664
1
�1
1
�1
3
775
12. y D
2
664
3
�1
1
13
3
775, v1 D
2
664
1
�2
�1
2
3
775, v2 D
2
664
�4
1
0
3
3
775
InExercises13and14, findthebestapproximationto z byvectorsoftheform c1v1 C c2v2.
13. z D
2
664
3
�7
2
3
3
775, v1 D
2
664
2
�1
�3
1
3
775, v2 D
2
664
1
1
0
�1
3
775
14. z D
2
664
2
4
0
�1
3
775, v1 D
2
664
2
0
�1
�3
3
775, v2 D
2
664
5
�2
4
2
3
775
15. Let y D
2
4
5
�9
5
3
5, u1 D
2
4
�3
�5
1
3
5, u2 D
2
4
�3
2
1
3
5. Find the
distancefrom y tothe planein R3 spannedby u1 and u2.
16. Let y, v1, and v2 beasinExercise 12. Findthedistancefromy tothesubspaceof R
4 spannedby v1 and v2.
17. Let y D
2
4
4
8
1
3
5, u1 D
2
4
2=3
1=3
2=3
3
5, u2 D
2
4
�2=3
2=3
1=3
3
5, and
W D Span fu1;u2g.
6.3 Orthogonal Projections 353
a. Let U D Œ u1 u2 �. Compute U TU and U U T .b. Compute projW y and .U U T /y.
18. Let y D�
7
9
�
, u1 D�
1=p
10
�3=p
10
�
, and W D Span fu1g.
a. Let U be the 2 � 1 matrixwhoseonlycolumnis u1.Compute U TU and U U T .
b. Compute projW y and .U U T /y.
19. Let u1 D
2
4
1
1
�2
3
5, u2 D
2
4
5
�1
2
3
5, and u3 D
2
4
0
0
1
3
5. Notethat
u1 and u2 areorthogonalbutthat u3 isnotorthogonalto u1 oru2. Itcanbeshownthat u3 isnotinthesubspace W spannedby u1 and u2. Usethisfacttoconstructanonzerovector v inR
3 thatisorthogonalto u1 and u2.
20. Let u1 and u2 beasinExercise 19, andlet u4 D
2
4
0
1
0
3
5. Itcan
beshownthat u4 isnotinthesubspace W spannedby u1 andu2. Usethisfacttoconstructanonzerovector v in R
3 thatisorthogonalto u1 and u2.
InExercises21and22, allvectorsandsubspacesarein Rn. Mark
eachstatementTrueorFalse. Justifyeachanswer.
21. a. If z is orthogonal to u1 and to u2 and if W DSpan fu1;u2g, then z mustbein W ?.
b. Foreach y andeachsubspace W , thevector y � projW yisorthogonalto W .
c. Theorthogonalprojection Oy of y ontoasubspace W cansometimesdependontheorthogonalbasisfor W usedtocompute Oy.
d. If y isinasubspace W , thentheorthogonalprojectionofy onto W is y itself.
e. Ifthecolumnsofann � pmatrixU areorthonormal, thenU U Ty istheorthogonalprojectionof y ontothecolumnspaceof U .
22. a. If W isasubspaceof Rn andif v isinboth W and W ?,
then v mustbethezerovector.b. IntheOrthogonalDecompositionTheorem, eachtermin
formula(2)for Oy isitselfanorthogonalprojectionof yontoasubspaceof W .
c. If y D z1 C z2, where z1 isinasubspace W and z2 isinW ?, then z1 mustbetheorthogonalprojectionof y ontoW .
d. Thebestapproximationto y byelementsofasubspaceW isgivenbythevector y � projW y.
e. If an n � p matrix U hasorthonormal columns, thenU U Tx D x forall x in R
n.23. Let A bean m � n matrix. Provethateveryvector x in R
n
canbewrittenintheform x D p C u, where p isin RowA
and u isin NulA. Also, showthatiftheequation Ax D bisconsistent, thenthereisaunique p in RowA suchthatAp D b.
24. Let W be a subspace of Rn with an orthogonal basis
25. [M] Let U bethe 8 � 4 matrixinExercise 36inSection 6.2.Findtheclosestpointto y D .1; 1; 1; 1; 1; 1; 1; 1/ in ColU .Write the keystrokes or commandsyouuse to solve thisproblem.
26. [M] Let U bethematrixinExercise 25. Findthedistancefrom b D .1; 1; 1; 1; �1; �1; �1; �1/ to ColU .
SOLUTION TO PRACTICE PROBLEM
Compute
projW y D y�u1
u1 � u1
u1 C y�u2
u2 � u2
u2 D 88
66u1 C �2
6u2
D 4
3
2
4
�7
1
4
3
5 � 1
3
2
4
�1
1
�2
3
5 D
2
4
�9
1
6
3
5 D y
Inthiscase, y happenstobealinearcombinationof u1 and u2, so y isin W . Theclosestpointin W to y is y itself.
354 CHAPTER 6 Orthogonality and Least Squares
6.4 THE GRAM SCHMIDT PROCESS
TheGram–Schmidt process is a simple algorithmfor producinganorthogonal ororthonormalbasisforanynonzerosubspaceofR
n. Thefirsttwoexamplesoftheprocessareaimedathandcalculation.
EXAMPLE 1 Let W D Span fx1; x2g, where x1 D
2
4
3
6
0
3
5 and x2 D
2
4
1
2
2
3
5. Con-
x1
x2
x2
v2
x3
W
v1 =
x
1
p
0
FIGURE 1
Constructionofanorthogonalbasis fv1; v2g.
structanorthogonalbasis fv1; v2g for W .
SOLUTION Thesubspace W isshowninFig. 1, alongwith x1, x2, andtheprojectionp of x2 onto x1. Thecomponentof x2 orthogonalto x1 is x2 � p, whichisin W becauseitisformedfrom x2 andamultipleof x1. Let v1 D x1 and
v2 D x2 � p D x2 � x2 � x1
x1 � x1
x1 D
2
4
1
2
2
3
5 � 15
45
2
4
3
6
0
3
5 D
2
4
0
0
2
3
5
Then fv1; v2g isanorthogonalsetofnonzerovectorsin W . Since dimW D 2, thesetfv1; v2g isabasisfor W .
clearlylinearlyindependentandthusisabasisforasubspace W of R4. Constructan
orthogonalbasisfor W .
SOLUTION
Step1. Let v1 D x1 and W1 D Span fx1g D Span fv1g.Step2. Let v2 bethevectorproducedbysubtractingfrom x2 itsprojectionontothesubspace W1. Thatis, let
v2 D x2 � projW1x2
D x2 � x2 � v1
v1 � v1
v1 Since v1 D x1
D
2
664
0
1
1
1
3
775
� 3
4
2
664
1
1
1
1
3
775
D
2
664
�3=4
1=4
1=4
1=4
3
775
AsinExample 1, v2 is thecomponent of x2 orthogonal to x1, and fv1; v2g is anorthogonalbasisforthesubspace W2 spannedby x1 and x2.Step20 (optional). Ifappropriate, scale v2 tosimplifylatercomputations. Since v2 hasfractionalentries, itisconvenienttoscaleitbyafactorof4andreplace fv1; v2g bytheorthogonalbasis
Then v3 isthecomponentof x3 orthogonalto W2, namely,
v3 D x3 � projW2x3 D
2
664
0
0
1
1
3
775
�
2
664
0
2=3
2=3
2=3
3
775
D
2
664
0
�2=3
1=3
1=3
3
775
SeeFig. 2foradiagramofthisconstruction. Observethat v3 isin W , because x3
and projW2x3 arebothin W . Thus fv1; v02; v3g isanorthogonalsetofnonzerovectorsandhencealinearlyindependentsetin W . Notethat W isthree-dimensionalsinceitwasdefinedbyabasisofthreevectors. Hence, bytheBasisTheoreminSection 4.5,fv1; v02; v3g isanorthogonalbasisfor W .
Givenabasis fx1; : : : ; xpg foranonzerosubspace W of Rn, define
v1 D x1
v2 D x2 � x2 � v1
v1 � v1
v1
v3 D x3 � x3 � v1
v1 � v1
v1 � x3 � v2
v2 � v2
v2
:::
vp D xp � xp � v1
v1 � v1
v1 � xp � v2
v2 � v2
v2 � � � � � xp � vp�1
vp�1 � vp�1
vp�1
Then fv1; : : : ; vpg isanorthogonalbasisfor W . Inaddition
Span fv1; : : : ; vkg D Span fx1; : : : ; xkg for 1 � k � p (1)
356 CHAPTER 6 Orthogonality and Least Squares
PROOF For 1 � k � p, letWk D Span fx1; : : : ; xkg. Set v1 D x1, sothat Span fv1g DSpan fx1g. Suppose, for some k < p, we have constructed v1; : : : ; vk so thatfv1; : : : ; vkg isanorthogonalbasisfor Wk . Define
IfA isanm � nmatrixwithlinearlyindependentcolumns, thenA canbefactoredas A D QR, where Q isan m � n matrixwhosecolumnsformanorthonormalbasisfor ColA and R isan n � n uppertriangularinvertiblematrixwithpositiveentriesonitsdiagonal.
PROOF Thecolumnsof A formabasis fx1; : : : ; xng for ColA. Constructanorthonor-malbasis fu1; : : : ; ung for W D ColA withproperty(1)inTheorem 11. ThisbasismaybeconstructedbytheGram–Schmidtprocessorsomeothermeans. Let
Q D Œ u1 u2 � � � un �
For k D 1; : : : ; n; xk isin Span fx1; : : : ; xkg D Span fu1; : : : ; ukg. Sotherearecon-stants, r1k ; : : : ; rkk , suchthat
xk D r1ku1 C � � � C rkkuk C 0�ukC1 C � � � C 0�un
Wemayassumethat rkk � 0. (If rkk < 0, multiplyboth rkk and uk by �1.) Thisshowsthat xk isalinearcombinationofthecolumnsof Q usingasweightstheentriesinthevector
rk D
2
66666664
r1k
:::
rkk
0:::
0
3
77777775
Thatis, xk D Qrk for k D 1; : : : ; n. Let R D Œ r1 � � � rn �. Then
SOLUTION Thecolumnsof A are thevectors x1, x2, and x3 inExample 2. Anorthogonalbasisfor ColA D Span fx1; x2; x3g wasfoundinthatexample:
v1 D
2
664
1
1
1
1
3
775
; v02 D
2
664
�3
1
1
1
3
775
; v3 D
2
664
0
�2=3
1=3
1=3
3
775
Tosimplifythearithmeticthatfollows, scale v3 byletting v03 D 3v3. Thennormalizethethreevectorstoobtain u1, u2, and u3, andusethesevectorsasthecolumnsof Q:
Q D
2
6664
1=2 �3=p
12 0
1=2 1=p
12 �2=p
6
1=2 1=p
12 1=p
6
1=2 1=p
12 1=p
6
3
7775
358 CHAPTER 6 Orthogonality and Least Squares
Byconstruction, thefirst k columnsofQ areanorthonormalbasisof Span fx1; : : : ; xkg.FromtheproofofTheorem 12,A D QR forsomeR. TofindR, observethatQTQ D I ,becausethecolumnsof Q areorthonormal. Hence
QTA D QT .QR/ D IR D R
and
R D
2
4
1=2 1=2 1=2 1=2
�3=p
12 1=p
12 1=p
12 1=p
12
0 �2=p
6 1=p
6 1=p
6
3
5
2
664
1 0 0
1 1 0
1 1 1
1 1 1
3
775
D
2
4
2 3=2 1
0 3=p
12 2=p
12
0 0 2=p
6
3
5
NUMER ICAL NOTES
1. WhentheGram–Schmidtprocessisrunonacomputer, roundofferrorcanbuildupasthevectors uk arecalculated, onebyone. For j and k largebutunequal, theinnerproducts uT
j uk maynotbesufficientlyclosetozero. Thislossoforthogonalitycanbereducedsubstantiallybyrearrangingtheorderofthecalculations.1 However, adifferentcomputer-basedQR factorizationisusuallypreferredtothismodifiedGram–Schmidtmethodbecauseityieldsamoreaccurateorthonormalbasis, eventhoughthefactorizationrequiresabouttwiceasmucharithmetic.
2. ToproduceaQR factorizationofamatrix A, acomputerprogramusuallyleft-multiplies A byasequenceoforthogonalmatricesuntil A istransformedintoanuppertriangularmatrix. Thisconstructionisanalogoustotheleft-multiplicationbyelementarymatricesthatproducesanLU factorizationofA.
PRACTICE PROBLEM
LetW D Span fx1; x2g, where x1 D
2
4
1
1
1
3
5 and x2 D
2
4
1=3
1=3
�2=3
3
5. Constructanorthonor-
malbasisfor W .
6.4 EXERCISESInExercises1–6, thegivensetisabasisforasubspace W . UsetheGram–Schmidtprocesstoproduceanorthogonalbasisfor W .
InExercises 13and14, thecolumnsof Q wereobtainedbyapplyingtheGram–SchmidtprocesstothecolumnsofA. Findanuppertriangularmatrix R suchthat A D QR. Checkyourwork.
b. If x isnotinasubspace W , then x � projW x isnotzero.c. InaQR factorization, say A D QR (when A haslin-
earlyindependentcolumns), thecolumnsof Q formanorthonormalbasisforthecolumnspaceof A.
19. Suppose A D QR, where Q is m � n and R is n � n. ShowthatifthecolumnsofA arelinearlyindependent, thenRmustbeinvertible. [Hint: Studytheequation Rx D 0 andusethefactthat A D QR.]
20. Suppose A D QR, where R isaninvertiblematrix. Showthat A and Q havethesamecolumnspace. [Hint: Given y inColA, showthat y D Qx forsome x. Also, given y in ColQ,showthat y D Ax forsome x.]
21. Given A D QR asinTheorem 12, describehowtofindanorthogonalm � m (square)matrixQ1 andaninvertible n � n
uppertriangularmatrix R suchthat
A D Q1
�
R
0
�
TheMATLAB qr commandsuppliesthis“full”QR factor-izationwhen rankA D n.
22. Let u1; : : : ; up beanorthogonalbasisforasubspace W ofR
n, and let T W Rn ! R
n bedefinedby T .x/ D projW x.Showthat T isalineartransformation.
23. Suppose A D QR isaQR factorizationofan m � n ma-trix A (withlinearlyindependentcolumns). Partition A asŒA1 A2�, where A1 has p columns. ShowhowtoobtainaQR factorizationof A1, andexplainwhyyourfactorizationhastheappropriateproperties.
24. [M] Use the Gram–Schmidt process as in Example 2 toproduceanorthogonalbasisforthecolumnspaceof
26. [M] Foramatrixprogram, theGram–Schmidtprocessworksbetterwithorthonormalvectors. Startingwith x1; : : : ; xp asinTheorem 11, let A D Œ x1 � � � xp �. Suppose Q isann � k matrixwhosecolumnsformanorthonormalbasisforthesubspace Wk spannedbythefirst k columnsof A. Thenfor x in R
n, QQT x istheorthogonalprojectionof x onto Wk
(Theorem 10inSection 6.3). If xkC1 isthenextcolumnofA, thenequation(2)intheproofofTheorem 11becomes
vkC1 D xkC1 � Q.QT xkC1/
(The parentheses above reduce the number of arithmeticoperations.) Let ukC1 D vkC1=kvkC1k. Thenew Q forthenextstepis Œ Q ukC1 �. UsethisproceduretocomputetheQR factorizationofthematrixinExercise 24. Writethekeystrokesorcommandsyouuse.
WEB
360 CHAPTER 6 Orthogonality and Least Squares
SOLUTION TO PRACTICE PROBLEM
Let v1 D x1 D
2
4
1
1
1
3
5 and v2 D x2 � x2 � v1
v1 � v1
v1 D x2 � 0v1 D x2. So fx1; x2g isalready
orthogonal. Allthatisneededistonormalizethevectors. Let
u1 D 1
kv1kv1 D 1p
3
2
4
1
1
1
3
5 D
2
4
1=p
3
1=p
3
1=p
3
3
5
Insteadofnormalizing v2 directly, normalize v02 D 3v2 instead:
u2 D 1
kv02kv02 D 1
p
12 C 12 C .�2/2
2
4
1
1
�2
3
5 D
2
4
1=p
6
1=p
6
�2=p
6
3
5
Then fu1;u2g isanorthonormalbasisfor W .
6.5 LEAST-SQUARES PROBLEMS
Thechapter’sintroductoryexampledescribedamassiveproblem Ax D b thathadnosolution. Inconsistentsystemsariseofteninapplications, thoughusuallynotwithsuchanenormouscoefficientmatrix. Whenasolutionisdemandedandnoneexists, thebestonecandoistofindan x thatmakes Ax ascloseaspossibleto b.
Thinkof Ax asan approximation to b. Thesmallerthedistancebetween b and Ax,givenby kb � Axk, thebettertheapproximation. The generalleast-squaresproblemistofindan x thatmakes kb � Axk assmallaspossible. Theadjective“least-squares”arisesfromthefactthat kb � Axk isthesquarerootofasumofsquares.
DEF IN I T I ON If A is m � n and b isin Rm, a least-squaressolution of Ax D b isan Ox in R
n
suchthatkb � AOxk � kb � Axk
forall x in Rn.
Themostimportantaspectoftheleast-squaresproblemisthatnomatterwhat x weselect, thevector Ax willnecessarilybeinthecolumnspace, ColA. Soweseekan xthatmakes Ax theclosestpointin ColA to b. SeeFig.1. (Ofcourse, if b happenstobein ColA, then b is Ax forsome x, andsuchan x isa“least-squaressolution.”)
Ax
Ax
Ax
Col A
b
0
FIGURE 1 Thevector b iscloserto AOx thanto Ax forother x.
6.5 Least-Squares Problems 361
Solution of the General Least-Squares ProblemGiven A and b asabove, applytheBestApproximationTheoreminSection 6.3tothesubspace ColA. Let
Ob D projColA b
Because Ob isinthecolumnspaceof A, theequation Ax D Ob is consistent, andthereisan Ox in R
n suchthatAOx D Ob (1)
Since Ob istheclosestpointin ColA to b, avector Ox isaleast-squaressolutionofAx D bifandonlyif Ox satisfies(1). Suchan Ox in R
n isalistofweightsthatwillbuild Ob outofthecolumnsof A. SeeFig. 2. [Therearemanysolutionsof(1)iftheequationhasfreevariables.]
x
�n
0
A
subspace of �m
bb – Ax
b = Axˆ
Col A
FIGURE 2 Theleast-squaressolution Ox isin Rn.
Suppose Ox satisfies AOx D Ob. BytheOrthogonalDecompositionTheoreminSec-tion 6.3, theprojection Ob hasthepropertythat b � Ob isorthogonalto ColA, so b � AOxisorthogonaltoeachcolumnof A. If aj isanycolumnof A, then aj �.b � AOx/ D 0,and aT
Thesecalculationsshowthateachleast-squaressolutionofAx D b satisfiestheequation
ATAx D AT b (3)
Thematrixequation(3)representsasystemofequationscalledthe normalequationsfor Ax D b. A solutionof(3)isoftendenotedby Ox.
THEOREM 13 Thesetofleast-squaressolutionsof Ax D b coincideswiththenonemptysetofsolutionsofthenormalequations ATAx D AT b.
PROOF Asshownabove, thesetofleast-squaressolutionsisnonemptyandeachleast-squaressolution Ox satisfiesthenormalequations. Conversely, suppose Ox satisfiesATAOx D AT b. Then Ox satisfies(2)above, whichshowsthat b � AOx isorthogonaltothe
362 CHAPTER 6 Orthogonality and Least Squares
rowsof AT andhenceisorthogonaltothecolumnsof A. Sincethecolumnsof A spanColA, thevector b � AOx isorthogonaltoallof ColA. Hencetheequation
b D AOx C .b � AOx/
isadecompositionof b intothesumofavectorin ColA andavectororthogonaltoColA. Bytheuniquenessoftheorthogonaldecomposition, AOx mustbetheorthogonalprojectionof b onto ColA. Thatis, AOx D Ob, and Ox isaleast-squaressolution.
EXAMPLE 1 Findaleast-squaressolutionoftheinconsistentsystem Ax D b for
A D
2
4
4 0
0 2
1 1
3
5; b D
2
4
2
0
11
3
5
SOLUTION Tousenormalequations(3), compute:
ATA D�
4 0 1
0 2 1
�2
4
4 0
0 2
1 1
3
5 D�
17 1
1 5
�
AT b D�
4 0 1
0 2 1
�2
4
2
0
11
3
5 D�
19
11
�
Thentheequation ATAx D AT b becomes�
17 1
1 5
��
x1
x2
�
D�
19
11
�
Rowoperationscanbeusedtosolvethissystem, butsince ATA isinvertibleand 2 � 2,itisprobablyfastertocompute
.ATA/�1 D 1
84
�
5 �1
�1 17
�
andthentosolve ATAx D AT b as
Ox D .ATA/�1AT b
D 1
84
�
5 �1
�1 17
��
19
11
�
D 1
84
�
84
168
�
D�
1
2
�
Inmanycalculations, ATA isinvertible, butthisisnotalwaysthecase. Thenextexampleinvolvesamatrixofthesortthatappearsinwhatarecalled analysisofvarianceproblemsinstatistics.
EXAMPLE 2 Findaleast-squaressolutionof Ax D b for
A D
2
6666664
1 1 0 0
1 1 0 0
1 0 1 0
1 0 1 0
1 0 0 1
1 0 0 1
3
7777775
; b D
2
6666664
�3
�1
0
2
5
1
3
7777775
6.5 Least-Squares Problems 363
SOLUTION Compute
ATA D
2
664
1 1 1 1 1 1
1 1 0 0 0 0
0 0 1 1 0 0
0 0 0 0 1 1
3
775
2
6666664
1 1 0 0
1 1 0 0
1 0 1 0
1 0 1 0
1 0 0 1
1 0 0 1
3
7777775
D
2
664
6 2 2 2
2 2 0 0
2 0 2 0
2 0 0 2
3
775
AT b D
2
664
1 1 1 1 1 1
1 1 0 0 0 0
0 0 1 1 0 0
0 0 0 0 1 1
3
775
2
6666664
�3
�1
0
2
5
1
3
7777775
D
2
664
4
�4
2
6
3
775
Theaugmentedmatrixfor ATAx D AT b is2
664
6 2 2 2 4
2 2 0 0 �4
2 0 2 0 2
2 0 0 2 6
3
775
�
2
664
1 0 0 1 3
0 1 0 �1 �5
0 0 1 �1 �2
0 0 0 0 0
3
775
Thegeneralsolutionis x1 D 3 � x4, x2 D �5 C x4, x3 D �2 C x4, and x4 isfree. Sothegeneralleast-squaressolutionof Ax D b hastheform
Ox D
2
664
3
�5
�2
0
3
775
C x4
2
664
�1
1
1
1
3
775
Thenexttheoremgivesusefulcriteriafordeterminingwhenthereisonlyoneleast-squaressolutionof Ax D b. (Ofcourse, theorthogonalprojection Ob isalwaysunique.)
THEOREM 14 Let A bean m � n matrix. Thefollowingstatementsarelogicallyequivalent:
a. Theequation Ax D b hasauniqueleast-squaressolutionforeach b in Rm.
b. Thecolumnsof A arelinearlyindpendent.c. Thematrix ATA isinvertible.
Whenthesestatementsaretrue, theleast-squaressolution Ox isgivenbyOx D .ATA/�1AT b (4)
ThemainelementsofaproofofTheorem 14areoutlinedinExercises19–21, whichalsoreviewconceptsfromChapter 4. Formula(4)for Ox isusefulmainlyfortheoreticalpurposesandforhandcalculationswhen ATA isa 2 � 2 invertiblematrix.
Whenaleast-squaressolution Ox isusedtoproduce AOx asanapproximationto b,thedistancefrom b to AOx iscalledthe least-squareserror ofthisapproximation.
EXAMPLE 3 Given A and b asinExample 1, determinetheleast-squareserrorintheleast-squaressolutionof Ax D b.
364 CHAPTER 6 Orthogonality and Least Squares
SOLUTION FromExample 1,
(2, 0, 11)
(0, 2, 1)
0
b
Ax
x3
x1
(4, 0, 1)
�84
Col A
FIGURE 3
b D
2
4
2
0
11
3
5 and AOx D
2
4
4 0
0 2
1 1
3
5
�
1
2
�
D
2
4
4
4
3
3
5
Hence
b � AOx D
2
4
2
0
11
3
5 �
2
4
4
4
3
3
5 D
2
4
�2
�4
8
3
5
andkb � AOxk D
p
.�2/2 C .�4/2 C 82 Dp
84
Theleast-squareserrorisp
84. Forany x in R2, thedistancebetween b andthevector
Alternative Calculations of Least-Squares SolutionsThenextexampleshowshowtofindaleast-squaressolutionof Ax D b whenthecolumnsof A areorthogonal. Suchmatricesoftenappearinlinearregressionproblems,discussedinthenextsection.
EXAMPLE 4 Findaleast-squaressolutionof Ax D b for
A D
2
664
1 �6
1 �2
1 1
1 7
3
775
; b D
2
664
�1
2
1
6
3
775
SOLUTION Because the columns a1 and a2 of A are orthogonal, the orthogonalprojectionof b onto ColA isgivenby
Ob D b�a1
a1 � a1
a1 C b�a2
a2 � a2
a2 D 8
4a1 C 45
90a2 (5)
D
2
664
2
2
2
2
3
775
C
2
664
�3
�1
1=2
7=2
3
775
D
2
664
�1
1
5=2
11=2
3
775
Nowthat Ob isknown, wecansolve AOx D Ob. Butthisistrivial, sincewealreadyknowwhatweightstoplaceonthecolumnsof A toproduce Ob. Itisclearfrom(5)that
Ox D�
8=4
45=90
�
D�
2
1=2
�
In some cases, the normal equations for a least-squares problem can be ill-conditioned; thatis, smallerrorsinthecalculationsoftheentriesof ATA cansometimescause relatively large errors in the solution Ox. If the columns of A are linearlyindependent, theleast-squaressolutioncanoftenbecomputedmorereliablythroughaQR factorizationof A (describedinSection 6.4).¹
THEOREM 15 Givenan m � n matrix A withlinearlyindependentcolumns, let A D QR beaQR factorizationof A asinTheorem 12. Then, foreach b in R
m, theequationAx D b hasauniqueleast-squaressolution, givenby
Ox D R�1QT b (6)
PROOF Let Ox D R�1QT b. Then
AOx D QROx D QRR�1QT b D QQT b
ByTheorem 12, thecolumnsof Q formanorthonormalbasisfor ColA. Hence, byTheorem 10, QQTb istheorthogonalprojection Ob of b onto ColA. Then AOx D Ob,whichshowsthat Ox isaleast-squaressolutionof Ax D b. Theuniquenessof Ox followsfromTheorem 14.
NUMER ICAL NOTE
Since R inTheorem 15isuppertriangular, Ox shouldbecalculatedastheexactsolutionoftheequation
InExercises9–12, find(a)theorthogonalprojectionof b ontoColA and(b)aleast-squaressolutionof Ax D b.
9. A D
2
4
1 5
3 1
�2 4
3
5, b D
2
4
4
�2
�3
3
5
10. A D
2
4
1 2
�1 4
1 2
3
5, b D
2
4
3
�1
5
3
5
11. A D
2
664
4 0 1
1 �5 1
6 1 0
1 �1 �5
3
775, b D
2
664
9
0
0
0
3
775
12. A D
2
664
1 1 0
1 0 �1
0 1 1
�1 1 �1
3
775, b D
2
664
2
5
6
6
3
775
13. Let A D
2
4
3 4
�2 1
3 4
3
5, b D
2
4
11
�9
5
3
5, u D�
5
�1
�
, and v D�
5
�2
�
. Compute Au and Av, andcomparethemwith b.
Could u possiblybea least-squares solutionof Ax D b?(Answerthiswithoutcomputingaleast-squaressolution.)
14. Let A D
2
4
2 1
�3 �4
3 2
3
5, b D
2
4
5
4
4
3
5, u D�
4
�5
�
, and v D�
6
�5
�
. Compute Au and Av, andcomparethemwith b. Is
itpossiblethatatleastoneof u or v couldbealeast-squaressolutionofAx D b? (Answerthiswithoutcomputingaleast-squaressolution.)
InExercises15and16, usethefactorization A D QR tofindtheleast-squaressolutionof Ax D b.
15. A D
2
4
2 3
2 4
1 1
3
5 D
2
4
2=3 �1=3
2=3 2=3
1=3 �2=3
3
5
�
3 5
0 1
�
, b D
2
4
7
3
1
3
5
16. A D
2
664
1 �1
1 4
1 �1
1 4
3
775
D
2
664
1=2 �1=2
1=2 1=2
1=2 �1=2
1=2 1=2
3
775
�
2 3
0 5
�
;b D
2
664
�1
6
5
7
3
775
InExercises17and18,A isanm � nmatrixand b isinRm. Mark
eachstatementTrueorFalse. Justifyeachanswer.
17. a. Thegeneralleast-squaresproblemistofindan x thatmakes Ax ascloseaspossibleto b.
6.5 Least-Squares Problems 367
b. A least-squaressolutionof Ax D b is a vector Ox thatsatisfies AOx D Ob, where Ob istheorthogonalprojectionofb onto ColA.
c. A least-squaressolutionof Ax D b isavector Ox suchthatkb � Axk � kb � AOxk forall x in R
n.d. Anysolutionof ATAx D AT b isaleast-squaressolution
of Ax D b.e. Ifthecolumnsof A arelinearlyindependent, thenthe
equation Ax D b hasexactlyoneleast-squaressolution.18. a. If b isinthecolumnspaceof A, theneverysolutionof
Ax D b isaleast-squaressolution.b. Theleast-squaressolutionof Ax D b isthepointinthe
columnspaceof A closestto b.c. A least-squaressolutionof Ax D b isalistofweights
that, whenappliedtothecolumnsof A, producestheorthogonalprojectionof b onto ColA.
d. If Ox is a least-squares solution of Ax D b, thenOx D .ATA/�1AT b.
e. Thenormalequationsalwaysprovideareliablemethodforcomputingleast-squaressolutions.
f. If A hasaQR factorization, say A D QR, thenthebestwaytofindtheleast-squaressolutionof Ax D b istocompute Ox D R�1QT b.
19. LetA beanm � nmatrix. Usethestepsbelowtoshowthatavector x inR
n satisfiesAx D 0 ifandonlyifATAx D 0. Thiswillshowthat NulA D NulATA.a. Showthatif Ax D 0, then ATAx D 0.b. Suppose ATAx D 0. Explainwhy xTATAx D 0, anduse
thistoshowthat Ax D 0.20. Let A bean m � n matrixsuchthat ATA isinvertible. Show
thatthecolumnsof A arelinearlyindependent. [Careful:Youmaynotassumethat A isinvertible; itmaynotevenbesquare.]
21. Let A bean m � n matrixwhosecolumnsarelinearlyinde-pendent. [Careful: A neednotbesquare.]a. UseExercise 19toshowthat ATA isaninvertiblematrix.b. Explain why A must have at least as many rows as
columns.c. Determinetherankof A.
22. UseExercise 19toshowthat rankATA D rankA. [Hint:Howmanycolumnsdoes ATA have? Howisthisconnectedwiththerankof ATA?]
23. Suppose A is m � n withlinearlyindependentcolumnsandb isin R
m. Usethenormalequationstoproduceaformulafor Ob, theprojectionof b onto ColA. [Hint: Find Ox first. Theformuladoesnotrequireanorthogonalbasisfor ColA.]
24. Findaformulafortheleast-squaressolutionofAx D bwhenthecolumnsof A areorthonormal.
25. Describeallleast-squaressolutionsofthesystem
x C y D 2
x C y D 4
26. [M] Example 3inSection 4.8displayedalow-passlinearfilterthatchangedasignal fykg into fykC1g andchangedahigher-frequencysignal fwkg into the zero signal, whereyk D cos.�k=4/ and wk D cos.3�k=4/. Thefollowingcal-culationswilldesignafilterwithapproximatelythoseprop-erties. Thefilterequationis
a0ykC2 C a1ykC1 C a2yk D ´k forall k .8/
Becausethesignalsareperiodic, withperiod 8, itsufficestostudyequation(8)for k D 0; : : : ; 7. Theactiononthetwosignalsdescribedabovetranslatesintotwosetsofeightequations, shownbelow:
k D 0
k D 1:::
k D 7
2
66666666664
ykC2
0ykC1
.7yk
1�:7 0 :7
�1 �:7 0
�:7 �1 �:7
0 �:7 �1
:7 0 �:7
1 :7 0
:7 1 :7
3
77777777775
2
4
a0
a1
a2
3
5 D
2
66666666664
ykC1
.70
�:7
�1
�:7
0
:7
1
3
77777777775
k D 0
k D 1:::
k D 7
2
66666666664
wkC2
0 �wkC1
.7wk
1:7 0 �:7
�1 :7 0
:7 �1 :7
0 :7 �1
�:7 0 :7
1 �:7 0
�:7 1 �:7
3
77777777775
2
4
a0
a1
a2
3
5 D
2
66666666664
0
0
0
0
0
0
0
0
3
77777777775
Write an equation Ax D b, where A is a 16 � 3 matrixformedfromthetwocoefficientmatricesaboveandwhere bin R
16 isformedfromthetworightsidesoftheequations.Find a0, a1, and a2 givenbytheleast-squaressolutionofAx D b. (The.7inthedataabovewasusedasanapprox-imationfor
p2=2, toillustratehowatypicalcomputation
inanappliedproblemmightproceed. If .707wereusedinstead, theresultingfiltercoefficientswouldagreetoatleastsevendecimalplaceswith
p2=4; 1=2, and
p2=4, thevalues
producedbyexactarithmeticcalculations.)
WEB
368 CHAPTER 6 Orthogonality and Least Squares
SOLUTIONS TO PRACTICE PROBLEMS
1. First, compute
ATA D
2
4
1 1 1
�3 5 7
�3 1 2
3
5
2
4
1 �3 �3
1 5 1
1 7 2
3
5 D
2
4
3 9 0
9 83 28
0 28 14
3
5
AT b D
2
4
1 1 1
�3 5 7
�3 1 2
3
5
2
4
5
�3
�5
3
5 D
2
4
�3
�65
�28
3
5
Next, rowreducetheaugmentedmatrixforthenormalequations, ATAx D AT b:2
4
3 9 0 �3
9 83 28 �65
0 28 14 �28
3
5 �
2
4
1 3 0 �1
0 56 28 �56
0 28 14 �28
3
5 � � � � �
2
4
1 0 �3=2 2
0 1 1=2 �1
0 0 0 0
3
5
Thegeneralleast-squaressolutionis x1 D 2 C 32x3, x2 D �1 � 1
2x3, with x3 free.
Foronespecificsolution, take x3 D 0 (forexample), andget
Ox D
2
4
2
�1
0
3
5
Tofindtheleast-squareserror, compute
Ob D AOx D
2
4
1 �3 �3
1 5 1
1 7 2
3
5
2
4
2
�1
0
3
5 D
2
4
5
�3
�5
3
5
Itturnsoutthat Ob D b, so kb � Obk D 0. Theleast-squareserroriszerobecause bhappenstobein ColA.
2. If b isorthogonaltothecolumnsofA, thentheprojectionof b ontothecolumnspaceof A is 0. Inthiscase, aleast-squaressolution Ox of Ax D b satisfies AOx D 0.
6.6 APPLICATIONS TO LINEAR MODELS
A commontaskinscienceandengineeringistoanalyzeandunderstandrelationshipsamongseveralquantitiesthatvary. Thissectiondescribesavarietyofsituationsinwhichdataareusedtobuildorverifyaformulathatpredictsthevalueofonevariableasafunctionofothervariables. Ineachcase, theproblemwillamounttosolvingaleast-squaresproblem.
Foreasyapplicationofthediscussiontorealproblemsthatyoumayencounterlaterinyourcareer, wechoosenotationthatiscommonlyusedinthestatisticalanalysisofscientificandengineeringdata. Insteadof Ax D b, wewrite Xˇ D y andreferto X asthe designmatrix, ˇ asthe parametervector, and y asthe observationvector.
Least-Squares LinesThe simplest relation between two variables x and y is the linear equationy D ˇ0 C ˇ1x.¹ Experimentaldataoftenproducepoints .x1; y1/; : : : ; .xn; yn/ that,
¹Thisnotationiscommonlyusedforleast-squareslinesinsteadof y D mx C b.
and ˇ1 thatmakethelineas“close”tothepointsaspossible.Suppose ˇ0 and ˇ1 are fixed, andconsider the line y D ˇ0 C ˇ1x in Fig. 1.
Correspondingtoeachdatapoint .xj ; yj / thereisapoint .xj ; ˇ0 C ˇ1xj / onthelinewiththesame x-coordinate. Wecall yj the observed valueof y and ˇ0 C ˇ1xj thepredicted y-value(determinedbytheline). Thedifferencebetweenanobserved y-valueandapredicted y-valueiscalleda residual.
ResidualResidual
Point on line
Data pointy
xj
x1
xn
x
y = �0 + �
1x
(xj, �
0 + �
1x
j)
(xj, y
j)
FIGURE 1 Fittingalinetoexperimentaldata.
Thereareseveralwaystomeasurehow“close”thelineistothedata. Theusualchoice(primarilybecausethemathematicalcalculationsaresimple)istoaddthesquaresoftheresiduals. The least-squaresline istheline y D ˇ0 C ˇ1x thatminimizesthesumofthesquaresoftheresiduals. Thislineisalsocalleda lineofregressionof yon x, becauseanyerrorsinthedataareassumedtobeonlyinthe y-coordinates. Thecoefficients ˇ0, ˇ1 ofthelinearecalled(linear) regressioncoefficients.²
Ifthedatapointswereontheline, theparameters ˇ0 and ˇ1 wouldsatisfytheequations
Predicted Observedy-value y-value
ˇ0 C ˇ1x1 = y1
ˇ0 C ˇ1x2 = y2
::::::
ˇ0 C ˇ1xn = yn
Wecanwritethissystemas
Xˇ D y; where X D
2
6664
1 x1
1 x2:::
:::
1 xn
3
7775
; ˇ D�
ˇ0
ˇ1
�
; y D
2
6664
y1
y2:::
yn
3
7775
(1)
Ofcourse, ifthedatapointsdon’tlieonaline, thentherearenoparameters ˇ0, ˇ1 forwhichthepredicted y-valuesin Xˇ equaltheobserved y-valuesin y, and Xˇ D y hasnosolution. Thisisaleast-squaresproblem, Ax D b, withdifferentnotation!
Thesquareofthedistancebetweenthevectors Xˇ and y ispreciselythesumofthesquaresoftheresiduals. The ˇ thatminimizesthissumalsominimizesthedistancebetween Xˇ and y. Computingtheleast-squaressolutionof Xˇ D y isequivalenttofindingthe ˇ thatdeterminestheleast-squareslineinFig. 1.
² Ifthemeasurementerrorsarein x insteadof y, simplyinterchangethecoordinatesofthedata .xj ; yj /
EXAMPLE 1 Findtheequation y D ˇ0 C ˇ1x oftheleast-squareslinethatbestfitsthedatapoints .2; 1/, .5; 2/, .7; 3/, and .8; 3/.
SOLUTION Usethe x-coordinatesofthedatatobuildthedesignmatrix X in(1)andthe y-coordinatestobuildtheobservationvector y:
X D
2
664
1 2
1 5
1 7
1 8
3
775
; y D
2
664
1
2
3
3
3
775
Fortheleast-squaressolutionof Xˇ D y, obtainthenormalequations(withthenewnotation):
XTXˇ D XTy
Thatis, compute
XTX D�
1 1 1 1
2 5 7 8
�
2
664
1 2
1 5
1 7
1 8
3
775
D�
4 22
22 142
�
XTy D�
1 1 1 1
2 5 7 8
�
2
664
1
2
3
3
3
775
D�
9
57
�
Thenormalequationsare�
4 22
22 142
��
ˇ0
ˇ1
�
D�
9
57
�
Hence�
ˇ0
ˇ1
�
D�
4 22
22 142
��1�9
57
�
D 1
84
�
142 �22
�22 4
��
9
57
�
D 1
84
�
24
30
�
D�
2=7
5=14
�
Thustheleast-squareslinehastheequation
y D 2
7C 5
14x
SeeFig. 2.
1 2 3
1
2
3
4 5 7 8 96
y
x
FIGURE 2 Theleast-squaresliney D 2
7C 5
14x.
A commonpracticebeforecomputingaleast-squareslineistocomputetheaveragex oftheoriginal x-valuesandformanewvariable x� D x � x. Thenew x-dataaresaidtobein mean-deviationform. Inthiscase, thetwocolumnsofthedesignmatrixwillbeorthogonal. Solutionofthenormalequationsissimplified, justasinExample 4inSection 6.5. SeeExercises 17and18.
6.6 Applications to Linear Models 371
The General Linear ModelInsomeapplications, itisnecessarytofitdatapointswithsomethingotherthanastraightline. Intheexamplesthatfollow, thematrixequationisstill Xˇ D y, butthespecificformof X changesfromoneproblemtothenext. Statisticiansusuallyintroducearesidualvector �, definedby � D y � Xˇ, andwrite
y D Xˇ C �
Anyequationofthisformisreferredtoasa linearmodel. OnceX and y aredetermined,thegoalistominimizethelengthof �, whichamountstofindingaleast-squaressolutionof Xˇ D y. Ineachcase, theleast-squaressolution O isasolutionofthenormalequations
XTXˇ D XTy
Least-Squares Fitting of Other CurvesWhendatapoints .x1; y1/; : : : ; .xn; yn/ onascatterplotdonotlieclosetoanyline, itmaybeappropriatetopostulatesomeotherfunctionalrelationshipbetween x and y.
Thenexttwoexamplesshowhowtofitdatabycurvesthathavethegeneralformy D ˇ0f0.x/ C ˇ1f1.x/ C � � � C ˇkfk.x/ (2)
where f0; : : : ; fk areknownfunctionsand ˇ0; : : : ; ˇk areparameters that mustbedetermined. Aswewillsee, equation(2)describesalinearmodelbecauseitislinearintheunknownparameters.
Multiple RegressionSupposeanexperimentinvolvestwoindependentvariables—say, u and v—andonedependentvariable, y. A simpleequationforpredicting y from u and v hastheform
Equations(4)and(5)bothleadtoalinearmodelbecausetheyarelinearintheunknownparameters(eventhough u and v aremultiplied). Ingeneral, alinearmodelwillarisewhenever y istobepredictedbyanequationoftheform
y D ˇ0f0.u; v/ C ˇ1f1.u; v/ C � � � C ˇkfk.u; v/
with f0; : : : ; fk anysortofknownfunctionsand ˇ0; : : : ; ˇk unknownweights.
EXAMPLE 4 In geography, local models of terrain are constructed fromdata.u1; v1; y1/; : : : ; .un; vn; yn/, where uj , vj , and yj arelatitude, longitude, andaltitude,respectively. Describethelinearmodelbasedon(4)thatgivesaleast-squaresfittosuchdata. Thesolutioniscalledthe least-squaresplane. SeeFig. 6.
Thissystemhasthematrixform y D Xˇ C �, whereObservation Design Parameter Residual
vector matrix vector vector
y D
2
6664
y1
y2
:::
yn
3
7775
; X D
2
6664
1 u1 v1
1 u2 v2
::::::
:::
1 un vn
3
7775
; ˇ D
2
4
ˇ0
ˇ1
ˇ2
3
5; � D
2
6664
�1
�2
:::
�n
3
7775
Example 4showsthatthelinearmodelformultipleregressionhasthesameabstractformasthemodelforthesimpleregressionintheearlierexamples. Linearalgebragivesusthepowertounderstandthegeneralprinciplebehindallthelinearmodels. Once X
where x isthetimeinmonths. Theterm ˇ0 C ˇ1x givesthebasicsalestrend, andthesinetermreflectstheseasonalchangesinsales. Givethedesignmatrixandtheparametervectorforthelinearmodelthatleadstoaleast-squaresfitoftheequationabove. Assumethedataare .x1; y1/; : : : ; .xn; yn/.
374 CHAPTER 6 Orthogonality and Least Squares
6.6 EXERCISESInExercises1–4, findtheequation y D ˇ0 C ˇ1x oftheleast-squareslinethatbestfitsthegivendatapoints.
1. .0; 1/, .1; 1/, .2; 2/, .3; 2/
2. .1; 0/, .2; 1/, .4; 2/, .5; 3/
3. .�1; 0/, .0; 1/, .1; 2/, .2; 4/
4. .2; 3/, .3; 2/, .5; 1/, .6; 0/
5. Let X bethedesignmatrixusedtofindtheleast-squareslinetofitdata .x1; y1/; : : : ; .xn; yn/. UseatheoreminSection 6.5toshowthatthenormalequationshaveauniquesolutionifandonlyifthedataincludeatleasttwodatapointswithdifferent x-coordinates.
6. Let X bethedesignmatrixinExample 2correspondingtoaleast-squaresfitofaparabolatodata .x1; y1/; : : : ; .xn; yn/.Suppose x1, x2, and x3 aredistinct. Explainwhythereisonlyoneparabolathatfitsthedatabest, inaleast-squaressense.(SeeExercise 5.)
7. A certain experiment produces the data .1; 1:8/, .2; 2:7/,.3; 3:4/, .4; 3:8/, .5; 3:9/. Describethemodelthatproducesaleast-squaresfitofthesepointsbyafunctionoftheform
y D ˇ1x C ˇ2x2
Suchafunctionmightarise, forexample, astherevenuefromthesaleof x unitsofaproduct, whentheamountofferedforsaleaffectsthepricetobesetfortheproduct.a. Givethedesignmatrix, theobservationvector, andthe
9. A certainexperimentproducesthedata .1; 7:9/, .2; 5:4/, and.3; �:9/. Describethemodelthatproducesaleast-squaresfitofthesepointsbyafunctionoftheform
y D A cos x C B sin x
10. SupposeradioactivesubstancesA andB havedecaycon-stantsof.02and.07, respectively. Ifamixtureofthesetwosubstancesattime t D 0 contains MA gramsofA and MBgramsofB,thenamodelforthetotalamount y ofthemixturepresentattime t is
y D MAe�:02t C MBe�:07t .6/
Suppose the initial amounts MA and MB are unknown,but a scientist is able to measure the total amountspresent at several timesandrecords thefollowingpoints.ti ; yi /: .10; 21:34/, .11; 20:68/, .12; 20:05/, .14; 18:87/,and .15; 18:30/.a. Describealinearmodelthatcanbeusedtoestimate MA
and MB.b. [M] Findtheleast-squarescurvebasedon(6).
13. [M] Tomeasurethetakeoffperformanceofanairplane, thehorizontalpositionoftheplanewasmeasuredeverysecond,from t D 0 to t D 12. Thepositions(infeet)were: 0, 8.8,29.9, 62.0, 104.7, 159.1, 222.0, 294.5, 380.4, 471.1, 571.7,686.8, and809.2.a. Find the least-squares cubic curve y D ˇ0 C ˇ1t C
Show that the least-squares line for the data.x1; y1/; : : : ; .xn; yn/mustpassthrough .x; y/. Thatis, showthat x and y satisfythelinearequation y D O
0 C O1x. [Hint:
Derivethisequationfromthevectorequation y D X O C �.Denotethefirstcolumnof X by 1. Usethefactthattheresidualvector � isorthogonaltothecolumnspaceof X andhenceisorthogonalto 1.]
Everystatisticstextthatdiscussesregressionandthelinearmodely D Xˇ C � introducesthesenumbers, thoughterminologyandnotationvarysomewhat. Tosimplifymatters, assumethatthemeanofthe y-valuesiszero. Inthiscase, SS(T) isproportionaltowhatiscalledthe variance ofthesetof y-values.
19. Justifytheequation SS(T) D SS(R) C SS(E). [Hint: Useatheorem, andexplainwhythehypothesesofthetheoremaresatisfied.] Thisequationisextremelyimportantinstatistics,bothinregressiontheoryandintheanalysisofvariance.
20. Showthat kX O k2 = O T XTy. [Hint: Rewritetheleftsideandusethefactthat O satisfiesthenormalequations.] ThisformulaforSS(R) isusedinstatistics. FromthisandfromExercise 19, obtainthestandardformulaforSS(E):
SS(E) D yT y � O TXT y
SOLUTION TO PRACTICE PROBLEM
Construct X and ˇ sothatthe kthrowof Xˇ isthepredicted y-valuethatcorresponds
x
y
Salestrendwithseasonalfluctuations.
tothedatapoint .xk ; yk/, namely,
ˇ0 C ˇ1xk C ˇ2 sin.2�xk=12/
Itshouldbeclearthat
X D
2
64
1 x1 sin.2�x1=12/:::
::::::
1 xn sin.2�xn=12/
3
75 ; ˇ D
2
4
ˇ0
ˇ1
ˇ2
3
5
376 CHAPTER 6 Orthogonality and Least Squares
6.7 INNER PRODUCT SPACES
Notions of length, distance, and orthogonality are often important in applicationsinvolvingavectorspace. For R
n, theseconceptswerebasedonthepropertiesoftheinnerproductlistedinTheorem 1ofSection 6.1. Forotherspaces, weneedanaloguesoftheinnerproductwiththesameproperties. TheconclusionsofTheorem 1nowbecomeaxioms inthefollowingdefinition.
DEF IN I T I ON An innerproduct onavectorspace V isafunctionthat, toeachpairofvectorsu and v in V , associatesarealnumber hu; vi andsatisfiesthefollowingaxioms,forall u, v, w in V andallscalars c:
1. hu; vi D hv; ui2. hu C v;wi D hu;wi C hv;wi3. hcu; vi D chu; vi4. hu;ui � 0 and hu; ui D 0 ifandonlyif u D 0
A vectorspacewithaninnerproductiscalledan innerproductspace.
Thevectorspace Rn withthestandardinnerproductisaninnerproductspace, and
Fromnowon, whenaninnerproductspaceinvolvespolynomialsorotherfunctions,wewillwritethefunctionsinthefamiliarway, ratherthanusetheboldfacetypeforvectors. Nevertheless, itisimportanttorememberthateachfunction is avectorwhenitistreatedasanelementofavectorspace.
6.7 Inner Product Spaces 377
EXAMPLE 2 Let t0; : : : ; tn bedistinctrealnumbers. For p and q in Pn, definehp; qi D p.t0/q.t0/ C p.t1/q.t1/ C � � � C p.tn/q.tn/ (2)
InnerproductAxioms 1–3arereadilychecked. ForAxiom 4, notethathp; pi D Œp.t0/�2 C Œp.t1/�2 C � � � C Œp.tn/�2 � 0
Also, h0; 0i D 0. (Theboldfacezeroheredenotesthezeropolynomial, thezerovectorin Pn.) If hp; pi D 0, then p mustvanishat n C 1 points: t0; : : : ; tn. Thisispossibleonlyif p isthezeropolynomial, becausethedegreeof p islessthan n C 1. Thus(2)definesaninnerproducton Pn.
EXAMPLE 3 Let V be P2, withtheinnerproductfromExample 2, where t0 D 0,t1 D 1
2, and t2 D 1. Let p.t/ D 12t2 and q.t/ D 2t � 1. Compute hp; qi and hq; qi.
SOLUTION
hp; qi D p.0/q.0/ C p�
12
�
q�
12
�
C p.1/q.1/
D .0/.�1/ C .3/.0/ C .12/.1/ D 12
hq; qi D Œq.0/�2 C Œq�
12
�
�2 C Œq.1/�2
D .�1/2 C .0/2 C .1/2 D 2
Lengths, Distances, and OrthogonalityLet V beaninnerproductspace, withtheinnerproductdenotedby hu; vi. JustasinR
n, wedefinethe length, or norm, ofavector v tobethescalar
kvk Dp
hv; viEquivalently, kvk2 D hv; vi. (Thisdefinitionmakessensebecause hv; vi � 0, butthedefinition doesnot saythat hv; vi isa“sumofsquares,” because v neednotbeanelementof R
n.)A unitvector isonewhoselengthis1. The distancebetween u and v is ku � vk.
Vectors u and v are orthogonal if hu; vi D 0.
EXAMPLE 4 Let P2 havetheinnerproduct(2)ofExample 3. Computethelengthsofthevectors p.t/ D 12t2 and q.t/ D 2t � 1.SOLUTION
kpk2 D hp; pi D Œp.0/�2 C�
p�
12
��2 C Œp.1/�2
D 0 C Œ3�2 C Œ12�2 D 153
kpk Dp
153
FromExample 3, hq; qi D 2. Hence kqk Dp
2.
The Gram–Schmidt ProcessTheexistenceoforthogonalbasesforfinite-dimensionalsubspacesofaninnerproductspacecanbeestablishedbytheGram–Schmidtprocess, justasin R
n. Certainorthogo-nalbasesthatarisefrequentlyinapplicationscanbeconstructedbythisprocess.
Theorthogonalprojectionofavectorontoasubspace W withanorthogonalbasiscanbeconstructedasusual. Theprojectiondoesnotdependonthechoiceoforthogonalbasis, andithasthepropertiesdescribedintheOrthogonalDecompositionTheoremandtheBestApproximationTheorem.
378 CHAPTER 6 Orthogonality and Least Squares
EXAMPLE 5 LetV beP4 withtheinnerproductinExample 2, involvingevaluationofpolynomialsat �2, �1, 0, 1, and2, andview P2 asasubspaceof V . Produceanorthogonalbasisfor P2 byapplyingtheGram–Schmidtprocesstothepolynomials1, t ,and t2.SOLUTION Theinnerproductdependsonlyonthevaluesofapolynomialat�2; : : : ; 2,sowelistthevaluesofeachpolynomialasavectorin R
5, underneaththenameofthepolynomial:¹
Polynomial: 1 t t 2
Vectorofvalues:
2
66664
1
1
1
1
1
3
77775
;
2
66664
�2
�1
0
1
2
3
77775
;
2
66664
4
1
0
1
4
3
77775
Theinnerproductoftwopolynomialsin V equalsthe(standard)innerproductoftheircorrespondingvectorsin R
5. Observethat t isorthogonaltotheconstantfunction1. Sotake p0.t/ D 1 and p1.t/ D t . For p2, usethevectorsin R
5 tocomputetheprojectionof t2 onto Span fp0; p1g:
ht2; p0i D ht2; 1i D 4 C 1 C 0 C 1 C 4 D 10
hp0; p0i D 5
ht2; p1i D ht2; ti D �8 C .�1/ C 0 C 1 C 8 D 0
Theorthogonalprojectionof t2 onto Span f1; tg is 105
p0 C 0p1. Thus
p2.t/ D t2 � 2p0.t/ D t2 � 2
Anorthogonalbasisforthesubspace P2 of V is:Polynomial: p0 p1 p2
Vectorofvalues:
2
66664
1
1
1
1
1
3
77775
;
2
66664
�2
�1
0
1
2
3
77775
;
2
66664
2
�1
�2
�1
2
3
77775
(3)
Best Approximation in Inner Product SpacesA commonprobleminappliedmathematicsinvolvesavectorspace V whoseelementsarefunctions. Theproblemistoapproximateafunction f in V byafunction g fromaspecifiedsubspace W of V . The“closeness”oftheapproximationof f dependsontheway kf � gk isdefined. Wewillconsideronlythecaseinwhichthedistancebetweenf and g isdeterminedbyaninnerproduct. Inthiscase, the bestapproximationto f byfunctionsin W istheorthogonalprojectionof f ontothesubspace W .
EXAMPLE 6 Let V be P4 withtheinnerproductinExample 5, andlet p0, p1,and p2 betheorthogonalbasisfoundinExample 5forthesubspace P2. Findthebestapproximationto p.t/ D 5 � 1
Thepolynomials p0, p1, and p2 inExamples 5and6belongtoaclassofpolynomi-alsthatarereferredtoinstatisticsas orthogonalpolynomials.² TheorthogonalityreferstothetypeofinnerproductdescribedinExample 2.
Two InequalitiesGivenavector v inaninnerproductspace V andgivenafinite-dimensionalsubspaceW , wemayapplythePythagoreanTheoremtotheorthogonaldecompositionof v withrespectto W andobtain
kvk2 D k projW vk2 C kv � projW vk2
SeeFig.2. Inparticular, thisshowsthatthenormoftheprojectionof v ontoW doesnotexceedthenormof v itself. Thissimpleobservationleadstothefollowingimportant
PROOF If u D 0, thenbothsidesof(4)arezero, andhencetheinequalityistrueinthiscase. (SeePracticeProblem 1.) If u ¤ 0, let W bethesubspacespannedby u. Recallthat kcuk D jcj kuk foranyscalar c. Thus
k projW vk D
hv;uihu; uiu
D jhv; uijjhu;uijkuk D jhv; uij
kuk2kuk D jhu; vij
kuk
Since k projW vk � kvk, wehave jhu; vijkuk � kvk, whichgives(4).
TheCauchy–Schwarzinequalityisusefulinmanybranchesofmathematics. A fewsimpleapplicationsarepresentedintheexercises. Ourmainneedforthisinequalityhereistoproveanotherfundamentalinequalityinvolvingnormsofvectors. SeeFig. 3.
THEOREM 17 The Triangle Inequality
Forall u; v in V ,ku C vk � kuk C kvk
PROOF ku C vk2 D hu C v;u C vi D hu;ui C 2hu; vi C hv; vi� kuk2 C 2jhu; vij C kvk2
� kuk2 C 2kuk kvk C kvk2 Cauchy–SchwarzD .kuk C kvk/2
An Inner Product for C Œa; b� (Calculus required)ProbablythemostwidelyusedinnerproductspaceforapplicationsisthevectorspaceC Œa; b� ofallcontinuousfunctionsonaninterval a � t � b, withaninnerproductthatwewilldescribe.
Webeginbyconsideringapolynomial p andanyinteger n largerthanorequaltothedegreeof p. Then p isin Pn, andwemaycomputea“length”for p usingtheinnerproductofExample 2involvingevaluationat n C 1 pointsin Œa; b�. However,thislengthof p capturesthebehavioratonlythose n C 1 points. Since p isin Pn foralllarge n, wecoulduseamuchlarger n, withmanymorepointsforthe“evaluation”innerproduct. SeeFig. 4.
Letuspartition Œa; b� into n C 1 subintervalsoflength �t D .b � a/=.n C 1/, andlet t0; : : : ; tn bearbitrarypointsinthesesubintervals.
a t0
∆t
tj btn
If n islarge, theinnerproducton Pn determinedby t0; : : : ; tn willtendtogivealargevalueto hp; pi, sowescaleitdownanddivideby n C 1. Observethat 1=.n C 1/ D�t=.b � a/, anddefine
hp; qi D 1
n C 1
nX
jD0
p.tj /q.tj / D 1
b � a
2
4
nX
jD0
p.tj /q.tj /�t
3
5
Now, let n increasewithoutbound. Sincepolynomialsp and q arecontinuousfunctions,theexpressioninbracketsisaRiemannsumthatapproachesadefiniteintegral, andweareledtoconsiderthe averagevalueof p.t/q.t/ ontheinterval Œa; b�:
Thefunction Œf .t/�2 iscontinuousandnonnegativeon Œa; b�. IfthedefiniteintegralofŒf .t/�2 iszero, then Œf .t/�2 mustbeidenticallyzeroon Œa; b�, byatheoreminadvancedcalculus, inwhichcase f isthezerofunction. Thus hf; f i D 0 impliesthat f isthezerofunctionon Œa; b�. So(5)definesaninnerproducton C Œa; b�.
EXAMPLE 8 Let V bethespace C Œ0; 1� withtheinnerproductofExample 7, andlet W bethesubspacespannedbythepolynomials p1.t/ D 1, p2.t/ D 2t � 1, andp3.t/ D 12t2. UsetheGram–Schmidtprocesstofindanorthogonalbasisfor W .
SOLUTION Let q1 D p1, andcompute
hp2; q1i DZ 1
0
.2t � 1/.1/ dt D .t2 � t /
ˇˇˇˇ
1
0
D 0
382 CHAPTER 6 Orthogonality and Least Squares
So p2 isalreadyorthogonalto q1, andwecantake q2 D p2. Fortheprojectionof p3
onto W2 D Span fq1; q2g, compute
hp3; q1i DZ 1
0
12t2 � 1 dt D 4t3
ˇˇˇˇ
1
0
D 4
hq1; q1i DZ 1
0
1 � 1 dt D t
ˇˇˇˇ
1
0
D 1
hp3; q2i DZ 1
0
12t2.2t � 1/ dt DZ 1
0
.24t3 � 12t2/ dt D 2
hq2; q2i DZ 1
0
.2t � 1/2 dt D 1
6.2t � 1/3
ˇˇˇˇ
1
0
D 1
3
Then
projW2p3 D hp3; q1i
hq1; q1i q1 C hp3; q2ihq2; q2i
q2 D 4
1q1 C 2
1=3q2 D 4q1 C 6q2
andq3 D p3 � projW2
p3 D p3 � 4q1 � 6q2
Asafunction, q3.t/ D 12t2 � 4 � 6.2t � 1/ D 12t2 � 12t C 2. Theorthogonalbasisforthesubspace W is fq1; q2; q3g.
1. hv; 0i D h0; vi D 0.2. hu; v C wi D hu; vi C hu;wi.
6.7 EXERCISES1. Let R
2 have the inner product of Example 1, and letx D .1; 1/ and y D .5; �1/.a. Find kxk, kyk, and jhx; yij2.b. Describeallvectors .´1; ´2/ thatareorthogonalto y.
2. Let R2 havetheinnerproductofExample 1. Showthat
theCauchy–Schwarzinequalityholdsfor x D .3; �2/ andy D .�2; 1/. [Suggestion: Study jhx; yij2.]
3. Compute hp; qi, where p.t/ D 4 C t , q.t/ D 5 � 4t 2.
4. Compute hp; qi, where p.t/ D 3t � t 2, q.t/ D 3 C 2t 2.
5. Compute kpk and kqk, for p and q inExercise 3.
6. Compute kpk and kqk, for p and q inExercise 4.
7. Computetheorthogonalprojectionof q ontothesubspacespannedby p, for p and q inExercise 3.
8. Computetheorthogonalprojectionof q ontothesubspacespannedby p, for p and q inExercise 4.
9. Let P3 havetheinnerproductgivenbyevaluationat �3, �1,1, and3. Let p0.t/ D 1, p1.t/ D t , and p2.t/ D t 2.a. Computetheorthogonalprojectionof p2 ontothesub-
spacespannedby p0 and p1.b. Find a polynomial q that is orthogonal to p0 and
p1, such that fp0; p1; qg is an orthogonal basis forSpan fp0; p1; p2g. Scale the polynomial q so that itsvectorofvaluesat .�3; �1; 1; 3/ is .1; �1; �1; 1/.
10. Let P3 havetheinnerproductasinExercise 9, with p0; p1,and q thepolynomialsdescribedthere. Findthebestapprox-imationto p.t/ D t 3 bypolynomialsin Span fp0; p1; qg.
11. Let p0, p1, and p2 betheorthogonalpolynomialsdescribedinExample5, wheretheinnerproducton P4 isgivenbyevaluationat �2, �1, 0, 1, and2. Find the orthogonalprojectionof t 3 onto Span fp0; p1; p2g.
12. Finda polynomial p3 such that fp0; p1; p2; p3g (see Ex-ercise 11) is an orthogonal basis for the subspace P3 ofP4. Scalethepolynomial p3 sothatitsvectorofvaluesis.�1; 2; 0; �2; 1/.
6.8 Applications of Inner Product Spaces 383
13. Let A beanyinvertible n � n matrix. Showthatfor u, v inR
n, theformula hu; vi D .Au/� .Av/ D .Au/T .Av/ definesaninnerproducton R
n.
14. Let T beaone-to-onelineartransformationfromavectorspace V into R
n. Showthatfor u, v in V , theformulahu; vi D T .u/�T .v/ definesaninnerproducton V .
Exercises 21–24refer to V D C Œ0; 1�, with the inner productgivenbyanintegral, asinExample 7.
21. Compute hf; gi, where f .t/ D 1 � 3t 2 and g.t/ D t � t 3.22. Compute hf; gi, where f .t/ D 5t � 3 and g.t/ D t 3 � t 2.23. Compute kf k for f inExercise 21.24. Compute kgk for g inExercise 22.25. Let V bethespace C Œ�1; 1� withtheinnerproductofExam-
ple 7. Findanorthogonalbasisforthesubspacespannedbythepolynomials 1, t , and t 2. Thepolynomialsinthisbasisarecalled Legendrepolynomials.
26. Let V bethespace C Œ�2; 2� withtheinnerproductofExam-ple 7. Findanorthogonalbasisforthesubspacespannedbythepolynomials 1, t , and t 2.
27. [M] Let P4 havetheinnerproductasinExample 5, andletp0, p1, p2 betheorthogonalpolynomialsfromthatexam-ple. Usingyourmatrixprogram, applytheGram–Schmidtprocesstotheset fp0; p1; p2; t 3; t 4g tocreateanorthogonalbasisfor P4.
28. [M] Let V be the space C Œ0; 2�� with the inner prod-uct of Example 7. Use the Gram–Schmidt process tocreate an orthogonal basis for the subspace spanned byf1; cos t; cos2 t; cos3 tg. Useamatrixprogramorcomputa-tionalprogramtocomputetheappropriatedefiniteintegrals.
SOLUTIONS TO PRACTICE PROBLEMS
1. ByAxiom 1, hv; 0i D h0; vi. Then h0; vi D h0v; vi D 0hv; vi, by Axiom 3, soh0; vi D 0.
2. ByAxioms 1, 2, andthen1again, hu; v C wi D hv C w; ui D hv; ui C hw; ui Dhu; vi C hu;wi.
Weighted Least-SquaresLet y beavectorof n observations, y1; : : : ; yn, andsupposewewishtoapproximate y byavector Oy thatbelongstosomespecifiedsubspaceof R
n. (InSection 6.5, Oy waswrittenas Ax sothat Oy wasinthecolumnspaceof A.) Denotetheentriesin Oy by Oy1; : : : ; Oyn.Thenthe sumofthesquaresforerror, or SS(E), inapproximating y by Oy is
SS(E) D .y1 � Oy1/2 C � � � C .yn � Oyn/2 (1)
Thisissimply ky � Oyk2, usingthestandardlengthin Rn.
384 CHAPTER 6 Orthogonality and Least Squares
Nowsupposethemeasurementsthatproducedtheentriesin y arenotequallyreliable. (ThiswasthecasefortheNorthAmericanDatum, sincemeasurementsweremadeoveraperiodof140years.) Asanotherexample, theentriesin y mightbecomputedfromvarioussamplesofmeasurements, withunequalsamplesizes.) Thenitbecomesappropriatetoweightthesquarederrorsin(1)insuchawaythatmoreimportanceisassignedtothemorereliablemeasurements.¹ Iftheweightsaredenotedby w2
EXAMPLE 1 Find the least-squares line y D ˇ0 C ˇ1x that best fits the data.�2; 3/, .�1; 5/, .0; 5/, .1; 4/, and .2; 3/. Supposetheerrorsinmeasuringthe y-valuesofthelasttwodatapointsaregreaterthanfortheotherpoints. Weightthesedatahalfasmuchastherestofthedata.
¹Noteforreaderswithabackgroundinstatistics: Supposetheerrorsinmeasuringthe yi areindependentrandomvariableswithmeansequaltozeroandvariancesof �2
1 ; : : : ; �2n . Thentheappropriateweightsin(2)
are w2i D 1=�2
i . Thelargerthevarianceoftheerror, thesmallertheweight.
6.8 Applications of Inner Product Spaces 385
SOLUTION AsinSection 6.6, write X forthematrix A and ˇ forthevector x, andobtain
X D
2
66664
1 �2
1 �1
1 0
1 1
1 2
3
77775
; ˇ D�
ˇ0
ˇ1
�
; y D
2
66664
3
5
5
4
3
3
77775
Foraweightingmatrix, choose W withdiagonalentries2, 2, 2, 1, and1. Left-multiplicationby W scalestherowsof X and y:
WX D
2
66664
2 �4
2 �2
2 0
1 1
1 2
3
77775
; W y D
2
66664
6
10
10
4
3
3
77775
Forthenormalequation, compute
.WX/T WX D�
14 �9
�9 25
�
and .WX/T W y D�
59
�34
�
andsolve �
14 �9
�9 25
��
ˇ0
ˇ1
�
D�
59
�34
�
Thesolutionofthenormalequationis(totwosignificantdigits) ˇ0 D 4:3 and ˇ1 D :20.Thedesiredlineis
Trend Analysis of DataLet f representanunknownfunctionwhosevaluesareknown(perhapsonlyapprox-imately)at t0; : : : ; tn. Ifthereisa“lineartrend”inthedata f .t0/; : : : ; f .tn/, thenwemightexpecttoapproximatethevaluesof f byafunctionoftheform ˇ0 C ˇ1t .Ifthereisa“quadratictrend”tothedata, thenwewouldtryafunctionoftheformˇ0 C ˇ1t C ˇ2t2. ThiswasdiscussedinSection 6.6, fromadifferentpointofview.
Insomestatisticalproblems, itisimportanttobeabletoseparatethelineartrendfromthequadratictrend(andpossiblycubicorhigher-ordertrends). Forinstance,supposeengineersareanalyzingtheperformanceofanewcar, and f .t/ representsthedistancebetweenthecarattime t andsomereferencepoint. Ifthecaristravelingatconstantvelocity, thenthegraphof f .t/ shouldbeastraightlinewhoseslopeisthecar’svelocity. Ifthegaspedalissuddenlypressedtothefloor, thegraphof f .t/ willchangetoincludeaquadratictermandpossiblyacubicterm(duetotheacceleration).Toanalyzetheabilityofthecartopassanothercar, forexample, engineersmaywanttoseparatethequadraticandcubiccomponentsfromthelinearterm.
Ifthefunctionisapproximatedbyacurveoftheform y D ˇ0 C ˇ1t C ˇ2t2, thecoefficient ˇ2 maynotgivethedesiredinformationaboutthequadratictrendinthedata,becauseitmaynotbe“independent”inastatisticalsensefromtheother ˇi . Tomake
386 CHAPTER 6 Orthogonality and Least Squares
whatisknownasa trendanalysis ofthedata, weintroduceaninnerproductonthespace Pn analogoustothatgiveninExample 2inSection 6.7. For p, q in Pn, define
hp; qi D p.t0/q.t0/ C � � � C p.tn/q.tn/
Inpractice, statisticiansseldomneedtoconsidertrendsindataofdegreehigherthancubicorquartic. Solet p0, p1, p2, p3 denoteanorthogonalbasisofthesubspace P3 ofPn, obtainedbyapplyingtheGram–Schmidtprocesstothepolynomials1, t , t2, and t3.BySupplementaryExercise 11inChapter 2, thereisapolynomial g in Pn whosevaluesat t0; : : : ; tn coincidewiththoseoftheunknownfunction f . Let Og betheorthogonalprojection(withrespecttothegiveninnerproduct)of g onto P3, say,
Og D c0p0 C c1p1 C c2p2 C c3p3
Then Og iscalledacubic trendfunction, and c0; : : : ; c3 arethe trendcoefficients ofthedata. Thecoefficient c1 measuresthelineartrend, c2 thequadratictrend, and c3 thecubictrend. Itturnsoutthatifthedatahavecertainproperties, thesecoefficientsarestatisticallyindependent.
Since p0; : : : ; p3 areorthogonal, thetrendcoefficientsmaybecomputedoneatatime, independentlyofoneanother. (Recallthat ci D hg; pi i=hpi ; pi i.) Wecanignore p3 and c3 ifwewantonlythequadratictrend. Andif, forexample, weneededtodeterminethequartictrend, wewouldhavetofind(viaGram–Schmidt)onlyapolynomial p4 in P4 thatisorthogonalto P3 andcompute hg; p4i=hp4; p4i.
Fourier Series (Calculus required)Continuousfunctionsareoftenapproximatedbylinearcombinationsofsineandcosinefunctions. Forinstance, acontinuousfunctionmightrepresentasoundwave, anelectricsignalofsometype, orthemovementofavibratingmechanicalsystem.
Forsimplicity, weconsiderfunctionson 0 � t � 2� . Itturnsoutthatanyfunctionin C Œ0; 2�� canbeapproximatedascloselyasdesiredbyafunctionoftheform
a0
2C a1 cos t C � � � C an cos nt C b1 sin t C � � � C bn sin nt (4)
forasufficientlylargevalueof n. Thefunction(4)iscalleda trigonometricpoly-nomial. If an and bn arenotbothzero, thepolynomialissaidtobeof order n. TheconnectionbetweentrigonometricpolynomialsandotherfunctionsinC Œ0; 2�� dependsonthefactthatforany n � 1, theset
f1; cos t; cos 2t; : : : ; cos nt; sin t; sin 2t; : : : ; sin ntg (5)
EXAMPLE 3 Let C Œ0; 2�� havetheinnerproduct(6), andlet m and n beunequalpositiveintegers. Showthat cosmt and cos nt areorthogonal.
SOLUTION Useatrigonometricidentity. When m ¤ n,
hcosmt; cos nti DZ 2�
0
cosmt cosnt dt
D 1
2
Z 2�
0
Œcos.mt C nt/ C cos.mt � nt/� dt
D 1
2
�sin.mt C nt/
m C nC sin.mt � nt/
m � n
�ˇˇˇˇ
2�
0
D 0
Let W bethesubspaceof C Œ0; 2�� spannedbythefunctionsin(5). Given f
in C Œ0; 2��, thebestapproximationto f byfunctionsin W iscalledthe nth-orderFourierapproximation to f on Œ0; 2��. Sincethefunctionsin(5)areorthogonal,thebestapproximationisgivenbytheorthogonalprojectiononto W . Inthiscase, thecoefficients ak and bk in(4)arecalledthe Fouriercoefficients of f . Thestandardformulaforanorthogonalprojectionshowsthat
ak D hf; cos ktihcos kt; cos kti ; bk D hf; sin kti
hsin kt; sin kti ; k � 1
Exercise 7asksyoutoshowthat hcos kt; cos kti D � and hsin kt; sin kti D � . Thus
where a0 isdefinedby(7)for k D 0. Thisexplainswhytheconstanttermin(4)iswrittenas a0=2.
388 CHAPTER 6 Orthogonality and Least Squares
EXAMPLE 4 Findthe nth-orderFourierapproximationtothefunction f .t/ D t ontheinterval Œ0; 2��.SOLUTION Compute
a0
2D 1
2�
1
�
Z 2�
0
t dt D 1
2�
"
1
2t2
ˇˇˇˇ
2�
0
#
D �
andfor k > 0, usingintegrationbyparts,
ak D 1
�
Z 2�
0
t cos kt dt D 1
�
�1
k2cos kt C t
ksin kt
�2�
0
D 0
bk D 1
�
Z 2�
0
t sin kt dt D 1
�
�1
k2sin kt � t
kcos kt
�2�
0
D � 2
k
Thusthe nth-orderFourierapproximationof f .t/ D t is
� � 2 sin t � sin 2t � 2
3sin 3t � � � � � 2
nsin nt
Figure 3showsthethird-andfourth-orderFourierapproximationsof f .
(a) Third order
y
2�
2�
y = t
�
�
t
y
2�
2�
y = t
�
�
t
(b) Fourth order
FIGURE 3 Fourierapproximationsofthefunction f .t/ D t .
Thenormofthedifferencebetween f andaFourierapproximationiscalledthemeansquareerror intheapproximation. (Theterm mean refers tothefact thatthenormisdeterminedbyanintegral.) ItcanbeshownthatthemeansquareerrorapproacheszeroastheorderoftheFourierapproximationincreases. Forthisreason, itiscommontowrite
f .t/ D a0
2C1X
mD1
.am cosmt C bm sinmt/
Thisexpressionfor f .t/ is calledthe Fourierseries for f on Œ0; 2��. Thetermam cosmt , forexample, is theprojectionof f ontotheone-dimensionalsubspacespannedby cosmt .
PRACTICE PROBLEMS
1. Let q1.t/ D 1, q2.t/ D t , and q3.t/ D 3t2 � 4. Verifythat fq1; q2; q3g isanorthog-onalsetin C Œ�2; 2� withtheinnerproductofExample 7inSection 6.7(integrationfrom �2 to2).
2. Findthefirst-orderandthird-orderFourierapproximationstof .t/ D 3 � 2 sin t C 5 sin 2t � 6 cos 2t
6.8 Applications of Inner Product Spaces 389
6.8 EXERCISES1. Findtheleast-squaresline y D ˇ0 C ˇ1x thatbestfitsthe
data .�2; 0/, .�1; 0/, .0; 2/, .1; 4/, and .2; 4/, assumingthatthefirstandlastdatapointsarelessreliable. Weightthemhalfasmuchasthethreeinteriorpoints.
14. SupposethefirstfewFouriercoefficientsofsomefunctionf in C Œ0; 2�� are a0, a1, a2, and b1, b2, b3. Whichofthefollowingtrigonometricpolynomialsiscloserto f ? Defendyouranswer.
g.t/ D a0
2C a1 cos t C a2 cos 2t C b1 sin t
h.t/ D a0
2C a1 cos t C a2 cos 2t C b1 sin t C b2 sin 2t
15. [M] RefertothedatainExercise 13inSection 6.6, con-cerningthetakeoffperformanceofanairplane. Supposethepossiblemeasurementerrorsbecomegreaterasthespeedoftheairplaneincreases, andlet W bethediagonalweightingmatrixwhosediagonalentriesare1, 1, 1, .9, .9, .8, .7, .6, .5,.4, .3, .2, and.1. Findthecubiccurvethatfitsthedatawithminimumweightedleast-squareserror, anduseittoestimatethevelocityoftheplanewhen t D 4:5 seconds.
16. [M] Let f4 and f5 bethefourth-orderandfifth-orderFourierapproximationsin C Œ0; 2�� tothesquarewavefunctioninExercise 10. Produceseparategraphsof f4 and f5 ontheinterval Œ0; 2��, andproduceagraphof f5 on Œ�2�; 2��.
SG The Linearity of an Orthogonal Projection 6–25
SOLUTIONS TO PRACTICE PROBLEMS
1. Compute
hq1; q2i DZ 2
�2
1�t dt D 1
2t2
ˇˇˇˇ
2
�2
D 0
hq1; q3i DZ 2
�2
1� .3t2 � 4/ dt D .t3 � 4t/
ˇˇˇˇ
2
�2
D 0
hq2; q3i DZ 2
�2
t � .3t2 � 4/ dt D�
3
4t4 � 2t2
�ˇˇˇˇ
2
�2
D 0
390 CHAPTER 6 Orthogonality and Least Squares
2. Thethird-orderFourierapproximationto f isthebestapproximationin C Œ0; 2��
to f byfunctions(vectors)inthesubspacespannedby1, cos t , cos 2t , cos 3t ,sin t , sin 2t , and sin 3t . But f isobviously in thissubspace, so f isitsownbestapproximation:
f .t/ D 3 � 2 sin t C 5 sin 2t � 6 cos 2t
Forthefirst-orderapproximation, theclosestfunctionto f inthesubspace W DSpanf1; cos t; sin tg is 3 � 2 sin t . Theothertwotermsintheformulafor f .t/ areorthogonaltothefunctionsin W , sotheycontributenothingtotheintegralsthatgivetheFouriercoefficientsforafirst-orderapproximation.
y = 3 – 2 sin ty = f (t)
t
3
9
–32π
y
π
First-andthird-orderapproximationsto f .t/.
CHAPTER 6 SUPPLEMENTARY EXERCISES1. Thefollowingstatementsrefertovectorsin R
n (or Rm/ with
thestandardinnerproduct. MarkeachstatementTrueorFalse. Justifyeachanswer.a. Thelengthofeveryvectorisapositivenumber.b. A vector v anditsnegative �v haveequallengths.c. Thedistancebetween u and v is ku � vk.d. If r isanyscalar, then krvk D rkvk.e. Iftwovectorsareorthogonal, theyarelinearlyindepen-
dent.f. If x is orthogonal to both u and v, then x must be
orthogonalto u � v.g. If ku C vk2 D kuk2 C kvk2, then u and v areorthogonal.h. If ku � vk2 D kuk2 C kvk2, then u and v areorthogonal.i. Theorthogonalprojectionof y onto u isascalarmultiple
of y.j. Ifavector y coincideswithitsorthogonalprojectiononto
asubspace W , then y isin W .k. ThesetofallvectorsinR
n orthogonaltoonefixedvectorisasubspaceof R
n.l. If W isasubspaceof R
n, then W and W ? havenovectorsincommon.
m. If fv1; v2; v3g isanorthogonalsetandif c1, c2, and c3 arescalars, then fc1v1; c2v2; c3v3g isanorthogonalset.
n. Ifamatrix U hasorthonormalcolumns, then U U T D I .o. A squarematrixwithorthogonalcolumnsisanorthogo-
followinginequality, called Bessel’sinequality, whichistrueforeach x in R
n:
kxk2 � jx �v1j2 C jx �v2j2 C � � � C jx �vpj2
4. Let U be an n � n orthogonal matrix. Show that iffv1; : : : ; vng is an orthonormal basis for R
n, then so isfU v1; : : : ; U vng.
5. Showthatifan n � n matrix U satisfies .U x/� .U y/ D x �yforall x and y in R
n, then U isanorthogonalmatrix.
6. Showthatif U isanorthogonalmatrix, thenanyrealeigen-valueof U mustbe ˙1.
7. A Householdermatrix, oran elementaryreflector, hastheform Q D I � 2uuT where u isaunitvector. (SeeExer-cise 13intheSupplementaryExercisesforChapter 2.) ShowthatQ isanorthogonalmatrix. (Elementaryreflectorsareof-tenusedincomputerprogramstoproduceaQR factorizationofamatrix A. If A haslinearlyindependentcolumns, thenleft-multiplicationbyasequenceofelementaryreflectorscanproduceanuppertriangularmatrix.)
Chapter 6 Supplementary Exercises 391
8. Let T W Rn ! R
n bealineartransformationthatpreserveslengths; thatis, kT .x/k D kxk forall x in R
n.a. Show that T also preserves orthogonality; that is,
T .x/�T .y/ D 0 whenever x �y D 0.b. Show that the standard matrix of T is an orthogonal
matrix.
9. Let u and v belinearlyindependentvectorsin Rn thatare
not orthogonal. Describehowtofindthebestapproximationto z in R
n byvectorsoftheform x1u C x2v withoutfirstconstructinganorthogonalbasisfor Span fu; vg.
10. Supposethecolumnsof A arelinearlyindependent. Deter-minewhathappenstotheleast-squaressolution Ox of Ax D bwhen b isreplacedby cb forsomenonzeroscalar c.
11. If a, b, and c are distinct numbers, then the followingsystemisinconsistentbecausethegraphsoftheequationsareparallelplanes. Showthatthesetofallleast-squaressolutionsofthesystemispreciselytheplanewhoseequationis x � 2y C 5´ D .a C b C c/=3.
x � 2y C 5´ D a
x � 2y C 5´ D b
x � 2y C 5´ D c
12. Considertheproblemoffindinganeigenvalueofan n � n
matrix A when an approximate eigenvector v is known.Since v isnotexactlycorrect, theequation
Av D �v .1/
will probably not have a solution. However, � can beestimatedbya least-squares solutionwhen(1) is viewedproperly. Thinkof v asan n � 1 matrix V , thinkof � asavectorin R
1, anddenotethevector Av bythesymbol b.Then(1)becomes b D �V , whichmayalsobewrittenasV � D b. Findtheleast-squaressolutionofthissystemof n
13. Usethestepsbelowtoprovethefollowingrelationsamongthe four fundamental subspaces determined by an m � n
matrix A.
RowA D .NulA/?; ColA D .NulAT /?
a. Showthat RowA iscontainedin .NulA/?. (Showthatifx isin RowA, then x isorthogonaltoevery u in NulA.)
b. Suppose rankA D r . Find dimNulA and dim .NulA/?,andthendeducefrompart(a)that RowA D .NulA/?.[Hint: StudytheexercisesforSection 6.3.]
c. Explainwhy ColA D .NulAT /?.
14. Explainwhyanequation Ax D b hasasolutionifandonlyif b isorthogonaltoallsolutionsoftheequation ATx D 0.
Exercises15and16concernthe(real) Schurfactorization ofann � nmatrixA intheformA D URU T , whereU isanorthogonalmatrixand R isan n � n uppertriangularmatrix.1
15. Show that if A admits a (real) Schur factorization, A DURU T , then A has n realeigenvalues, countingmultiplic-ities.
16. Let A bean n � n matrixwith n realeigenvalues, countingmultiplicities, denotedby �1; : : : ; �n. ItcanbeshownthatA admitsa(real)Schurfactorization. Parts(a)and(b)showthekeyideasintheproof. Therestoftheproofamountstorepeating(a)and(b)forsuccessivelysmallermatrices, andthenpiecingtogethertheresults.a. Let u1 bea unit eigenvector corresponding to �1, let
u2; : : : ; un beanyothervectorssuchthat fu1; : : : ; ungis an orthonormal basis for R
n, and then let U DŒ u1 u2 � � � un �. Show that the first column ofU T AU is �1e1, where e1 isthefirstcolumnofthe n � n
identitymatrix.b. Part(a)impliesthat U TAU hastheformshownbelow.
[M] When the right side of an equation Ax D b is changedslightly—say, toAx D b C �b forsomevector�b—thesolutionchangesfrom x to x C �x, where �x satisfies A.�x/ D �b.Thequotient k�bk=kbk iscalledthe relativechange in b (orthe relativeerror in b when �b representspossibleerrorintheentriesof b/. Therelativechangeinthesolutionis k�xk=kxk.When A isinvertible, the conditionnumber of A, writtenascond.A/, producesaboundonhowlargetherelativechangeinx canbe:
k�xkkxk � cond.A/�
k�bkkbk .2/
InExercises17–20, solve Ax D b and A.�x/ D �b, andshowthattheinequality(2)holdsineachcase. (Seethediscussionofill-conditioned matricesinExercises41–43inSection 2.3.)
17. A D�
4:5 3:1
1:6 1:1
�
, b D�
19:249
6:843
�
, �b D�
:001
�:003
�
18. A D�
4:5 3:1
1:6 1:1
�
, b D�
:500
�1:407
�
, �b D�
:001
�:003
�
1 Ifcomplexnumbersareallowed, every n� n matrix A admitsa(complex)Schurfactorization,A D URU�1, whereR isuppertriangularand U�1 isthe conjugate transposeof U . Thisveryusefulfactisdiscussedin MatrixAnalysis, byRogerA. HornandCharlesR. Johnson(Cambridge: CambridgeUniversityPress, 1985), pp. 79–100.