Top Banner
89
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Welcome to Mathematics Class

Welcome to Quantitative Methods ClassInstructor: Ms . Ayesha N. RaoInstructors Profile Done several projects of Time-Series with Pakistan Institute of Development Economics(PIDE), Central Board of Revenue (CBR) and Federal Bureau of Statistics (FBS). PhD in progressHold a M.Phil in StatisticsHold a M.Sc in Statistics

Publications Zahid Asghar and Ayesha Nazuk. Iran-Pakistan-India Gas PipelineAn Economic Analysis in a Game Theoretic Framework. The Pakistan Development Review,Vol. 46, No. 4, Part II (2007) pp. 537550.Ayesha Nazuk and Javid Shabbir. A New Mixed Randomized Response Model, Proceedings of the European conferences on Quality in Official Statistics- Q2010, Held in Finlanda Hall, Helsinki, Finland (4th-6th May 2010). Estimating the proportion of liars in NUST, NUST Journal of Business and Finance.For details type AYESHA NAZUK in Google.

More PulblicationsAyesha Nazuk, Fiza Amer, Quratulain Tanvir, Saba Nawaz, Sahar Zahid Siddiqui, Shahwaiz Alvi (2013) Entrepreneurial Education in Public Sector Institutes of Rawalpindi/Islamabad, International Journal of Management Sciences and Business Research, Vol. 2, Issue 2, pp. 56-72. Ayesha Nazuk, Sadia Nadir and Javid Shabbir, (2013), Adjustment of the auxiliary variable for estimation of a finite population mean, article accepted for publication in Lahore Journal of Operations Research and Statistics. Ayesha Nazuk, Yusra Siddiquii, Maha Gul, Rana Iradat Shareef, Meraj Murtaza and Raza Abbas Rajput, Analysis of Cheating disorder among university students through Randomized Response Technique, International Journal of Business and Behavioral Sciences Vol. 3, No.3; 2013, pp. 15-22. Book review of "Bio-statistical Analysis" by Jerrold H. Zar, NUST Journal of Business and Economics, Vol 2 No. 2, pp. 98-99.

Workshop ConductedRegistered with PDC NUST and have organized several trainings on Statistics and softwares.Trained the faculty of NUST Business School with tools of Econometrics.Trained the faculty of AIOU with SPSS.Have delivered guest lectures in various universities/organizations. Student Consultation Students are expected to go through the class lectures and notes on continuous basis. In case of a problem, you are welcome to contact me on;Appointment Hours are posted on LMS. Or any other time but subject to prior appointment on office phone i:e 90853560.

Mid-Term = 30Terminal Exam=40

Assignments= 15 Quizzes= 15

In case of Term Paper marks for assig will be scaled down.Marking Scheme

Contact DetailsMs. Ayesha N. Rao Office: Room 310, NBS Faculty Block.Phone: +92-51-9085-3560E-mail: [email protected] or [email protected] COURSE OBJECTIVESThis course provides an introduction to theoretical and applied statistics and Mathematics for business and economics. The main objective is to stress the importance of applying statistical analysis to the solution of common business problems.

Text Book- will be uploaded on LMS soon.

Assignments Students are recommended to make a study group (each consists of 3 to 4 students) and are strongly encouraged to study together to solve homework problems.Submit assignments in group. NO INDIVIDUAL ASSIGNMENTS. Examination and Quizzes 2-3 quizzes, mid-term exam and a comprehensive final exam will be given in class during the semester. Quizzes, of course, will be solved independently.

Review WorksheetsIf required, review worksheets will be posted on LMS. You are encouraged to solve and discuss these mutually and with the instructor.

Make-Up QuizThere will be no make-ups for missed quizzes regardless of reason. Late assignments will not be accepted. DO NOT REQUEST FOR THIS. Make-up quiz may be given under extreme circumstances. For such request please submit a written application. Absentees are supposedTo cover previous lecture and do not come in class- unprepared. It has been noticed in past few semesters that absentees try to impede the pace of the lecture. Such behavior is not at all welcome.

CLASSROOM POLICY

I expect you to conduct yourself with professional courtesy in the classroom. You should not talk to other students during lectures unless directed to do so by the instructor. Brief discussions, in a decent low voice, to ensure understanding may be done. Please turn off Cell Phones, Beepers, i-pods or Pagers.

Course OutlineMeasures of Central Tendency & DispersionThe Arithmetic Mean, The Mode, The MedianRange, Skewness, Kurtosis, Variance & Standard DeviationCourse OutlineProbabilityBasic Definitions: Events, Sample Space & ProbabilityRules of ProbabilityThe Rule of ComplementsAddition Law & Mutually Exclusive EventsConditional ProbabilityIndependence of EventsProduct Rules for independent events

Course OutlineProbability DistributionsNormal distributionStudent-t distributionCourse OutlineHypothesis TestingThe Concept of Hypothesis TestingType I & Type II Errors, Computing the p-ValueOne-tailed & Two-tailed TestsTests of the Mean of a Normal Distribution: Population variance knownTests of the Mean of a Normal Distribution: Population variance unknownCourse OutlineHypothesis Testing IITests of the Difference Between Two PopulationTests of the Difference Between Two Population ProportionsTests of the Equality of the Variances Between Two Normally Distributed PopulationsCourse OutlineRegression & Correlation AnalysisRegression versus CorrelationRegression versus CausationClassical Linear Regression Model & AssumptionsMethod of Ordinary Least SquaresMethod of Logistic RegressionCourse OutlineDifferentiationConcepts of DerivativesRules of DerivativesExamples & PracticeApplications in BusinessCourse OutlineOptimizationConcavity & Inflection Points Identification of Maxima & Minima Business ApplicationsCourse OutlineDepreciation &/or AnnuitiesStraight-line-method, Sum-of-year-digit Method, Declining Balance Method, Units ofProduction Method & The MARC MethodAnnuities, Sinking FundsCourse OutlineMarkupMarkup on CostMarkup on Selling Price, Relationships between markupsMarkdown and ShrinkageCourse OutlineDiscountsTrade discount, Trade discount seriesCash discounts, Discounts and Freight termsNominal data is just for naming. E:g our names, names of cities, CNIC numbers, roll numbers of students provided that they are not assigned as per merit. All arithmetic operations are invalid.Ordinal data is for naming with a sense of ranking. Level of management from low to high. Numbers on the back of cricketers provided that they are based on ICC ranking.Scales of MeasurementInterval data is purely numeric but it has not got a true zero point. The Fahrenheit and Celsius scales of temperatures are both examples of data at the interval level of measurement. You can talk about 30 degrees being 60 degrees less than 90 degrees, so differences do make sense. However 0 degrees (in both scales) cold as it may be does not represent the total absence of temperature.Scales of MeasurementPDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.Ratio scales data is purely numeric and it has got a true zero point. Distances, in any system of measurement give us data at the ratio level. A measurement such as 0 feet does make sense, as it represents no length. Furthermore 2 feet is twice as long as 1 foot. So ratios can be formed between the data.Scales of MeasurementYears Sale (in lakhs of $)Progressive total.Progressive average.1972888.019739178.519748258.319757328.019768408.019779498.1197810598.4 197911708.7198011819.0198112939.3198210 103 9.3Calculation Of Progressive AverageCalculate the progressive average of the data Solution: Calculation of progressive averageThe Dow Jones Composite Average is a stock index from Dow Jones Indexes that tracks 65 prominent companies. The average's components are every stock from the Dow Jones Industrial Average, the Dow Jones Transportation Average, and the Dow Jones Utility Average.C.A= Sum of all averages/ number of averagesComposite Average-UseThe Arithmetic Mean is not independent of origin and scaleLet Y=ax+b Then mean of Y= a+b (Mean of X)

Let Y=ax-b Then mean of Y= a-b (Mean of X) Addition/subtraction changes the origin and multiplication/division changes the scale.

PDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.PDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.PDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.PDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.PDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.For a sample of 10 numbers (from x1, the smallest, to x10 the largest) the 20% Winsorized mean is.

The key is in the repetition of x2 and x9: the extras substitute for the original values x1 and x10 which have been discarded and replaced.PDC NUST workshop on SPSS-Trainer Ms. Ayesha Nazuk Rao (Assistant Professor NUST)-July 29, 2013 to July 31 2013.Example: Q.MXX**2-100100000010010000A.M=0Q.M= 81.64966Data Type and Average usedRatios of change, proportions, percentages etc. G.M Rate of change per unit of time such as speed, number of items produced per day etc. H.MWell behaved (that is outlier free) data which is purely quantitative A.M NominalMode Ordinal---Median Median The Middle observation of an arrayed ( that is arranged either in ascending or descending order) is called median.For an ungrouped data it is the item number (n+1)/2. e:g if the data set is 4,6,1,5,4,8. we shall array it 1,4,4,5,6,8 median= item at position (6+1)/2=3.5th item= (3rd item+4th item)/2=(4+5)/2=4.5. So median is 4.5.If data is 1,3,7 median is (3+1)/2=2nd item= 3. Median for grouped data

Median can be found for ordinal dataFor example in a shop there are clothes in different shades of a color. One can arrange the clothes from light shades to darker as follows. S1,S2,S2,S3,S4,S5,S6Median shade is at (7+1)/2=4th shade=S3Grouped data Median CalculationsIntervalfC.FC.B0155-0.51.5236111.53.5452133.55.5Total13-------Locate the class whose C.F covers n/2=sum(f)/2=13/2=7.5Median Class as 7.5 lies hereContd.Median=1.5+(2/6)(7.5-5)=2.33So, the value 2.33 cuts previous data in two equal halves.Mode for grouped datal = the lower limit of modal class = 15f1 = frequency of modal class = 7fo = frequency of class preceding the modal class = 5f2 = frequency of class succeeding the modal class = 2h = size of class intervals = 5

Example ContinuedMode = 15 + [(7 - 5) / (2 x 7 - 5 - 2)] x 5Mode = 15 + [2 / (14 - 7)] x 5Mode = 15 + (2 / 7) x 5Mode = 15 + (10 / 7)Mode = 15 + 1.42Mode = 16.42

Summary of Central Tendency Measures MeasureEquationDescriptionMeanXi / nBalance PointMedian(n+1) Position 2Middle ValueWhen OrderedModenoneMost FrequentQuintiles; Quartiles Deciles and PercentilesQuintiles are break points that divide an arrayed data in to i equal parts.Quartiles divide data into four equal parts. There are three quartiles Q1,Q2 and Q3.Q1= [(n+1)/4]th item, 25% data lies before it.Q2=[2(n+1)/4]th item=[(n+1)/2]th item which is median. 50% data lie before Q2Q3=[3(n+1)/4] item, 75 % data lies before it.Quartiles1.Measure of Non-central Tendency2.Split Ordered Data into 4 Quarters3.Position of i-th Quartile25%25%25%25%Q1Q2Q3Positioning Point of Qi(ni1)4DecilesDeciles are 9 break points that divide an arrayed data in to 10 equal parts.D1=[(n+1)/10]th item, 10% data lies before it. D2=[2(n+1)/10]th item 20% data lie before it.D9=[9(n+1)/10]th item, 90% data lies before it.Note that D5=Q2=MedianPercentiles Percentiles are 99 break points that divide an arrayed data in to 100 equal parts.P25=Q1P75=Q3P10=D1Other relationships may easily be seen.Measures of Variation Data Summary; A Glance To recall the data compaction process,To summarize the data we use graphs and chartsFor more technical analysis, a frequency distribution is made.To report a summary value that may represent the data, we find measure of central tendency. BUT there may be data sets who have same value of central tendency but differ in terms of variation/scatter around the central value.Illustrative example On the average 31 patients get satisfactory treatment from both D1 and D2.However the data, number of patients that come to D1 or D2, is very different.Doctor MeanValues( No. of patients) D130.7512,35,36,40D230.751,4,3,115Measure of Central Tendency are Insufficient Measure of central cannot convey the full picture of data.Specifically they cannot tell us the amount of scatter in the data.If a measure of scatter ( variation) accompanies a measure of central tendency, then data can be more efficiently described.Measure of variation Definition of Measure of VariationMeasure of variation is a measure that describes how spread out or scattered a set of data. It is also known as measures of dispersion or measures of spread. Examples of Measure of VariationSome Important measures of variation: The range, the variance, and the standard deviation. Range The range is the distance between the lowest data point and the highest data point. Range can be misleading since it does not take into consideration every value. Consider each of the following data sets:1,10,10,10,10 and 1,2,5,8,10. Both have a range of 9, yet the first data set is clearly not as dispersed as the second.Variance & Standard Deviation1.Measures of Dispersion2.Most Common Measures3.Consider How Data Are Distributed4.Show Variation About Mean (X or )4681012X = 8.3Sample Variance Formulan - 1 in denominator! (Use N if Population Variance)S(XX)niin2211Standard DeviationThe standard deviation of a set of scores is a measure of variation of scores about the mean. It is calculated by

procedure for finding the standard deviation ( ungrouped data)Find the mean of the scores Subtract the mean from each individual score Square each of the values in step 2 Add up all the squares obtained in step 3 Divide the total in step 4 by n-1 Find the square root of step 5. Ungrouped Sample Data; S.D Find the standard deviation of the data 1, 2, 12, 3, 6 and 11.The mean of X is 5.83Variance is (110.833/5) = 22.166 And S.D is 4.708.

X1-4.823.42-3.814.7126.138.03-2.88.0360.170.03115.1726.7Sum-----110.833

Standard Deviation Grouped Sample dataIntervalf(X)X-A.M(X-A.M)^2f(X-A.M)^20150.5-1.5382.365411.822362.50.4620.21341.2804524.52.4626.061412.12Total13------25.23A.M=sum of f*x/sum of f=26.50/13=2.038S.D=Square root of (25.23/13-1)=0.4185VarianceVariance is the square of S.D Because the differences are squared, the units of variance are not the same as the units of the data. Therefore, the standard deviation is reported as the square root of the variance and the units then correspond to those of the data set.

Interpretation of Standard Deviation:

There are some ideas you remember about standard deviation and variance A small standard deviation means the data is close together, a large deviation means the data is wide spread At least 75% of all scores fall within 2 standard deviations from the mean and at least 89% fall within at least 3 standard deviations from the mean. Welcome to Mathematics & Statistics ClassInstructor: Ms . Ayesha N. Rao76Inter-quartile RangeWhen there are extreme values in a distribution or when the distribution is skewed, variance and standard deviation are not true measures of spread. in these situations inter-quartile range or semi-inter quartile range are preferred measures of spread. Inter quartile range is the difference between the Q1 and Q3. Semi-inter quartile range is half of the difference between the Q1 and Q3. Summary of Variation Measures MeasureEquationDescriptionRangeXlargest - XsmallestTotal SpreadInterquartile RangeQ3 - Q1Spread of Middle 50%Standard Deviation(Sample)XXni21Dispersion aboutSample MeanStandard Deviation(Population)XNiX2Dispersion aboutPopulation MeanVariance(Sample)(Xi -X)2n - 1Squared Dispersionabout Sample MeanRelative Measure of DispersionComparison of data setsUp till now we have been analyzing a single data. Direct comparison of variance/standard deviation is not valid. Because they depend on unit of measurement. For example if we have data on weights of potatoes and another on weights of milk cartons. Then variance of 0.1 kg may be considered large for potatoes but small for milk cartons.Relative measures of variation are those that help in comparing two or more data sets; as to which data is more scattered/dispersed/variable.Coefficient of Variation It is defined as the ratio of the standard deviation to the mean;C.V=S.D/MeanThis is only defined for non-zero mean, and is most useful for variables that are always positive. It does not have any meaning for data on an interval scale.S.D Vs C.VFor example, the value of the standard deviation of a set of weights will be different depending on whether they are measured in kilograms or pounds. The coefficient of variation, however, will be the same in both cases as it does not depend on the unit of measurement. C.V interpretation Lesser the C.V, lesser is the variability in the data.C.V Pros and Cons.AdvantagesThe coefficient of variation is a dimensionless number. So when comparing between data sets with different units or wildly different means, one should use the coefficient of variation for comparison instead of the standard deviation.DisadvantagesWhen the mean value is near zero, the coefficient of variation is sensitive to small changes in the mean, limiting its usefulness. Z-ScoresZ-scores are a means of answering the question ``how many standard deviations away from the mean is this observation?'' If our observation X is from a population with mean and standard deviation , then

Z Score for SampleOn the other hand, if the observation X is from a sample with mean and standard deviation s, then

Z Score InterpretationA positive (negative) Z-score indicates that the observation is greater than (less than) the mean. Example In a certain city the mean price of a quart of milk is 63 cents and the standard deviation is 8 cents. The average price of a package of bacon is $1.80 and the standard deviation is 15 cents. If we pay $0.89 for a quart of milk and $2.19 for a package of bacon at a 24-hour convenience store, which is relatively more expensive? To answer this, we compute Z-scores for each: Solution Z (Milk)=(0.89-0.63)/0.08=3.25Z (Bacon)= (2.19-1.80)/0.15=2.60Our Z-scores show us that we are overpaying quite a bit more for the milk than we are for the bacon.