Top Banner

of 44

Introduction to STATISTICS-

Apr 05, 2018

Download

Documents

Nitin Mishra
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/2/2019 Introduction to STATISTICS-

    1/44

    Introduction toSTATISTICS

  • 8/2/2019 Introduction to STATISTICS-

    2/44

    Statisticsis the science of conducting studiesto collect, organize, summarize, analyze,

    present, interpret and draw conclusions fromdata.

  • 8/2/2019 Introduction to STATISTICS-

    3/44

    What is data?

    It is the collection of facts, concepts orinstructions in a formalized mannersuitable for communication or

    processing by human. Collection of data is known as a dataset and a single observation a datapoint.

  • 8/2/2019 Introduction to STATISTICS-

    4/44

    Statistics- IntroductionMost people become familiar with probability and statistics through

    radio, television, newspapers, and magazines. For example, thefollowing statements were found in newspapers.

    Based on the 2000 census, 40.5 million households have two vehicles. The average age of top 50 powerful persons in India is decreased from 58 years in

    2003 to 54 years in 2006. The average cost of a wedding is nearly Rs 10,00,000.

    Women who eat fish once a week are 29% less likely to develop heart disease.

  • 8/2/2019 Introduction to STATISTICS-

    5/44

    PopulationThe complete collection of

    measurements outcomes, object orindividual under study

    SampleA subset of a population, containing

    the objects or outcomes that areactually observed

    ParameterA number that describes apopulation characteristics

    StatisticA number that describes a sample

    characteristics

    Basic ConceptsData

    An information coming fromobservations, counts,measurements, or responses.

    The basic idea behind all statistical methods of data analysis is to makeinferences about a population by studying small sample chosen from it

  • 8/2/2019 Introduction to STATISTICS-

    6/44

    Samples and Populations

  • 8/2/2019 Introduction to STATISTICS-

    7/44

    Descriptive Statistics

    Consists of the collection, organization,classification, summarization, andpresentation of data obtain from thesample.

    Used to describe the characteristics ofthe sample

    Used to determine whether the samplerepresent the target population bycomparing sample statistic and

    population parameter

  • 8/2/2019 Introduction to STATISTICS-

    8/44

    Inferential Statistics

    Consists of generalizing from samplesto populations, performing estimationshypothesis testing, determiningrelationships among variables, and

    making predictions. Used when we want to draw a

    conclusion for the data obtain from thesample

    Used to describe, infer, estimate,

    approximate the characteristics of the

  • 8/2/2019 Introduction to STATISTICS-

    9/44

    Inferences

    Consider:o Average length of females and males:

    90cm and 100cm respectively.o Descriptive statistics: the values.o Inference: males are (in general) larger

    than females.

  • 8/2/2019 Introduction to STATISTICS-

    10/44

    An overview of descriptive statisticsand statistical inference

    StatisticalInference

    DescriptiveStatistics

    No

    Yes

  • 8/2/2019 Introduction to STATISTICS-

    11/44

    Data Collection

    Collect datao e.g. Survey

    Present datao

    e.g. Tables and graphs

    Characterize datao e.g. Sample mean =

    Mean weight is 120 pounds

  • 8/2/2019 Introduction to STATISTICS-

    12/44

    Types of data

    Qualitative/ Categorical andQuantitative/Numericalo Nominal, Ordinal, Interval and Ratioo Discrete

    --Nominal and ordinalo Continuous

    -- Interval and ratio Cross-sectional , Temporal and Spatial

  • 8/2/2019 Introduction to STATISTICS-

    13/44

    Data Types

    Data

    Qualitative QuantitativeData Types

    Nominal Ordinal Interval RatioLevels of

    Measurement

    Discrete Discrete or continuous

  • 8/2/2019 Introduction to STATISTICS-

    14/44

    Qualitative/ Categoricalvariables Here, data are classified on the basis

    of some attribute or quality such asgender, literacy, religion, employment

    etc. These attributes under study cannotbe measured.

    One can only find out whether it ispresent or absent in the units ofpopulation under study.

  • 8/2/2019 Introduction to STATISTICS-

    15/44

    Example

    Attribute under studyblindness Here, we can determine how many

    persons are blind in a given population. It is not possible to measure the degree

    of blindness in each case. Attributes can be: Gender (Males and females) Literacy (literates and illiterates)

    Employment (employed and

    unemployed)

  • 8/2/2019 Introduction to STATISTICS-

    16/44

    Two types of categorical variables

    Nominal Ordinal

  • 8/2/2019 Introduction to STATISTICS-

    17/44

    Nominal data

    Nominal data are the labels orassigned numbers.

    Car number

    Roll number STD code Color of bike House number Such data are used for identifying

    individuals and places .

  • 8/2/2019 Introduction to STATISTICS-

    18/44

    Ordinal data

    Ordinal data can be arranged inorder such as worst to best or bestto worst

    Same as nominal but there is an orderwithin the groups into which the datais classified.

    Unable to say by how much they differ

    from each other.-- Rating of hotels, restaurants andmovies.

  • 8/2/2019 Introduction to STATISTICS-

    19/44

    Quantitative/Numericalvariables Here, the data are classified on the

    basis of some characteristics capableof quantitative measurements such

    as:

    Marks scored by students in class Height of individuals Income of individuals

    Age of individuals Expenditure of individuals

  • 8/2/2019 Introduction to STATISTICS-

    20/44

    Two types of Quantitativevariables

    Interval data Ratio data Quantitative variables can be discreteor continuous.

  • 8/2/2019 Introduction to STATISTICS-

    21/44

    Interval data

    Interval data can be on a numericalscale .

    zero point does not mean absence

    of property.

    Temperature

  • 8/2/2019 Introduction to STATISTICS-

    22/44

    Ratio data

    It possess all the properties of intervaldata with meaningful ratio of twovalues

    Ratio data differ from interval data inthat there is a definite zeropoint(nothing exists for the variable atzero point)

    Height Weight Price Length Sales revenue

  • 8/2/2019 Introduction to STATISTICS-

    23/44

    Discrete variables

    The variables is said to be discrete if itassumes only some specific values.

    Discrete variables arises in a

    situation where counting isinvolved.o number of credit cards held by an

    individual

    o number of defective items in boxes of 100items

    o number of students in the class

  • 8/2/2019 Introduction to STATISTICS-

    24/44

    Continuous variables

    Continuous variables arises insituations when some sort ofmeasurement is involved in range.o life of an electric bulbo waiting time for customers at a banks

    counter.o rainfallo

    temperature

    Case Let

  • 8/2/2019 Introduction to STATISTICS-

    25/44

    The ABC Marketing Corporation has asked you for informationabout the car you drive. For each question, identify each of the typesof data requested as either Qualitative data or Quantitative data.When numeric data is requested, identify the variable as discrete orcontinuous.1. What is the weight of your car?2. In which city was your car made?3. How many people can be seated in your car?4. Whats the distance traveled from your home to your school?5. Whats the color of your car?6. How many cars are in your household?7. Whats the length of your car?

    Case Let

  • 8/2/2019 Introduction to STATISTICS-

    26/44

    Levels of Measurement

    Level Put incategories

    Arrangein order

    Subtractvalues

    Dividevalues

    Nominal Yes No No No

    Ordinal Yes Yes No No

    Interval Yes Yes Yes No

    Ratio Yes Yes Yes Yes

  • 8/2/2019 Introduction to STATISTICS-

    27/44

    Cross-sectional Data

    Cross-sectional data comprises of a variable recordedover at the same point or period of time for manyindividuals , organization, places etc.o

    Ages of all students at the time of joining IMS , in theyear 2008.o Number of students enrolled in IIM, in the year 2008.o Stock prices of Infosys Technologies, TCS, and

    Wipro on31st March 2008.o Population of Delhi, Mumbai , Chennai and Kolkata

    as per 2001 census.

  • 8/2/2019 Introduction to STATISTICS-

    28/44

    Temporal Data

    Temporal data also referred as time-seriesdata , is the data about an individualorganization , places etc over a period of time.

    Marks obtained by student from standardI to XII.

    Total business of ICICI bank as at the endof last five years.

    Population of India from the year 1931 to2001

  • 8/2/2019 Introduction to STATISTICS-

    29/44

    Spatial Data

    Spatial data is the data based ongeographical location basis.

    Income tax collection from variousstates

    Sales of Times Of India in Delhi.Production of wheat in different states

    of the country

  • 8/2/2019 Introduction to STATISTICS-

    30/44

    Data Collection Techniques

    Method of DataCollection

    Data collected andrecorded by

    others(secondarystudy)

    Data collected directlyfrom the field of

    enquiry(primary data)

  • 8/2/2019 Introduction to STATISTICS-

    31/44

    Primary Data

    Data originally collected in the process ofinvestigations are known as primary data.

    Primary data consists of figures collectedat first hand in order to satisfy the

    purpose of a particular statistical enquiry. Merits :

    o Original in natureo More reliable and accurateo Can be used with greater confidence bz the

    enquirer knows its origin.o Exactly matches the needs of the project.

  • 8/2/2019 Introduction to STATISTICS-

    32/44

    Demerits :o Expensiveo Time-consumingo Collection of data involves creating new

    definitions and measuring instrumentssuch as questionnaires or interview formsand training people to use thesespecifically designed instruments.

  • 8/2/2019 Introduction to STATISTICS-

    33/44

    Data Collection Techniques

    Collection ofPrimary Data

    Mailed QuestionnaireMethod

    Direct PersonalInvestigation

    Indirect OralObservation

    ObservationInterviewSchedule Sent Through

    Investigator

  • 8/2/2019 Introduction to STATISTICS-

    34/44

    Collection of primary data

    Direct personal investigationo Personal interview ( the investigator personally

    approaches each informant and gathers therequired information)

    o Personal observation ( here, rather than askinganybody, the investigator personally observesand records the information related to aparticular field)

    Indirect oral observation (here, instead of

    directly approaching the actual field or person,data are collected from third party informant)

    Questionnaire method ( here, a well-prepared questionnaire is given to a list of

    persons with the request to return them duly

  • 8/2/2019 Introduction to STATISTICS-

    35/44

    Designing a Questionnaire

    The no. of questions should be as few aspossible

    Questions should be of objective type.Yes or no type or simple tick marking

    answers are preferred. Questions should be properly arranged to

    have a systematic and easy flow ofanswer.

    Questions affecting the sentiment andpride of the respondent should beavoided.

    Necessary instructions and guidelines

  • 8/2/2019 Introduction to STATISTICS-

    36/44

    Types of Questionnaires

    Structured or Non structuredquestionnaire.

    Disguised and Non disguised

    questionnaire.

    Structured or Non structured

  • 8/2/2019 Introduction to STATISTICS-

    37/44

    Structured or Non structuredquestionnaire

    Structured questionnaire: consists of a set ofquestions arranged in a predetermined order .Each question requires the respondent to make

    a choice among a few given predeterminedresponses.

    Example : How frequently do you go to watch a movie? Choices (Very frequently, often, sometimes,

    never) Such questions are called closed questions.

  • 8/2/2019 Introduction to STATISTICS-

    38/44

    Non Structured questionnaire: consists ofwhat are called open-ended questions.

    Example: How do you spend your free time? How do you describe the ambience of the

    new store? Such questions give the respondent

    freedom to answer according to their viewsand opinions.

    Disguised and Non disguised

  • 8/2/2019 Introduction to STATISTICS-

    39/44

    Disguised and Non disguisedquestionnaire

    Non

    disguised questionnaire: here, thepurpose or objectives of the study are madeknown to the respondent.

    Disguised questionnaire: here, respondents arenot taken into confidence regarding purpose orobjectives of the study.

    Disguised questionnaire is not very popular asrespondents may not be forthcoming in theiranswers when they do not know the objectivesor relevance of the questions or the study.

    S d d

  • 8/2/2019 Introduction to STATISTICS-

    40/44

    Secondary data

    Secondary data consists of figureswhich were collected originally tosatisfy a particular enquiry but noware being used for different enquiry.

    Sources of secondary data:o Journalso Reportso Government and non-Governmentpublications.

    C

  • 8/2/2019 Introduction to STATISTICS-

    41/44

    Data Collection Techniques

    Collection ofSecondary Data

    Journals,

    News-papers

    Publication by

    Government /InternationalOrganization

    Universities

    and ResearchOrganizations

    BooksInternet

  • 8/2/2019 Introduction to STATISTICS-

    42/44

    Merits :o Readily availableo Less expensive compared to primary datao Less time consuming compared to primary

    data

    Demerits :o These may not be relevant in the present

    context.o

    These may not have the needed accuracy orreliability.o These may not be adequate.

    T pes of secondar data

  • 8/2/2019 Introduction to STATISTICS-

    43/44

    Types of secondary data

    Internal or external Internal

    o Company Reports , Intranet External

    o Newspaper, magazines, websites, RBIpublications

    Summary

  • 8/2/2019 Introduction to STATISTICS-

    44/44

    The two major areas of statistics are descriptiveand inferential. When the populationsto be studied are large, statisticians usesubgroups called samples. Data can be classified as qualitativeor quantitative. The four basic types of measurement are nominal, ordinal,

    interval, and ratio.

    y