Probability and Statistics Chapter 1 Notes
Dec 16, 2015
Probability and Statistics
Chapter 1 Notes
Probability and StatisticsChapter 1 Notes I. Section 1-1 A. Definition of Statistics
1. Statistics is the science of collecting, organizing, analyzing,
and interpreting data in order to make decisions.a. Data –
Information coming from observations, counts, measurements,
or responses.1)
There are 2 types of data sets.
a) Population – the collection of all outcomes,
responses, measurements, or counts that are of
interest.
1. In other words, the set of all possible
measurements, counts or observations that are
of interest in a particular study.
Probability and StatisticsChapter 1 Notes I. Section 1-1
b) Sample – A subset of the population.
1. Since it is usually impractical or even impossible
in terms of time or money to obtain every
possible response, we must often rely on
information obtained from a sample.
1. Random Sample: -- A sample in which every
member of the population has an equal
chance of belonging.2) A
central theme, in the study of statistics, is that of
using information obtained from a sample to
make decisions or inferences concerning an entire
population from which the sample has been
drawn.
1. We will study techniques which will enable us
to do this with a high level of reliability.
Probability and StatisticsChapter 1 Notes I. Section 1-1
3)There are 2 types of numerical descriptions
a) Parameter – A numerical description of a
population characteristic.
b) Statistic – A numerical description of a sample
characteristic.B. Branches of Statistics
1. Descriptive Statisticsa. The branch of
statistics that involves the organization, summarization, and
display of data.2. Inferential Statistics.
a. The branch of statistics that involves using a sample to
draw conclusions about a population.
1) A basic tool in the study of inferential statistics is
probability.
Probability and StatisticsChapter 1 NotesII. Section 1-2
A. Types of Data1. Qualitative Data
a. Attributes, labels or nonnumerical entries.
2. Quantitative Dataa. Numerical
measurements or counts.B. Levels of Measurement
1. Nominal Dataa. Consists of
names, categories, qualities, or labels.
Example: type of car you drive.b. Can put data into
categories, but we are unable to determine if one
piece of data is better or higher than another.c. When numbers
are used as labels, such as on an athletic jersey, they are classified as nominal
data.
Probability and StatisticsChapter 1 NotesII. Section 1-2
1) It is of no use whatsoever to know the average of all
jersey numbers of the King’s Fork field hockey team.
2. Ordinal Dataa. Designations or
numerical rankings which can be arranged
in ascending or descending order. 1) TV
ratings for #1 show, #2 show, etc.b. We can compare
rankings as to which is higher, however it does not make sense to subtract one
rank value from another.
1)Differences in rankings are not meaningful
computations.
a) If there are three candidates for a job, they can be
ranked 1, 2, and 3, but there is no way to tell how
far ahead of the second candidate the first
candidate is.
Probability and StatisticsChapter 1 NotesII. Section 1-2
3. Interval Data a. Can be
subtracted to find the difference between two values, put in
order, and put into categories.b. Data is
numerical; 0 can be used to indicate a position in time or space, however, the
zero at this level does not correspond to “none” of the specific
variable being measured.
1)The position on the thermometer of zero degrees
does not indicate that is absolutely no heat present.c. Differences
between data values are meaningful but it does not make sense to
compare one data value as being twice (or any multiple of) another.
1) A temperature of 2 degrees is not twice as warm as a
temperature of 1 degree.
Probability and StatisticsChapter 1 NotesII. Section 1-2
4. Ratio Dataa. The highest level
of measurement. 1)
The number of gallons of gasoline you put into your
car today.b. There is a zero on
this scale which is interpreted as “none” of the
variable in question.1) It
is possible to put zero gallons of gas into your tank today.2)
This is called an “inherent” zero.c. It is meaningful
to say one measure is two times, or three times, as much as another.
1)You may have put twice as much gas in your car today
than you did last week.
Probability and StatisticsChapter 1 NotesII. Section 1-2
5. How to tell Interval data from Ratio data.
a. Does the expression “twice as much” have any meaning in
the context of the data?1) $2
is twice as much as $1, so these data points are at the ratio level.
2) A temperature of 2 degrees is NOT twice as warm as 1
degree is, so these data points are at the interval level.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
A. Design of a Statistical Study1. Identify the variable(s) of
interest (the focus) and the population of the study.2. Develop a detailed plan for
collecting data.3. Collect the data.4. Describe the data, using
descriptive statistics techniques.5. Interpret the data and make
decisions about the population using inferential statistics.
6. Identify any possible errors.B. Data Collection
1. Do an Observational Studya. Observe and
measure characteristics of interest of part of a population, but do NOT
change existing conditions.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
B. Data Collection2. Do an Experiment
a. Apply a treatment to part of a population and observe
responses or results.b. Observe another part
of the population as a control group.
1)May use a placebo in place of the treatment being
tested.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
B. Data Collection3. Use a simulation
a. Use a mathematical or physical model to reproduce the
conditions of a situation or process.
1) Simulations allow us to study situations that are
impractical or even dangerous to create in real life.
a) Testing the effects of alcohol on a pilot’s ability to
fly is best done in a flight simulator2)
Simulations often save time and/or money.4. Use a survey (census)
a. A survey is an investigation of one or more
characteristics of a population.1) Usually
carried out on people by asking them to
respond to questions.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
B. Data Collectionb. It’s important to
word the questions so that they do not lead to biased results.
C. Experimental Design1. Experiments must be carefully
designed in order to produce meaningful, unbiased, results.
a. The Hawthorne effect occurs in an experiment when
subjects change their behavior simply because they know
they are participating in an experiment.2. Three key elements of a well-
designed experiment are control, randomization, and replication.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
C. Experimental Designa. Control
1) It is important to control as many influential factors as
possible in a study.
2)When an experimenter cannot tell the difference
between the effects of different factors in an
experiment, a confounding variable has occurred.
3)Placebo effect occurs when a subject reacts favorably
to a placebo when in fact they have been given no
medical treatment at all.
a) Blinding is a technique used in which the subject
does not know whether he or she is receiving a real
treatment or a placebo.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
C. Experimental Design
b) Double-blind experiments occur when neither the
subjects nor the experimenter know which
individual subjects are receiving a treatment or a
placebo.
1. The experimenter only finds out which subjects
are which after all the data have been collected.
b. Randomization is a process of randomly assigning
subjects to different treatment groups.
1)Randomized block design – Divide subjects with
similar characteristics into blocks, and then randomly
split each block up into different treatment groups.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
C. Experimental Design2)
Matched-pairs design – Subjects are paired up
according to a similarity.
a) One subject in each pair is randomly selected to
receive one treatment, while the other one gets
another, different treatment.
c. Replication is the repetition of an experiment using a
large group of subjects.1)
The larger the sample size, the better.D. Sampling Techniques
1. Census – a count or measure of an entire population.
a. Provides complete information, but is often too costly or
difficult to perform.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
D. Sampling Techniques2. Sampling – a count or measure
of part of a population.a. Researcher must
ensure that the sample is
representative of the population.1)
This is necessary to ensure that inferences about a
population are valid.
a) Sampling error – the difference between the
results of a sample and those of the population.
b. Random sample – a sample in which every member of
the population has an equal chance of being selected.
1)Methods of sampling randomly
Probability and StatisticsChapter 1 NotesIII. Section 1-3
D. Sampling Techniques
a) Simple Random Sample – assign each member of
the population a number and then randomly select
the numbers that you will survey.
1. Random number table (Appendix B of the book)
a. Randomly pick a starting point
b. Count off digits in groups that match how
many digits your population has.
c. Record the numbers, ignoring those that are
larger than the population size.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
D. Sampling Techniques
2. Calculator
a. Press Math, select PRB, press 5(randInt)
b. Enter the number that you started with when
assigning labels to your population, then a
comma, then the last number you assigned,
comma, and the sample size you wish to use.
1) The calculator will generate the requested
quantity of random numbers.
3. If you do not want to have any member of the
population included in the sample twice, the
sampling process is said to be without
replacement.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
D. Sampling Techniques
4. If you don’t care if a member of the population
is included twice, the sampling process is said to
be with replacement.
b) Stratified Sample
1. Separate population into two or more subsets,
called strata, using some similar characteristic.
a. Randomly select members of each strata to
make up your sample.
c) Cluster Sample
1. When the population is already divided into
subsets that are very similar to each other, you
could randomly select a number of entire
groups (not all the groups) and do your data
collection on those groups.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
D. Sampling Techniques
a. We call these groups clusters.
d) Systematic Sample
1) Each member of the population is assigned a
number.
a. Put the members of the population in order
somehow.
b. Randomly select a starting point.
c. Randomly select an interval.
d. Survey every nth member of the population
from your starting point.
Probability and StatisticsChapter 1 NotesIII. Section 1-3
D. Sampling Techniques
e) Convenience Sample
1) NOT RECOMMENDED!!
a. Simply select those members of the
population who are readily available.
QUIZ on Chapter 1 Sections 1 and 2 during next class block
Friday (ODD) and Monday (EVEN)
TEST on Chapter 1 next weekTuesday (ODD) and Wednesday (EVEN)