Top Banner
Correlations and t Scores Dr. Pedro L. Martinez Introduction to Educational Research and Assessment
66

Correlations and t scores (2)

May 22, 2015

Download

Education

Pedro Martinez

Help and Guide Study for Students Struggling with basic research concepts
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Correlations and t scores (2)

Correlations and t ScoresCorrelations and t Scores

Dr. Pedro L. Martinez

Introduction to Educational Research and Assessment

Dr. Pedro L. Martinez

Introduction to Educational Research and Assessment

Page 2: Correlations and t scores (2)

Populations and SamplesPopulations and Samples Population - all the individuals of interest in a particular study

(characteristic that describes a population is called a parameter)

Sample - set of individuals selected from a population, usually intended to represent the population (characteristic that describes a sample is called a statistic)

Understanding the relationship between the population and a sample from that population is always important in understanding the use of inferential statistics and why formulas may differ for each of them why don’t we just study the population?

Population - all the individuals of interest in a particular study (characteristic that describes a population is called a parameter)

Sample - set of individuals selected from a population, usually intended to represent the population (characteristic that describes a sample is called a statistic)

Understanding the relationship between the population and a sample from that population is always important in understanding the use of inferential statistics and why formulas may differ for each of them why don’t we just study the population?

Page 3: Correlations and t scores (2)

Populations and SamplesPopulations and Samples Population - all the individuals of interest in a particular

study (characteristic that describes a population is called a parameter)

Sample - set of individuals selected from a population, usually intended to represent the population characteristic that describes a sample is called a statistic)

Understanding the relationship between the population and a sample from that population.

If we are mainly interested in the population, why don’t we just study the population?

Population - all the individuals of interest in a particular study (characteristic that describes a population is called a parameter)

Sample - set of individuals selected from a population, usually intended to represent the population characteristic that describes a sample is called a statistic)

Understanding the relationship between the population and a sample from that population.

If we are mainly interested in the population, why don’t we just study the population?

Page 4: Correlations and t scores (2)

Variables and DataVariables and Data

Variable - characteristic or condition that changes or has different values for different individuals.

Data - measurements or observations (plural) Data set - collection of measurements or

observation Datum - single measurement (also called

score or raw score)

Variable - characteristic or condition that changes or has different values for different individuals.

Data - measurements or observations (plural) Data set - collection of measurements or

observation Datum - single measurement (also called

score or raw score)

Page 5: Correlations and t scores (2)

Inferential Statistics Inferential Statistics

Inferential Statistics - techniques that allow us to study samples and then make generalizations about the populations from which they were selected.

z scores helped us to make comparisons between different distributions by standardization

Sampling error - discrepancy that exists between a sample statistic and the corresponding population parameter

Inferential Statistics - techniques that allow us to study samples and then make generalizations about the populations from which they were selected.

z scores helped us to make comparisons between different distributions by standardization

Sampling error - discrepancy that exists between a sample statistic and the corresponding population parameter

Page 6: Correlations and t scores (2)

Research MethodsResearch Methods

Correlational Method - two different variables are observed to determine whether there is a relationship between them

Experimental Method - one variable is manipulated while another variable is observed and measured. Used to establish cause-and-effect relationship between variables

Correlational Method - two different variables are observed to determine whether there is a relationship between them

Experimental Method - one variable is manipulated while another variable is observed and measured. Used to establish cause-and-effect relationship between variables

Page 7: Correlations and t scores (2)

Correlational DesignsCorrelational Designs

Represents the strength of the relationship between two variables e.g., # of hours of studying and test grades e.g., motivation and persistence e.g., ability to delay gratification as a child and success in college

Scatter plot

Correlation coefficient (“r”) ranges from -1 to +1 e.g., r = +.34 e.g., r = -.52

Represents the strength of the relationship between two variables e.g., # of hours of studying and test grades e.g., motivation and persistence e.g., ability to delay gratification as a child and success in college

Scatter plot

Correlation coefficient (“r”) ranges from -1 to +1 e.g., r = +.34 e.g., r = -.52

Page 8: Correlations and t scores (2)

Interpreting CorrelationsInterpreting Correlations Positive correlation

increase in studying associated with increase in tests grade scores

Negative correlation increase in studying associated with decrease in

test grade scores

No correlation (0 correlation) Variables are not related

Correlation may exist but it might be associated with another variable (e.g. persistence and number of hour studying a subject)

Positive correlation increase in studying associated with increase in

tests grade scores

Negative correlation increase in studying associated with decrease in

test grade scores

No correlation (0 correlation) Variables are not related

Correlation may exist but it might be associated with another variable (e.g. persistence and number of hour studying a subject)

Page 9: Correlations and t scores (2)

Correlations: Positive, Negative, and NoneCorrelations: Positive, Negative, and None

Page 10: Correlations and t scores (2)

Correlation ≠ CausationCorrelation ≠ Causation

Page 11: Correlations and t scores (2)

Why can’t we infer causality?Why can’t we infer causality? Reverse-Causality Problem

X → Y or Y ← X ????

Reverse-Causality ProblemX → Y or Y ← X ????

Is there a relationship between exposure to

violent films and aggression?

Page 12: Correlations and t scores (2)

Why can’t we infer causality?Why can’t we infer causality?

Reverse-Causality ProblemX → Y or Y ← X ????

Third-variable problemA → X and A → Y

e.g., ice cream sales and violence (r = +.29)

VERY IMPORTANT FOR INTERPRETING NEWS ABOUT HEALTH RESEARCH!!!

Reverse-Causality ProblemX → Y or Y ← X ????

Third-variable problemA → X and A → Y

e.g., ice cream sales and violence (r = +.29)

VERY IMPORTANT FOR INTERPRETING NEWS ABOUT HEALTH RESEARCH!!!

Page 13: Correlations and t scores (2)

Explaining Correlations: Three PossibilitiesExplaining Correlations: Three Possibilities

Page 14: Correlations and t scores (2)

Not CausalityNot Causality

It is essential that you remember this point: A nonexperimental research study could be seriously flawed if you are interested in concluding that an observed relationship is a causal relationship.

That's because "observing a relationship between two variables is not sufficient grounds for concluding that the relationship is a causal relationship."  (Remember this important point!)

It is essential that you remember this point: A nonexperimental research study could be seriously flawed if you are interested in concluding that an observed relationship is a causal relationship.

That's because "observing a relationship between two variables is not sufficient grounds for concluding that the relationship is a causal relationship."  (Remember this important point!)

Page 15: Correlations and t scores (2)

The Three Necessary Conditions for Cause-and-Effect Relationships

The Three Necessary Conditions for Cause-and-Effect Relationships

It is essential that your remember that researchers must establish three conditions if they are to make a defensible conclusion that changes in variable A cause changes in variable B. Here are the conditions (which have been agreed upon by reserachers in a summary table:

It is essential that your remember that researchers must establish three conditions if they are to make a defensible conclusion that changes in variable A cause changes in variable B. Here are the conditions (which have been agreed upon by reserachers in a summary table:

Page 16: Correlations and t scores (2)

Three Necessary Conditions for CausationThree Necessary Conditions for Causation

Condition #1 Condition #2 Condition #3

Variable A and Variable B must be related

Proper time sequence must be established (temporal antecedent condition)

The relationship between Variables A & B must not be through a confounding extraneous variable (the third variable and the lack of an alternative explanation condition )

Page 17: Correlations and t scores (2)

Applying the Three Necessary Conditions for Causation  in Nonexperimental Research

Applying the Three Necessary Conditions for Causation  in Nonexperimental Research

Non-experimental research is much weaker than strong and quasi experimental research for making justified judgments about cause and effect.

 It is, however, quite easy to establish condition 1 in non-experimental research—just see if the variables are related  For example, Are the variables correlated? or Is there a difference between the means?

 It is much more difficult to establish conditions 2 and 3 (especially 3).

Non-experimental research is much weaker than strong and quasi experimental research for making justified judgments about cause and effect.

 It is, however, quite easy to establish condition 1 in non-experimental research—just see if the variables are related  For example, Are the variables correlated? or Is there a difference between the means?

 It is much more difficult to establish conditions 2 and 3 (especially 3).

Page 18: Correlations and t scores (2)

Designs Used to Avoid PitfallsDesigns Used to Avoid Pitfalls When attempting to establish condition 3,

researchers use logic and theory (e.g., make a list of extraneous variables that you want to measure in your research study), control techniques (such as statistical control and matching), and design approaches (such as using a longitudinal design rather than a cross-sectional design).

 When attempting to establish condition 3, researchers use logic and theory (e.g., make a list of extraneous variables that you want to measure in your research study), control techniques (such as statistical control and matching), and design approaches (such as using a longitudinal design rather than a cross-sectional design).

Page 19: Correlations and t scores (2)

Real Life ExampleReal Life Example

Here is one more example of controlling for a variable: There is a relationship between gender and income in the United States. In particular, men earn more money than women. Perhaps this relationship would disappear if we controlled for the amount of education people had. What do you think? To test this alternative explanation (i.e., it is due not to gender but to education) you could examine the average income levels of males and females ate each of the levels of education (i.e., to see if males and females who have equal amounts of education differ in income levels). If gender and income are still related (i.e., if men earn more money than women at each level of education) then you would conclude make this conclusion: “After controlling for education, there is still a relationship between gender and income.” And, by the way, that is exactly what happens if you examine the real data (actually the relationship becomes a little smaller but there is still a relationship). Can you think of any additional variables you would like to control for? That is, are there any other variables that you think will eliminate the relationship between gender and income?

Here is one more example of controlling for a variable: There is a relationship between gender and income in the United States. In particular, men earn more money than women. Perhaps this relationship would disappear if we controlled for the amount of education people had. What do you think? To test this alternative explanation (i.e., it is due not to gender but to education) you could examine the average income levels of males and females ate each of the levels of education (i.e., to see if males and females who have equal amounts of education differ in income levels). If gender and income are still related (i.e., if men earn more money than women at each level of education) then you would conclude make this conclusion: “After controlling for education, there is still a relationship between gender and income.” And, by the way, that is exactly what happens if you examine the real data (actually the relationship becomes a little smaller but there is still a relationship). Can you think of any additional variables you would like to control for? That is, are there any other variables that you think will eliminate the relationship between gender and income?

Page 20: Correlations and t scores (2)

Other conditions…Other conditions…

When attempting to establish condition 2, researchers use logic and theory (e.g., we know that biological sex occurs before achievement on a math test) a design approach that is used for this condition is a longitudinal research because it establishes proper time order.

Condition 3 is a serious problem in nonexperimental research because it is always possible that an observed relationship is "spurious" (i.e., due to some confounding extraneous variable or "third variable").

When attempting to establish condition 2, researchers use logic and theory (e.g., we know that biological sex occurs before achievement on a math test) a design approach that is used for this condition is a longitudinal research because it establishes proper time order.

Condition 3 is a serious problem in nonexperimental research because it is always possible that an observed relationship is "spurious" (i.e., due to some confounding extraneous variable or "third variable").

Page 21: Correlations and t scores (2)

Advantages of Correlational MethodsAdvantages of Correlational Methods

Allow assessment of behavior as it occurs in people’s everyday lives

Allow study of variables that can’t be studied in experimental designs (e.g. smoking, cancer)

Establishes that a relationship between 2 variables exists

One very serious disadvantage

CORRELATION IS NOT CAUSATION!

Allow assessment of behavior as it occurs in people’s everyday lives

Allow study of variables that can’t be studied in experimental designs (e.g. smoking, cancer)

Establishes that a relationship between 2 variables exists

One very serious disadvantage

CORRELATION IS NOT CAUSATION!

Page 22: Correlations and t scores (2)

Experimental MethodExperimental Method

Cornerstone of psychological research. Used to examine cause-and-effect relationships. Two essential characteristics:

Researcher has control over the experimental procedures to make sure that everything (but the variable being manipulated) stays the same.

Researcher manipulates one variable by changing its value from one level to another. A second variable is observed (measured) to determine whether the manipulation causes changes to occur

Participants are randomly assigned to different treatment conditions (they cannot self select).

Cornerstone of psychological research. Used to examine cause-and-effect relationships. Two essential characteristics:

Researcher has control over the experimental procedures to make sure that everything (but the variable being manipulated) stays the same.

Researcher manipulates one variable by changing its value from one level to another. A second variable is observed (measured) to determine whether the manipulation causes changes to occur

Participants are randomly assigned to different treatment conditions (they cannot self select).

Page 23: Correlations and t scores (2)

Participant and Environmental VariablesParticipant and Environmental Variables

Participant Variables - things like gender, age, and intelligence. Vary from one individual to another

Environmental Variables - lighting, time of day, weather. Must be the same across conditions

Participant Variables - things like gender, age, and intelligence. Vary from one individual to another

Environmental Variables - lighting, time of day, weather. Must be the same across conditions

Page 24: Correlations and t scores (2)

Random Sampling vs. Random AssignmentRandom Sampling vs. Random Assignment Random Sampling

Selecting Ps to be in study so that everyone in population has an equal chance of being in the study.

Representative samples Generalization

Is it possible?

Random Sampling

Selecting Ps to be in study so that everyone in population has an equal chance of being in the study.

Representative samples Generalization

Is it possible?

Random Assignment

Assigning Ps (who are already in study) to the different conditions so that each P as equal chance of being in any of the conditions.

Equalizes the conditions of experiment so that it is unlikely that conditions differ because of pre-existing differences

Required for inferences of causality.

Random Assignment

Assigning Ps (who are already in study) to the different conditions so that each P as equal chance of being in any of the conditions.

Equalizes the conditions of experiment so that it is unlikely that conditions differ because of pre-existing differences

Required for inferences of causality.

Page 25: Correlations and t scores (2)

VariablesVariables

Independent Variable variable that we expect causes an outcome the antecedent event variable that the experimenter can control and

manipulate

Dependent Variable the “effect” the outcome variable it’s value depends on the changes introduced by

the IV

Independent Variable variable that we expect causes an outcome the antecedent event variable that the experimenter can control and

manipulate

Dependent Variable the “effect” the outcome variable it’s value depends on the changes introduced by

the IV

Page 26: Correlations and t scores (2)

IVs and ConditionsIVs and Conditions

Must have at least two conditions (also called “levels”) of the IV in order to demonstrate that the IV has an effect on the DV. Otherwise, it wouldn’t be a ‘variable’.

Experimental condition (IV present) vs. control condition (IV not present) Those in control condition receive no treatment or receive neutral,

placebo treatment. Provides baseline for comparison with experimental condition.

Example interested in mood and helping

experimental group – told they received “A” or “F” control group – does not grade feedback

Must have at least two conditions (also called “levels”) of the IV in order to demonstrate that the IV has an effect on the DV. Otherwise, it wouldn’t be a ‘variable’.

Experimental condition (IV present) vs. control condition (IV not present) Those in control condition receive no treatment or receive neutral,

placebo treatment. Provides baseline for comparison with experimental condition.

Example interested in mood and helping

experimental group – told they received “A” or “F” control group – does not grade feedback

Page 27: Correlations and t scores (2)

Laboratory ExperimentsLaboratory Experiments

Conducted in settings in which: The environment can be controlled.

E.g., temperature, amount of light in room

The participants can be carefully studied.E.g., Ps remain in the same seat

Conducted in settings in which: The environment can be controlled.

E.g., temperature, amount of light in room

The participants can be carefully studied.E.g., Ps remain in the same seat

Page 28: Correlations and t scores (2)

Field ExperimentsField Experiments

Conducted in real-world settings. Advantage: People are more likely to

act naturally. Disadvantage: Experimenter has less control

(“quasi-experiments”). Quasi-independent variable - independent variable

in nonexperimental study. Typically something the experimenter cannot manipulate such as gender or smoking

Conducted in real-world settings. Advantage: People are more likely to

act naturally. Disadvantage: Experimenter has less control

(“quasi-experiments”). Quasi-independent variable - independent variable

in nonexperimental study. Typically something the experimenter cannot manipulate such as gender or smoking

Page 29: Correlations and t scores (2)

Field ExperimentsField Experiments

Conducted in real-world settings. Advantage: People are more likely to

act naturally. Disadvantage: Experimenter has less control

(“quasi-experiments”). Quasi-independent variable - independent variable

in nonexperimental study. Typically something the experimenter cannot manipulate such as gender or smoking

Conducted in real-world settings. Advantage: People are more likely to

act naturally. Disadvantage: Experimenter has less control

(“quasi-experiments”). Quasi-independent variable - independent variable

in nonexperimental study. Typically something the experimenter cannot manipulate such as gender or smoking

Page 30: Correlations and t scores (2)

Researchers are interested in influences on self-esteem. Specifically, researchers want to assess how performing a difficult task under pressure influences college students’ self-esteem. Ps are given a set of anagrams to solve. Half are randomly assigned to receive very easy anagrams, and half are given difficult ones. Crossed with this, half are randomly assigned to be given 10 minutes to complete the anagrams, and half are given 30 minutes to complete the task. After completing as many of the anagrams as they can, Ps are given a Q’aire labeled “Thoughts and Feelings Questionnaire” that is really a measure of self-esteem.

Researchers are interested in influences on self-esteem. Specifically, researchers want to assess how performing a difficult task under pressure influences college students’ self-esteem. Ps are given a set of anagrams to solve. Half are randomly assigned to receive very easy anagrams, and half are given difficult ones. Crossed with this, half are randomly assigned to be given 10 minutes to complete the anagrams, and half are given 30 minutes to complete the task. After completing as many of the anagrams as they can, Ps are given a Q’aire labeled “Thoughts and Feelings Questionnaire” that is really a measure of self-esteem. Constructs

IV1: task difficulty IV2: pressure DV: self-esteem

Constructs IV1: task difficulty IV2: pressure DV: self-esteem

Operational IV1: easy vs. hard IV2: 10 vs. 30 minutes DV: score on Q’aire

Operational IV1: easy vs. hard IV2: 10 vs. 30 minutes DV: score on Q’aire

Page 31: Correlations and t scores (2)

Discrete and Continuous VariablesDiscrete and Continuous Variables Discrete Variable - consists of separate, indivisible

categories. No values can exist between two neighboring categories.

One choice in a five point scale (1,2, 3, 4,5). Continuous Variable - infinite number of possible

values that fall between any two observed values. Divisible into an infinite number of fractional parts

Low Normal High Agree, strongly agree, disagree, strongly disagree

Discrete Variable - consists of separate, indivisible categories. No values can exist between two neighboring categories.

One choice in a five point scale (1,2, 3, 4,5). Continuous Variable - infinite number of possible

values that fall between any two observed values. Divisible into an infinite number of fractional parts

Low Normal High Agree, strongly agree, disagree, strongly disagree

Page 32: Correlations and t scores (2)

Scales of MeasurementScales of Measurement

Nominal Scale Ordinal Scale Interval Scale Ratio Scale

Nominal Scale Ordinal Scale Interval Scale Ratio Scale

Page 33: Correlations and t scores (2)

Nominal ScalesNominal Scales

NominalNominal scales are the lowest scales of measurement. Numbers are assigned to categories as "names". Which number is assigned to which category is completely arbitrary. Therefore, the only number property of the nominal scale of measurement is identity. The number gives us the identity of the category assigned. The only mathematical operation we can perform with nominal data is to count.

Classifying people according to gender is a common application of a nominal scale.In the example below, the number "1" is assigned to "male" and the number "2" is assigned to "female". We can just as easily assign the number "1" to "female" and "2" to male. The purpose of the number is merely to name the characteristic or give it "identity".

NominalNominal scales are the lowest scales of measurement. Numbers are assigned to categories as "names". Which number is assigned to which category is completely arbitrary. Therefore, the only number property of the nominal scale of measurement is identity. The number gives us the identity of the category assigned. The only mathematical operation we can perform with nominal data is to count.

Classifying people according to gender is a common application of a nominal scale.In the example below, the number "1" is assigned to "male" and the number "2" is assigned to "female". We can just as easily assign the number "1" to "female" and "2" to male. The purpose of the number is merely to name the characteristic or give it "identity".

Page 34: Correlations and t scores (2)

Nominal ScaleNominal Scale

As we can see from the graphs, changing the number assigned to "male" and "female" does not have any impact on the data -- we still have the same number of men and women in the data set.

Page 35: Correlations and t scores (2)

Ordinal ScaleOrdinal Scale

Ordinal scales have the property of magnitude as well as identity. The numbers represent a quality being measured (identity) and can tell us whether a case has more of the quality measured or less of the quality measured than another case (magnitude). The distance between scale points is not equal. Ranked preferences are presented as an example of ordinal scales encountered in everyday life. We also address the concept of unequal distance between scale points.

Ordinal scales have the property of magnitude as well as identity. The numbers represent a quality being measured (identity) and can tell us whether a case has more of the quality measured or less of the quality measured than another case (magnitude). The distance between scale points is not equal. Ranked preferences are presented as an example of ordinal scales encountered in everyday life. We also address the concept of unequal distance between scale points.

Page 36: Correlations and t scores (2)

Coke, Pepsi or SpriteCoke, Pepsi or Sprite

Ranked PreferencesWe are often interested in preferences for different tastes, especially if we are planning a party. Let's say that we asked the three students pictured below to rank their preferences for four different sodas. We usually rank our strongest preference as "1". With four sodas, our lowest preference would be "4". For each soda, we assign a rank that tells us the order (magnitude) of the preference for that particular soda (identity). The number simply tells us that we prefer one soda over another, not "how much" more we prefer the soda.

Ranked PreferencesWe are often interested in preferences for different tastes, especially if we are planning a party. Let's say that we asked the three students pictured below to rank their preferences for four different sodas. We usually rank our strongest preference as "1". With four sodas, our lowest preference would be "4". For each soda, we assign a rank that tells us the order (magnitude) of the preference for that particular soda (identity). The number simply tells us that we prefer one soda over another, not "how much" more we prefer the soda.

Changing the numbers changes the meaning of the preferences.

Page 37: Correlations and t scores (2)

Interval ScalesInterval Scales

Interval scales have the properties of: identity magnitude equal distance The equal distance between scale points allows us to

know how many units greater than, or less than, one case is from another on the measured characteristic. So, we can always be confident that the meaning of the distance between 25 and 35 is the same as the distance between 65 and 75. Interval scales DO NOT have a true zero point; the number "0" is arbitrary.

Interval scales have the properties of: identity magnitude equal distance The equal distance between scale points allows us to

know how many units greater than, or less than, one case is from another on the measured characteristic. So, we can always be confident that the meaning of the distance between 25 and 35 is the same as the distance between 65 and 75. Interval scales DO NOT have a true zero point; the number "0" is arbitrary.

Page 38: Correlations and t scores (2)

IQ Scores are examples of an Interval ScaleIQ Scores are examples of an Interval Scale

Page 39: Correlations and t scores (2)

Ratio ScalesRatio Scales

Ratio scales of measurement have all of the properties of the abstract number system.

identity magnitude equal distance absolute/true zero These properties allow us to apply all of the possible

mathematical operations (addition, subtraction, multiplication, and division) in data analysis. The absolute/true zero allows us to know how many times greater one case is than another. Scales with an absolute zero and equal interval are considered ratio scales.

Ratio scales of measurement have all of the properties of the abstract number system.

identity magnitude equal distance absolute/true zero These properties allow us to apply all of the possible

mathematical operations (addition, subtraction, multiplication, and division) in data analysis. The absolute/true zero allows us to know how many times greater one case is than another. Scales with an absolute zero and equal interval are considered ratio scales.

Page 40: Correlations and t scores (2)
Page 41: Correlations and t scores (2)
Page 42: Correlations and t scores (2)

Where do t scores come in?Where do t scores come in? We studied single variables such as

Central Tendency measures. However, most researchers look for

relationships between variables ( at least two)

The foundation to understanding the relationship between them is the correlation coefficient.

We studied single variables such as Central Tendency measures.

However, most researchers look for relationships between variables ( at least two)

The foundation to understanding the relationship between them is the correlation coefficient.

Page 43: Correlations and t scores (2)

More than just one correlationMore than just one correlation The most widely used in education is

Pearson Product Moment correlation coefficient.

1) When do we use it? 2) What does it tell us? 1.We use coefficients to see how

variables are related to one another. 2. We use Pearson PMC when the

variables being examined are ratio or interval variables (continous variables)

The most widely used in education is Pearson Product Moment correlation coefficient.

1) When do we use it? 2) What does it tell us? 1.We use coefficients to see how

variables are related to one another. 2. We use Pearson PMC when the

variables being examined are ratio or interval variables (continous variables)

Page 44: Correlations and t scores (2)

ExamplesExamples 1. How much time in studying is related to

your exam scores? Assumptions for questions in your research: A. The more time spent in studying yields

higher scores in the exam. B. Students will do well in an exam if they Spend more time studying C. Some students will not do well in the exam

despite the hours spent studying.

1. How much time in studying is related to your exam scores?

Assumptions for questions in your research: A. The more time spent in studying yields

higher scores in the exam. B. Students will do well in an exam if they Spend more time studying C. Some students will not do well in the exam

despite the hours spent studying.

Page 45: Correlations and t scores (2)

Consider other possibilitiesConsider other possibilities

The reason for studying longer hours is because students do not understand the material.

Students will do well in the exam regardless of the time spent studying.

Are the above exceptions to my rule?

The reason for studying longer hours is because students do not understand the material.

Students will do well in the exam regardless of the time spent studying.

Are the above exceptions to my rule?

Page 46: Correlations and t scores (2)

How do we address this dilemma?How do we address this dilemma?

Two fundamental characteristics of correlations are:

A) Direction- + -- B) Strength r=.80 (is this low, moderate or

high

Going back to our example, if student study more, then what is the direction of the correlation? What if they spend less time studying?

Two fundamental characteristics of correlations are:

A) Direction- + -- B) Strength r=.80 (is this low, moderate or

high

Going back to our example, if student study more, then what is the direction of the correlation? What if they spend less time studying?

Page 47: Correlations and t scores (2)

What happens when…What happens when…

Students spend more time studying and the scores go down. Why?

The least time studying the scores go higher.

Why? So what is the direction in these cases and how

do we describe this in terms of the direction of my correlation?

Draw a simple scattergram showing each of the above examples.

Students spend more time studying and the scores go down. Why?

The least time studying the scores go higher.

Why? So what is the direction in these cases and how

do we describe this in terms of the direction of my correlation?

Draw a simple scattergram showing each of the above examples.

Page 48: Correlations and t scores (2)

Which is which?Which is which?

Page 49: Correlations and t scores (2)

Strength (Magnitude) of CorrelationsStrength (Magnitude) of Correlations Correlations may range from -1.0 to +1.0 . A

correlation of ) indicates no relationship. The closer the correlation to -1.0 or +1.0 the stronger

the relationship Perfect correlations are never found. Instead we find: -.20 and +.20=weak above 2.0 to .50=moderate Above .50 to .70= strong Some correlations despite their magnitude do not yield

any value. However, weak correlations may be of great significance.

Correlations may range from -1.0 to +1.0 . A correlation of ) indicates no relationship.

The closer the correlation to -1.0 or +1.0 the stronger the relationship

Perfect correlations are never found. Instead we find: -.20 and +.20=weak above 2.0 to .50=moderate Above .50 to .70= strong Some correlations despite their magnitude do not yield

any value. However, weak correlations may be of great significance.

Page 50: Correlations and t scores (2)

Examine the following ExampleExamine the following Example r = is the symbol for correlations r = is the symbol for correlations

Student Hours Spent in Studying (X) Exam Score (Y)

Student 1 5 80

Student 2 6 85

Student 3 7 70

Student 4 8 90

Student 5 9 85

*Scores must be paired because you will convert scores to z scores in order to subtract them from the mean and dividing by the standard deviation.

Page 51: Correlations and t scores (2)

Consider the Following ScenariosConsider the Following Scenarios

One set of scores in variable x is associated with high scores in variable y.

One set of low scores in variable x is associated with a set of low scores in variable y

When two factors are positive then you get a positive product.

When two factors are negative you get a positive product.

Explain what these statements mean with respect to correlations.

One set of scores in variable x is associated with high scores in variable y.

One set of low scores in variable x is associated with a set of low scores in variable y

When two factors are positive then you get a positive product.

When two factors are negative you get a positive product.

Explain what these statements mean with respect to correlations.

Page 52: Correlations and t scores (2)

What happens when…What happens when…

High scores in one variable are associated with low scores in the other variable?

Do not ascribe more to the association of the variables!

Correlation simply means that variations in the scores of one variable correspond to the variation of the scores of another variable.

High scores in one variable are associated with low scores in the other variable?

Do not ascribe more to the association of the variables!

Correlation simply means that variations in the scores of one variable correspond to the variation of the scores of another variable.

Page 53: Correlations and t scores (2)

Are correlations always linear?Are correlations always linear?

A curvilinear relationship begin positive then it turns negative. There are some explanations for this when it happens. Consider anxiety before a test and how it may help to improve the scores. However, too much of it can cause negative scores. This is alos a sign of a weak relationship.

Page 54: Correlations and t scores (2)

Correlations can be calculated in various ways: Correlations can be calculated in various ways:

X Values Y Values

5 80

6 85

7 70

8 90

985

X Value Y Value X*Y X*X(X2) Y*Y (Y2)

5 80 5 * 8 = 40 5 * 5 = 25 80 * 80 = 640

6 85 6 * 85 = 510 6 * 6 = 36 85 * 85 = 665

7 70 7 * 70 = 490 7 * 7 = 49 70 * 70 = 490

8 90 8 * 90 = 720 8 * 8 = 64 90 *90 = 810

9 85 9 * 85 =765 9 * 9 =8185 * 85 = 665

Step 1: Find ΣX, ΣY, ΣXY, ΣX2, ΣY2.            ΣX =             ΣY = ΣXY =             ΣX2 =             ΣY2 =

Step2: Now, Substitute in the above formula given. Correlation(r) =[ NΣXY - (ΣX)(ΣY) / Sqrt([NΣX2 - (ΣX)2][NΣY2 - (ΣY)2])]

Page 55: Correlations and t scores (2)

Review language for formulasReview language for formulas

Hinton page 259

Page 56: Correlations and t scores (2)

This is another wayThis is another way

This means:•Calculate everybody's Z score.•Multiply your Zx by your Zy.•Add up these pairs for everyone.•Divide by the number of people or observations.So if we multiply your Z scores together, sum all these pairs and we get a positive sum (positive scores for X multiplied by positive scores for Y and negative scores for X multiplied by negative scores for Y) - we have a positivecorrelation)If you have a negative relationship you will have thesumof positive Zxs times negative Zys and vice versa. Add these up for a negative sum.If you have no correlation, then you get equal numbers of positive Zx times negative Zy, positive Zx times positive Zy, negative Zx times negative Zy, and negative Zx times positive Zy. Add these up and you get zero.

Page 57: Correlations and t scores (2)

Correlations may be used when…Correlations may be used when…

In the simple case of correlational research you have one quantitative IV (e.g., level of motivation) and one quantitative DV (performance on math test).

·        The researcher checks to see if the observed correlation is statistically significant (i.e., not due to chance) using the "t-test for correlation coefficients" (it tells you if the relationship is statistically significant.

·        Remember that the commonly used correlation coefficient (i.e., the Pearson correlation) only detects linear relationships.

In the simple case of correlational research you have one quantitative IV (e.g., level of motivation) and one quantitative DV (performance on math test).

·        The researcher checks to see if the observed correlation is statistically significant (i.e., not due to chance) using the "t-test for correlation coefficients" (it tells you if the relationship is statistically significant.

·        Remember that the commonly used correlation coefficient (i.e., the Pearson correlation) only detects linear relationships.

Page 58: Correlations and t scores (2)

Let’s go back to our exampleLet’s go back to our exampleStudent No hours spent

StudyingExam Score

Joyce 0 95

Ashley 2 95

Jeff 4 100

Sean 7 95

Pedro 10 100

What can we conclude from the above examples?

Page 59: Correlations and t scores (2)

We want to know…We want to know…

1. Is there a correlation?

2. How strong and what the direction of the correlation?

3. Is it statistically significant?

4. Is there a possibility of another extraneous factor that gave us a false correlation?

1. Is there a correlation?

2. How strong and what the direction of the correlation?

3. Is it statistically significant?

4. Is there a possibility of another extraneous factor that gave us a false correlation?

Page 60: Correlations and t scores (2)

If you were testing the null hypothesis…If you were testing the null hypothesis…

This where t test comes along. If your Ho=ƿ=0 (rho, population correlation

coefficient) If your H1=ƿ≠0

The t distribution is used to test whether a correlation coefficient is statistically significant.

The t test just like z scores consist on a ratio where t= r-ƿ

sr

This where t test comes along. If your Ho=ƿ=0 (rho, population correlation

coefficient) If your H1=ƿ≠0

The t distribution is used to test whether a correlation coefficient is statistically significant.

The t test just like z scores consist on a ratio where t= r-ƿ

sr

Page 61: Correlations and t scores (2)

t testst tests

t= r-ƿ sr

r is the sample correlation coefficient Ƿ is the population correlation coefficient Sr is the standard error of the sample

correlation coefficient We do not need to calculate the standard

error of the sample because algebraic equation allows us to use this formula:

Sr= √ (1-r2)÷N-2= r2

t= r-ƿ sr

r is the sample correlation coefficient Ƿ is the population correlation coefficient Sr is the standard error of the sample

correlation coefficient We do not need to calculate the standard

error of the sample because algebraic equation allows us to use this formula:

Sr= √ (1-r2)÷N-2= r2

Page 62: Correlations and t scores (2)

Find a t valueFind a t value

Go back to your formula where t=(.25) √ 100-2

1-.25 2 You can skip the math now and we get : t =2.56, df= 98

Using appendix B we find that a t score of 2.56 has a probability between 01 and .02 of occurring by chance. If the researcher has used and alpha value <.05 then the null hypothesis is rejected and you can conclude that the correlation is significant by accepting the alternative hypothesis.

Thus r=.25 , t98,=2.56, ƿ< .05

Go back to your formula where t=(.25) √ 100-2

1-.25 2 You can skip the math now and we get : t =2.56, df= 98

Using appendix B we find that a t score of 2.56 has a probability between 01 and .02 of occurring by chance. If the researcher has used and alpha value <.05 then the null hypothesis is rejected and you can conclude that the correlation is significant by accepting the alternative hypothesis.

Thus r=.25 , t98,=2.56, ƿ< .05

Page 63: Correlations and t scores (2)

You will learn more about alpha values.You will learn more about alpha values.

The alpha value is the probability of making a Type I Error.  In a Hypothesis Test a Type I error occurs when statistically unlikely test results lead to the incorrect conclusion that the null hypothesis should be rejected. The alpha value is conventionally designated by the symbol alpha (α ).

The alpha value is also called the 'level of significance' and is selected based on the importance of the test, a value of 0.05 would be common, with 0.01 for tests that are critical, and even 0.001 in some cases.

A hypothesis test involves comparing the calculated p-value with the previously agreed alpha value. If the p-value is less than the alpha value the alternative hypothesis will be accepted.

 

The alpha value is the probability of making a Type I Error.  In a Hypothesis Test a Type I error occurs when statistically unlikely test results lead to the incorrect conclusion that the null hypothesis should be rejected. The alpha value is conventionally designated by the symbol alpha (α ).

The alpha value is also called the 'level of significance' and is selected based on the importance of the test, a value of 0.05 would be common, with 0.01 for tests that are critical, and even 0.001 in some cases.

A hypothesis test involves comparing the calculated p-value with the previously agreed alpha value. If the p-value is less than the alpha value the alternative hypothesis will be accepted.

 

Page 64: Correlations and t scores (2)

The formula for calculating the t value isThe formula for calculating the t value is

t=(r) √ N-2

1-r2

Consider the following example: I randomly select 100 people living in

different latitudes to measure whether the number of hours expose to

sunlight result in a seasonal affective disorder (SAD) measured by a

mood scale of 1-10 where 1=“very sad” and 10=“very happy”. Suppose

that I have computed a Pearson Correlation wit my data and I find that

there is the above variables provided me with a .25 correlation. I want to

know if this is statistically significant. So what do I do?

t=(r) √ N-2

1-r2

Consider the following example: I randomly select 100 people living in

different latitudes to measure whether the number of hours expose to

sunlight result in a seasonal affective disorder (SAD) measured by a

mood scale of 1-10 where 1=“very sad” and 10=“very happy”. Suppose

that I have computed a Pearson Correlation wit my data and I find that

there is the above variables provided me with a .25 correlation. I want to

know if this is statistically significant. So what do I do?

Page 65: Correlations and t scores (2)

Summary: A picture is worth a thousand words.Summary: A picture is worth a thousand words.

Page 66: Correlations and t scores (2)

My Easy Way…My Easy Way…

http://easycalculation.com/statistics/correlation.php

http://easycalculation.com/statistics/learn-correlation.php

http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/stat_workshp/scale/scale_08.html

http://easycalculation.com/statistics/correlation.php

http://easycalculation.com/statistics/learn-correlation.php

http://www.wadsworth.com/psychology_d/templates/student_resources/workshops/stat_workshp/scale/scale_08.html