Top Banner
PASW Statistics (SPSS)
34

1. chapter i(pasw)

Jan 17, 2017

Download

Education

Chhom Karath
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1. chapter i(pasw)

PASW Statistics (SPSS)

Page 2: 1. chapter i(pasw)

What is PASW?

Predictive

Analytics

Software

Page 3: 1. chapter i(pasw)

What is Statistics?

• Statistics is a set of mathematical techniques used to:• Summarize research data. • Determine whether the data supports the

researcher’s hypothesis.

Research Stages

1. Planning and Designing

2. Data Collecting

3. Data Analyzing

4. Data Reporting

5. deployment

Page 4: 1. chapter i(pasw)

SPSS• it was acquired by IBM in 2009.• It is also used by market researchers, health researchers,

survey companies, government, education researchers, marketing organizations, data miners, and others.

• Companion products in the same family are used for survey authoring and deployment (IBM SPSS Data Collection), data mining(IBM SPSS Modeler), text analytics, and collaboration and deployment (batch and automated scoring services).

• The software name stands for Statistical Package for the Social Sciences (SPSS),[2] reflecting the original market, although the software is now popular in other fields as well, including the health sciences and marketing.

Page 5: 1. chapter i(pasw)

SPSS• Run the tutorial: Sample lesson existing• Type in Data: Insert data ourselves• Run as existing query: having existing query• Create new query using Database wizard: Import DB table• Open• SPSS Data Editor• OPEN=> Data• Column Header=Variable• Row Header=Case Number• The Data Editor provides two views of your data:

– Data View. This view displays the actual data values or defined value labels.

– Variable View. This view displays variable definition information, including defined variable and value labels, data type (for example, string, date, or numeric), measurement level (nominal, ordinal, or scale), and user-defined missing values.

Page 6: 1. chapter i(pasw)

Data View• similar to the features that are found in spreadsheet applications• Rows are cases. Each row represents a case or an observation. For

example, each individual respondent to a questionnaire is a case. • Columns are variables. Each column represents a variable or characteristic

that is being measured. For example, each item on a questionnaire is a variable.

• Cells contain values. Each cell contains a single value of a variable for a case. The cell is where the case and the variable intersect. Cells contain only data values. Unlike spreadsheet programs, cells in the Data Editor cannot contain formulas.

• The data file is rectangular. The dimensions of the data file are determined by the number of cases and variables. You can enter data in any cell. If you enter data in a cell outside the boundaries of the defined data file, the data rectangle is extended to include any rows and/or columns between that cell and the file boundaries. There are no "empty" cells within the boundaries of the data file. For numeric variables, blank cells are converted to the system-missing value. For string variables, a blank is considered a valid value. To display Data View Hide details

Page 7: 1. chapter i(pasw)

Variable View• Column Name: name of variable(Name, Age, Sex, Edu, Income)

– Each variable name must be unique; duplication is not allowed. – Variable names can be up to 64 bytes long, and the first character

must be a letter or one of the characters @, #, or $. – Variable names cannot contain spaces– Reserved keywords cannot be used as variable names. Reserved

keywords are ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO, and WITH.

• Column Type/Variable type: Data Type of variable– Number: A variable whose values are numbers. Values are displayed in

standard numeric format.– Comma: A numeric variable whose values are displayed with commas

delimiting every three places and displayed with the period as a decimal delimiter.

– Dot: A numeric variable whose values are displayed with periods delimiting every three places and with the comma as a decimal delimiter.

Page 8: 1. chapter i(pasw)

Variable View• nominal

– A variable can be treated as nominal when its values represent categories with no intrinsic ranking (for example, the department of the company in which an employee works). Examples of nominal variables include region, zip code, and religious affiliation.

• Ordinal– A variable can be treated as ordinal when its values represent

categories with some intrinsic ranking (for example, levels of service satisfaction from highly dissatisfied to highly satisfied). Examples of ordinal variables include attitude scores representing degree of satisfaction or confidence and preference rating scores.

• Scale– A variable can be treated as scale (continuous) when its values

represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars.

Page 9: 1. chapter i(pasw)

Variable View• Column Type/Variable type: Data Type of variable

– Scientific notation: A numeric variable whose values are displayed with an embedded E and a signed power-of-10 exponent. The exponent can be preceded by E or D with an optional sign or by the sign alone--for example, 123, 1.23E2, 1.23D2, 1.23E+2, and 1.23+2.

– Date. A numeric variable whose values are displayed in one of several calendar-date or clock-time formats.

– Dollar. A numeric variable displayed with a leading dollar sign ($), commas delimiting every three places, and a period as the decimal delimiter.

– Custom currency. A numeric variable whose values are displayed in one of the custom currency formats that you have defined on the Currency tab of the Options dialog box.

– String. A variable whose values are not numeric and therefore are not used in calculations.

– Restricted numeric. A variable whose values are restricted to non-negative integers.

Page 10: 1. chapter i(pasw)

Variable View• Variable labels

– You can assign descriptive variable labels up to 256 characters (128 characters in double-byte languages). Variable labels can contain spaces and reserved characters that are not allowed in variable names.

• Value labels– You can assign descriptive value labels for each value of a variable.

This process is particularly useful if your data file uses numeric codes to represent non-numeric categories (for example, codes of 1 and 2 for male and female).

– Value labels are saved with the data file. You do not need to redefine value labels each time you open a data file. Value labels can be up to 120 bytes.

Page 11: 1. chapter i(pasw)

Variable View• Miss value

Page 12: 1. chapter i(pasw)

Replace Missing Value

(Age1+age2+age3+age4)/4

Page 13: 1. chapter i(pasw)

Import File Excel• File=> Open Data=> Excel Type

Page 14: 1. chapter i(pasw)

Run New Query• File=> Open DataBase=>New Query

– Select table for import

Page 15: 1. chapter i(pasw)

Questoin #1Data View

Variable View

1. Change Value: 1=Male 2=Female

2. MeasureGender=NormalHeigh=Scale

3. Analyze=> Descriptive Statistic=> FrequencyGender=Variable

4. Analyze=> Descriptive Statistic=> FrequencyHeigh=VariableStatistic

5. Analyze=> Comparative Means=> Means

Page 16: 1. chapter i(pasw)

Frequency:Count number of case having each value variable Percent: Valid number + Invalid number to find out percentValid Percent: Only valid number in percentCumulative Percent: valid percent sumEx: 14.3, 14.3+14.3=26.6, 14.3+14.3+42.9=71.4, 14.3+14.3+42.9+28.6=100.0

Page 17: 1. chapter i(pasw)

Standard deviation• standard deviation (SD) (sigma, σ) measures the amount of variation or dispersion from the average.•A low standard deviation indicates that the data points tend to be very close to the mean (also called expected value); •A high standard deviation indicates that the data points are spread out over a large range of values.•For example, consider a population consisting of the following eight values:

2, 4, 4, 4, 5, 5, 7, 9•These eight data points have the mean (average=mean) of 5:

(2+4+4+4+5+5+7+9)/8=5•Each Data Point from mean:

(2-5)2=9 (5-5)2=0(4-5)2=1 (5-5)2=0(4-5)2=1 (7-5)2=4(4-5)2=1 (9-5)2=16

•Varian=(9+1+1+1+0+0+4+16)/8=4•Standard Diviation=sqrt(4)=2

Page 18: 1. chapter i(pasw)

Finding the Mode• first put the numbers in order,•count how many of each number.•EX1: 3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29• In order numbers are: 3, 5, 7, 12, 13, 14, 20, 23, 23, 23, 23, 29, 39, 40, 56• numbers appear most often: 23

•EX2: {19, 8, 29, 35, 19, 28, 15}• In order number are: {8, 15, 19, 19, 28, 29, 35}• Number appear most often: 19

•Note: More Than One Mode•EX3: {1, 3, 3, 3, 4, 4, 6, 6, 6, 9}• So there are two modes: at 3 and 6

Page 19: 1. chapter i(pasw)

Questoin #2

1. Select Transform->Compute VariableTarget Variable: averageNumeric Expression: (v1+v2)/2OK

2. Save=> Open existing Data *.sav3. Tramsform-> Record into Different Variables

• Double Click Average• Name: grade• Click: Change• Old and New Variable

• Range• Through

Range Though value

0 14.9 0

15 19.9 1

20 24.9 2

25 30 3

Page 20: 1. chapter i(pasw)

Questoin #31. Descriptive Statistic=> Descriptive =>

Variable=>• Age• Exam1• Exam2• Average• Grade

Page 21: 1. chapter i(pasw)

Median value• The Median is the "middle number" (in a sorted list of numbers).•Form Example• Example1: find the Median of 12, 3 and 5

• Put them in order: 3, 5, 12• The middle number is 5, so the median is 5• Example2: 3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29• Put order: 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56

• 3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40, 56

Page 22: 1. chapter i(pasw)

Questoin #4•Graphs=> Legacy Dialogs

=> Histagram• Variable=Grade

Page 23: 1. chapter i(pasw)

Variance•The average of the squared differences from the Mean.•To calculate the variance follow these steps:• Work out the Mean (the simple average of the numbers)Then for each

number: subtract the Mean and square the result (the squared difference).

• Then work out the average of those squared differences. (Why Square?)• Mean = (600 + 470 + 170 + 430 + 300)/5=394

Page 24: 1. chapter i(pasw)

Variance•Frequency: count or number of cases•Valid: valid case having niether missing data nor valid Data•Missing: user missing value or . •Percent: value and also missing data•Valid percent: percent of case for non missing value

Page 25: 1. chapter i(pasw)

Cumulative percentage•another way of expressing frequency distribution.•cumulative percentage = (cumulative frequency ÷ n) x 100

Page 26: 1. chapter i(pasw)

Question #5

•DATA•Select Case•If conditon is satified•If•Gender=“1” if string

Page 27: 1. chapter i(pasw)

Question #6

- Transform=> Calculate=> - Variable Name: Mean(q1,q2,q3,q4)- ok

•Will compute a new variable by Mathematic operation

Page 28: 1. chapter i(pasw)
Page 29: 1. chapter i(pasw)

Question #6(continue)Selecting Cases• To select cases either by filtering (which keeps all the cases but limits further analyses to selected cases) or by removing the cases that do not meet your criteria.-Data => Select Case => If condition is satified(.)=> sex=1 => Contiue=> OK

Page 30: 1. chapter i(pasw)

Question #6(continue)Sorting CasesWe can sort on one or more variables, For example, we may want to sort the records in our dataset by age and sex.-Data=> Shorting Case=>Select what you want

Page 31: 1. chapter i(pasw)

Question #6(continue)Splitting a File•splitting a file creates separate "layers" for the grouping variables.•Data=> Split File=> Organize output by group(Sex of student)

Page 32: 1. chapter i(pasw)

Question #6(continue)Descriptive •To calculate the means and standard deviations for age, all quizzes, and the average quiz score•Analyze=>Descriptive Statistics=> Descriptive

Page 33: 1. chapter i(pasw)

Question #6(continue)Exploring Means for Different Groups• two or more groups, you may want to examine the means for each group as well as the overall mean.•Select Analyze, Compare Means, Means

Page 34: 1. chapter i(pasw)

Question #6(continue)Frequency Distributions and Histograms•Select Analyze, Descriptive Statistics, Frequencies•Click on the Charts=> Histagram