1 Revisiting salary Revisiting salary discrimination @ Acme Bank: discrimination @ Acme Bank: Background Background • A bank is facing a discrimination suit in which it is accused of paying its female employees less than their male employees • The bank had 208 employees in 1995 --140 females and 68 males
28
Embed
1 Revisiting salary discrimination @ Acme Bank: Background A bank is facing a discrimination suit in which it is accused of paying its female employees.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Difference in salariesDifference in salaries• Female salaries are lower than male salaries on average by
$8,295 (coefficient of Gender; Gender=0 for male and Gender=1 for female)
• Although r-square is low (12%), most people would agree that the difference in salaries is statistically significant based on the low p-value
• Simple regression only looks at things at the gross or surface level
• Multiple regression helps us net out effects of other important variables, such as prior experience, job grade, educational background etc. We need to include additional variables (information) in the analysis to see if salaries are different
• We expand the model by including prior experience (YrsPrior) in the banking industry, and years at this bank (YrsExp)
5
MR1: Salary vs Gender, Years at the current MR1: Salary vs Gender, Years at the current bank (YrsExpr), and prior experience in bank (YrsExpr), and prior experience in
QuestionsQuestions1. Is the MR1 better than the simple regression
model (ensure that the model is significant, and compare r-squared values)?
2. Interpret the coefficient of Gender in MR1
3. Interpret the coefficients of YrsExp and YrsPrior
7
Further expansionFurther expansion
• Next, we add Job Grade (see slide #2) to the MR1 model
• Since Job Grade is a categorical variable with 6 levels, 5 dummy variables are created to represent these levels (Job_2 through Job_6 )
• Job_2 was set to 1 if Job Grade was 2, and zero other wise – similar approach was used in coding Job_3 through Job_6
4. Can you tell which Job Grade is represented by the default setting where Job_2, Job_3, Job_4, Job_5 and Job_6 equal zero? Please ask me to elaborate if you are not clear on this.
8
MR2MR2Regression Statistics
Multiple R 0.861585R Square 0.742329Adjusted R Square 0.73197Standard Error 5.827492Observations 208
QuestionsQuestions5. Is this model better than the first two models (check
that the model is significant; please also look at r-squared)?
6. Are female salaries significantly lower than male salaries at the 10% level?
7. Are you clear on the interpretation of coefficients for the Job Grade dummy variables? Can you determine the difference in salaries for a person moving from Job Grade 2 to 3, all else equal?
10
Full ModelFull Model• More information was added to further
expand the model
• Four dummy variables (Ed_2 through Ed_5) were added to represent the 5 education levels
• Age for each employee was included
11
MR3MR3SUMMARY OUTPUT
Regression StatisticsMultiple R 0.866994R Square 0.751678Adjusted R Square 0.735038Standard Error 5.794046Observations 208
QuestionsQuestions8. Is the most recent model in MR3 (slide #11)
better than the MR2 (slide #8)?
9. Are female salaries significantly lower than male salaries at 10%?
10.Using p-values, identify variables that do not appear to contribute significantly to MR3
13
Refining the regression Refining the regression model – removing variablesmodel – removing variables
• Age does not appear to contribute significantly to the MR3 model
• Most education levels (with the exception of Ed_5) also do not appear to contribute significantly to the MR3 model
• Thus, we should exclude Age and Education Level variables from further analysis (this gives us the same model as MR2, but it is shown again as MR4 on the next slide for easy reference)
14
MR4 (Same as MR2)MR4 (Same as MR2)Regression Statistics
Multiple R 0.861585R Square 0.742329Adjusted R Square 0.73197Standard Error 5.827492Observations 208
Gender and experience – is there an interaction?Gender and experience – is there an interaction?
Males FemalesYrsExp 0.8877 0.2354
JobGrade 0.8210 0.7147
• The above table shows correlations between Salary and the Yrs of Experience and Job Grade for Males and Females
• The correlation between Yrs of Experience and Salary appears to be much stronger for males than females – in other words, male employees are moving up the salary ladder faster than female employees – the analyst felt this may be the source of salary discrimination at Acme Bank
• Thus, there appears to be an interaction: The effect that Yrs of Experience has on Salary depends on whether the employee is male or female
• Regression analysis can be improved by adding an interaction term in the model – the method is described in the next slide
18
Variables for the modelVariables for the model
• To capture the interaction between Yrs Experience and Gender, a new variable called Gen*YrsExp was created by multiplying the value of Gender (0 or 1) by the employee’s experience (YrsExp) at Acme
• Thus the MR model with interaction is: Salary against Gender, YrsExp, and Gen*YrsExp
19
Regression model with Regression model with interactioninteraction
Regression StatisticsMultiple R 0.79913R Square 0.638609Adjusted R Square 0.633295Standard Error 6.816298Observations 208
QuestionsQuestions11.What is the regression equation for
male employees?
12.What is the regression equation for female employees?
13.How do we interpret the regression coefficients in the slide #19?
21
QuestionsQuestions14. According to the regression model, what is the
salary for a male (Gender=0) who has 1 year of experience at this bank?
15. What is the predicted salary for a male (Gender=0) who has 6 years of experience at this bank?
16. Answer the above questions for a female (Gender=1) at this bank
Contd.
22
QuestionsQuestions17.Looking at your answers, can you tell
if there is an interaction?
18.Can you explain the interaction?
19.Is the interaction significant (=10%)?
Contd.
23
How many interaction variables?How many interaction variables?• Suppose we want to test the interaction between
Gender and Job-Grade
20. How many interaction variables would we need?
21. Let’s add the interaction variable to our best MR model so far (MR5 on slide #16) to see if further improvements are possible …. The new model (MR6) is shown in the following slide – do you think the model with interaction is better (is MR6 better than MR5 -- why)?
24
MR6: Full model with interactionMR6: Full model with interactionRegression Statistics
Multiple R 0.9005R Square 0.8109Adjusted R Square 0.8033Standard Error 4.9916Observations 208
But your data has an outlier!But your data has an outlier!• Before we accept that there are significant
differences between male and female salaries, we’d like to address the issue of outliers
• Specifically, there is a female employee in the highest job grade (Job Grade 6), has 33 years of experience at Acme, but whose salary is only $30,000 – this could a major source of discrimination at Acme Bank
• To see if this is the case, we remove this employee from our data and redo the regression analysis
26
MR7: Regression with outlier removedMR7: Regression with outlier removedRegression Statistics
Multiple R 0.9130R Square 0.8336Adjusted R Square 0.8269Standard Error 4.6857Observations 207