Volume 05 - Issue 01 January 2016

[ w w w . m o j - e s . n e t ]

2017

Malaysian Online Journal of Educational

Sciences Volume 5, Issue 1

January 2017

Editor-in-Chief

Professor Datuk Dr. Sufean Hussin

Editors

Assoc. Prof. Datin Dr. Sharifah Norul Akmar Syed Zamri Assist. Prof. Dr. Onur bulan

Associate Editors

Professor Dr. Omar Abdull Kareem

Associate Prof. Dr. Ibrahem Narongsakhet

Associate Prof. Dr. Mohd Yahya Mohamed Ariffin,

Associate Prof. Dr. Norani Mohd Salleh

Associate Prof. Dr. Wan Hasmah Wan Mamat

Inst. Aydn Kiper

ISSN: 2289-3024

Malaysian Online Journal of Educational Sciences 2017 (Volume 5 - Issue 1)

www.moj-es.net

Copyright 2013 - MALAYSIAN ONLINE JOURNAL OF EDUCATIONAL SCIENCES All rights reserved. No part of MOJESs articles may be reproduced or utilized in any form or by any means,

electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the publisher.

Contact Address:

Professor Datuk Dr. Sufean Hussin

MOJES, Editor in Chief, University of Malaya, Malaysia

Malaysian Online Journal of Educational Science 2017 (Volume 5 - Issue 1)

www.moj-es.net

Message from the editor-in-chief

Malaysian Online Journal of Educational Sciences (MOJES) strives to provide a national and international academic forum to meet the professional interests of individuals in various educational disciplines. It is a professional refereed journal in the interdisciplinary fields sponsored by the Faculty of Education, University of Malaya. This journal serves as a platform for presenting and discussing a wide range of topics in Educational Sciences. It is committed to providing access to quality researches ranging from original research, theoretical articles and concept papers in educational sciences.

In order to produce a high quality journal, extensive effort has been put into selecting valuable researches that contributed to the journal. I would like to take this opportunity to express my appreciation to the editorial board, reviewers and researchers for their valuable contributions to make this journal a reality.

Professor Datuk Dr. Sufean Hussin, University of Malaya, Malaysia

January 2017

Editor in chief

Message from the editor

Malaysian Online Journal of Educational Sciences (MOJES) seeks to serve as an academic platform to researchers from the vast domains of Educational Sciences. The journal is published electronically four times a year.

MOJES welcomes original and qualified researches on all aspects of Educational Sciences. Topics may include, but not limited to: pedagogy and educational sciences, adult education, education and curriculum, educational psychology, special education, sociology of education, Social Science Education, Art Education, Language Education, educational management, teacher education, distance education, interdisciplinary approaches, and scientific events.

Being the editor of this journal, it is a great pleasure to see the success of the journal. On behalf of the editorial team of the Malaysian Online Journal of Educational Science (MOJES), we would like to thank to all the authors and editors for their contribution to the development of this journal.

Assoc. Prof. Datin Dr. Sharifah Norul Akmar Syed Zamri & Assist. Prof. Dr. Onur bulan

January 2017

Editors


www.moj-es.net

Editor-in-Chief

Professor Datuk Dr. Sufean Hussin, University of Malaya, Malaysia

Editors

Associate Professor Datin Dr. Sharifah Norul Akmar Syed Zamri, University of Malaya, Malaysia

Assist. Prof. Dr. Onur bulan, Sakarya University, Turkey

Associate Editors

Professor Dr. Omar Abdull Kareem, Sultan Idris University of Education, Malaysia

Associate Prof. Dr. Ibrahem Narongsakhet, Prince of Songkla University, Thailand

Associate Prof. Dr. Mohd Yahya Mohamed Ariffin, Islamic Science University of Malaysia

Associate Prof. Dr. Norani Mohd Salleh, University of Malaya, Malaysia

Associate Prof. Dr. Wan Hasmah Wan Mamat, University of Malaya, Malaysia

Inst. Aydn Kiper, Sakarya University, Turkey

Advisory Board

Emeritus Professor Dr. Tian Po Oei, University of Queensland, Australia

Professor Dr. Fatimah Hashim, University of Malaya, Malaysia

Professor Dr. Jinwoong Song, Seoul National University, Korea

Professor Dr. H. Mohammad Ali, M.Pd, M.A., Indonesian University of Education, Indonesia

Professor Dr. Moses Samuel, University of Malaya, Malaysia

Professor Dr. Nik Azis Nik Pa, University of Malaya, Malaysia

Professor Dr. Richard Kiely, the University College of St. Mark and St. John, United Kingdom

Professor Dr. Sufean Hussin, University of Malaya, Malaysia

Dr. Zawawi Bin Ismail, University of Malaya, Malaysia

Editorial Board

Emeritus Professor Dr. Rahim Md. Sail, University Putra of Malaysia, Malaysia

Professor Dr. Abdul Rashid Mohamed, University of Science, Malaysia

Professor Dr. Ananda Kumar Palaniappan, University of Malaya, Malaysia

Professor Dr. Bakhtiar Shabani Varaki, Ferdowsi University of Mashhad, Iran.

Professor Dr. H. Iskandar Wiryokusumo M.Sc, PGRI ADI Buana University, Surabaya, Indonesia

Professor Dr. Ramlee B. Mustapha, Sultan Idris University of Education, Malaysia

Professor Dr. Tamby Subahan Bin Mohd. Meerah, National University of Malaysia, Malaysia

Associate Professor Datin Dr. Sharifah Norul Akmar Syed Zamri, University of Malaya, Malaysia

Associate Professor Dato Dr. Ab Halim Bin Tamuri, National University of Malaysia, Malaysia

Associate Professor Dr. Abdul Jalil Bin Othman, University of Malaya, Malaysia


www.moj-es.net

Associate Professor Dr. Ajmain Bin Safar, University of Technology, Malaysia

Associate Professor Dr. Habib Bin Mat Som, Sultan Idris Education University, Malaysia

Associate Professor Dr. Hj. Izaham Shah Bin Ismail, Mara University of Technology, Malaysia

Associate Professor Dr. Jas Laile Suzana Binti Jaafar, University of Malaya, Malaysia

Associate Professor Dr. Juliana Othman, University of Malaya, Malaysia

Associate Professor Dr. Loh Sau Cheong, University of Malaya, Malaysia

Associate Professor Dr. Mariani Binti Md Nor, University of Malaya, Malaysia

Associate Professor Dr. Mohamad Bin Bilal Ali, University of Technology, Malaysia

Associate Professor Dr. Norazah Mohd Nordin, National University of Malaysia, Malaysia

Associate Professor Dr.Rohaida Mohd Saat, University of Malaya, Malaysia

Associate Professor Dr. Syed Farid Alatas, National University of Singapore, Singapore

Dato Dr. Hussein Hj Ahmad, University of Malaya, Malaysia

Datuk Dr. Abdul Rahman Idris, University of Malaya, Malaysia

Datin Dr. Rahimah Binti Hj Ahmad, University of Malaya, Malaysia

Dr. Abu Talib Bin Putih, University of Malaya, Malaysia

Dr. Abd Razak Bin Zakaria, University of Malaya, Malaysia

Dr. Adelina Binti Asmawi, University of Malaya, Malaysia

Dr. Ahmad Zabidi Abdul Razak, University of Malaya, Malaysia

Dr. Chew Fong Peng, University of Malaya, Malaysia

Dr. Diana Lea Baranovich, University of Malaya, Malaysia

Dr. Fatanah Binti Mohamed, University of Malaya, Malaysia

Dr. Ghazali Bin Darusalam, University of Malaya, Malaysia

Dr. Haslee Sharil Lim Bin Abdullah, University of Malaya, Malaysia

Dr. Husaina Banu Binti Kenayathulla, University of Malaya, Malaysia

Dr. Kazi Enamul Hoque, University of Malaya, Malaysia

Dr. Latifah Binti Ismail, University of Malaya, Malaysia

Dr. Lau Poh Li, University of Malaya, Malaysia

Dr. Leong Kwan Eu, University of Malaya, Malaysia

Dr. Madhyazhagan Ganesan, University of Malaya, Malaysia

Dr. Megat Ahmad Kamaluddin Megat Daud, University of Malaya, Malaysia

Dr. Melati Binti Sumari, University of Malaya, Malaysia

Dr. Mohammed Sani Bin Ibrahim, University of Malaya, Malaysia

Dr. Mohd Rashid Mohd Saad, University of Malaya, Malaysia


www.moj-es.net

Dr. Muhammad Azhar Bin Zailaini, University of Malaya, Malaysia

Dr. Muhammad Faizal Bin A. Ghani, University of Malaya, Malaysia

Dr. Nabeel Abdallah Adedalaziz, University of Malaya, Malaysia

Dr. Norlidah Binti Alias, University of Malaya, Malaysia

Dr. Pradip Kumar Mishra, University of Malaya, Malaysia

Dr. Rafidah Binti Aga Mohd Jaladin, University of Malaya, Malaysia

Dr. Rahmad Sukor Bin Ab Samad, University of Malaya, Malaysia

Dr. Renuka V. Sathasivam, University of Malaya, Malaysia

Dr. Rose Amnah Bt Abd Rauf, University of Malaya, Malaysia

Dr. Selva Ranee Subramaniam, University of Malaya, Malaysia

Dr. Sit Shabeshan Rengasamy, University of Malaya, Malaysia

Dr. Shahrir Bin Jamaluddin, University of Malaya, Malaysia

Dr. Suzieleez Syrene Abdul Rahim, University of Malaya, Malaysia

Dr. Syed Kamaruzaman Syed Ali, University of Malaya, Malaysia

Dr. Vishalache Balakrishnan, University of Malaya, Malaysia

Dr. Wail Muin (Al-Haj Said) Ismail, University of Malaya, Malaysia

Dr. Wong Seet Leng, University of Malaya, Malaysia

Dr. Zahari Bin Ishak, University of Malaya, Malaysia

Dr. Zahra Naimie, University of Malaya, Malaysia

Dr. Zanaton Ikhsan, National University of Malaysia, Malaysia

Dr. Zeliha DEMIR KAYMAK, Sakarya University, Turkey

Cik Umi Kalsum Binti Mohd Salleh, University of Malaya, Malaysia

En. Mohd Faisal Bin Mohamed, University of Malaya, Malaysia

En. Norjoharuddeen Mohd Nor, University of Malaya, Malaysia

En. Rahimi Md Saad, University of Malaya, Malaysia

Pn. Alina A. Ranee, University of Malaya, Malaysia

Pn. Azni Yati Kamaruddin, University of Malaya, Malaysia

Pn. Fatiha Senom, University of Malaya, Malaysia

Pn. Fonny Dameaty Hutagalung, University of Malaya, Malaysia

Pn. Foziah Binti Mahmood, University of Malaya, Malaysia

Pn. Hamidah Binti Sulaiman, University of Malaya, Malaysia

Pn. Huzaina Binti Abdul Halim, University of Malaya, Malaysia

Pn. Ida Hartina Ahmed Tharbe, University of Malaya, Malaysia


www.moj-es.net

Pn. Norini Abas, University of Malaya, Malaysia

Pn. Roselina Johari Binti Md Khir, University of Malaya, Malaysia

Pn. Shanina Sharatol Ahmad Shah, University of Malaya, Malaysia

Pn. Zuwati Binti Hashim, University of Malaya, Malaysia


www.moj-es.net

Table of Contents

CONCEPTIONS OF THE NATURE OF SCIENCE HELD BY UNDERGRADUATE PRE-SERVICE BIOLOGY TEACHERS IN SOUTH-WEST NIGERIA

1

Adedoyin, A. O [1], Bello, G

DIFFERENTIAL ITEM FUNCTIONING ANALYSIS OF HIGH-STAKES TEST IN TERMS OF GENDER: A RASCH MODEL APPROACH

10

Seyed Mohammad Alavi, Soodeh Bordbar

EFFECTIVENESS OF BLENDED LEARNING AND E-LEARNING MODES OF INSTRUCTION ON THE PERFORMANCE OF UNDERGRADUATES IN KWARA STATE, NIGERIA

25

Amosa Isiaka GAMBARI, Ahmed Tajudeen SHITTU, O. Olufunmilola OGUNLADE, Olourotimi Rufus OSUNLADE

THE EFFECT OF SCHOOL BUREAUCRACY ON THE RELATIONSHIP BETWEEN PRINCIPALS LEADERSHIP PRACTICES AND TEACHER COMMITMENT IN MALAYSIA SECONDARY SCHOOLS

37

Teoh Hong Kean, Sathiamoorthy Kannan, Prof Chua Yan Piaw

THE EFFECT OF TIME ON DIFFICULTY OF LEARNING (THE CASE OF PROBLEM SOLVING WITH NATURAL NUMBERS)

56

Deniz KAYA, Cenk KEAN

THE RELATIONSHIP BETWEEN PROBLEMATIC INTERNET USE, ALEXITHYMIA, DISSOCIATIVE EXPERIENCES AND SELF-ESTEEM IN UNIVERSITY STUDENTS

75

Murat Iskender, Mustafa Ko, Neslihan Arici, Naciye Gven

Malaysian Online Journal of Educational Sciences 2017 (Volume5 - Issue 1 )

Conceptions Of The Nature Of Science Held By Undergraduate Pre-Service Biology Teachers In South-West Nigeria

Adedoyin, A. O [1], Bello, G [1]

[1] Department of Science Education, University of Ilorin, Ilorin 240001, Nigeria *Corresponding author [email protected], Tel. +234(0)8066762605

ABSTRACT

This study investigated the conceptions of the nature of science held by pre-service undergraduate biology teachers in South-West, Nigeria. Specifically, the study examined the influence of their gender on their conceptions of the nature of science. The study was a descriptive research of the survey method. The population for the study comprised all undergraduate pre-service biology teachers in Nigerian universities. Stratified random sampling technique was used to select ninety nine (99) undergraduate pre-service biology teachers from three universities in SouthWest, Nigeria. The nature of science questionnaire (NoSQ) was used to collect data. Results revealed that pre-service undergraduate teachers gender did not influence their conceptions. It was recommended that biology teacher educators should equip the pre-service undergraduate biology teachers with meta-cognitive tools such as Study Technology to enable them to learn for meaningful understanding.

Keywords: Nature of Science, Conceptions, Misconceptions, Correct Conceptions

INTRODUCTION

Science has since the dawn of civilization been a potent tool for finding solutions to the never ending human problems, or at least, help man to manage his challenges well. Science as a field of study and endeavour will always be an important aspect of human lives. Science involves all conscious activities that man engages in to understand nature and its components. Science, according to Abimbola (2013), can be seen as a body of knowledge; it could also mean a way or method of investigation and a way of thinking in an attempt to understand nature. Amongst others, the scientific process involves particular skills of inquiry that include: observing, classifying, experimenting, measuring, inferring and organizing data.

The nature of science according to GessNewsome (2002) is defined as the epistemological foundations of science, which include its empirical basis, tentativeness, subjectivity, creativity, unification, and its cultural and social embedded characteristics. The nature of science encapsulates the characteristics of science that make people understand scientific endeavours with less acquisition of cumbersome scientific knowledge. The preceding descriptions of the nature of science cannot be wholesome because science is viewed from different points of view and perspectives by researchers and scientists the world over. If there is a list that attempts to reveal all the values, processes, usefulness, prospects and products of science, the list will be endless.

Scientists and science educators have emphasized the absence of a consensus among researchers and scientists on the meaning of the nature of science. They opined that the situation is so because the

www.moj-es.net

1


nature of science is multifaceted, everchanging and convoluted. Like scientific knowledge, conceptions of the nature of science are ever dynamic and have witnessed different transformations throughout the development of science and scientific processes. Moreover, despite continuing disagreements about a particular definition of the nature of science, at a certain level of generality and within a set period, there is a shared perspective about the nature of science. There is a general agreement on several elements of the nature of science that is used for research purposes (AbdElKhalick, 2005; AbdELKhalick and Lederman, 2008; Akerson, Morrison & McDuffie, 2006; Lederman, 2007). The main purpose of this study was to gain useful insight into what the conceptions of the nature of science held by Nigerian undergraduate preservice biology teachers are and take appropriate actions where necessary. Specifically, the study also aimed to; find out undergraduate preservice biology teachers conceptions of the nature of science; determine the influence of gender on conceptions of the nature of science held by undergraduate preservice biology teachers in SouthWest, Nigeria.

For any science student to excel in the field of science, adequate knowledge of the nature of science is essential. The structure, epistemology and philosophy of science as described by Abimbola (2013) include its products, processes, and ethics. It is only when a student gets a good grasp of these concepts that such student could stand a good chance of a constructive academic success.

Literature shows that relatively little attention has been paid to students views about the nature of science. This is more so in the case of undergraduate preservice biology teachers (Kang, Scharmann & Noh, 2005; elikdemir, 2006). There have been many studies about students views of the nature of scientific knowledge, but those conducted among the undergraduate preservice biology teachers in Nigeria are relatively few. Meanwhile, there exist numerous studies carried out among undergraduate preservice science teachers and practicing science teachers in other parts of the world (Aslan, 2009; Ayvac, 2007; Kenar, 2008; YcelOyman, 2002).

The results of these studies made it clear that the majority of Turkish elementary school students and preservice science teachers held misconceptions and alternative conceptions of some aspects of the nature of science. Many of the undergraduate science education students had the conception that there is certain and defined scientific method to develop scientific knowledge (Blbl & Kk, 2007; ahin, Deniz & Grgen, 2006; naloban & Ergin, 2008). However, Most of these studies have neglected the possible influence of preservice biology teachers gender on their conceptions of the nature of science. Also, many of these researches were conducted involving only preservice biology teachers from one particular university or college, whereas, this study focused on preservice biology teachers from three different Nigerian universities. This is the knowledge gap this study intended to fill.

Bearing in mind the cultural diversity in the Nigerian context and the peculiarities of the Nigerian educational system, considerable attention should be focused at understanding what the views of preservice biology teachers about the nature of science are. This knowledge will help stakeholders arrive at a comprehension of the Nigerian biology teachers conception of the nature of science and take appropriate palliative or remedial measures where necessary.

The theoretical foundation upon which this study was based is constructivism. This area of educational interest is a learning theory propounded by cognitive psychologists with constructivist epistemological perspectives such as Jean Piaget (18961980). Other writers and philosophers who have enormously influenced constructivism are: John Dewey, Maria Montessori, Lev Vygotsky among others (Wikipedia, 2014). Brooks (2004) defined the term constructivism as an instructional approach that emphasises the active participation of learners in the instructional process. Effective learning takes place, as learners are active components of a process of meaning and knowledge construction instead of passively receiving information. Brooks (2004) perceived constructivism as a learning theory based on observation and scientific study. The researcher also went further to explain it as a philosophy of learning founded on the belief that by reflecting on human experiences, man can build up an understanding of the world. Learning could also be perceived as the process of adjusting our mental states to accommodate new experience (Brooks, 2004). Read (2004) opined that, there is widespread agreement among researchers in education that learners should not be seen as passive recipients of information during the teaching and learning process. Instead, they are active constructors of their knowledge. He explained that before a child begins school, he has a wealth of experiences, and these prior experiences have led him to develop a common sense

www.moj-es.net

2


understanding of his social and natural environment. These experiences give the learning process a boost because the formation of new knowledge will likely be influenced by preexisting knowledge. However, there may be a challenge that could arise from learners personal construction of new knowledge.

The study was guided by the following research questions; 1. What are undergraduate preservice biology teachers conceptions of the Nature of Science? 2. Is there difference in the number of correct conceptions about the nature of science held by male

and female undergraduate preservice biology teachers? 3. Is there difference in the number of misconceptions about the nature of science held by male and

female undergraduate preservice biology teachers? Based on the preceding research questions, it was hypothesised that; HO1: There is no significant difference in the number of correct conceptions about the nature of

science held by male and female undergraduate preservice biology teachers. HO2: There is no significant difference in the number of misconceptions about the nature of

science held by male and female undergraduate preservice biology teachers. The outcome of this study is envisaged to be of importance to the teaching and learning of science.

Specifically; students, secondary school biology teachers and lecturers in tertiary institutions, teacher educators, curriculum planners and textbook writers.

Secondary school biology teachers and lecturers in higher institutions of learning stand to gain immensely from the findings of this research work. The outcome of this research might help them realize the conceptions held by their students, and take appropriate steps towards improving the meaningful understanding of scientific concepts and the nature of science by the students.

On the part of biology teacher educators, the findings of this research work could help them to tailor biology teacher education practices towards producing biology teachers that can successfully handle students misconceptions and improve the image of biology among the students. Curriculum experts need to be sensitive to students preconceptions at various stages of curriculum development. Hence, results of this study could provide them with useful information on students conception of the nature of science.

Textbook writers might also find the outcome of this study useful because it might illuminate the hidden educational needs of secondary school science students. The upshot of this study would keep them abreast of the conceptions of the nature of science held by students, thereby giving them directions in which they should make improvements in addressing students misconceptions and alternative conceptions of the nature of science.

A review of literature related to this study was carried out. Bello and Abimbola (1997) conducted a study to determine the impacts of gender on students conceptmapping ability and achievement in evolution. The upshot of the study showed that there was no gender influence on students' conceptmapping ability and their achievement in evolution. In Ghana, Taale (2014) also conducted a study to inquire about the influence of biology teachers gender on their conceptions of the nature of science. The outcome of the study is also similar to that of Bello and Abimbola (1997). The study also showed that teachers gender is not a major factor in their conceptions of the nature of science.

From the information provided by the reviewed literature, it is evident that the researchers share the opinion that there exist various misconceptions and alternative conceptions about the nature of science among undergraduate preservice science teachers, especially practicing biology teachers and secondary school students alike. Unlike what was observed in some of the reviewed literature, where the researchers utilized simple random sampling and in some cases, purposive sampling, the researcher in the case of this study employed the stratified random sampling technique in selecting the sample for the study, this is at variance with some of the reviewed studies where the purposive sampling method was adopted for drawing samples.

The need for this sort of research was borne out of the fact that most researches into the conceptions of the nature of science held by preservice biology teachers were conducted outside Nigeria while those conducted in Nigeria are relatively few. The execution of a research work of this nature is envisaged to help bridge the gap created by the relatively small number researches into the conceptions of the nature of science held by Nigerian undergraduate preservice biology teachers. The sample that was used for the

www.moj-es.net

3


present study was gotten from three different universities, unlike other past researches that made use of just one institution. This would also enhance the external validity of the study.

METHODOLOGY

The study was a descriptive research of the survey method. The population for this study was all undergraduate preservice biology teachers in SouthWest Nigeria. Stratified random sampling technique was adopted to select the representation of the population. The universities from which samples was drawn are government universities that have a long history of graduating biology education students. Specifically, ninety nine (99) preservice biology teachers were selected from three universities in SouthWest Nigeria.

The research instrument that was employed to gather data in this study is the Nature of Science Questionnaire (NoSQ). The researcher adapted the NoSQ from the previous instrument developed and used by Indiana State University (2015). The NoSQ was divided into two sections; sections A and B. Section A of the questionnaire sought for demographic information while items in Section B sought for the preservice biology teachers conceptions about the nature of science. There were 25 items in Section B, some of which have been restructured to suit this study. These items were based on the various tenets of the nature of science. Respondents were required to indicate their conceptions about the nature of science by ticking () or crossing (X) the various statements about the tenets of the nature of science. A reliability coefficient of 0.74 was obtained using Pearson product moment correlation statistics.

Both descriptive and inferential statistics were employed in the analysis of the collected data. All the research questions raised and hypotheses earlier stated were tested using the chisquare (2) statistical tool. The hypotheses were tested at 0.05 level of significance.

RESULTS

Research Question 1: What are undergraduate pre-service biology teachers conceptions of the Nature of Science?

To answer research question 1, undergraduate preservice biology teachers were requested to indicate their conceptions of the nature of science. Table 1 and table 2 show the number, as well as percentages of undergraduate preservice biology teachers that held correct conceptions and misconceptions about various aspects of the nature of science respectively. As shown in table 1 and table 2, undergraduate preservice biology teachers held a mixture of correct and misconceptions about the nature of science. However, they appeared to hold more misconceptions than correct conceptions about the nature of science. This finding provides the answer to research question 1.

Table 1. Correct Conceptions about the Nature of Science held by Undergraduate Pre-Service Biology Teachers.

S/N Pre-service undergraduate biology teachers misconceptions about the nature of science Correct Conceptions

Frequency Percentage

1 Science is primarily concerned with understanding how the natural world works. 78 79.4%

2 Science requires a lot of creative activity. 77 78.6% 3 Science typically provides only temporary answers to questions. 50 51.0% 4 Scientists can believe in God or a supernatural being and still do good science. 64 64.6%

5 Science can be done poorly. 40 40.4% 6 Science can study and explain events that happened millions of years ago. 72 73.5%

7 Knowledge of what science is, what it can and cannot do, and how it works, is important for all educated people. 85 85.9%

8 Scientists have observed that nature apparently follows the same rules throughout the universe. 65 65.7%

www.moj-es.net

4


9 Scientists often try to test or disprove possible explanations. 78 78.8%

10 Science can be influenced by the race, gender, nationality, or religion of the scientists. 55 55.6%

Table 2. Misconceptions about the Nature of Science held by Undergraduate Pre-Service Biology Teachers.

S/N

Pre-service undergraduate biology teachers misconceptions about the nature of science

Misconceptions Frequency Percentage

1 Science is primarily a search for truth. 85 86.7% 2 Science can solve any problem or answer any question. 66 66.7% 3 Science can use supernatural explanations if necessary. 36 36.4%

4 Astrology (predicting your future from the arrangement of stars and planets) is a science. 66 66.7%

5 A hypothesis is an educated guess about anything. 73 74.5% 6 Science is most concerned with collecting facts. 79 82.3% 7 Most engineers and medical doctors are actually scientists. 83 83.8%

8 A scientific fact is absolute, fixed, and permanent. 59 59.6%

9 A scientific theory is a guess. 47 48.5% 10 Scientists have solved most of the major mysteries of nature. 67 67.7%

11 Modern scientific experiments usually involve trying something to see what will happen, without predicting a likely result. 71 74.7%

12 Anything done scientifically is always accurate and reliable. 69 69.7% 13 All scientific problems must be studied with The Scientific Method. 73 73.7% 14 Disagreement between scientists is one of the weaknesses of science. 45 45.9% 15 Any study done carefully and based on observation is scientific. 73 73.7%

Research Question 2: Is there difference in the number of correct conceptions about the nature of

science held by male and female undergraduate pre-service biology teachers? HO1 ; There is no significant difference in the number of correct conceptions about the nature of

science held by male and female undergraduate pre-service biology teachers.

Table 3: Chi-square Analysis of Significant Difference in the Number of Correct Conceptions held by Male and Female Pre-service Undergraduate Biology teachers

Not Significant at .05 alpha level of significance. As shown in Table 3, a chisquare analysis was conducted to compare the correct conceptions about

the nature of science held by male and female undergraduate preservice biology teachers. It was found that there was no significant difference in the number of correct conceptions about the nature of science held by male and female undergraduate preservice biology teachers. [(1, 99) = 25.296, p =.235]. Since the pvalue (.235) is greater than 0.05 (level of significance), the null hypothesis (HO1) was not rejected. This finding provides an answer to research question 1. That is to say, there is no gender difference in the number of correct conceptions about the nature of science held by undergraduate preservice biology teachers. This

Gender Df Sig

Pearson ChiSquare 25.296 1 .235

Likelihood Ratio 32.165 1 .056 LinearbyLinear Association .326 1 .568

N of Valid Cases 99

www.moj-es.net

5


result suggests that gender really does not have influence on the number of correct conceptions of the nature of science held by undergraduate preservice biology teachers.

Research Question 3: Is there difference in the number of misconceptions about the nature of science

held by male and female undergraduate pre-service biology teachers? HO2 ; There is no significant difference in the number of misconceptions about the nature of science

held by male and female undergraduate pre-service biology teachers.

Table 4:Chi-square Analysis of Significant Difference in the Number of Misconceptions held by Male and Female Pre-service Undergraduate Biology teachers

Not Significant at .05 alpha level of significance. A chisquare analysis was conducted to compare the number of misconceptions about the nature of

science held by male and female undergraduate preservice biology teachers. As shown In Table 4, There was no significant difference in the number of misconceptions about the nature of science held by undergraduate preservice biology teachers based on their gender at the p>. 05 level [(1, 99) = .009, p =.923]. Since the pvalue (.923) is greater than 0.05 (level of significance), the null hypothesis (HO2) was not rejected. This finding answers research question 2, meaning there is no difference in the number of misconceptions about the nature of science held by undergraduate preservice biology teachers based on their gender. This result indicates that undergraduate preservice biology teachers gender does not have much influence on their conceptions of the nature of science.

DISCUSSION

Findings from the study showed that undergraduate preservice biology teachers held both correct conceptions and misconceptions about the nature of science. The finding suggests that undergraduate preservice biology teachers held more misconceptions about the nature of science than correct conceptions. This is in line with the works of Butler et al. (2014), Hanson (2015), Onijamowo (2010), Sangsaard et al. (2014), Stojanovska, Soptrajanov, and Petrusevski (2012), Tan and Taber (2009), Pinarbasi, Sozbilir, and Canpolat (2009) all these researchers claimed that there exist numerous misconceptions among practicing and preservice biology teachers about the nature of science.

The gender difference in the conception of undergraduate preservice biology teachers was found to be statistically insignificant. This outcome agrees with those of Oluwatayo (2011), Taale (2014) who concluded that there is no significant difference in the number of biology teachers who held correct conceptions and misconceptions about the nature of science regarding gender. Parts of the reasons adduced for this is that both male and female undergraduate preservice biology teachers are trained by the same teachers; under the same conditions; and are taught using the same curriculum. Hence, there is little or no room for a variance in their conceptions.

Consequent upon the fact that there exist various misconceptions about the nature of science among preservice biology teachers which spreads across both genders as reported in this study, the teaching of science and biology in particular in secondary schools and institutions of higher learning is at a disadvantage. Olorundare (2014b) reported that students usually experience difficulty in learning science topics because of the misconceptions held by their science teachers which are easily transferred to the students. Misconceptions are understood to be stubborn and resistant, hence the higher risk of learners, carrying along the same misconceptions about the nature of science through their elementary education till graduation.

Gender Df Sig

Pearson ChiSquare .009 1 .923 Likelihood Ratio .009 1 .923

LinearbyLinear Association .009 1 .924

N of Valid Cases 99

www.moj-es.net

6


The implication of this is that such graduates eventually become teachers and transfer same

misconceptions about the nature of science on to another generation of learners which will make the task of overcoming students failure in science subjects and biology in particular difficult if not impossible. The findings of this study should stimulate the education authorities to proactively device methods to arrest the unwanted level of misconceptions among preservice biology teachers and enable them to effectively educate their students for meaningful understanding of biology and science generally.

CONCLUSION

The study concluded that undergraduate preservice biology teachers held both correct conceptions and misconceptions about the nature of science; however, they held more misconceptions than correct conceptions about the nature of science. The study further concluded that the gender of preservice biology teachers did not influence their conceptions about the nature of science. This study has shed more light on the conceptions of the nature of science held by undergraduate biology teachers in Nigeria. Irrespective of preservice biology teachers gender, they exhibited similar conceptions of the nature of science. The high number of misconceptions held by these teachers implies that remedial measures that will help these biology teachers reconcile their misconceptions about the nature of science with the appropriate scientific conceptions of the nature of science should be prioritized. Also, the study has pointed out the urgent need for capacity building programs on the nature of science for biology teachers in both government and private schools. These programs may include; symposia, seminar and workshops. The findings of this study conform to the international realities of the existence of various misconceptions and alternative conceptions of the nature of science among biology teachers in many countries. This being the case, international agencies with focus on the advancement of science and biology in particular should focus more attention on the minimization and possible eradication of misconceptions among biology teachers in Nigeria. This could be through the introduction of international scientific literacy exchange programs for both preservice and inservice biology teachers.

RECOMMENDATIONS

The following recommendations are considered relevant based on the findings of this study: 1. Biology teacher education curriculum planners should take cognisance of the fact that there exist

numerous misconceptions about the nature of science among preservice biology teachers; hence, there is a need to introduce the nature of science as a separate course in the Nigerian biology teacher education curricula.

2. There is also a need to retrain practicing biology teachers to help them reconcile their misconceptions about the nature of science with the appropriate scientific conceptions of the nature of science. This will prevent such teachers from passing misconceptions to their students in the science classroom.

3. Biology teacher educators should regularly identify the preservice undergraduate science teachers' misconceptions about the nature of science and take appropriate pedagogical measures to help them to reconcile the misconceptions with the appropriate scientific conceptions

4. Biology teacher educators should equip the preservice undergraduate science teachers with metacognitive tools such as Study Technology to enable them to learn how to learn for meaningful learning.

5. Biology education programmes should include Misconceptions in science a core course. This will help the biology teachers and students to be more sensitive to misconceptions about scientific concepts and how to avoid and reconcile misconceptions with the appropriate scientific conceptions.

6. Biology textbooks writers should also take note of misconceptions about the nature of science, hence, guide against statements and assertions that encourage misconceptions.

7. Biology teacher training programmes should make room for the use of instruments such as the nature of science questionnaire as formative tools. This would improve undergraduate biology

www.moj-es.net

7


teachers awareness of the nature of science and aid their understanding of the processes of scientific inquiry and the scientific enterprise. This study was specifically carried out on the conceptions of the nature of science held by preservice

biology teachers in SouthWest Nigeria, this kind of study can be carried out in other parts of the country to give a holistic view of what the conceptions of the nature of science held by Nigerian undergraduate preservice biology teachers are. Variables not covered in this study can also be investigated by other researchers. Further studies can also be conducted to look into the sources of the misconceptions or correct conceptions of the nature of science held by biology teachers in Nigeria. Researches can also be conducted to compare the conceptions of the nature of science held by preservice and inservice biology teachers to see if their conceptions of the nature of science changes with their teaching experience.

More researches can also be carried out to determine if there is any relationship between preservice biology teachers conceptions of the nature of science and their academic achievement in biology courses. Also, this can be replicated among inservice biology teachers to find out if their conceptions of the nature of science have anything to do with their classroom instruction or the academic achievement of their students in biology.

REFERENCES

AbdElKhalick, F. (2005). Developing deeper understandings of nature of science: The impact of a philosophy of science course on preservice science teachers' views and instructional planning. International Journal of Science Education, 27(1), 1542.

AbdElKhalick, F., & Laderman, N. G. (2008). Improving science teachers conceptions of the nature of science: A critical review of literature. International Journal of Science Education, 22(7), 665701.

Abimbola, I. O. (2013). Philosophy of science for degree students. Ilorin: Bamitex printing & publishing.

Akerson, V., Morrison, J., & McDuffie, A. (2006). One course is not enough: Pre service elementary teachers; retention of improved views of nature of science. Journal of Research in Science Teaching, 43(2), 194213.

Aslan, O. (2009). Science and technology teachers views on nature of science and the reflexions of these views on classroom activities. Published doctoral dissertation, Gazi University, Ankara.

Ayvac, H. . (2007). A study toward teaching the nature of science based on different approaches for classroom teachers in gravity content. Published doctoral dissertation, Karadeniz Technical University.

Brooks, J. G. (2004). Workshop: Constructivism as a paradigm for teaching and learning. Educational Broadcasting Corporation. Retrieved 16/9/2014 from http://www.thirteen:org/edonline/concepttoclass/constructivism ndex

Blbl, K. & Kk, M. (2007). Investigating elementary students view about scientific knowledge. Primary School Congress: Primary school education bulletin booklet, Hacettepe University, Ankara.

Butler, J., Simmie, G. M. & OGrady, A. (2014). An investigation into the prevalence of ecological misconceptions in upper secondary students and implications for preservice teacher education. European Journal of Teacher Education. 14(2)22 Retrieved from: www.emeraldinsiight.com. doi: 10.1080/02619768.2014.943394.

elikdemir, M. (2006). Examining middle school students understanding of the nature of science. Published master thesis, Middle East Technical University, Ankara.

GessNewsome, J. (2002). The use and impact of explicit instruction about the nature of science and science inquiry in an elementary science methods course. Science and Education, 11(15), 5567.

www.moj-es.net

8

http://www.thirteen:org/edonline/concepttoclass/constructivism%09ndex


Hanson, R. (2015). Identifying students alternative concepts in basic chemical bonding: A case study of

teacher trainees in the University of Education, Winneba. International Journal of Innovative Research and Development, 4 (1) 115-122.

Indiana State University (2015). Science knowledge survey. Retrieved from www.indiana.edu/../sci.tst.html

Kang, S., Scharmann, L. C., & Noh, A. (2005). Examining students views on the nature of science: Results from Korean 6th, 8th and 10th graders. Science Education, 89(13), 314334.

Kenar, Z. (2008). Prospective science teachers views of the nature of science. Published master thesis. Balkesir University, Balkesir. Khata.

Lederman, N. (2007). Nature of science: Past, present, and future. In S. L. Abell, N. (Ed.), Handbook of Research on Science Education. (3, pp.18). Mahwah: Lawrence Erlbaum Associates.

Olorundare, A. S. (2014b). Theory into practice: beyond surface curriculum in science education. Proceeding of 147th Inaugural lecture, University of Ilorin.

Oluwatayo J. A. (2011). Gender difference and performance of secondary school students in mathematics. European Journal of Educational Studies 3(1), 9495.

Onijamowo, O. T. (2010). Senior School Chemistry student misconceptions and alternative conceptions of selected chemistry concepts in Kogi State Nigeria. Unpublished master Dissertation. University of Ilorin, Nigeria.

Pinarbasi, T., Sozbilir, M. & Canpolat, N. (2009). Prospective chemistry teachers misconceptions about colligative properties: Boiling point elevation and freezing point depression, Chemical Education Research and Practice, 9(10), 273280.

Read, J. R. (2004). Childrens misconceptions and conceptual change in science education. Retreived from: http://acell.chem.usyd.edu.au/ConceptualChange.cfm.

ahin, N., Deniz, S., & Grgen, I. (2006). Student teachers attitudes concerning understanding the nature of science in turkey. International Education Journal, 7 (1), 5155.

Sangsaarda, R., Thathongb, K. & Chapoo, S. (2014). Examining grade 9 students conceptions of the nature of science. Procedia Social and Behavioral Sciences, 116(42), 382 388. Retrieved from www.sciencedirect.com.

Stojanovska, M., Soptrajanov, B. & Petrusevski, V. (2012). Addressing misconceptions about the particulate nature of matter among secondary school and highschool students in the Republic of Macedonia. Creative Education, 3, 619631. doi:10.4236/ce.2012.35091.

Taale, K. D. (2014). Gender and location influence on Ghanaian students perceptions of energy and classroom learning. International Journal of Education and Practice, 2(3), 5166.

Tan, K. D. & Taber, K. (2009). Ionization energy: Implications of preservice teachers' conceptions. Journal of Chemical Education, 86(5), 623629.

Wikipedia (2014). constructivism. Retrieved September 22 2014 at 6:19pm. From http://wikipedia.com/constructivism.

www.moj-es.net

9

http://www.indiana.edu/sci.tst.htmlhttp://acell.chem.usyd.edu.au/ConceptualChange.cfmhttp://www.sciencedirect.com/http://dx.doi.org/10.4236/ce.2012.35091http://wikipedia.com/constructivism


Differential Item Functioning Analysis of High-Stakes Test in Terms of Gender: A Rasch Model Approach

Seyed Mohammad Alavi [1], Soodeh Bordbar [2]

[1] University of Tehran, Tehran, Iran Email:[email protected] [2] PhD candidate of TEFL, Tehran University, Tehran, Iran Email: [email protected]

ABSTRACT

Differential Item Functioning (DIF) analysis is a key element in evaluating educational test fairness and validity. One of the frequently cited sources of construct-irrelevant variance is gender which has an important role in the university entrance exam; therefore, it causes bias and consequently undermines test validity. The present study aims at investigating the presence of DIF in terms of gender in a high stakes language proficiency test in Iran, the National University Entrance Exam for Foreign Languages (NUEEFL). The participants responses (N = 5000) were selected randomly from a pool of examinees who had taken the NUEEFL in 2015. The results displayed DIF between male and female test takers. Hence, on the basis of the findings, it is concluded that the NUEEFL test scores are not free of construct-irrelevant variance and the overall fairness of the test is not confirmed. Also, both Rasch assumptions (i.e., unidimensionality and local independence) are hold in the present research.

Keywords: Differential Item Functioning, Dimensionality, Rasch Model

INTRODUCTION

In language testing and educational measurement the discussions about test use and the consequences of tests have increased. Since the National University Entrance Exam for Foreign Languages (NUEEFL) is administered annually to a large number of test takers country-wide in Iran, the consequences of failure on the test are serious. It could result in spending one or more years for test preparation and two-year military service (for males).

Therefore, it is essential to examine the extent to which the instrument assesses what it is intended to measure (validity) as well as the test consistency (reliability) (Pae, 2011) in measuring the English ability in the high-stakes test, such as NUEEFL. Nonetheless, despite the heated nature of the debates, there has been little empirical evidence for the validity of the NUEEFL test and its fairness. Specifically, there is no ample evidence of test fairness among male and female test takers. In the absence of such evidence, any talk of the fairness of the selection policy would be doomed to fail.

The present study aims at investigating the validity of a high-stakes test in general and to considering the role of gender as a source of bias in the NUEEFL, in particular. Regardless of the content of the debates over the gender issue, it appears that there is no evidence on the effect of gender on the performance on the NUEEFL. If gender asserts a large influence, then it would be a case of bias and will undermine validity of the test. This is because gender is not part of the construct measured by the test and any significant impact by gender is a case of construct-irrelevant variance. As a part of standard process, Differential Item Functioning (DIF) analysis is conducted on the test items, as a main factor in the evaluation of the fairness, and validity of educational tests.

www.moj-es.net

10


In order to investigate the psychometric properties of the high-stakes test (i.e., NUEEFL), the present

study will address the following research questions: 1. To what extent do the item responses of NUEEFL form a unidimensional construct according to the

Rasch measurement model? 2. Is participants gender a source of DIF in NUEEFL items?

Review of Related Literature Differential Item Functioning (DIF) Test developers deploy several quality control or statistical procedures to ensure that the test items

are proper and fair for all examinees (Camilli & Penfield, 1997; Holland & Wainer, 2012; Ramsey, 1993). The statistical procedure aims at identifying items with different statistical features across certain groups of examinees. This refers to differential item functioning (DIF) and such items are said to function differentially across groups, which is a potential indicator of item bias (Sireci & Rios, 2013, p. 170).

According to Geranpayeh and Kunnan (2007) irrespective of considering the fairness issue in the design-development-administration-scoring cycle, still many problems are found in this procedure. Geranpayeh and Kunnan (2007) maintained there are two approaches for solving these problems. One approach is to develop a pilot group in order to examine test scores. If the test has been already conducted, a large sample is available to examine test scores and to investigate items and functions. If it is identified that they act differently, the source of this difference is known as Differential Item Functioning (DIF).

To characterize the definition of DIF, Wiberg (2007) indicated that identifying problematic items via item analysis plays a key role in a test. It is maintained that item analysis includes using statistical techniques to examine the test takers performance on the items (Wiberg, 2007, p. 1) and one of the crucial parts in the item analysis is to detect differential item functioning. The DIF technique is still a very useful method for identifying potential problem items (Angoff, 1993).

Differential item functioning occurs when an items properties in one group are different from the items properties in another group (Furr & Bacharach, 2007, p. 331). To highlight this point, Furr and Bacharach (2007) specified via an example; DIF exists when a particular item has different levels of difficulty for males and females. Put another way, the incidence of differential item functioning means that a male and a female who have the same trait or ability level have different probabilities of answering the item correctly. It is concluded that the presence of DIF between groups shows that the groups cannot be meaningfully compared on the item (Furr & Bacharach, 2007).

DIF procedures are used to determine whether the individual items on a test function in the same way for two or more groups of examinees, usually defined by racial/ethnic background, sex, age/experience, or handicapping condition (Scheuneman & Bleistein, 1989, pp. 255-256). A plethora of studies categorized DIF detection techniques. To date, many DIF analysis techniques have been proposed. McNamara and Roever (2006, p. 93) classified methods for detecting DIF into four broad categories; 1). Analyses based on item difficulty: These approaches compare item difficulty estimates. 2). Nonparametric approaches: These procedures use contingency tables, chi-square, and odds ratios. 3). Item-response-theory-based approaches which include 1, 2, and 3-parameter IRT analyses. 4). Other approaches. These include logistic regression, which also employs a model comparison method, as well as generalizability theory and multifaceted measurement, which are less commonly used in classic DIF studies.

A large range of possible techniques is available; however, only a limited number are currently used. The following section attempts to consider Item Response Theory (IRT)-based models, specifically the Rasch model as an applicable and germane method to present research.

The Rasch Model IRT is an extension of classical testing theory with mathematical roots which deeply penetrated in

psychology and the mathematical basis of IRT has been embedded in the psychological measurement (Ostini & Nering, 2006). Some controversial issues, however, exist in defining the concept of measurement in human science and psychology. Rasch model is mathematically equivalent to the one-parameter logistic (1PL) IRT model, but they developed separately (DeMars, 2010). Controversy surrounds the Rasch model; some specialists believed that the Rasch model and IRT models are structurally different and are used very differently. It is claimed that IRT models are used to describe and fit data; when fit is poor, the model is adapted or discarded in favor of another model and, in contrast, the Rasch model is more prescriptive. The

www.moj-es.net

11


data are required to fit the model and when they do not, items that show misfit are discarded until a satisfactory fit is obtained (Zand Scholten, 2011, p. 39).

The Rasch model involves model-based measurement in which trait level estimates depend on both the persons responses and on the properties of the items that were administered (Embretson & Reise, 2000, p. 13). Furthermore, the test items should not act differently for any specific subgroups of the participants. If an item behaves differently for particular groups, then the validity of the measure for the certain construct decreases; as it is considered as a threat to the test fairness. The Rasch model approach permits investigation of the biased items toward different subgroups and to inspect the construct irrelevant factors (i.e., gender, ethnicity, and academic background) via calculating Differential Item Functioning (DIF) measures.

Besides that, the Rasch model assumptions include unidimensionality and local independence. A unidimensional test consists of items that refer to only one dimension; as DeMars (2010) asserted whenever only a single score is reported for a test, there is an implicit assumption that the items share a common primary construct (p. 38). Wale (2013) mentioned that the assumption of unidimensionality requires the items function in unison and all non-random variance in the data can be accounted for by person ability and item difficulty (p. 56). Generally, unidimensionality indicates whether the items makes a single latent trait () (DeMars, 2010).

One aspect regarding unidimensionality deserves caution. Sometimes, responses to test items can be mathematically unidimensional while the items measure what educators and psychologist would conceptualize as two different constructs. For example, test items may gauge both test-taking speed and knowledge (DeMars, 2010).

Another assumption of the Rasch model is local independence. Local independence is the probability of a test taker responding correctly to a certain item is not dependent on previous responses or the answers given by other individuals to the same item (Wale, 2013, p. 56). Unidimensionality can be checked via model fit statistics. Besides, unidimensionality and local independence are estimated using fit model statistics; to say a person or an item may be misfitting means the extent to which an intended person and item does not act as the Rasch model would predict.

METHODS

Participants The participants of the present study (N = 5000) were selected from among the pool of examinees

from a population of 20,000 who had taken a recent version of the NUEEFL test in 2015. The participants were randomly selected from the two gender groups (i.e., males and females). The female group included 3335 persons of the total participants and the rest of 1665 examinees were male. The academic background and the age of the participants were not considered in this study.

Instrumentation The National University Entrance Examination for Foreign Languages (NUEEFL) has a total of 95 items

of which 25 are general questions and 70 items come under six subtests: a) Grammar (10 items), b) Vocabulary (15 items), c) Sentence Structure (5 items), d) Language Functions (10 items), e) Cloze Test (15 items), f) Reading Comprehension (15 items).

The NUEEFL test is annually administered to more than 100,000 university applicants to attempt to find the B.A degree in governmental university, specifically in the field of foreign languages. The questions all are in multiple-choice format and are dichotomously scored. The test is time restricted with a dedicated time of 105 minutes. Generally, as a rule in NUEEFL test, guessing is not allowed and the test has included negative score for the wrong responses. It means that a total of three wrong answers will expurgate a correct answer.

The latest version of Winsteps software Version 3.92.1 updated in February 2016 was employed for the data analysis (Linacre, 2016a, b). Winsteps constructs Rasch measures from simple data sets (i.e., usually of persons and items) and applies the dichotomous Rasch model. Pearson-test reliability and item reliability of NUEEFL test were excellent (Pearson r = 0.93 and item r = 1).

www.moj-es.net

12


Procedure The NUEEFL is administered annually to a large number of test takers all across Iran. The present

study focused on the one aspect of test validity which was assessed through the implementation of the Rasch model. To investigate DIF analysis and apply the Rasch model, the statistical and mathematical assumptions must be met.

Data Analysis The psychometric properties of the items were estimated using the Winsteps software (Linacre,

2016b). Since the dataset was dichotomous the data were analyzed implementing the Joint Maximum Likelihood Estimation (JMLE) method for estimating the Rasch parameters. In the JMLE formula, the estimate of the Rasch parameter happens when the observed raw score for the parameter matches the expected raw score.

The data-model fit estimated through employing the infit and outfit mean-square values to identify misfit and good-fit items. When it is said a person or an item may be misfitting, it denotes that an intended person and item does not act as the Rasch model would predict (Boone et. al., 2014). The fit estimation checks for the model mis-specifications that can be evaluated in the fit between the model and the data (DeMars, 2010). There are two different fit statistics for persons or items; they are called the weighted (infit) which weights the square residual by the variance of item, while the unweighted (outfit) gives the residual the same weight (i.e., 1) (Wale, 2013, p. 57). The normal range of acceptable fit for both statistics is between 0.70 and 1.3 (Bond & Fox, 2007; Liu, 2010).

Furthermore, like many IRT models, the Rasch model rests on two basic assumptions: unidimensionality and local independence. The unidimensionality assumption requires that there is only one underlying construct measured by the set of items included in the test. That is, the test measures only one factor. The local independence assumption requires that an examinees response to an item does not influence his or her response to any other item. Hence, the items must not give a clue to the correct response for another item.

Unidimensionality was checked through Principal Component Analysis (PCA) in Winsteps. What is required is the fact that there must one dominant factor explaining the shared line of covariance among the items (Hambleton, Swaminathan, & Rogers, 1991). Hence, unidimensionality will hold if the first extracted factor explains a much higher amount of the total variance than that explained by the secondary dimensions. As mentioned before, multiple methods for assessing unidimensionality exist, including the data-model fit statistics. However, studies indicated that these statistics lack the sensitivity required to detect multidimensionality. Hence, it is logical to use Principal Components Analysis (PCA) on the raw data and residuals, in addition to checking the data-model fit.

One approach, first proposed by Smith (2002), for assessing unidimensionality within the Rasch model framework, is the Principal Component Analysis (PCA) based method. This approach aims at assessing whether the items are unidimensional enough as to be treated in practice (Smith, 2002). Principal Component Analysis has the advantage of compressing the data, once patterns have been found in the data. It reduces the number of dimensions, without losing too much of information (Smith, 2002). PCA determines whether the set of items represents a single construct or not. The analysis of dimensionality, based on Smiths approach, involves a two-step process.

First, the measurement dimension of the scale was estimated using the Rasch model. The variance associated with this measurement dimension was extracted from the item-response data by computing standardized residuals: (observed - expected)/ (model standard error). Second, a principal component analysis of the standardized residuals was used to determine whether substantial subdimensions existed within the items. If the items measure a single latent dimension as estimated by the Rasch model, then the remaining residual variance should reflect random variation.

McCreary et al., 2013, p. 6

For determining the unidimensionality of questions in the PCA method, conventional criteria were used for judging unidimensionality (Linacre, 2006). Regarding this, Reckase (1979) suggested the following criteria for unidimensionality: a). if the amount of variance explained by measures be > 20%, b). the

www.moj-es.net

13


unexplained variance of the eigenvalue for the first contrast (size) < 3.0 and unexplained variance explained by first contrast < 5% is good (Linacre, 2006, p. 272).

According to Smiths approach, at first, the parameters for all questions were estimated, which is called level A in the present research. In the PCA approach, item residuals with loadings of +0.3 or more and 0.3 or less are taken as potential representatives of subdimensions (Hagell, 2014). And in the second round of data analysis, it is attempted to detect the questions with the outfit MNSQ statistic value larger than 1.3. This phase is called level B in which the parameters of the questions were separately estimated. Then, the difference in the difficulty parameter of the questions which were obtained in level B, from the difficulty parameter in level A was estimated. Accordingly, the mean scores of the differences was calculated. Technically, the estimation is called constant correction.

In the next step, the participants ability parameter were calculated once regarding the entire test (Level A), and once the items calculated separately (Level B). Afterwards, the constant correction value which had been previously calculated was added to the ability parameter (i.e., the Level B). Finally, the difference between the individuals ability level in the entire test and in the separate items in Level B were calculated through a series of independent t-test for determining the statistical significance and to compare the two estimates on a person-by-person basis in order to determine the proportion of instances in which the two item sets yield different person measures (Hagell, 2014, p. 460).

In order to determine the significance of the t-test, the error level of type one ( = 0.05) has been modulated. Smith (2002) indicated that if the level of significance of t-test exceeds 5%, the local independency and unidimensionality will be violated.

Moreover, the DIF analysis was examined to test the invariance of measurement. The test developers deploy several quality control or statistical procedures to ensure that the test items are proper and fair for all examinees (Camilli & Penfield, 1997; Holland & Wainer, 2012; Ramsey, 1993).

According to Angoff (1993) an item which shows DIF has different statistical properties in different groups when monitoring for differences in the abilities of groups. DIF is highlighted as an unexpected difference between two groups after matching to measure the underlying ability in the intended item (Camilli, 2006; Wiberg, 2007). Besides, the essential component of DIF analysis in the Rasch model is to compare the item difficulties obtained from the two samples. If the difference in the difficulty estimates is large, then measurement invariance fails and DIF has happened. In the present study the DIF analysis was carried out to test whether the items functioned differently across gender groups.

RESULTS

Data-model Fit Estimation

The Winsteps software normally assessed the fit of the model through obtained statistics indicators of mean-square fit values (MNSQs) and the standardized Z values (ZSTDs). The values in the range of MNSQs are considered from zero to infinite (0- ) and the expected value is 1. Values above 1 show a deviation from the unidimensionality, and values less than 1 indicate the overfit in the response patterns with the data-model. The overfit in the model implies the existence of dependency among responses or items.

It is worth mentioning that this statistical index is very sensitive to the sample size. For the sample size with less than 90 people, the model fits with any types of data model, whereas the model does not fit with samples consisting more than 900 people.

Therefore, due to the large sample size in this study and keeping with the valued guidance provided by Linacre (2012), the data-model fit was displayed through MNSQs. The Rasch model offers two indicators of misfit: the infit and outfit mean square indices. Infit is sensitive to unexpected responses to items near the persons ability level and outfit discusses difference between observed and expected responses regardless of how far away the item endorsability is from the persons ability (McCreary et al., 2013, p. 7). The MNSQs estimates and reports both outfit and infit MNSQs for analyzing the fit of the model.

For both indicators, values between 0.701.3 are considered as acceptable or so called good fit values.

www.moj-es.net

14


Values less than 0.70 indicate outfit, whereas values above 1.3 are a sign of infit. Furthermore, Linacre favors outfit-MNSQs over infit-MNSQs. Hence, in the present research in order to assess data model fit it is decided to consider outfit-MNSQs as a criterion for the outcome interpretations. And, the acceptable values for this index are in the range of 0.70 to 1.3.

In analyzing the model fit estimation, it is required to eliminate the participants with total score of zero. The data were screened for outliers. Besides, the fit indices should be reported for the item calibration. The difficulty estimates for the items, standard errors of item difficulty of estimates, and the infit-MNSQs and the outfit-MNSQs indices are shown in Table 1.

Note that in Table 1 due to space restriction, only the estimation of difficulty parameter and model fit estimations of misfit items are illustrated in descending order (from the most difficult to the least difficult).

Table 1. Item Statistics for Fit Model Estimate and Difficulty Parameter in the Entire Test (Descending Order)

Item Entry Number Total Score

Total Count Measure

Model S.E.

Infit MNSQ

infit ZSTD

Outfit MNSQ

Outfit ZSTD

Q155 80 158 5000 2.45 0.08 1.07 1 1.82 4.8 Q126 51 191 5000 2.23 0.08 1.08 1.2 2.05 6.3 Q137 62 265 5000 1.84 0.07 1.04 0.8 1.32 2.6 Q105 30 285 5000 1.76 0.07 1.19 3.7 3.01 9.9 Q118 43 314 5000 1.64 0.06 1.18 3.6 1.98 7.3 Q166 91 330 5000 1.57 0.06 1.05 1.2 1.49 4.2 Q101 26 335 5000 1.56 0.06 1.26 5.3 3.3 9.9 Q103 28 363 5000 1.46 0.06 1.28 6.1 2.99 9.9 Q158 83 367 5000 1.44 0.06 1.14 3.1 1.46 4.2 Q111 36 392 5000 1.36 0.06 1.12 2.9 1.87 7.4 Q115 40 411 5000 1.3 0.06 1.14 3.5 2.02 8.6 Q133 58 411 5000 1.3 0.06 1.1 2.5 1.77 6.8 Q121 46 459 5000 1.15 0.05 1.38 9.2 2.43 9.9 Q167 92 462 5000 1.14 0.05 0.83 -4.9 0.64 -4.7 Q109 34 463 5000 1.14 0.05 1.24 6 2.03 9.2 Q128 53 496 5000 1.05 0.05 1.2 5.3 1.44 4.7 Q122 47 600 5000 0.79 0.05 1.13 4.2 1.46 5.4 Q156 81 732 5000 0.5 0.04 1.2 6.9 1.41 5.4 Q99 24 821 5000 0.33 0.04 0.83 -7.1 0.69 -5.7 Q84 9 860 5000 0.26 0.04 1.28 9.9 1.6 8.5 Q153 78 874 5000 0.24 0.04 0.78 -9.4 0.59 -8.1 Q149 74 1046 5000 -0.05 0.04 0.75 -9.9 0.57 -9.9 Q108 33 1079 5000 -0.1 0.04 1.17 7.6 1.33 6.1 Q160 85 1151 5000 -0.21 0.04 0.8 -9.9 0.67 -8.1 Q91 16 2076 5000 -1.37 0.03 0.76 -9.9 0.67 -9.9 Q79 4 2513 5000 -1.85 0.03 1.2 9.9 1.36 9.9 Mean 1170.9 5000 0 0.04 1 -0.7 1.14 0.1 P.SD 790.5 0 1.21 0.01 0.13 5.4 0.5 5.3 The first column of Table 1 is the Item Number. The second column shows the Entry Number which

provides the order of entering the data in each row. The next column provides Total Score which is total number of correct answers and Total Count is the total number of participants or responses. The fifth column, the measure, reports the difficulty estimates for the items. The range value of difficulty parameter is from

www.moj-es.net

15


2.45 to -2.99, with mean score of 0, and Standard Deviation (SD) of 1.21. The descending management of item statistics in Table 1 is helpful to arrange the most difficult item Q.155 (Measure = 2.45) and the least difficult item Item Q.87 (Measure = -2.99). Column six provides the Standard Error (SE) of item difficulty estimates.

In the last four columns the infit and outfit statistics are presented. In the following tables the infit-MNSQs values were provided, however; in estimation of data model fit the values of outfit-MNSQs were merely used. As explained before, the acceptable range for both infit and outfit MNSQs is between 0.70 and 1.33. In this study, the outfit- MNSQs indices are all within the acceptable range. However, it appears 26 items were not located in the acceptable range. In Table 1 the range value of the outfit-MNSQs varies from 0.57 to 3.33 which means that some items presented in Table 1 do not fit the model. The investigation of item statistics of outfit-MNSQs reveals that 26 items (27% of items) are not fitted.

Figure 1. The ICC Curve for Misfitting Item, Item 105.

Figure 1 shows the Item Characteristic Curve (ICC) for Item 105. The red curve is the expected ICC. It would be gained if the data fitted the Rasch model. The blue curve is the observed or empirical ICC. The grey line in the outskirt of the red curve is the confidence interval. The confidence intervals are constructed from an estimate and its standard error. The data in present study employed from a large data set, it is evident that the standard error would be very small. The confidence interval become wider if the output of data analysis contains the large standard error.

Figure 1 shows that the empirical ICC for mis-fitting Item 105 has a large deviation from the expected ICC. This fact is reflected in Item 105 with the large outfit-MNSQs value of 3.01. On the contrary, for instance, Figure 2 demonstrates the empirical ICC for good-fitting item (i.e., Item 142). The out-fit MNSQs value (i.e., 0.94) is within the acceptable range.

www.moj-es.net

16


Figure 2. The ICC Curve for Good-fit Item, Item 142.

The presence of a large number of misfitting items demonstrates that the data does not fit the model in the NUEEFL. Therefore, the model and its assumptions may be violated. It is possible that the Rasch model unidimensionality assumptions also may not attain the desirable results. Thus, in the next section the results of unidimensionality and local independence will be reported.

Unidimensionality

The unidimensionality assumption requires that there is only one underlying construct measurement by the set of items included in the test. That is, the test measures only one factor. There are multiple methods for assessing unidimensionality, including the data-model fit statistics. However, studies indicated that these statistics do not have the ample sensitivity required to detect multidimensionality. Hence, it is logical to use a Principal Component Analysis (PCA) on the raw data and residuals, in addition to checking the data-model fit. In the current research, the unidimensionality of the test and its items were checked through a Principal Components Analysis (PCA) of residuals and a t-test.

In order to assess dimensionality, PCA of the Rasch residuals was performed. The variance of the measurement dimension is 34.8% with 12.7% of raw variance explained by persons and 22.1% raw variance explained by items. The results showed that it is larger than the requirement of 20%, demonstrating a unidimentional trait of the data (Reckase, 1979).

The first, second, third, fourth, and fifth unexplained variance accounted for eigenvalues are 3.4, 2.5, 2.2, 1.9, and 1.7 which were good by referring the criteria. The results of the data analysis suggested that the unidimensionality is hold across the whole test (See Table 2 & 3).

www.moj-es.net

17


Table 2. PCA Analysis

Table 3. Contrast in Eigenvalue Units

Expected Observed Eigenvalue Contrast in Eigenvalue units 3.6% 2.4% 3.4662 Unexplained variance in 1st contrast 2.7% 1.7% 2.5397 Unexplained variance in 2nd contrast 2.4% 1.5% 2.2560 Unexplained variance in 3rd contrast 2.0% 1.3% 1.9438 Unexplained variance in 4th contrast 1.8% 1.2% 1.7076 Unexplained variance in 5th contrast 100.0% 100.0% 95.0000 Raw unexplained variance (total)

Local independence was examined through checking all the abilities in order to identify whether the

responses to items could be independent of each other (Pae, 2011). As for local independence, in assessing the t-test statistics the outfit-MNSQs was examined. The results showed that 20 items for this statistic exceeded 1.3. These 20 items are considered as those which confirm the unidimensionality assumption in data analysis.

The steps of t-test calculation were conducted. Note that the measurement of local independence was based on Smiths approach (2002). The total sum of difference between difficulty parameter of these 20 items gained from level A and B is equal with -1.12. The constant correction value is -0.056 which is obtained through dividing -1.12 by the number of items (20 items). By adding the constant correction value to the ability parameter of participants within questions of level B. Then, it is attempted to examine the significance level of t-test and to modify the significance level.

The Students t-statistics on 20 items revealed that there were no significant different among t-test results. Thus, it is concluded that the local independence assumption is strongly accepted in the entire NUEEFL test. To sum, regarding the results both Rasch assumptions (i.e., unidimensionality and local independence) are hold in the whole test.

Differential Item Functioning The next step in data analysis was the DIF analysis. The ability and difficulty estimates in the Rasch

model are assumed to be invariant. The statistical procedure aims at identifying items with different statistical features across certain groups of examinees.

In this study, DIF analysis was investigated for the gender groups and for the NUEFFL items. For DIF Analysis in a Rasch context, both magnitude of the difference in logit units between the groups and statistical significance of the difference should be considered (Linacre, 2016a). The magnitude of the DIF value should be at least 0.5 logits, indicating the comparison between differences in difficulty of the items for one group to the difficulty level of the same items for the other group (Linacre, 2016a).

In this stage of this study, DIF analysis was tested between two groups of males and females. In order to examine the invariance, the difference between the DIF analysis of two groups of males and females through testing the t-test of the statistical significance of the data was investigated. For statistically significant DIF, the probability of such difference (0.5 logits or larger), occurring as a random accident, should be 0.05. This indicates that the probability of such difference happens when there is no systematic item bias in the test items (Linacre, 2016a).

Beside, considering that statistical significance tests are affected by sample size and due to the large size of the study groups in present research, it needs at least 0.5 logits for DIF to be noticeable. For instance,

Expected Observed Eigenvalue Variance in Eigenvalue units 33.7% 34.8% 50.7068 Raw variance explained by measures 12.3% 12.7% 18.5454 Raw variance explained by person 21.4% 22.1% 32.1614 Raw variance explained by items 66.3% 65.2% 95.0000 Raw unexplained variance 100.0% 100.0% 145.7068 Total raw variance in observations

www.moj-es.net

18


if the difficulty of an item in both groups of males and females has 0.5 logits difference, that specific item will be considered as a DIF item. Note that, due to space limitation, only DIF-flagged items appear in Table 4.

Table 4. DIF-flagged items in the NUEEFL test

Item Number

Person Class

DIF Measure

Person Class

DIF Measure

DIF Contrast

Rasch-Welch

t df Prob

Q76 M -0.87 F -0.67 -0.2 -2.75 INF 0.0059

Q77 M -1.95 F -2.11 0.15 2.19 INF 0.0289

Q80 M -2.5 F -2.08 -0.42 -5.85 INF 0.0000

Q85 M -0.16 F 0 -0.16 -1.96 INF 0.0498

Q86 M -1.17 F -1.02 -0.15 -2.05 INF 0.0400

Q90 M 0.29 F 0.81 -0.52 -5.6 INF 0.0000

Q91 M -1.48 F -1.31 -0.17 -2.47 INF 0.0137

Q94 M -0.99 F -0.81 -0.18 -2.5 INF 0.0123

Q95 M -1.08 F -0.72 -0.35 -4.83 INF 0.0000

Q97 M -0.41 F -0.12 -0.29 -3.59 INF 0.0003

Q99 M 0.14 F 0.45 -0.31 -3.52 INF 0.0004

Q103 M 1.62 F 1.36 0.26 2.1 INF 0.0362

Q104 M 0.55 F 0.35 0.21 2.22 INF 0.0263

Q105 M 2.01 F 1.61 0.4 2.89 INF 0.0039

Q106 M -0.46 F -0.09 -0.37 -4.66 INF 0.0000

Q107 M -0.4 F -0.63 0.23 2.93 INF 0.0034

Q108 M 0.03 F -0.17 0.2 2.42 INF 0.0157

Q111 M 1.62 F 1.21 0.41 3.35 INF 0.0008

Q113 M -0.48 F -0.07 -0.41 -5.15 INF 0.0000

Q117 M -1.19 F -0.98 -0.21 -2.93 INF 0.0034

Q119 M 1.31 F 1.67 -0.36 -2.99 INF 0.0028

Q121 M 1.52 F 0.96 0.55 4.77 INF 0.0000

Q122 M 0.96 F 0.7 0.26 2.58 INF 0.0100

Q125 M 0.36 F 0.81 -0.44 -4.71 INF 0.0000

www.moj-es.net

19


Item Number

Person Class

DIF Measure

Person Class

DIF Measure

DIF Contrast

Rasch-Welch

t df Prob

Q128 M 1.2 F 0.96 0.24 2.2 INF 0.0279

Q130 M 0.39 F 0.18 0.21 2.37 INF 0.0179

Q131 M -0.18 F -0.41 0.22 2.78 INF 0.0054

Q132 M -1.38 F -1.7 0.32 4.55 INF 0.0000

Q135 M -0.96 F -1.42 0.46 6.43 INF 0.0000

Q138 M -0.86 F -1.07 0.21 2.84 INF 0.0046

Q141 M -1.81 F -2.02 0.21 2.95 INF 0.0032

Q142 M 0.08 F -0.12 0.2 2.35 INF 0.0187

Q143 M -0.49 F -0.3 -0.19 -2.46 INF 0.0140

Q145 M -0.9 F -1.14 0.24 3.3 INF 0.0010

Q147 M 1.74 F 1.33 0.41 3.28 INF 0.0011

Q157 M -0.04 F 0.24 -0.28 -3.3 INF 0.0010

Q159 M 0.47 F 0.97 -0.5 -5.12 INF 0.0000

Q163 M 0.05 F -0.18 0.22 2.69 INF 0.0072

Q167 M 0.96 F 1.27 -0.3 -2.8 INF 0.0052

Q168 M -0.42 F -0.89 0.47 6.22 INF 0.0000

Note. M = Male; F = Female; INF = Infinity

The DIF analysis between groups of male and female was investigated. The results of this analysis are shown in Table 4. The results show that among 95 items, 40 items exhibit DIF.

The difficulty level of items between males and females was variant. Hence, it is concluded that the invariability of questions in gender group is not accepted. The null hypothesis which stated the participants gender is not a source of DIF in NUEEFL is rejected. Given significance DIF within the Rasch model in gender group, the NUEEFL appeared not to be a DIF-free person estimates test. Hence, it is concluded that there is significant difference between males and female in answering the NUEEFL test. And, the NUEEFL test is not fair to all male and female participants.

DISCUSSION AND CONCLUSIONS

The present study aimed at investigating the interaction of person abilities and item difficulties of the high-stakes test of university entrance exam in Iran (i.e., NUEEFL). In particular, results from the Rasch model and DIF analysis were compared to see whether evidence of differential functioning would be found in data analysis.

www.moj-es.net

20


The descriptive statistics revealed that there was significant difference in the overall test. Hence, the

hypothesis that the data fit the Rasch model was not supported. The results of data-model fit revealed that 26 items (i.e., 155, 126, 137, 105, 118, 166, 101, 103, 158, 111, 115, 133, 121, 167, 109, 128, 122, 156, 99, 84, 153, 149, 108, 160, 91, and 79) out of total 95 items were not located in acceptable range value of 0.70 to 1.30. There are many misfitting items and items were not fitted with the model.

A concern about the dimensionality of the NUEEFL test suggests that a calibration of test of Rasch model in Winsteps software revealed that there are many misfitting items. The dimensionality was detected through the Principal Component Analysis (PCA) on the raw data and residuals. The amount of variance explained by the different components in the data was 34.8% (eigenvalue 50.70) which is larger than 20% as to be indicative of unidimensionality. The results of the data analysis suggested that the unidimensionality is hold across the whole test.

In the case of local independence, a series of t-test was performed. The result of outfit-MNSQs showed that 20 items are larger than the intended criteria (i.e., 1.3). It is concluded that the local independence assumption is strongly accepted in the entire NUEEFL test.

DIF analysis confirmed a different probability of endorsing the test items across the gender groups. Based on the DIF results, it is interpretable that out of the 95 items, 40 items displayed DIF-flagged items. This suggests that NUEEFL test scores are not free of construct-irrelevant variance. Hence, it does not support the argument for the construct validity.

Additionally, there is an ongoing interest in comparing cultural, ethnic, or gender groups. DIF studies are essential in testing programs with high stakes. Furthermore, possible gender and/or ethnicity bias could negatively impact one or more groups in a construct-irrelevant manner. In fact, the test administrators attempt in making a perfectly fair testing battery; however the dearth of research on NUEEFL test raised questions regarding the fairness of this national test.

The NUEEFL, which is given to thousands of individuals annually, is used as a gate-keeping test for entering higher education in Iran. In line with the main purpose of this research, DIF analysis across gender groups was investigated. DIF analysis rejected a similar probability of endorsing the test items across the gender groups. The results of the study indicated that 40 items out of the 95 items of NUEEFL test exhibited DIF. This suggests that NUEEFL test scores are not free of construct-irrelevant variance.

It is worth mentioning that the results of the present study are not consistent with Karamis (2015) study from the aspect of dimensionality of NUEEFL, in which the multidimensionality in the whole test and among sub-tests had been proven.

According to Camilli (2006) DIF analysis mainly focuses on the performance of two or more different groups. Therefore, such analysis is unable to disclose the existence of bias against different individuals. Moreover, this study directs to this point that due to the asses

Volume 05 - Issue 01 January 2016

Documents