RM Project Final (2)

CHAPTER IINTRODUCTIONQ.1: Explain the concept of processing of data and analyse in detail the different stages in the data processing?Data ProcessingData reduction or processing mainly involves various manipulations necessary for preparing the data for analysis. The process (of manipulation) could be manual or electronic. It involves editing, categorising the open ended questions, coding, computerisation and preparation of tables and diagrams.Checking and EditingInformation gathered during stage of data collection varies in nature and quantity from study to study. For example, when surveys are conducted and data obtained through questionnaires and scheduled, the answers either may not be ticked at proper places, or some questions may be left unanswered, or may be given in a form which need reconstruction in a category designed for analysis, e.g., converting daily/monthly income in annual income or indentifying family structure (nuclear/joint) on the basis of kin living together and functioning under the common authority, and so on. Suppose in a business research, in one question is your industry one of the largest or about average in size, or small, the respondent ticked both large and average and writes average in sales but one of the largest in chain of chemical industries. The researcher has to take a decision as to how to edit it and whether to identify it as largest or average industry.Checking also needs that data are relevant and appropriate and errors are modified. Occasionally, the investigator makes a mistake and records and impossible answer. How much red chilli do you use in a month? The answer is written 4 kilos. Can a family of three use four kilos in a month? The correct answer would be 0.4 kilos. Similarly, an answer to the question How much money do you spend in a year on education of your children says Rs. 30,000. This answer will be confusing if the respondent says his monthly income is Rs. 5000. A family which educates its children in a costly private school cannot survive on a monthly expense of Rs. 2500. Such answers need editing.

STAGES IN DATA ANALYSIS DATA PROCCESSING

EDITING CODING COMPUTER FEEDING

DATA DISTRIBUTION

TABULATION

UNIVERATE BIVARATE MULTIVARATE

DATA ANALYSIS

CATEGORISATION FREQUENCY DISTRIBUTION MEASUREMENT

DATA INTREPRETATION

DIAGRAMATIC REPRESENTATIONEditing is required for proper coding and entering in the computer (when decision is taken not to analyse the data manually). Editing thus means that the data are complete, error-free, readable and worthy of being assigned a code. Editing process brings in the field itself. Interviewers, soon after completing the interviews should check the completed forms for errors and omissions. They can complete the incomplete responses and reduce the number of no responses with the rapid follow-up, stimulated by field editing. In many cases, field editing may not be possible. In such cases, in-house editing may help. Editing also occurs simultaneously with forming categories, e.g. age given by the respondents may be put in the category of below 18 years (very young), 18-30 years (young), 30-40 years 9early middle-aged), 40-50 years (late middle aged) and above 50 years old (old). Field supervisors can do editing in the field itself by re-contacting the respondents. Editing can be done along with coding too.Editing also requires re-arranging answer re-arranging answers to open-ended questions. Sometimes dont know answer is edited to no response. This is wrong. Dont know means the respondent is not sure and is in a double mind about his reaction or is not able to formulate a clear-cut opinion, or considers the question personal and does not want to answer it. No response means that the respondent is not familiar with the situation/object/individual about which he is asked.Coding of Data

Coding is translating answers into numerical values or assessing numbers to the various categories of a variable to be used in data analysis. Coding is generally done while preparing the questions and before finalizing the questionnaires and interview schedules. Fieldwork is thus done with precoded questions. However, sometimes, when questions are not precoded, coding is done after the fieldwork. Coding is done on the basis of the instructions given in the codebook. The code book gives a numerical code for each variable.

Coding is done by using a code book, code sheet and a computer card. Code book explains how to assign numerical codes for response categories received in the questionnaire/schedule. It also indicates the location of a variable on computer cards. Code sheet is a sheet used to transfer data from original source (i.e. questionnaire/schedule, etc.) to cards. They are prepared by the researcher for assigning codes to the answers received. Code sheets are like computer cards. These sheets are given to key-punchers who then transfer the data to cards. The computer card has 80 columns horizontally and 9 columns vertically (from the top to the bottom of the card). It is used for storing data or talking to computers. For example, in a question about the religion of the respondent, the answer categories, viz., Hindu, Muslim, Sikh, Christian, SC, ST. will be substituted by 1, 2, 3, 4, 5, 6 respectively and counting of frequencies will refer not to Hindus or Muslims. This is because computers easily handle numbers than words.

Coding uses categories that are mutually exclusive and uni-dimensional. The first 3 or 4 columns in the card (depending on the total number of respondents) are left blank for respondents identification number. We can take the following example for understanding the preparation of the code book and the code sheet.

The data is then transferred from questionnaires to computer cards by using a keypunch machine. The key-punch machine does not leaving a hole over a particular number in a specific column. Data are then considered machine readable.

DATA DISTRIBUTIONDistribution of data is important in the presentation of data. A distribution is the form of classification of scores obtained for the various categories of a particular variable (Sarantakos, 19989:343). There are three types of distributions: (i) frequency distributions, (ii) percentage distributions, and (iii) cumulative distributions. In social research, frequency distributions are most common.(i) Frequency distribution: it presents the frequency of occurrence of certain categories. This distribution appears in two forms: ungrouped and grouped. In ungrouped form, the scores are not collapsed into categories, e.g., distribution of ages of the students of an MBA class, each age value (e.g., 20, 22, 24, and so on) will be presented separately in thedistribution. In a grouped distribution, the scores are collapsed into categories, so that 2 or 3 scores are presented together as a group.(ii) Percentage Distribution: It is also possible to give frequencies not in absolute numbers but in percentages. For instance, of the 1,383 users, 15.1 per cent had monthly income of less than Rs. 500 (in 1976), i.e., they belonged to low income group, 24.6 had family income of Rs. 500-1000 (i.e., they belonged to middle income group) and 60.3 per cent had family income of Rs. 1000 (i.e., they belonged to upper income group.) It is also possible to convert these figures into proportions, e.g., the ratio of female users to male users was 126:1257 or 1:10. This distribution is useful in comparing cases. This appears both in grouped and ungrouped forms. (iii) Cumulative distribution: It does not contain in each item the observations that fall in the relevant category (as in the above two types of distribution) but consists of number of cases up to and including a specified scale value. This distribution appears in grouped and ungrouped forms.

Tabulation of Data

After editing, which ensures that the information on the schedule is accurate and categorized in a suitable form, the data are put together in some kind of tables and may also undergo some other forms of statistical analysis. There is nothing like statistical sophistication in tabulation. It amounts to no more than counting of the number of cases falling into each of several categories. Thus, when distribution is adding all the schedules together (in frequencies, percentages and averages), tabulation is not only total adding but counting frequencies in each category.

Tables can be prepared manually and/or by computers. For a small study of 100 to 200 persons, there may be little in tabulating by computer since this necessitates putting the data on punched cards. But for a survey analysis involving a large number respondents and requiring cross tabulation involving more than two variables, hand tabulation will be inappropriate, time consuming and unwieldy. When the data are put on the punched cards, construction of tables is easy and speedy. The machine tabulation has one another advantage also. Suppose, the researcher is working on sociology of earthquakes (disasters) in India I last 20 years (between 1980 and 2000). He may have done tabulation of years, numbers of earthquakes, magnitude, and death toll as given in Table 1. But tabulation may not have done in terms of zones involved (i.e., a 3 way tabulation)

DIAGRAMMATIC REPRESANTATION

At one time, diagrams and graphs were given much importance in report writing. However, today, these are not considered much important in research report. In Ph.D. and D.Lit. theses, these are the even avoided. Nevertheless, we can understand some of the diagrams and graph used in the reports. These are: graphs, histograms, bar diagrams, pie charts, pyramids and pictographs.

GraphsGraphs offer a visual presentation of the results. The horizontal line is the axis and vertical line intersecting it, is y-axis or ordinate. The point of intersection is the origin. The values of independent variable are scaled on the x-axis and of the dependant variable on the y-axis. Graph 1 shows the number of the cognizable crimes in India in last 40 years.

Sometimes, the multiple line graph is also used for indicating comparisons between two or more elements as shown in the Graph 2.

HistogramsIn histogram, the values of variables are presented in vertical bars drawn adjacent to each other, as shown in Diagram 3. The difference between graph and histogram is that in graph, points are plotted.

Data Analysis and InterpretationThe analysis in the ordering of data into constituent parts in order to obtain answers to research questions. For example, a researcher formulates a hypothesis pertaining to relation between high educational level and positive attitude towards a certain phenomenon (and vice versa). He conducts a study and gathers data from the respondent in a college/university. He then breaks down the data and so orders them that he can obtain an answer to the question: does high education change attitudes? However, merely analysis does not provide answers to research questions. Interpretation of data is also necessary. Interpretation takes the results of analysis, makes inferences and draws conclusion about the relationship. Thus to interpret is to explain, to find meaning.

Stages in AnalysisThe analysis of a research is done in four stages. These are (i)categorization, (ii) frequency distribution, (iii) measurement, and (iv) interpretation.

CategorizationCategories are set up according to the research problem and purpose of study. These are mutually exclusive, independent and exhaustive.

Frequency distributionFrequency distribution is the tabulation of quantitative data in classes. It indicates the number of cases or distribution of cases falling into different categories. Frequency distribution is of two types: primary and secondary. Primary analysis(or distribution) is descriptive and only gives the number of cases in each class. Secondary analysis (or distribution) is comparison of frequencies and percentages. Secondary analysis is thus concerned with relations, e.g., comparing the frequency of men with women or educated with illiterate, or rural with urban, and so on.

MeasurementMeasurement could be in the form of central tendencies (i.e., calculating mean, mode and median) or statistical averages. The mean is the arithmetic average of a set of measures. The median is the midmost measure of any set of measures. The mode is the most frequently occurring measure of a set of measures.

Measurement could be in terms of coefficient of correlation(s). The reliability and validity of the measures of the variables is important in all social research. The whole interpretation can collapse on this point alone. The statistical analysis sometimes may be of univariate type (examining one variable at a time), sometimes of bivariate type (assessing relationship between two variables) and sometimes of multivariate type (analyzing three of more variables simultaneously)

There are four scales used for measurement: nominal, ordinal, interval and ratio. The nominal scale is merely a classificatory scale in which a number is assigned to each object for identification. The ordinal scale consists of raking of the objects. The interval scale is like an ordinal scale plus the fact that the intervals or the discus between numbers on the numbers on the scale are equal. The ratioscale is used for determining ratios of the numbers assigned to categories.

InterpretationInterpretation of data can be descriptive or analytical or it can be from a theatrical standpoint. Negative results are much harder to interpret than the positive results (i.e., when the data support the hypotheses).

Q.2:Enumerate in detail transcription and graphical representation.VISUAL DATA PRESENTATIONData are the set of characteristics associated with the AUs of interest. Data for public School teachers may consist of a single characteristic, such as salary, or a set of characteristics, such as salary, years of experience, highest academic degree, and opinion on a fiscal issue to appear on a ballot in an upcoming election. To this point, the researcher has been given an overview of the sampling process (with greater detail coming later) and an introduction to the scales of measurement associated with characteristics of interest. Numerical aspects of dealing with data are the subject of the remaining chapters of this HANDBOOK. In this section, possible approaches to presenting data visually for purposes of a briefing or for a report are suggested. These few procedure is by covered are tabular and they should make it easier for the reader to review the data. To be covered are tabular and graphic methods.

Tabular Presentation MethodsClarity in a table summarizing a set of data is essential. The reader (or listener) should not have to probe too deeply to understand what is being presented.The variable of interest in the annual contract salary for public school teachers during the school year 1975-76. Because 1,311 teachers in the sample the data have been grouped into eight classes or intervals, with the lowest salary interval being from $5,000 to$6,999. Clearly, from the exhibit, the interval size is @2,000.

Exhibit

Annual contract salary for a nationwide sample of public school teachers for school year 1975-76Annual frequency midpoint limit relative cumulative relative Salary f x l Frequency Frequency(1) (2) (3) (4) (5) (6) 20,999.5 1.000119,000-20,999 43 19,999.5 0.033 18,999.5 0.968 17,000-18,999 98 17,999.5 0.075 16,999.5 0.893 15,000-16999 125 15,999.5 0.095 14,999.5 0.79813,000-14,999 179 13,999.5 0.137 12,999.5 0.66111,000-12,999 275 11,999.5 0.210 10,999.5 0.45 9,000-10.999 363 9,999.5 0.277 8,999.57,000-8,999 224 7,999.5 0.171 6,999.5 6,000-6,999 4 5,999.5 0.003 4,999.5

TOTAL 1,311 1.001

Colum2 indicates the frequency, or number, of teachers with contract salaries in each of the eight intervals. For example, 98 teachers earned salaries between $17,000 and $18,999. Columns 1 and 2 taken together indicate the frequency with which individual teacher salaries fall within each of the intervals and are referred to as a frequency distribution.

Because these data have been grouped, it is impossible to identify where within this interval each of the 98 teachers is located. Two assumptions are usually made in dealing with grouped data. The AUs within the interval have values of the variable that average out the interval midpoint (column 3). The AUs within the interval have values of the variable that are every spread out between the lower and higher limits of the interval (column 4)

Midpoints for the intervals are obtained by averaging the lower upper value is interval. (For example, the midpoint of the interval with 275 teachers salaries is average of $12,999 Row 5 or $11,999.5). Successive midpoints appeared in column 3

Limits dividing the successive intervals are halfway between the upper value in one interval and the lower value in the next higher interval. For example, the limit dividing the intervals with 179 and 125 teachers is halfway between $14,999 (Row 4) and $ 15,000 (Row 3), clearly at $14,999.50. Successive interval limits are given in column 4.

Frequently, it is of interest to discuss data in terms of percentages or proportion of AUs lying within a given interval, say, referred to as relative frequencies or proportions. For example, 0.095, or 9.5% of the teachers had salaries between $15,000 and $16,999 (Row 3). The entries in column 5 are obtained by dividing the entries in Column 2 by the total of Column 20. The total for Column 5 should equal 1.00, but due to rounding may be off slightly, as it is in this example.

Though not usually presented in a report, the entries in column 6 are necessary develop one of the graphic methods presented in the next section. Column 6, most frequently referred to as the cumulating relative frequency distribution, is obtained by cumulating successively the entries in column 5, starting with the lowest value and moving toward the highest value of the variable of interest. These successive cumulations are then placed besides the successive limits dividing adjacent intervals. For example, the entry of 0.451 in the last column opposite the entry of $10.999.50 was obtained by adding together 0.00. +0.171+0.277. This quantity, 0.451, is the proportion of teachers in the sample with salaries less than @10,999.50, or 45.1% of the teachers in the sample fall in the first three class intervals.

Columns 1, 2, and 5 are those most frequently encountered in a tabular display data of this type, namely, a variable. Characteristics that are attributes lend them to slightly different tables.The attribute of interest in

Graphic presentation methodsWhile clarity and lack of ambiguity are also important in using to portray, a set of data, they are not quite as critical. The reviewer still needs to understand the worlds that appear with the graphs, but impressions are perhaps as important as exactness in reviewing graphs. The variable, annual contract salary, presented in 1st graphically presented in a number of ways, the three most common of which are as a proportion of relative frequency distribution.

Presenting resultsFrequencies and numbers can often presented more clearly in charts than in words. Consider, for figure, which present the

< 29 year 30-39 year 40-49 year 50-59 year>60 year

Sample size is a longitudinal study at three times of measurement An alternative is to use pie charts. Above figures provides an example in which the age distribution of homeless people in Germany is summarized in age groups(32-35 years).A third means of presentation is to use tables. The above table provides an example. Here, the frequencies of responses to the question of how to prevent diseases are summarized according to the age of children, who could give more than one answer.The example demonstrates how you can present findings visually so that they become apparent for readers at first sight. The three methods above are, not, of course the only methods available; they are included here merely to illustrate the usefulness of visual presentation.

Q.3:What is analysis of data? Discuss in detail the purpose, characteristics and the various types of the analysis of data.Introduction to analysis of data.Analysis may be defined as classifying, ordering, manipulating and summarizing data with the view to answering the research question. The raw data have first to be rendered in terms of the research variables in order to constitute research data. The research variables in order to constitute research data. The research variable, in most behavioral research, is conceptual which has to be made operational in order to be amenable to the process of observation and measurement and the operationalization of the research variable has to be determined by the nature of the problem. It is the research problem that determines the kind of data into which the raw data have to be rendered and also the kind of analysis to which the data should be subjected.The analysis of data is the most skilled task in the research process. It calls for the researchers own judgment and skill. Analysis means a critical examination of the assembled and grouped data for studying the characteristics of the subject under study and for determining the patterns of relationships among the variables relating to it. Both quantitative and non-quantitative methods are used. However, social research most often requires quantitative analysis involving the application of various statistical techniques. Analysis of data is the most skilled task of all the stages of the research. It is a task which demands the researchers own judgment and skill. It should be done by the researcher himself. A correct analysis needs familiarity with the background of the survey and all the stages of research. The analysis does not necessarily be a statistical one. Quantitative and non-quantitative methods of analysis can be done. The steps followed in analysis of data will vary on the basis of the type of study. A part of analysis is a matter of working out statistical distribution, constructing diagrams, calculating simple measures like averages, measures of dispersion, percentages, correlation etc. Hence statistical analysis forms a part of survey analysis.The problems raised by the analysis of data are directly related to the complexity of the hypothesis. Problems of data analysis involve the entire questions raise in research design, for secondary analysis to involve the designing and redesigning of substitutes for the controlled experiment.After collecting the data from a representative sample of the population, the next step is to analyze them to test the research hypotheses. However, before analyzing the data to test hypotheses, some preliminary steps need to be completed. This will help to ensure that the data are reasonably good and assured of good quality for further analysis. There are four steps namely1) Getting ready for analysis.2) Getting a feel for the data.3) Testing of goodness of data. 4) Testing the hypotheses.According to Johan Galtung, The two phases of research operations are:a) Processing of data refers to concentrating, recasting and dealing with data in such a way that they become as amenable to analyze as possibleb) Analysis of data may be considered as having a reference to the process of viewing the data in the light of hypotheses or research questions, as also, the prevailing theories and drawing conclusions that will make some contribution in the matter of theory formulation or modification.According to Selltiz and others, Analysis of data does not make such a precise differentiation. Analysis is a comprehensive process, involves processing.The dividing line between analysis of data and interpretation of data is difficult to draw. These two are symbolic and merge imperceptibly. If analysis involves organizing the data in a particular manner, it is mostly the interpretative ideas that govern this task. If the end product of analysis is the setting up of certain general conclusions, then what these conclusions really mean and reflect is the bare minimum that the researcher must feel obliged to know. Interpretation is the way to this knowledge. Thus the task to analysis can hardly be said to be complete without interpretation coming to illuminate the results.Analysis of data is the most skilled task of all stages of the research. It is a task calling for the researchers own judgment and skill. It should be done by the researcher himself. Proper analysis requires a familiarity with the background of the survey and with all its stages. The analysis does not necessarily be statistical as both quantitative and non- quantitative (Qualitative) methods can be used.The steps in the analysis of data depend upon the type of study. In case, there is a set of clearly formulated hypotheses, then each hypothesis can be seen as a work prescribing a certain action to be taken vis--vis the data. The more specific the hypothesis, the more specific action. In such a study, analysis of data is almost completely a mechanical procedure. Part of the analysis is working out statistical distribution, construction of diagrams and calculating simple measures like average, measures of dispersion, percentage correlation etc. Hence statistical analysis form a part of survey analysis. The analysis means verification of hypotheses.

Purpose of Data AnalysisStatistical analysis of data serves several major purposes. It summarizes large mass of data into understandable and meaningful form. This is the role of Statistics. The reduction of data facilitates further analysis. Statistics makes exact descriptions possible. For example, when we say that the educational level of people in X district is very high, the description is not specific; but when statistical measures like the percentages of literate among males and females, and the like are available, the description becomes exact.Statistical analysis facilitates identification of the casual factors underlying complex phenomena. What are the factors which determine a variable like labour productivity or academic performance of students? What are the relative contributions of the causative factors? Answers to such questions can be obtained from statistical multivariate analysis.Statistical analysis aids the drawing of reliable inferences from observations. Data are collected and analyzed in order to predict or make inferences about situations that have not been measured in full. What can be the growth rate of industrial production during the coming year? What would be the probable demand for a particular product in the coming year? Questions of this kind require predictions of future states to be made on the basis of current knowledge. Such predictions are essential in any strategic decision relating to management of an enterprise or the national economy or a social action forum. The statistical prediction is one of the functions in inferential statistics.Statistical analysis also helps making estimations or generalizations from the results of sample surveys. This is another function of inferential statistics. Sample statistics based on probability samples may give good estimates of particular population parameters. Any estimate will deviate from the true value due to sampling error. The process of statistical inference enables us to evaluate the accuracy of the estimates. Inferential statistical analysis is useful for assessing the significance of specific sample results under assumed population conditions. This type of analysis is called hypothesis testing.

Characteristics of Analysis of DataFollowing are the main characteristics of analysis of data: Analysis of data is one of the most important aspects of research. Since it is highly skilled and technical job, it should be carried out by the researcher himself or under his also supervision. It demands a deep and intense knowledge on the part of the researcher about the data to be analyzed. The researcher should also possess judgment skill, ability of generalization and should be familiar with the background objects and hypothesis of study. Data, facts and figures are silent and they never speak for themselves but they have complexities. It is through systematic analysis that the important characteristics which are hidden in the data are brought out and valid generalizations are drawn. Analysis demands a thorough knowledge of ones data. Without deep knowledge, the analysis is likely to be aimless. It is only by organizing , analyzing and interpreting the research data that we know their important features, inter-relationship and cause-effect relationship. The trends and sequences inherent in the phenomena are elaborated by means of generalization. According to P. V. Young, The function of systematic analysis is to build an intellectual edifice in which properly sorted and shifted facts and figures are placed in their appropriate settings and broader generalizations beyond the immediate contents of the facts under study, consistent relationships, or that general inferences can be drawn from them the aim of a mature science. The data to be analyzed and interpreted should : i)be reproducible, ii) be readily disposed to quantitative treatment, iii) have significance for some systematic theory, and can serve as a basis for broader generalizations. We should remember that the steps envisaged in the analysis of data will vary depending on the type of study. A set of clearly formulated hypothesis to start with the study presents a norm prescribing a certain action to be taken. The more specific is the hypothesis, the more specific is the action and in such types of studies, the analysis of data is almost completely a mechanical procedure. If the data are collected according to vague clues rather than according to the specific hypothesis, the data are analysed inductively or invested during the process and not by means of new prescribed set of rules. The task of analysis in complete without interpretation. In fact, analysis of data and interpretation of data are complementary to each other. The end product of analysis is the setting up of certain general conclusions while the interpretation deals with what these conclusions really mean. Since analysis and interpretation of data are interwoven the interpretation should more properly be conceived of as a special aspect of analysis rather than a distinct operation. Interpretation is the process to establish relationship between variables which are expressed in the findings and why such relationships exists. For any successful study the task of analysis and interpretation should be designed before the data are actually collected with the exception of formulative studies where the researcher had no idea as to what kind of answer he wants. Otherwise there is always a danger to being too late and the chances of missing important relevant data.The most difficult task in the analysis and interpretation of data is the establishment of cause and effect relationship especially in the cases of social and personal problems. Research problems do not necessarily have one factor or a set of factors but they arise due to a complex variety of factors and sequence. Karl Pearson has observed, No phenomena or stage in sequence has only one cause; all antecedent stages are successive causes when we scientifically state causes we are really describing he successive stages of a routine of experience.In fact, human behaviour cannot be reduced or explained with the help of cause effect sequences as we face difficulties in detecting the factors and in establishing cause and effect relationships because nature of these factors differ from one individual to another and due to the fact the cause and effect both are inter-dependent, i.e., one stimulates the other.

Types of AnalysisAnalysis of survey or experimental data involves estimating the values of unknown parameters of the population and testing of hypotheses for drawing inferences. Analysis may be categorised as:1. Descriptive Analysis:It is largely a study of distributions of one or more variable. Such study provides with profiles of a business group, work group, persons or others subjects on any of a multitude of characteristics such as size, composition, efficiency, preferences etc. Various measures that show the size and shape of distribution along with the study of measuring the relationship between two or more variables are available from this analysis.2. Inferential Analysis:It is concerned with the various tests of significance for testing hypotheses in orderto determine with what validity the data can indicate some conclusion or conclusions. It is also concerned with the estimation of population values. It is mainly on the basis of inferential analysis that the task of interpretation is performed.3. Correlation Analysis:It studies that joint variation of two or more variables for determining the amount of correlation between two or more variables.4. Casual Analysis:It is concerned with the study of how one or more variables affect changes in another variable. It is a study of functional relationship existing between two or more variables. 5. Multivariate Analysis:With the availability of computer facilities, there is a development of multivariate analysis which means use of statistical methods which analyse more than two variables on a sample of observations. These include:a) Multiple Discriminate Analysis:It is suitable when the researcher has a single dependent variable that cannot be measured, but can be classified into two or more groups on the basis of some attribute. The objective of this analysis happens to be to predict an organizations possibility of belonging to a particular group based on several predictor variables.b) Multiple Regression Analysis:It is suitable when the researcher has one dependent variable which is presumed to be a function of two or more independent variables. The objective of this analysis is to make a prediction about the dependent variable based on its covariance with all the concerned independent variables.c) Multivariate Analysis of Variance (Multi- Anova):This analysis is an extension of two way ANOVA, where in the ratio of among group variable to within group variance is worked out on a set of variables.d) Canonical Analysis:This analysis can be used in case of both measurable and non-measurable variables for the purpose of simultaneously predicting a set of dependent variables from their joint covariance with a set of independent variables.

Q.5: What are the various methods of tabulation and explain the significance or processing of the data. Discuss the role of computer in data processing and analysis? Explain the need of statistical techniques in the research.TabulationThe process of placing classified data into tabular form is known as tabulation. A table is a symmetric arrangement of statistical data in rows and columns. Rows are horizontal arrangements whereas columns are vertical arrangements. It may be simple, double or complex depending upon the type of classification.Basic descriptionA table consists of an ordered arrangement of rows and columns. This is a simplified description of the most basic kind of table. Certain considerations follow from this simplified description: the term row has several common synonyms (e.g., record, k-tuple, n-tuple, vector); the term column has several common synonyms (e.g., field, parameter, property, attribute); a column is usually identified by a name; a column name can consist of a word, phrase or a numerical index; the intersection of a row and a column is a cell.The elements of a table may be grouped, segmented, or arranged in many different ways, and even nested recursively. Additionally, a table may include metadata, annotations, header,[6]footer or other ancillary features.

Simple tableThe following illustrates a simple table with three columns and six rows. The first row is not counted, because it is only used to display the column names. This is traditionally called a "header row".

An example of a table containing rows with summary information. The summary information consists of subtotals that are combined from previous rows within the same column.The concept of dimension is also a part of basic terminology.[7] Any "simple" table can be represented as a "multi-dimensional" table by normalizing the data values into ordered hierarchies. A common example of such a table is a multiplication table.Multiplication table

123

1123

2246

3369

NOTE: Multidimensional tables, 2-dimensional as in the example, are created under the condition the coordinates or combination of the basic headers (margins) give a unique value attached. This is an injective relation: each combination of the values of the headers row (row 0, for lack of a better term) and the headers column (column 0 for lack of a better term) is related to a unique value represented on the table: column 1 and row 1 will only correspond to the value 1 (and no other) column 1 and row 2 will only correspond to the value 2 (and no other), etc.If the said condition is not present, it is required to insert extra columns or rows which increases the size of table with plenty of empty cells.To illustrate how a simple table can be transformed into a multi-dimensional table, consider the following transformation of the Age table.Modified Age Table (names only)

+123

NancyNancy DavolioNancy KlondikeNancy Obesanjo

JustinJustin SaundersJustin TimberlandJustin Daviolio

This is structurally identical to the multiplication table, except it uses concatenation instead of multiplication as the operator; and first name and last name instead of integers as the operands.Wide and Narrow TablesTables can be described as wide or narrow in format. Wide format has a separate column for each data variable, a Narrow format will have one column for all the variable values and another column for the context of that value. See Wide and Narrow Data.

Importance Of TabulationThere are no hard and fast rules for preparing a statistical table. Prof. Bowley has rightly pointed out In collection and tabulation, common sense is the chief requisite and experience is the chief teacher. However, the following points should be borne in mind while preparing a table.(i) A good table must contain all the essential parts, such as, Table number, Title, Head note, Caption, Stub, Body, Foot note and source note.(ii) A good table should be simple to understand. It should also be compact, complete and self-explanatory.(iii) A good table should be of proper size. There should be proper space for rows and columns. One table should not be overloaded with details. Sometimes it is difficult to present entire data in a single table. In that case, data are to be divided into more number of tables.(iv) A good table must have an attractive get up. It should be prepared in such a manner that a scholar can understand the problem without any strain.(v) Rows and columns of a table must be numbered.(vi) In all tables the captions and stubs should be arranged in some systematic manner. The manner of presentation may be alphabetically, or chronologically depending upon the requirement.(vii) The unit of measurement should be mentioned in the head note.(viii) The figures should be rounded off to the nearest hundred, or thousand or lakh. It helps in avoiding unnecessary details.(ix) Percentages and ratios should be computed. Percentage of the value for item to the total must be given in parenthesis just below the value.(x) In case of non-availability of information, one should write N.A. or indicate it by dash (-).(xi) Ditto marks should be avoided in a table. Similarly the expression etc should not be used in a table.

Significance of Data ProcessingData processing is very important to businesses and companies nowadays. This is because the processing of data converts all relative information and data in a readable manner. Also, companies need a standardized format for all the information that they need so processing can really help them. With data processing, your company can face the challenges and competition among other companies in your field because you can concentrate more on the productive activities that your company should do. Data processing services take care of the non-core activities such as conversion of data, data entry, and data processing itself. Data processing will convert all information into a standard electronic format so that you can use it to help you decide on the important things immediately. Your high goals can now be achieved since you can now focus more on making your company very competitive.Data processing services typically includes form processing, check processing, insurance claims processing, and image processing. These may seem very minor to your company but they can give you high impact in the market. Form processing will help you access all the necessary information faster and easier because the forms will be available in a way that they are easy to understand. These forms include vouchers, invoices, HTML, resumes, tax forms, different kinds of survey, and legal and email forms.Check is the basic transaction unit in all businesses thus making it very important in the company. Check processing will help you ensure that checks are properly processed and accomplished so that your company's reputation will not be affected.Insurance also plays an important role in your company. Losses incurred by your company are insured through insurance companies and you can reimburse these losses just by processing the insurance claims. Getting help from the professionals can help you save time and effort and will allow you to do your own job in the company.Image processing may be a minor job but it can greatly affect the marketing of your company. Making high quality images and putting them in catalogs and brochures will surely get the attention of your target clients and customers.There are many benefits that you can get from data processing. First, the important data in your company will be converted into a standard format that can be understandable to you and your employees. Since all the sets of information are in standard electronic format, you can make a back up copy that you can use in case of data loss. These sets of information are ensured to be accurate so that you can make your decision correctly. Lastly, you will save more time, effort, and money because of data processing. You can also say goodbye to lost opportunities.Role of Computer in Analysis and ProcessingIn the last 15 years there has been a proliferation of computer software packages designed to facilitate qualitative data analysis. The programs can be classified, according to function, into a number of broad categories such as: text retrieval; text base management; coding and retrieval; code-based theory building; and conceptual-network building. The programs vary enormously in the extent to which they can facilitate the diverse analytical processes involved. The decision to use computer software to aid analysis in a particular project may be influenced by a number of factors, such as the nature of the data and the researcher's preferred approach to data analysis which will have as its basis certain epistemological and ontological assumptions. This paper illustrates the way in which a package called NUD.IST facilitated analysis where grounded theory methods of data analysis were also extensively used. While highlighting the many benefits that ensued, the paper illustrates the limitations of such programs. The purpose of this paper is to encourage researchers contemplating the use of computer software to consider carefully the possible consequences of their decision and to be aware that the use of such programs can alter the nature of the analytical process in unexpected and perhaps unwanted ways. The role of the Computer Assisted Qualitative Data Analysis (CAQDAS) Networking Project, in providing up-to-date information and support for researchers contemplating the use of software, is discussed.

Need of Statistical Technique in ResearchStatistics are helpful in analyzing most collections of data. This is equally true of hypothesis testing which can justify conclusions even when no scientific theory exists. In the Lady tasting tea example, it was "obvious" that no difference existed between (milk poured into tea) and (tea poured into milk). The data contradicted the "obvious".Real world applications of hypothesis testing include:[10] Testing whether more men than women suffer from nightmares Establishing authorship of documents Evaluating the effect of the full moon on behavior Determining the range at which a bat can detect an insect by echo Deciding whether hospital carpeting results in more infections Selecting the best means to stop smoking Checking whether bumper stickers reflect car owner behavior Testing the claims of handwriting analystsStatistical hypothesis testing plays an important role in the whole of statistics and in statistical inference. For example, Lehmann (1992) in a review of the fundamental paper by Neyman and Pearson (1933) says: "Nevertheless, despite their shortcomings, the new paradigm formulated in the 1933 paper, and the many developments carried out within its framework continue to play a central role in both the theory and practice of statistics and can be expected to do so in the foreseeable future".Significance testing has been the favored statistical tool in some experimental social sciences (over 90% of articles in the Journal of Applied Psychology during the early 1990s).[11] Other fields have favored the estimation of parameters (e.g., effect size). Significance testing is used as a substitute for the traditional comparison of predicted value and experimental result at the core of the scientific method. When theory is only capable of predicting the sign of a relationship, a directional (one-sided) hypothesis test can be configured so that only a statistically significant result supports theory. This form of theory appraisal is the most heavily criticized application of hypothesis testing.1

RM Project Final (2)

Documents