Introduction to Data Structure
Aug 26, 2014
Introduction to Data Structure
Topics
• Data Structure• Arrays• Records & Pointers• Multidimensional Arrays• Pointer Arrays• Record Structure
Data
• Data – simple values or sets of values
• Data Item – refers to a single unit of values
• Data item –divided into sub items – are called group items
• Data item – not divided into sub items – are called elementary items
• For example employee’s name – divided into first name, middle name and last name
• Social security number – treated as a single item
• Collection of data – organized into a hierarchy of fields, records and files.
Data
• Entity – that has certain attributes or properties which may be assigned values.
• Values may be numeric or non numeric • Employee –
– Attributes: Name Age Sex SSN– Values: Arpit 20 M 134-24-5533
• Entity set – Entities with similar attributes
• The way that data are organized into the hierarchy of fields, records and files reflects the relationship between attributes, entities and entity sets.
Data
Data organized asFields
RecordsFile
Fixed Length Records
Variable Length Records
The data may be organized into many such different ways – the logical and mathematical model of a particular organization of data is called data structure
The way information is organized in the memory of a computer is called a data structure.
A data structure helps you to understand relationship of one data element with the other and organize it within the memory.
Introduction to Data Structures
• way of organizing all data items• considers not only elements stored but also
their relationship to each other.• Specifies fallowing– Organization of data.– Accessing methods.– Degree of associativity.– Processing alternatives for information.
Classification of Data structure
Primitive Data Structure
Basic structuresDirectly operated upon by the machine
instructions
Non primitive Data structure
Derived from primitive data structure.Emphasize on structuring of a group of
homogeneous or heterogeneous data itemsEg:-
ArraysListsFiles
Data Structure Operations
• The data in data structures are processed by means of certain operations
• The four main operations:– Traversing – Processing each element in the list.– Searching– Inserting– Deleting– Sorting – Merging
Arrays, Records and Pointers
Data Structure
• Data Structure : classified as either linear or non linear
• Linear data structure: if its elements form a sequence or linear list
• There are two basic ways of representing such linear structures in memory:– One way is to have the linear relationship between the elements represented by
means of sequential memory locations e.g. Arrays
– The other way – linear relationship between the elements represented by means of pointers or links e.g. Linked Lists
Linear Arrays
• A linear array – list of finite number n of homogeneous data elements, finite collection of similar elements stored in adjacent memory locations.
• The elements of the array are referenced respectively by an index set consisting of n consecutive numbers
• The elements of the array are stored respectively in successive memory locations.
• n – indicates total number of elements in the array – size or length of the array.
• Length = Total number of elements = UB –LB +1
• Array elements are denoted as A1, A2,… An or A(1), A(2),… A(n) or A[1],A[2],….,A[n].
Linear Arrays
• Array declaration must give, three items of information:
– The name of the array– The data type of the array– The index set of the array
• Memory for array can be allocated in two ways:
– Statically : Compile time – size of array is fixed during program execution
– Dynamically: Run time – read value of n at run time and then allocate memory while program execution.
Representation of Linear Arrays in Memory• Arr[] – linear array• Loc(Arr[k]) = address of the element Arr[k ] of the array Arr
1000
1001
1002
1003
1004
No need to keep track of the address of every element of Arr.
Track only the address of the first element of Arr, denoted by Base (Arr)
Using base address, computer calculates address of any element of Arr by using below formula:
Loc(Arr[k]) = Base(Arr) + w(K-Lower bound)
Base(Arr)
Arrays in C
• Declaration of an Array in C:– Data type followed by array name.
– Subscript in bracket indicates the number of elements array will hold.
– By declaring an array, the specified number of memory locations are allocated in the memory
– For example, int age[20] ;float sal[10];char grade[10];
int arr[5]
100 102 104 106 108
arr[0] arr[1] arr[2] arr[3] arr[4]
Arrays in C
• Array initialization:– Can be initialized at the time of declaration:
int age[5] = [8,10,5,15,20]; float sal[3] = [2000, 2000.50,1000];
• An array of characters is called a string and it is terminated by a null (‘o’) character.char name[3]=‘abc’;
2000 2000.50 1000
sal[0] sal[1] sal[2]
Traversing Linear Arrays
• Traversing a Linear Array LA = Linear Array
LB = Lower Bound UB = Upper Bound
1. [Initialize Counter] set K:=LB2. Repeat Steps 3 and 4 while K<=UB
[Visit element] Apply Process to LA[K][Increase Counter] set K:=K+1
3. [End of Step 2 loop]4. Exit
5. Repeat for K=LB to UBApply Process to LA[K]
6. [End of Loop]7. Exit
Inserting into a Linear Arrays
• Inserting into a Linear Array LA = Linear Array
LB = Lower Bound UB = Upper Bound N = array with N elements K= K is a positive integer such that K<=N. This algorithm Inserts an element ITEM into the Kth position in LA INSERT(LA, N, K, Item)
1. [Initialize counter] Set J:=N2. Repeat Steps 3 and $ while J>= K3. [Move Jth element downward] Set LA[J+1]:=LA[J]4. [Decrease counter] Set J:=J-15. [End of step 2 loop]6. [Insert element] set LA[K]:=Item7. [Reset N] Set N:= N+18. Exit
NAME
Brown
Davis
Johnson
Smith
N=4
NAME
Brown
Davis
Johnson
Smith
N=4K=3Item=FordJ=N=4
NAME
Brown
Davis
Ford
Johnson
Smith
Multi Dimensional Arrays
• Linear arrays – referenced by one sub scripts
• Multi Dimensional arrays- referenced by more than one subscript two or three
• A two dimensional M* N array A is a collection of M*N elements such that each element is specified by pair of integers such as j, k called subscripts
• The element of A with subscript j and second subscript k is denoted byAj,k or A[j][k]
• Two dimensional arrays are called matrices with M rows and N columns.
Representation of Two Dimensional Arrays in Memory• A two dimensional m* n array – in memory represented as m*n
sequential memory locations.• Array A can be stored in either of two ways:
– Column by column- Column major order– Row by row – Row major order
A Subscript(1,1)(2,1)(3,1)
(1,4)
(2,2)(3,2)(1,3)(2,3)(3,3)
(2,4)
(1,2)
(3,4)
Column1
Column2
Column3
Column4
A Subscript(1,1)(1,2)(1,3)
(3,2)
(2,1)(2,2)(2,3)(2,4)(3,1)
(3,3)
(1,4)
(3,4)
Row 1
Row 2
Row3
Column Major Order
Row Major Order
Search• Linear Search: Compare search element with each element one by one
in array• Binary Search :
– Array must be in sorted order– Divide array into two halves
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
Binary Search
lo
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
lo = Lower boundhi = Upper boundEx. Binary search for 33.
hi
Binary Search
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
mid= (lo + hi)/2Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo himid
Binary Search
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo hi
Binary Search
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo mid hi
Binary SearchBinary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo hi
Binary SearchBinary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo himid
Binary SearchBinary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lo himid
Binary Search
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lohi
Binary Search
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lohi
mid
Binary Search
Binary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lohi
mid
Binary SearchBinary search. Given value and sorted array a[], find index isuch that a[i] = value, or report that no such index exists.
Invariant. Algorithm maintains a[lo] value a[hi].
Ex. Binary search for 33.
821 3 4 65 7 109 11 12 14130
641413 25 33 5143 53 8472 93 95 97966
lohi
mid
Binary Search Algorithm(Binary Search) Binary(DATA, LB, UB, Item, LOC)Initially LOC = NULL /* DATA = [10, 20,30,40, 50,60] ITEM=20 */1. [Initialize segment variables.] [ 1 2 3 4 5 6 ]
Set BEG:= LB, END:= UB and MID =INT(BEG +END)/2 /* BEG := 1, END:= 5 MID = (1+5)/2 = 3 2. Repeat steps 3 and 4 while BEG<=END and DATA[MID] ≠ ITEM3. IF ITEM < DATA[MID], then: /*20 < DATA[MID]= DATA[3] =30 */
Set END := MID-1. /* END = MID-1= 3-1 = 2 */Else: /* DATA = [10,20,30,40,50,60] */ Set BEG:=MID+1.[End of If Structure.]
4. Set MID:= INT(BEG +END)/2. /* MID = (1 + 2)/2 = 1.5 =2 */ [End of Step 2 loop.] *ITEM =20 < DATA[MID] =DATA[2] =20 No End */5. IF DATA[MID]=ITEM, then: /* DATA[MID]= DATA[2] = 20 = ITEM=20*/
Set LOC:= MID /* LOC = MID = 2 */Else:
SET LOC:=NULL.[End of If Structure.]
6. Exit.
Pointers
• Pointers are special variables which contain the address of another memory location.
• Pointers are useful in accessing any memory location directly.
• An address of a memory location is a long integer – which is stored in pointer type variable.
• Adding two pointers or subtracting two pointers – gives number of bytes between two memory addresses.
Pointers
• & - address operator – represents the address of the variable.• %u – used for obtaining the address = printf(“address of X = %u”, &X);• Declare base variable
– int X;
• Declare pointer type variable related to base variable– int *P;
• Establish relation between base variable and pointer variable– P = &X;
25
1011
XVariable name
P Pointer Type Variable
1011Address of Memory where value for variable X is stored
With the help of pointer P one can access the variable X. *P = “Value at Pointer P” = Value at memory location 1011 = Value of variable X = 25
Pointers
• Another pointer variable can store the address of a pointer variableint a=2;int *b;int **c; /* c has been declared as a pointer to pointer variable b=&a; which contains address of pointer variable b */c=&b;
2
1011
aVariable name
b Pointer Type Variable
10111021
c1021
Address of Memory where value for variable a is stored
Pointer to Pointer Type Variable
Pointer Arrays• On array declaration – sufficient amount of storage is allocated by the compiler
• The compiler also defines the name of the array as a pointer to the first element.
• The name arr acts as a pointer pointing to the first element.arr = &arr[0] =100;int *ip;ip = arr or ip=&arr[0];ip = &arr[0] = 100;ip+1 = &arr[1] = 102;ip+2 = &arr[2] = 104;ip++ = 106 = bytes that pointer data type holds are added
Address of arr[2] = base address + (2*size of int)= 100 + (2*2)=104
2 6 4 8
100 102 104 106
arr[0] arr[1] arr[2] arr[3]
Call By Value• The process of passing the actual values of variables as arguments to a function is called call by value.
Before calling program
Main(){ a (1000) b (2000)
int a =2;int b=1;Printf(“Before Calling the function, a and b are %d %d “, a, b);Value(a,b);Printf(“After calling the function, a and b are %d %d”, a, b);
}
Value(int p, int q) In Function{
p++;q++; p ( 5000) q (8000)
}
2 1
2 1
3 2
p(5000) Q(8000)
After p++ and q++ in function
2 1
a(1000) b(2000)
After calling function
Call By Reference• Pass the addresses of the variables as parameters to the function.
Before calling program
Main(){ a (1000) b (2000)
int a =2;int b=1;Printf(“Before Calling the function, a and b are %d %d “, a, b);Value(&a,&b);Printf(“After calling the function, a and b are %d %d”, a, b);
}
Value(int *p, int *q) In Function{
(*p)++;(*q)++; p ( 5000) q (8000)
}
2 1
1000 2000
1000 2000
p(5000) Q(8000)
After (*p)++ and (*q)++ in function
3 2
a(1000) b(2000)
After calling function
41
Records
• Recall that elements of arrays must all be of the same type
• In some situations, we wish to group elements of different types
scores : 85 79 92 57 68 80 . . .
0 1 2 3 4 5 98 99
employee R. Jones 123 Elm 6/12/55 $14.75
Record Structures
• Collections of data - organized into a hierarchy of fields, records and files.
• Record – collection of related (Not similar) data items, called as field or attribute.
• File – collection of similar records.
• Record- collection of non homogenous data – the data items in a record may have different data types.
• The data items in a record are indexed by attribute names.
Record Structures
• Record for new born baby in hospital1. Newborn
2. Name2. Sex2 Birthday
3 Month3 Day3 Year
2 Father3 Name3 Age
2 Mother3 Name3 Age
Can also have array of elements Newborn[20] – indicate a file with 20 records like Newborn[1], Newborn[2]……
Record Structures
1. Student(20)2. Name
3. Last3. First3. MI
2. Test(3)2. Final2. Grade
To access last name of first student – Student.Name.Last[1]To access marks of third test of first student – Student.Test[1][3]