Top Banner
SET DATA STRUCTURE (Part 2)
54

Set data structure 2

Dec 17, 2014

Download

Technology

Tech_MX

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Set data structure 2

SET DATA STRUCTURE(Part 2)

Page 2: Set data structure 2

LIST HASH TABLE BIT VECTORS TREE

Representation of Sets

Page 3: Set data structure 2

LIST REPRESENTATION OF SETS

Page 4: Set data structure 2

Simplest and straight forward Best suited for dynamic storage facility. This allow multiplicity of elements ie; Bag structure. All operations can be easily implemented and

performance of these operations are as good as compared to other representations.

Ex: set S = { 5,6,9,3,2,7,1} using linked list structure is

965

7

3

21

Page 5: Set data structure 2

Operations on List Representation of sets

Si: Si:5 6 9 3

2

7 6 5

965

7

3

21

1

UNION:

Si:

Sj:

Si U Sj:

Page 6: Set data structure 2

Input: Si and Sj are header of two single linked list

representing two distinct sets.

Output: S is the union of Si and Sj.

Data structure: Linked list representation of set.

ALGORITHM : UNION_LIST_SETS(Si,Sj;S)

Page 7: Set data structure 2

/* to get a header note for S and initialize it*/

1. S= GETNODE(NODE)2. S.LINK= NULL, S.DATA = NULL

/* to copy the entire list of Si into S*/3. ptri = si.LINK4. While (ptri !=NULL) do 1.Data = ptri.data 2.INSERT_SL_FRONT(S, DATA) 3. ptri= ptri.LINK

5.Endwhile6.ptrj=Sj.LINK /* for each element in Sj added to S if it is not in Si*/

STEPS

Page 8: Set data structure 2

7. While (ptrj!=NULL) do ptri=Si.link while (ptri. DATA != ptrj. DATA) do 1. ptri=ptri.LINK8. Endwhile9.If (ptri=NULL) then INSERT_SL_FRONT(S,ptrj.DATA)10. EndIf11. ptrj=ptrj.LINK12. Endwhile 13. Return (S)14. stop

Page 9: Set data structure 2

Si: Si:5 6 9 3

2

7 6 5

65

1

INTERSECTION

Si:

Sj:

Si Sj

Page 10: Set data structure 2

Input: Si and Sj are header of two single linked list

representing two distinct sets.

Output: S is the intersection of Si and Sj.

Data structure: Linked list representation of set.

ALGORITHM : INTERSECTION_LIST_SETS(Si,Sj;S)

Page 11: Set data structure 2

/*To get a header node for S and initialize it*/

1. S= GETNODE(NODE)2. S. LINK= NULL, S. DATE= NULL

/*search the list Sj, for each element in Si*/

3. ptri= Si.LINK4. While (ptri!= NULL) do 1. ptrj= Sj.LINK 2. While(ptrj.DATA!= ptri.DATA) and(ptrj !=NULL) do 1. ptrj= ptrj. LINK

STEPS:

Page 12: Set data structure 2

3. Endwhile.

4. If (ptrj!=NULL) then // when the element is found in Sj

1. INSERT_SL_FRONT(S,ptrj,DATA) 5. EndIf 6. ptri = Si.LINK5. Endwhile6. Return(S)7.Stop.

Page 13: Set data structure 2

Si: Si:5 6 9 3

2

7 6 5

65

1

DIFFERENCE:

Si:

Sj:

Si –Sj: 2

Page 14: Set data structure 2

Input: Si and Sj are header of two single linked list

representing two distinct sets.

Output: S is the difference of Si and Sj.

Data structure: Linked list representation of set.

ALGORITHM : DIFFERENCE_LIST_SETS(Si,Sj;S)

Page 15: Set data structure 2

/*Get a header node forS and initialize it*/

1.S= GETNODE(NODE )2. S.LINK= NULL,S. DATA =NULL

/*Get S’ the intersection of Si, and Sj*/3. S’= INTERSECTION _LIST_SET_(Si, Sj)

/* Copy the entire list Si into S*/4.ptri= Si. LINK5. While (ptri.LINK!=NULL) do 1. INSERT_SL_FRONT(S.ptri.DATA) 2. ptri=ptri.LINK6. Endwhile

STEPS:

Page 16: Set data structure 2

/* For each element in S’. Delete it from S if it is there*/

7.ptr= S’.LINK8.While (ptr!=NULL) do 1. DELETE_SL_ANY(S,ptr.DATA) 2. ptr=ptr.LINK9. Endwhile10.Return (S)11.Stop.

Page 17: Set data structure 2

Input: Si and Sj are header of two single linked list

representing two distinct sets.

Output: Return TRUE if two sets Si and Sj equal else

FALSE

Data structure: Linked list representation of set.

ALGORITHM : EQUALITY_LIST_SETS(Si,Sj)

Page 18: Set data structure 2

1. li= 0, lj =02.ptr=Si.LINK // to count Si3.while (ptr!=NULL) do 1. li=li+1 2. ptr=ptr.LINK4.Endwhile5. ptr=Sj.LINK //to count Sj6. While (ptr!=NULL) do 1. lj=lj+1 2. ptr=ptr.LNIK7.Endwhile8. If (li !=lj) then 1. flag = FALSE 2. exit .9.Endif /*compare the elements in

Si and Sj*/

STEPS

Page 19: Set data structure 2

10. ptri= Si.LINK,flag=TRUE11. While (ptril!=NULL )and (flag = TURE) do 1. ptrj=sj.LINK 2. while (ptrj.DATA !=ptri.DATA)and (ptrj!=NULL) do 1.ptrj=ptrj.LINK 3. Endwhile 4.ptri=ptri.LINK 5. If (ptrj= NULL)then 1. flag= FALSE 6.Endif12. Endwhile13.Return(flag)14.Stop.

Page 20: Set data structure 2

TREE REPRESENTATION OF SETS

Page 21: Set data structure 2

►Here a tree is used to represent one set, and the each element in the set has the same root.►Each element in a set has pointer to its parent.►Let us consider sets S1 ={1,3,5,7,9,11,13} S2 ={2,4,8} S3 ={6}

1

53 7 9

11 13

2

84

6

S1

S2 S3

S1 ={1,3,5,7,9,11,13} S2 ={2,4,8}

S3 ={6}

Page 22: Set data structure 2

1

53 7 9

11 13

S1

Tree representation of set S1 ={1,3,5,7,9,11,13}

0 -- 1 -- 1 -- 1 -- 1 -- 7 -- 7 -- -- --

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Page 23: Set data structure 2

1

5

3

79

11

6

S1

Illustration of FIND method

-4 -3 -3 2 1 3 1 1 3 -- 7 -- -- -- -- --

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

2

84

S2 S3

Page 24: Set data structure 2

HASH TABLE REPRESENTATION OF SETS

Page 25: Set data structure 2

Here the elements in collection are separated in to number of buckets.

Each bucket can hold arbitrary number of elements. Consider set S ={2,5,7,16,17,23,34,42} Here hash table with 4 buckets and H(x) hash function

can store which can place element from S to any of the four buckets.

Bucket 1

Bucket 2

Bucket 3

Bucket 4

2 34

16

42

7 23

5 17

Page 26: Set data structure 2

Operation on Hash table Representation of Sets

--

--

A

H

K

I

--

--

B

D

C

E

J

L

F

G

N

S

T

ZK

B

Q

E

V

X

I

S

Set Si Set Sj

Page 27: Set data structure 2

--

A B

D

H

K

I

E

J

L

F

G

Q

NC

S

V U T

X Z

UNION: S = Si U Sj

Page 28: Set data structure 2

--

--

A

H J

F

INTERSECTION

C

D

G

L

--

--

--

--

I

B

E

K

DIFFERENCE

Si Sj Si -Sj

Page 29: Set data structure 2

BIT VECTOR REPRESENTATION OF SETS

Page 30: Set data structure 2

VARIATION OF SETS

MAINTAINING THE INDICATION OF PRESENCE OR

ABSENCE OF DATA

MAINTAINING ACTUAL DATA VALUE

Page 31: Set data structure 2

A set, giving the records about the age of cricketer less than or equal to 35 is as given below:

{0,0,0,0,1,1,1,1,0,1,1} Here 1 indicates the presence of records having the

age less than or equal to 35. 0 indicates the absence of records having the age less

than or equal to 35. As we have to indicate presence or absence of an

element only, so 0 or 1 can be used for indication for saving storage space

A bit array data structure is known for this purpose. A bit array is simply an array containing values 0 or

1(binary).

Page 32: Set data structure 2

Operations on bit vector representation

• It is very easy to implement set operation on the bit array data structure.

• The operations are well defined only if the size of the bit arrays representing two sets under operation are of same size.

Page 33: Set data structure 2

To obtain the union of sets si and sj, the bit-wise

OR operation can be used

Si and Sj are given below:

UNION

Si = 1 0 0 1 0 1 1 0 0 1

Sj = 0 0 1 1 1 0 0 1 0 0

Si U Sj = 1 0 1 1 1 1 1 1 0 1

Page 34: Set data structure 2

Input: Si and Sj are two bit array corresponding to two

sets.

Output: A bit array S is the result of Si U Sj.

Data structure: Bit vector representation of set.

ALGORITHM : UNION_BIT_SETS(Si,Sj;S)

Page 35: Set data structure 2

1. li=LENGTH(Si) //Size of Si.

2. li=LENGTH(Sj) //Size of Sj.3. If (li != lj) then 1.Print “Two sets are not compatible for union” 2.Exit4. End if

/*Loop over the under lying bit arrays and bit-wise OR on its constituents data.*/

5. For i=1 to li do 1.S[i] = Si[i] OR Sj[i]

6. EndFor 7. Return(S)8. Stop

Page 36: Set data structure 2

To obtain the intersection of sets si and sj, the bit-

wise AND operation can be used

Si and Sj are given below:

INTERSECTION

Si = 1 0 0 1 0 1 1 0 0 1

Sj = 0 0 1 1 1 0 0 1 0 0

Si Sj = 0 0 0 1 0 0 0 0 0 0

Page 37: Set data structure 2

Input: Si and Sj are two bit array corresponding to two

sets.

Output: A bit array S is the result of Si Sj.

Data structure: Bit vector representation of set.

ALGORITHM : INTERSECTION_BIT_SETS(Si,Sj;S)

Page 38: Set data structure 2

1. li=LENGTH(Si) //Size of Si.2. li=LENGTH(Sj) //Size of Sj.3. If (li != lj) then 1.Print “Two sets are not compatible for intersection” 2.Exit4. End if

/*Loop over the under lying bit arrays and bit-wise AND on its constituents data.*/

5. For i=1 to li do 1.S[i] = Si[i] AND Sj[i]

6. EndFor 7. Return(S)8. Stop

STEPS:

Page 39: Set data structure 2

The difference of Si from Sj is the set of values

that appear in Si but not in Sj. This can be

obtained using bit-wise AND on the inverse of Sj.

Si and Sj are given below:

DIFFERENCE

Si = 1 0 0 1 0 1 1 0 0 1

Sj = 0 0 1 1 1 0 0 1 0 0

Sj’ = 1 1 0 0 0 1 1 0 1 1

S = Si – Sj = Si Sj’ =

1 0 0 0 0 1 1 0 0 1

Page 40: Set data structure 2

Input: Si and Sj are two bit array corresponding to two

sets.

Output: A bit array S is the result of Si and Sj.

Data structure: Bit vector representation of set.

ALGORITHM : DIFFERENCE_BIT_SETS(Si,Sj;S)

Page 41: Set data structure 2

STEPS:

1. li=LENGTH(Si) //Size of Si.2. lj=LENGTH(Sj) //Size of Sj.3. If (li != lj) then 1.Print “Two sets are not compatible for difference” 2.Exit4. End if /*To find the inverse (NOT) of

Sj.*/5. For i=1 to li do 1.Sj[i] = NOT Sj[i]6. EndFor /*Loop over the under lying bit

arrays and bit-wise AND*/7. For i=1 to li do 1.S[i] = Si[i] AND Sj[i]d8. EndFor 9. Return(S)10. Stop

Page 42: Set data structure 2

The equality operation is used to determine whether two

sets Si and Sj are equal or not.

This can be achieved by simple comparison between the

pair-wise bit values in two bit arrays.

EQUALITY

Page 43: Set data structure 2

Input: Si and Sj are two bit array corresponding to two

sets.

Output: Return TRUE if they are equal else FALSE.

Data structure: Bit vector representation of set.

ALGORITHM : EQUALITY_BIT_SETS(Si,Sj)

Page 44: Set data structure 2

1. li=LENGTH(Si) //Size of Si.

2. li=LENGTH(Sj) //Size of Sj.3. If (li != lj) then 1.Return (FALSE) //return with failure 2.Exit4. End if

/*Loop over the under lying bit arrays and compare*/

5. For i=1 to li do 1.SJ[i] != Sj[i] then

1.Return (FALSE) //return with failure 2.Exit 2.EndIf6. EndFor

/*Otherwise two sets are equal */ 7. Return(TRUE)8. Stop

STEPS:

Page 45: Set data structure 2

Application of Set DataStructure

Page 46: Set data structure 2

Let us consider a technique of storage and retrieval of information using bit strings.A bit string is a set of bits that is a string of 0’s and 1’s for example 1000110011 is a bit string.Let us now see how the information can be stored and retrieved using bit string.Let us assume a simple database to store the information of 10 students.In the sample database we have assumed the information structure as stated below:

Information storing using bit string

Page 47: Set data structure 2

NAME REG NO

SEX DISCIPLINE MODULE CATEGORY ADDRESS

AAA A1 M CS C SC ---

BBB A2 M CE P GN ---

CCC A3 F ME D GN ---

DDD A4 F EC D GN ---

EEE A5 M EE P ST ---

FFF A6 M AE C SC ----

GGG A7 F ME C ST ---

HHH A8 M CE D GN ---

III A9 F CS P SC ---

JJJ A10 M AE P ST ---

A SAMPLE DATA BASE WITH 10 RECORDS

Page 48: Set data structure 2

Name : String of Characters of length 25.

RegnNo : Alpha numeric string of length 15.

Sex : A single character value coded as

F=Female M=Male

Discipline: Two character value coded as:

AE-Agricultural Engineering

CE-Civil Engineering

CS-Computer Science and Engineering

EC-Electrical and Communication Engineering

EE-Electrical Engineering

ME-Mechanical

Module : One character value coded as

C = Certificate P=Diploma D= Degree

Category: Two character value coded as

GN=General SC=Scheduled Caste

ST=Schedule d tribe OC=Other Category

Address : Alpha numeric String of length 50

Page 49: Set data structure 2

Length of bit string = number of records(here 10).

To store a particular column we require Bit Arrays storing a set of bit string.

The number of bit arrays will be determined by different attributes that the field may have.

For ex:

Sex : 2 for M or F

Discipline : 6 for six different branches

Module : 3 for three different streams

Category : 4 for different categories

All together 15 bit arrays each of length 10 in this case is required to store the information.

Hence in the bit array in the ‘i’th position of the bit string ,a ‘1’ means the existence and ‘0’ means the absence of such attribute for the ‘i’th record.

Page 50: Set data structure 2

ARRAY BIT STRING

M 1100110101

F 0011001010

AE 0000010001

CE 0100000100

CS 1000000010

EC 0001000000

EE 0000100000

ME 0000010000

C 0010001000

P 0100100011

D 0011000100

GN 0111000100

SC 1000010010

ST 0000101001

OC 0000000000

Page 51: Set data structure 2

How many students are there in engineering and computer discipline?

To retrieve this information only bit arrays CS needs to be searched for the number of 1’s in it.

Who are the female students in CS discipline? For this information do F CS or

[0 0 1 1 0 0 1 0 1 0] [1 0 0 0 0 0 0 0 1 0] = [ 0 0 0 0 0 0 0 1 0 ] Thus it gives the 9th record only.

How many students of General Category are there in diploma or degree Module?

GN [ P D]

Information retrieval using bit string

Page 52: Set data structure 2

Efficient in terms of storage point of view If v = number of bit arrays r = number of records Total bits needed = v*r; In our example 15*10 = 150 bits. In contrast if we are using conventional method we may need 10 bytes for sex and module, 20 bytes for each Discipline and Category thus total 60 bytes=480

bits

Performance issue of the technique

Page 53: Set data structure 2

From computation point of view this technique is efficient because no searching is involved.

A record can be computed through logical operations like AND,OR,NOT and hence giving fast computations.

One drawback of this technique is that it is not possible to store all kind of information. For example , the field where all or nearly all the values are different ,like name, regno, address this technique is in efficient.

Page 54: Set data structure 2