Chapter 6 Structured Data Types Arrays Records
Dec 18, 2015
Copyright © 2007 Addison-Wesley. All rights reserved. 1–2
Definitions
• data type– collection of data objects – a set of predefined operations
• descriptor : collection of attributes for a variable
• object : instance of a user-defined (abstract data) type
Copyright © 2007 Addison-Wesley. All rights reserved. 1–3
Structured Data Types
• Built out of other types– usually composed of multiple elements.– homogeneous : all elements have the
same type – heterogeneous : elements have different
types
Copyright © 2007 Addison-Wesley. All rights reserved. 1–4
Structured Data Types
• Arrays– aggregate of homogeneous data elements
indexed by its position
• Associative arrays– unordered collection of key-value pairs
• Records– heterogeneous aggregate of data elements
indexed by element name
Copyright © 2007 Addison-Wesley. All rights reserved. 1–5
Array Operations
• Whole array operations: – assignment– catenation
• Elemental operations same as those of base type
• Indexing : mapping from indexes to elements
array_name (index_value_list) an element
Copyright © 2007 Addison-Wesley. All rights reserved. 1–6
Array Design Issues• What types are legal for subscripts?
• Are subscripting expressions in element references range checked?
• When are subscript ranges bound?
• When does allocation take place?
• What is the maximum number of subscripts?
• Can array objects be initialized?
• Are any kind of slices allowed?
Copyright © 2007 Addison-Wesley. All rights reserved. 1–7
Binding Time Choices
• Static: compile-time binding of subscript range and memory
• Fixed stack-dynamic: subscript ranges static, allocated at declaration time (C, C++)
• Stack-dynamic: run-time binding of subscript range and memory
• Fixed heap-dynamic: storage binding is dynamic but fixed after allocation (Java, C and C++)
• Heap-dynamic: binding of subscript ranges and storage allocation is dynamic (Perl and JavaScript)
Copyright © 2007 Addison-Wesley. All rights reserved. 1–8
Array Initialization• Some language allow initialization at the
time of storage allocation– C, C++, Java, C# example
int list [] = {4, 5, 7, 83} – Character strings in C and C++
char name [] = “freddie”;– Arrays of strings in C and C++
char *names [] = {“Bob”, “Jake”, “Joe”};
Copyright © 2007 Addison-Wesley. All rights reserved. 1–9
Memory for arrays• For 1D arrays, contiguous block of
memory with equal amount of space for each element
• Two approaches for multi-dimensional arrays– Single block of contiguous memory for all
elements• Arrays must be rectangular• Address of array is starting memory location
– Implement as arrays of arrays (Java)• Jagged arrays are possible• Array variable is a pointer (reference)
Copyright © 2007 Addison-Wesley. All rights reserved. 1–10
Element Access
• Access function maps subscript expressions to an address in the array
• Access function for single-dimensioned arrays:address(list[k]) = address (list[lower_bound])
+ ((k-lower_bound) * element_size)
• Two common ways to organize 2D arrays– Row major order (by rows) – used in most languages– Column major order (by columns) – used in Fortran
Copyright © 2007 Addison-Wesley. All rights reserved. 1–11
Memory Allocation for 2D Array
• Row major (by rows) or column major order (by columns) for 2D array
• Access function maps subscript expressions to an address in the array
Copyright © 2007 Addison-Wesley. All rights reserved. 1–12
Row-major access formula
Location (a[I,j]) = address of a [row_lb,col_lb] + (((I - row_lb) * n) + (j - col_lb)) *element_size
Copyright © 2007 Addison-Wesley. All rights reserved. 1–14
Rectangular and Jagged Arrays• A rectangular array is a multi-
dimensioned array in which all of the rows have the same number of elements and all columns have the same number of elements
• A jagged matrix has rows with varying number of elements– Possible when multi-dimensioned arrays
actually appear as arrays of arrays
Copyright © 2007 Addison-Wesley. All rights reserved. 1–15
Slices• A slice is some substructure of an array;
it is nothing more than a referencing mechanism
• Slices are only useful in languages that have array operations– Java allows row slices from 2D arrays– Fortran 95
Integer, Dimension (10) :: VectorInteger, Dimension (3, 3) :: MatInteger, Dimension (3, 3) :: Cube
Vector (3:6) is a four element array
Copyright © 2007 Addison-Wesley. All rights reserved. 1–17
Compile-Time Descriptors
Single-dimensioned array Multi-dimensional array
Copyright © 2007 Addison-Wesley. All rights reserved. 1–18
Associative Arrays
• An associative array is an unordered collection of data elements that are indexed by an equal number of values called keys– A hash table has the same behavior
• Design Issues:1. What is the form of references to
elements?
2. Is the size static or dynamic?
Copyright © 2007 Addison-Wesley. All rights reserved. 1–19
Associative Arrays in Perl• Names begin with %; literals are
delimited by parentheses%hi_temps = ("Mon" => 77, "Tue" => 79, “Wed” => 65, …);
• Subscripting is done using braces and keys$hi_temps{"Wed"} = 83;– Elements can be removed with delete
delete $hi_temps{"Tue"};
Copyright © 2007 Addison-Wesley. All rights reserved. 1–20
Other Languages
• Ruby has hashes– ht = {key1=> vlaue1, …}– use ht[key1] to access
• Python has dictionary type– ht = {key1 : value1, …}– use ht[key1] to access
• In C++, Java provide library classes• In C, must use user-defined type
Copyright © 2007 Addison-Wesley. All rights reserved. 1–21
Record Types• A possibly heterogeneous aggregate of
data elements
• Individual elements identified by field name
• Like a class with no methods and only public data.
• Design issues:– What is the syntactic form of references to
the field? – Are elliptical references allowed
Copyright © 2007 Addison-Wesley. All rights reserved. 1–22
Definition of Records in Ada
• Record structures are indicated in an orthogonal waytype Emp_Rec_Type is record
First: String (1..20);Mid: String (1..10);Last: String (1..20);Hourly_Rate: Float;
end record;Emp_Rec: Emp_Rec_Type;
Copyright © 2007 Addison-Wesley. All rights reserved. 1–23
structs in C• Define a record in C using the struct syntax
struct record {
int var1;
double var2;
}
• Structs can be copiedstruct record r1, r2 // mem for 2 records
r1.var1 = 1; r1.var2 = 2.3;
r2 = r1; // copy data from r1 into r2
Copyright © 2007 Addison-Wesley. All rights reserved. 1–24
References to Record Fields
• Most language use dot notationEmp_Rec.Name
• Fully qualified references must include all record names
• Elliptical references allow leaving out record names as long as the reference is unambiguous, for example in COBOLFIRST, FIRST OF EMP-NAME, and FIRST of EMP-REC are elliptical references to the employee’s first name
Copyright © 2007 Addison-Wesley. All rights reserved. 1–25
Operations on Records
• Assignment is very common if the types are identical
• Ada allows record comparison
• Ada records can be initialized with aggregate literals
• COBOL provides MOVE CORRESPONDING– Copies a field of the source record to the
corresponding field in the target record
Copyright © 2007 Addison-Wesley. All rights reserved. 1–26
Records vs. Arrays
• Straight forward and safe design
• Use records when collection of data values is heterogeneous
• Access to array elements is much slower than access to record fields– subscripts are dynamic – field names are static
Copyright © 2007 Addison-Wesley. All rights reserved. 1–27
Implementation of Record Type
Offset address relative to the beginning of the records is associated with each field
Copyright © 2007 Addison-Wesley. All rights reserved. 1–28
Union Types
• A type whose elements are allowed to store different types at different times during execution
• Fortran, C, and C++ provide free union – no language support for type checking
• Type checking requires extra element – Type indicator called a discriminant– Supported by Ada
Copyright © 2007 Addison-Wesley. All rights reserved. 1–29
Evaluation of Unions
• Potentially unsafe construct– Do not allow type checking
• Java and C# do not support unions– Reflective of growing concerns for safety in
programming language
Copyright © 2007 Addison-Wesley. All rights reserved. 1–30
Type Equivalence
• Consider the problem of two structured types:– Are two record types compatible if they are
structurally the same but use different field names?
– Are two array types compatible if they are the same except that the subscripts are different?(e.g. [1..10] and [0..9])
– Are two enumeration types compatible if their components are spelled differently?
Copyright © 2007 Addison-Wesley. All rights reserved. 1–31
Two approaches
• Name type compatibility : two variables have compatible types if they are in either the same declaration or in declarations that use the same type name– Easy to implement but highly restrictive:– Subranges of integer types are not compatible with integer types– Formal parameters must be the same type as their corresponding
actual parameters (Pascal)
• Structure type compatibility means that two variables have compatible types if their types have identical structures– More flexible, but harder to implement