Top Banner
Data Structures and Algorithms Course’s slides: Introduction, Basic data types www.mif.vu.lt/~algis
27

Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Dec 23, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Data Structuresand

Algorithms

Course’s slides: Introduction, Basic data types

www.mif.vu.lt/~algis

Page 2: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Motto

Data Structures and Algorithms

The key to your professional reputation

A much more dramatic effect can be made on the performance of a program by changing to a better algorithm than by hacking

converting to assembler

Buy a text for long-term reference! Professional software engineers have algorithms

text(s) on their shelves

Hackers have user manuals for the latest software package

Page 3: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Content 1. Introduction, computing model, von Neuman

principles, data, abstract data types, data structures, basic data types

2. Sorting, internal sorting, quicksort

3. Merge sort, von Neuman sorting, external sorting

4. Abstract data types, stack, queue, programming of stack and queue

5. Heap, priority queue, priority queue by heap structure, lists, list programming, dynamic sets ADT

6. Hierarchical structures, binary search trees, tree allocation in memory

7. AVL trees, 2-3-4 trees, red-black trees

Page 4: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Content 8. B-trees and other similar trees, Huffman algorithm

for data compression

9. Hashing idea, hashing functions and tables, hashing procedures and algorithms, extendable hashing

10. Radix search algorithms, radix trees, radix algorithms, radix search

11. Patricia trees, suffix tree

12. Text search

13. Analysis of algorithms

Page 5: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Introduction Informally, algorithm means is a well-defined

computational procedure that takes value (set of values) as input and produces some value (set of values) as output.

An algorithm is thus a sequence of computational steps that transform the input into the output.

Algorithm is also viewed as a tool for solving a well-specified problem, involving computers.

There exist many points of view to algorithms. A good example of this is a famous Euclid’s algorithm:

for two integers x, y calculate the greatest common divisor gcd (x, y). Direct implementation of the algorithm looks like:

Page 6: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Introduction

program euclid (input, output);var x,y: integer;function gcd (u,v: integer): integer;

var t: integer; begin repeat

if u<v then begin t := u; u := v; v := t end; u := u-v; until u = 0; gcd := v end;

begin while not eof do

begin readln (x, y); if (x>0) and (y>0) then writeln (x, y, gcd (x, y)) end;

end.

Page 7: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Introduction This algorithm has some exceptional features

it is applicable only to numbers;

it has to be changed every time when something of the environment changes, say if numbers are very long and does not fit into a size of variable (numbers like 1000!).

For algorithms of applications in the focus of this course, like databases, information systems, etc., they are usually understood in a slightly different way:

is repeated many times; in a various circumstancies; with different types of data.

Page 8: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Introduction 404800479988610197196058631666872994808558901323↵ 829669944590997424504087073759918823627727188732↵

519779505950995276120874975462497043601418278094↵ 646496291056393887437886487337119181045825783647↵ 849977012476632889835955735432513185323958463075↵ 557409114262417474349347553428646576611667797396↵ 668820291207379143853719588249808126867838374559↵ 731746136085379534524221586593201928090878297308↵ 431392844403281231558611036976801357304216168747↵ 609675871348312025478589320767169132448426236131↵ 412508780208000261683151027341827977704784635868↵ 170164365024153691398281264810213092761244896359↵ 928705114964975419909342221566832572080821333186↵ 116811553615836546984046708975602900950537616475↵ 847728421889679646244945160765353408198901385442↵ 487984959953319101723355556602139450399736280750↵ 137837615307127761926849034352625200015888535147↵ 331611702103968175921510907788019393178114194545↵ 257223865541461062892187960223838971476088506276↵ 862967146674697562911234082439208160153780889893↵ 964518263243671616762179168909779911903754031274↵ 622289988005195444414282012187361745992642956581↵ 746628302955570299024324153181617210465832036786↵ 906117260158783520751516284225540265170483304226↵ 143974286933061690897968482590125458327168226458↵ 066526769958652682272807075781391858178889652208↵ 164348344825993266043367660176999612831860788386↵ 150279465955131156552036093988180612138558600301↵ 435694527224206344631797460594682573103790084024↵ 432438465657245014402821885252470935190620929023↵ 136493273497565513958720559654228749774011413346↵ 962715422845862377387538230483865688976461927383↵ 814900140767310446640259899490222221765904339901↵ 886018566526485061799702356193897017860040811889↵ 729918311021171229845901641921068884387121855646↵ 124960798722908519296819372388642614839657382291↵ 123125024186649353143970137428531926649875337218↵ 940694281434118520158014123344828015051399694290↵ 153483077644569099073152433278288269864602789864↵ 321139083506217095002597389863554277196742822248↵ 757586765752344220207573630569498825087968928162↵ 753848863396909959826280956121450994871701244516↵ 461260379029309120889086942028510640182154399457↵ 156805941872748998094254742173582401063677404595↵ 741785160829230135358081840096996372524230560855↵ 903700624271243416909004153690105933983835777939↵ 410970027753472000000000000000000000000000000000↵ 000000000000000000000000000000000000000000000000↵ 000000000000000000000000000000000000000000000000↵ 000000000000000000000000000000000000000000000000↵ 000000000000000000000000000000000000000000000000↵ 000000000000000000000000

Page 9: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Von Neuman computing model

1943: ENIACPresper Eckert and John Mauchly -- first general electronic computer. Hard-wired program -- settings of dials and switches.

1944: Beginnings of EDVAC

among other improvements, includes program stored in memory

1945: John von Neumann

wrote a report on the stored program concept, known as the First Draft of a Report on EDVAC , the “von Neumann machine” (or model).

a memory, containing instructions and dataa processing unit, for performing arithmetic and logical operations

a control unit, for interpreting instructions

Page 10: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Von Neuman computing model

Page 11: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Sub-Components

Clearly, requiring hardware changes with each new programming operation was time-consuming, error-prone, and costly

Von Neuman’s proposal was to store the program instructions right along with the data

The stored program concept was proposed about fifty years ago; to this day, it is the fundamental architecture that fuels computers.

Page 12: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Von Neuman computing model

valdymas skaičiavimas

registrai

RAM

3

5

8

PC

Page 13: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Memory Types: RAM

RAM is typically volatile memory (meaning it doesn ’t retain voltage settings once power is removed)

RAM is an array of cells, each with a unique address

A cell is the minimum unit of access. Originally, this was 8 bits taken together as a byte. In today ’s computer, word-sized cells (16 bits, grouped in 4) are more typical.

RAM gets its name from its access performance. In RAM memory, theoretically, it would take the same amount of time to access any memory cell, regardless of its location with the memory bank (“random” access).

Page 14: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

The ALU

The third component in the von Neumann architecture is called the Arithmetic Logic Unit.

This is the subcomponent that performs the arithmetic and logic operations for which we have been building parts.

The ALU is the “brain” of the computer.

Page 15: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

The ALU

It houses the special memory locations, called registers, of which we have already considered.

The ALU is important enough that we will come back to it later, For now, just realize that it contains the circuitry to perform addition, subtraction,multiplication and division, as well as logical comparisons (less than, equal to and greater than).

Page 16: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Boolean dataData values: {false, true}

In C/C++: false = 0, true = 1 (or nonzero)

Could store 1 value per bit, but usually use a byte (or word)

Operations: and &&or ||not !

&& 0 1

0 0 0

1 0 1

| | 0 1

0 0 1

1 1 1

x !x

0 1

1 0

Page 17: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Character DataStore numeric codes (ASCII, EBCDIC, Unicode) 1 byte for ASCII and EBCDIC, 2 bytes for Unicode (see examples on p. 35).

Basic operation: comparison of chars to determine if ==, <, >, etc. uses their numeric codes (i.e. uses their ordinal values)

Java

Page 18: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Integer DataNonegative (unsigned) integer:

type unsigned (and variations) in C++Store its base-two representation in a fixed number w of bits

(e.g., w = 16 or w = 32)

88 = 00000000010110002

Signed integer: type int (and variations) in C++

Store in a fixed number w of bits using one of the following representations:

Page 19: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Sign-magnitude representation

Save one bit (usually most significant) for sign

(0 = +, 1 = – )

Use base-two representation in the other bits.

88 ® _000000001011000

0 sign bit

1. Cumbersome for arithmetic computations

2. 2 0’s in this scheme

3. Incrementing by one results in subtraction of one, not addition!

–88 ® _0000000010110001

Both 0 and -0

Page 20: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Complement representation

For negative n (–n):(1) Find w-bit base-2 representation of n (2) Complement each bit.(3) Add 1

Example: –881. 88 as a 16-bit base-two number000000000101100

0

Same as subtracting the number from 0!

Same as sign mag.For non-negative n:

Use ordinary base-two representation with leading (sign) bit 0

2. Complement this bit string3. Add 1

11111111101001111111111110101000

Page 21: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

(see p. 38)

5 + 7:

0000000000000101+0000000000000111

5 + –6: 0000000000000101+1111111111111010

These work for both + and – integers

0000000000001100

111¬¾¾ carry bits

1111111111111111

+ 0 1

0 0 1

1 1 10

x 0 1

0 0 0

1 0 1

Good for arithmetic computation

Page 22: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Problems with Integer Representation

Limited Capacity -- a finite number of bits

Overflow and Underflow:

Overflow- addition or multiplication can exceed largest value permitted by

storage scheme

Underflow- subtraction or multiplication can exceed smallest value permitted by

storage scheme

Not a perfect representation of (mathematical) integers

can only store a finite (sub)range of them

Page 23: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

How is Real Data represented?

T y p e s f l o a t a n d d o u b l e ( a n d v a r ia t io n s ) i n C + +

S in g le p r e c is io n ( I E E E F lo a t in g - P o in t F o r m a t )

1 . W r i te b in a r y r e p r e s e n ta t io n in f lo a t in g - p o in t f o r m :b 1 .b 2 b 3 . . . 2 k w i th e a c h b i a b i t a n d b 1 = 1 ( u n l e s s n u m b e r is 0 )

m a n t is s a e x p o n e n t o r fr a c t io n a l p a r t

Example: 22.625 = (see p.41)

Floating point form:

10110.1012 1.01101012 ´

24+ 127

double:Exp: 11 bits, bias 1023Mant: 52 bits

p. 756

Round-off Errors

base

Page 24: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Basic data types for C

C programming language:

char - smallest addressable unit of the machine that can contain basic character set. It is an integer type, actual type can be either signed or unsigned depending on the implementation.

signed char - same size as char, but guaranteed to be signed.

unsigned char - same size as char, but guaranteed to be unsigned.

shortshort intsigned shortsigned short int - short signed integer type, at least 16 bits in size.

Page 25: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Basic data types for Cunsignedunsigned int - same as int, but unsigned.

longlong intsigned longsigned long int - long signed integer type, at least 32 bits in size.

unsigned longunsigned long int - same as long, but unsigned.

long longlong long intsigned long longsigned long long int - long long signed integer type, at least 64 bits in size

Page 26: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Basic data types for C

unsigned long long int - same as long long, but unsigned.

float - single precision floating-point type.

double - double precision floating-point type, actual properties unspecified (except minimum limits), however on most systems this is the IEEE 754 double-precision binary floating-point format

long double - extended precision floating-point type, actual properties unspecified.

Boolean type

Structures:

• struct birthday { char name[20]; int day; int month; int year; };

Page 27: Data Structures and Algorithms Course’s slides: Introduction, Basic data types algis.

Basic data types for C

Array - array of N elements of type T

• int cat[10]; // array of 10 elements, each of type int

• int bob[]; // array of an unspecified number of 'int' elements.

• int a[10][8]; // array of 10 elements, each of type 'array of 8 int elements'

• float f[][32]; // array of unspecified number of 'array of 32 float elements’

Pointer - char *square; long *circle;

Unions - union types are special structures which allow access to the same memory using different type descriptions