OPERATING SYSTEMS AND COMPUTER NETWORKS OSCN... · operating systems and computer networks lecture notes operating systems and computer networks lecture notes u n i v e r s i t y

1

OPERATING SYSTEMS AND COMPUTER

NETWORKS

LECTURE NOTES

OPE RATI NG S YST EMS A N D COMPUT ER NETW ORK S

LECTURE NOTES

U N I V E R S I T Y O F D U I S B U R G - E S S E N

F A C U L T Y O F E N G I N E E R I N G

I N S T I T U T E O F C O M P U T E R E N G I N E E R I N G

P R O F . D R . - I N G A . H U N G E R

2

Table of Contents

1 INTRODUCTION TO OPERATING SYSTEM ..................................................................... 5

1.1 TASKS OF OPERATING SYSTEMS .......................................................................................... 9 1.2 TYPES OF OPERATING SYSTEMS ........................................................................................ 11

2 FILE MANAGEMENT ......................................................................................................... 14

2.1 FILE SYSTEMS ON DISK (PHYSICAL STORAGE) .................................................................... 15 2.1.1 FILE MANAGEMENT ON DISKS AND FLOPPY DISKS .................................................................. 15 2.2 STRUCTURE OF A HARD DISK ............................................................................................. 21 2.3 STRUCTURE OF PORTABLE DATA DISCS ............................................................................ 23 2.4 BLOCK SIZE AND MANAGEMENT OF FREE SPACES .............................................................. 25 2.4.1 OPTIMAL BLOCK SIZE ................................................................................................................ 25 2.4.2 FREE SPACE MANAGEMENT ..................................................................................................... 27 2.4.2.1 FAT Sizes: FAT12, FAT16 and FAT32 ........................................................................... 30 2.5 DIRECTORIES ......................................................................................................................... 31 2.5.1 FILE ORGANIZATION IN UNIX ................................................................................................... 34

3 MEMORY MANAGEMENT ................................................................................................ 36

3.1 MEMORY HIERARCHY ......................................................................................................... 38 3.2 CACHE MEMORY ................................................................................................................ 41 3.2.1 CACHE STRUCTURE ................................................................................................................. 42 3.2.2 CACHE PERFORMANCE ............................................................................................................ 43 3.2.3 AVERAGE ACCESS TIME .......................................................................................................... 44 3.2.4 CACHE ORGANIZATION ............................................................................................................ 45 3.3 MAIN MEMORY MANAGEMENT ............................................................................................ 47 3.3.1 MEMORY MANAGEMENT WITHOUT SWAPPING AND PAGING ................................................... 47 3.3.1.1 Relocation ............................................................................................................................ 47 3.3.1.2 Protection ............................................................................................................................. 47 3.3.1.3 Partitions .............................................................................................................................. 47 3.3.1.3.1 Creating Partitions ........................................................................................................... 47 3.3.1.3.2 Allocation Strategies ....................................................................................................... 48 3.3.2 MEMORY MANAGEMENT WITH SWAPPING AND PAGING .......................................................... 49 3.3.3 REPLACEMENT STRATEGIES FOR PAGES ................................................................................ 53 3.4 VIRTUAL MEMORY .............................................................................................................. 54

4 PROCESS MANAGEMENT ............................................................................................... 56

4.1 PROCESS STATES AND PRINCIPLES OF HANDLING .............................................................. 57 4.1.1 PROCESS MANAGEMENT TOOLS .............................................................................................. 58 4.1.2 STRUCTURE OF MULTI-TASKING OPERATING SYSTEMS .......................................................... 59 4.1.2.1 The time sharing concept (Solution for Problem 1: To change between processes) 60 4.1.2.2 Scheduling algorithms (Solution for problem 2: Increase efficiency of use of CPU)) 61 4.2 PROCESS CHANGE ............................................................................................................ 64

3

4.3 SCHEDULING ..................................................................................................................... 67 4.3.1 SCHEDULER ............................................................................................................................. 67 4.3.2 SCHEDULING ALGORITHMS ..................................................................................................... 68 4.3.2.1 Requirements to a Scheduling Algorithm ....................................................................... 68 4.3.2.2 Classification of scheduling algorithms ........................................................................... 69 4.3.3 ANALYSIS OF SCHEDULING ALGORITHMS .............................................................................. 70 4.3.3.1 Gantt-Diagram ..................................................................................................................... 70 4.3.3.2 Timing Diagram ................................................................................................................... 71 4.3.3.3 Example of Planning Algorithms ...................................................................................... 72

5. INTRODUCTION TO COMPUTER NETWORKS ............................................................. 74

5.1 ISO/OSI REFERENCE MODEL ............................................................................................ 75 5.1.1 PHYSICAL LAYER ...................................................................................................................... 78 5.1.2 DATA LINK LAYER ...................................................................................................................... 79 5.1.3 NETWORK LAYER ..................................................................................................................... 80 5.1.4 TRANSPORT LAYER .................................................................................................................. 81 5.1.5 SESSION LAYER ........................................................................................................................ 81 5.1.6 PRESENTATION LAYER ............................................................................................................. 81 5.1.7 APPLICATION LAYER ................................................................................................................. 82 5.2 INFORMATION CODING ....................................................................................................... 83 5.2.1 FRAMES AND DATA TRANSMISSION ......................................................................................... 85 5.2.1.1 Bit Oriented Transmission ................................................................................................. 85 5.2.1.2 Symbol (Byte) oriented Transmission ............................................................................. 87 5.2.1.3 Packet Oriented Transmission ......................................................................................... 88 5.3 CODE EFFICIENCY .............................................................................................................. 89 5.3.1 STATIC CODE EFFICIENCY ........................................................................................................ 89 5.3.2 DYNAMIC CODE EFFICIENCY .................................................................................................... 89 5.4 TRANSMISSION PROTOCOLS .............................................................................................. 92 5.4.1 CARRIER SENSE MULTIPLE ACCESS (CSMA) PROTOCOLS .................................................... 92 5.4.2 CARRIER SENSE MULTIPLE ACCESS WITH COLLISION DETECTION ......................................... 92 5.4.3 STOP-AND-WAIT TRANSMISSION PROTOCOLL ........................................................................ 93 5.4.4 GO-BACK-N TRANSMISSION PROTOCOLL ................................................................................ 94 5.4.5 SELECTIVE REPEAT TRANSMISSION PROTOCOLL .................................................................. 95

6 NETWORKS ....................................................................................................................... 96

6.1 PHYSICAL NETWORKS ........................................................................................................ 96 6.2 NETWORK TOPOLOGIES ..................................................................................................... 96 6.2.1 LINE/BUS ................................................................................................................................... 96 6.2.2 RING ......................................................................................................................................... 97 6.2.3 STAR ......................................................................................................................................... 98 6.2.4 MESH ....................................................................................................................................... 100 6.2.5 OTHER TOPOLOGIES .............................................................................................................. 102 6.2.5.1 Multi-drop ........................................................................................................................... 102 6.2.5.2 Head-end ........................................................................................................................... 102 6.3 TECHNICAL REALIZATION OF NETWORK TOPOLOGIES ........................................................ 103 6.3.1 BUS TOPOLOGY ...................................................................................................................... 103 6.3.1.1 Thin Ethernet (10Base2) ................................................................................................. 103 6.3.1.2 10Base-T Ethernet (twisted pair ethernet) .................................................................... 103 6.3.2 RING ....................................................................................................................................... 104

4

6.3.2.1 Token Ring ........................................................................................................................ 104 6.3.2.2 FDDI ................................................................................................................................... 104 6.3.3 STAR: ATM ............................................................................................................................. 104

7 COMPUTER NETWORKS ................................................................................................ 105

8 ROUTING PROTOCOL - DIJKSTRA’S ALGORITHM .................................................... 107

5

Part I - OPERATING SYSTEMS

1 INTRODUCTION TO OPERATING SYSTEM

A modern Computer consists of one or more processors, main memory, data-storage devices and I/O-peripherals. Compared to computers of the late century, these devices are far more complex and challenging to understand. It is a necessity, that users can operate a computer without the knowledge of the mechanisms within. To accomplish this a layer of software exists in almost every computer called the operating system, whose job it is to provide user programs with a simpler interface and to manage all the resources of the underlying system. Basically an operating system is a program that serves as the mediator between computer users and the computer hardware (ref Fig. 1-1).

Fig. 1-1 Components of Computer System

According to figure 1-1 a computer system is composed of 4 components:

Hardware

Operating systems

Service programs

Application programs

The obvious advantage of an operating system for software developers is, that they do not need to create their programs for a specific type of hardware system. It is only necessary to provide compatibility to the OS which takes care of necessary hardware management.

6

To understand the distinctive layers of a computer system and software hierarchy Tanenbaum’s layered model helps to understand the underlying structure necessary to execute programs on the hardware (ref. Fig. 1-2,1-3).

Fig. 1-2 Computer as a Multilevel Machine

7

Fig. 1-3 Tanenbaum introduced a layered Principle

Short description of the particular levels in Tanenbaum’s principle -Level 0: Combination of Gates, arithmetic circuits, memory flip-flops and latches, Micro-

processors, chips and similar basic elements. The basic logic at this level is described with the Boolean algebra

-Level 1: True machine language level – numeric language. Programs at this level consist

of simple arithmetic operations and logical combinations. The execution is usually done by an Arithmetic Logic Unit (ALU)

-Level 2: The Instruction Set Architecture (ISA), also known as the machine language, is

the target of compilers and high-level languages. Commands are interpreted by microprograms or executed directly by hardware.

-Level 3: The Operating System Machine ‘sits’ on top of the ISA and provides additional

sets of instructions. New mechanisms of memory organization can be realized and the possibility of parallel program execution is usually implemented (in modern OS). This level is the lowest level within the hierarchy of an Operating system.

-Level 4: The Assembler Level translates the assembly language to machine language.

E.g. symbolic instruction names to numerical instruction codes, register names to numbers and symbolic variable names to numerical memory locations (to name an example of operations).

-Level 5: The Problem Oriented Language Level consists usually of applications

programming specific languages (C, C++, Java, LISP, to name a few). The

8

compiler translates commands into L3, L4 languages and serves as interpreters for specialized application areas. Software at this level deals with e.g. arrays, complex arithmetic operations, Boolean expressions.

Main purposes of an operating system:

To provide an environment for a computer user to execute programs on computer hardware in a convenient and efficient manner.

To allocate the separate resources of the computer as needed to solve the problem given. The allocation process should be as fair and efficient as possible.

As a control program it serves two major functions: (1) supervision of user programs to prevent errors and improper use of the computer, and (2) 1management of the operation and control of I/O devices.

Figure 1-4 shows an abstract model of an entire computer system. The user usually operates the system with the Graphic User Interface (GUI), but direct control with the help of a command-console (e.g. windows cmd-interface) is also possible in most cases. In this case the gray colored fields describe the components of an operating system.

9

1.1 TASKS OF OPERATING SYSTEMS

Fig. 1-4 Abstract Model of a Computer System

10

Operating systems are responsible for 4 main tasks:

1. Process management After switching on the computer the operating system loads with the respective GUI, ready to accept user input. Hereby, a whole lot of processes have already run: The ROM-resident boot program prompted to load the OS from a memory storage system. A program call from the user or the OS itself causes the operating system to load the called program from the disk and run it. Possibly, during its execution, the program calls on another one, which is to be loaded and started by the operating system.

If parallel or concurrent program execution is supported, the operating system has to assign the programs to be executed to the CPU according to a specific strategy. Parallel means that two processors are working at the same time. Concurrent means the ability to execute in parallel.

2. Memory management If there are several programs in the memory they must be respectively assigned a space. Each program must receive a stack and heap area. With the "virtual memory concept ", where not all parts of a program are located in memory, the operating system must organize the reloading of the files on a hard disk. Additionally, the access to sensible data and programs must be prevented from other processes.

3. File/data management This describes the management of mass storage in the form of files and directories. Keyword: File tree. Mostly, the data on an ASCII string are addressed as a name. The task of the operating system it is to map the logical name to a physical address.

4. Resource management Devices should be addressable by names, or the so-called logical addresses. These include disks, USB-memory sticks, printers, etc. At the time of the old MS-DOS OS the devices have certain names, so that commands like "copy Brief.TXT.prn" result in the output of the printer. If there are several programs in progress and they ask for devices, the operating system must assign them respectively.

11

1.2 TYPES OF OPERATING SYSTEMS

Operating systems can be found in different types of computer systems such as:

Console operations:

o Console operation by the user doing the following steps:

Load program (entry via: switches, paper tape, voice-command)

switch input of the starting address and the start command

Tracking the program course via indicator lights

o Further development:

Device drivers from library

Compiler, Assembler (compile, assemble, linking with library routines generates the object program)

Batch systems (batch processing): A batch process describes a sequence of jobs a system computes with a give set of data, in which the user has no option to intervene. In modern computer systems a batch is used in cases of non-interactive Data handling. Advantage: Faster error handling: dump job and proceed to next job. Jobs with similar needs are packed together (for example, all jobs that require the FORTRAN compiler). At the end of a job or disorders the operator can intervene.

Overlapping CPU and I/O Operation In order to accelerate input/output operations data from a slower storage medium is copied on faster storage units. For example a CD can be copied on a SSD hard drive in order to enable a faster reading of the data for the CPU. While the data is copied the CPU can access parts of the data.

SPOOL (Simultaneous Peripheral Operation On-Line) A further improvement, allowed the storing of jobs from an interface into a buffer before processing it. A print job e.g. can be stored on a hard disk, if the printer is busy. The CPU doesn’t need to wait for the printer to finish and can go on working on other jobs.

Fig. 1-5 SPOOL-ing System

12

Multi user systems In contrast to a single-user system the multi user system allows processing time and other system resources to be shared between multiple users. Access can be established via Network or a shared workstation.

Multitasking / Time sharing systems A multi-tasking system is able to allow users interaction with the program. A CPU works on several jobs at the same time, providing each job a specific amount of CPU-Time. While another job is processed the user is able to interact with the program. Also the CPU is able to call up another job as soon as the currently processed one waits for data input. This gives the user a shorter response time for changes to commit. The operating system has now other two tasks:

o Selection of jobs from the pool for the transfer into memory

o Select which job gets how much CPU-Time (dependent on method).

Key words: job-scheduling, CPU-scheduling

Fig. 1-6 Multitasking with ‘Round Robin’- method

Parallel systems / multi-processor In principle, we distinguish between "tightly coupled" and "loosely coupled" systems. In the first case, two or more processors share the same bus, the same clock and same memory and peripheral devices (e.g. the I-Series CPUs from Intel). Loosely coupled indicates, e.g. computers in distributed locations connected with one or more communication system, => distributed systems. The goal in both systems is improved throughput and increased reliability and maybe a more economical system compared to a bigger processing unit.

13

Distributed systems Represent loosely coupled systems; each processor is autonomous with its own memory. They communicate with each other over different links (buses, telephone lines, network).

The execution has three forms: (1) All processes share one processor, (2) each process has its own processor and all of them share the memory, (3) each process has its own processor and the processors are distributed (computer networks, LAN, WAN).

Real-time A real-time system can be distinguished between "hard" and "soft" real-time conditions. Hard real-time conditions demand an action at a given time, a soft real time system has a better tolerance for missing a deadline. Example: A control unit for a power electronics system demands a control signal every 200µs. If the signal does not arrive in time the control sequence is faulty which may lead to a permanent damage of the system. The control unit must fulfil the criteria: Reaction time + operation time < maximum delay (here: 200µs), which makes it a hard real-time system. Some properties from non-real time systems are usually missing, e.g. the virtual memory concept, since the unpredictable delay that may occur while reloading data from a HDD is a problem for programmers.

Embedded systems An embedded system refers to a microprocessor or microcontroller for process control that are embedded in a technical environment, e.g. the engine management system in an automobile. Sensors and actors handle the micro-electronic core which is possibly configured as a standard component of the special program for its task. The processor must, under circumstances, accept information simultaneously from various sensors, do calculations and output information to the actors. Here concepts of programming concurrent processes are to be considered.

14

2 FILE MANAGEMENT

A ‚file‘, as used in the computer science, describes a combined set of data, defined by the user or corresponding software to be stored on a mass storage device e.g. hard drives, floppy disks, CD’s. According to the type of file system these data sets are compiled and stored as a combination of binary expressions. Those files must be managed and archived for the time the content is needed again by the OS or user. The file manager is a component of an operating system that manages the entire space on a mass storage device. The tasks that the manager has to fulfill include:

1. Locating the files requested by users.

All operating systems aim to achieve device modularity, so that data access is independent of the type of device on which the files are stored. The device manager must be able not only to find a file, but to acknowledge files on any compatible storage system and make them available for the OS or user (e.g. USB Sticks).

2. Space allocation for new created files.

When Data is written on a storage system it is sometimes only necessary to

capture the position of the last track (i.e. unfinished CD). But on a hard drive

files can be deleted and the space occupied is available again. A file manager

has to be able to recognize free areas and assign them to new files if needed.

That may lead to a set of data being spread over several areas of the storage

system.

3. Overview of the files and associated memory Aside from ‘knowing’ where files are located and how much space they occupy a file manager, in some cases, knows, when a file is spread over several areas, where the components of a specific file are located.

A file system is different from database management systems. Database systems represent program systems for structuring large amounts of data. They have a set of commands for inputting, requesting, modifying the content.

15

2.1 FILE SYSTEMS ON DISK (PHYSICAL STORAGE)

2.1.1 FILE MANAGEMENT ON DISKS AND FLOPPY DISKS

A modern computer system unifies many types of data storage systems. The working memory (RAM) is a low memory, fast access data system, which usually cannot save data permanently, but allows access to specific bits within a reasonable amount of time. This makes the Random Access Memory a perfect complement for a CPU but unusable for long time data saving. The, to this day still, most common system for saving a great amount of data is the magnetic disk hard drive (HDD). A HDD consists of at least 1 magnetic disk and 2 heads to read and write data. Due to the physical design of disks and floppy disks, the data storage device is ordered in:

concentric tracks which are divided into sectors. The sector size varies 32-4096 bytes. The sector is the smallest addressable unit and a combination of several sectors called cluster.

front and back; with several plates. In case of a stack of plates, 2 heads are assigned to each plate. The heads of all calculated tracks is called cylinder, see Fig. 2.1.

Fig. 2-1 Construction of a disk storage layout

To access a given set of data, the head has to move to the right track and the disk needs to spin to the right position, so the rotation speed of a disk is a major variable in determining the latency of data access. If a file is scattered across one or more

16

disks the data access takes a considerably greater amount of time compared to a defragment file, which has to be considered when writing a file on the disk. The process of assigning disk space to a file is referred to as an allocation. To achieve an optimal utilization of the given data storage medium it is necessary to develop an effective strategy. To determine the working parameters of the sought strategy it is necessary to have all information about the storage device. For example the, today rarely used, magnetic tape permits a purely sequential organization, because it is not possible to ‘jump’ to a certain position on the medium. Magnetic disks on the other hand allow direct and indexed sequential arrays on one or more plates. When stored data is modified, deleted or new contend is pasted it is not desired to rewrite the entire file on the storage device. To avoid this, we divide a file into blocks separated by gaps. Each block comprises one or more sectors. The blocks may be distributed over the disc and represent the logical layout of a file, the physical sectors and clusters. Often there is no distinction between block and cluster. Since new data can (or at least should) only inserted into free memory space it is necessary to keep track of unused memory blocks. This is usually done with a free memory list in which information about free clusters / blocks is stored. The block and/or cluster size can usually defined by the operating system or user once the storage device is plugged into the system. This choice affects the data transfer rate as well as the utilization of disk space. Here a four common allocation strategies presented:

1. Coherent (continuous) storage (also called Sequential storage) Sequential organization means that the data are successively stored on the disk and also are to be found by searching in this order. Pasting, modifying, and deleting data can only be done by writing the entire new file, if the file is not organized in blocks.

All blocks of a file are located one after another.

Advantages: Small folder size, because only the address of the first block/Cluster is saved as a file name. File can be read without interruption. Disadvantages: the complete volume of the file must be known from the beginning. Fragmentation of the disc space will occur when files are repeatedly deleted and created. This system was usually found on the magnetic tape storage system which is outdated today. Modern mass media devices are able to jump to a specific position on the memory unit, which allows more effective allocation strategies.

2. Allocation and access via linked blocks (Sectors)

This allocation method organizes files in several blocks, that can be placed anywhere on the disk. A block of the file contains the information where the next part of the file is located. The memory directory contains the information

17

where a file starts and optional where it ends. The last block of a file contains a NIL, to define the end of the files. Advantages: No external fragmentation of the disk space when files are deleted and on expanded repeatedly. Disadvantages: Slow access to specific block. A file has to be searched sequential for pointers to find the correct block. The pointer has to be stored in every block of a file which leads to a loss of space. I.e. if a pointer needs 4 bytes of disk space at a block size of 512 byte the effective data volume is reduced by 0.78%. To counter this loss of space it is possible to create clusters the size of multiples of 512 bytes. This results in fewer pointers to the next block/cluster but increases the potential of the ever-present internal fragmentation.

Fig. 2 2 Allocation and Access via Linked Blocks

3. Allocation and access via a pointer list, Fig. -2-3. This method creates a list of pointers in the main memory, which stores the pointers to the memory blocks of the stored files and keeps the information where the file continuous (ref. Fig. 2-3). The list is stored on the HDD when the main memory is shut down.

Advantage: Faster access via list in main memory

Disadvantage: Possibly huge table in main memory. An error in the table can cause real problems.

18

File A

File A resides in sectors 2,7,5,1023,3

Fig. 2-3 Allocation and Access via list of pointers

4. Indexed Allocation (Combination of 2 and 3) The indexed allocation method combines the attributes of both previous described methods. The first block of a file contains the pointers for all parts of the specific file. This means, that all addresses for the physical location of a block/cluster are stored in this file directory (ref. Fig. 2-4).

Fig. 2-4 Indexed Allocation

0 2

1

2 7

3 EOF

4

5 1023

6

7 5

~ ~

~

~

1023 3

19

When a file is created all pointers in the index block are assigned the value NIL (not in list). When writing a block for the first time, the file management system removes a free block from the free memory list and writes the address in the index block. Advantages: No external fragmentation. Faster access than linked blocks. Disadvantages: Wasted space for pointers in the index block. Block size defines the size of the index block.

Usually one block of a disk block is selected for each index block. If this is not sufficient, the last entry to a sequential block can be assigned to point on another index block for more file blocks. If the first blocks of a file point to blocks with more pointers on data blocks the method is called ‘multi level indexing’ (see below).

5. Multi level indexing The first entries of a file contain pointers to other pointer blocks. This method allows the storage of very large files on different segments of a disk but is slower the bigger the file gets. Apart from this the advantages and disadvantages are similar to indexed allocation.

Index block

1st Level

data

2nd Level

Fig. 2-5 Two-Stage Block Access (Multi level indexing)

1st entry of file

~

n-th entry of file

pointer on data block

pointer on data block

…

…

Index Block 1

Pointer on data block

Pointer on

data block

Index Block n

20

In UNIX a combined method to prevent internal fragmentation is used. Small files are organized with an index table for direct access. If the size of a file is equal or smaller than the cluster size it is linked to a direct addressing block. For bigger files the single, double, triple (and so on) index blocks are used (ref. Fig. 2-6.). This method allows size independence compared to a fixed size method.

Fig. 2-6 UNIX I-Node

.

.

.

.

.

.

12

21

2.2 STRUCTURE OF A HARD DISK

The effectiveness of a file management system depends to a certain degree on the quality and size of the data storage system. Since HDD are still the most common file storage for computer systems, it is necessary to get a superficial understanding of the medium. The following figures depict the components of a disk drive and the structure of a hard disk.

Fig. 2-7 Components of a disk

The platters of a hard disk are either constructed out of metal or plastic coated with magnetizable material. The recording and retrieving of data is done by a conduction coil fixed on a moving arm which is called ‘head’. During a read/write operation the head stays stationary, while the disks rotate underneath it.

Write mechanism: During a write operation an electric current flows through the coil within a head and generates a magnetic field. This field magnetizes the target area of the disk (usually 1 Bit) depending on the polarity of the field, resulting in a logical ‘0’ or ‘1’ as written data. The current used for the magnetic field is a pulsed current, meaning that it can change its direction at a very high frequency allowing a recording speed which is usually limited only through the rotational speed of the disks.

Read mechanism: When data is read form the disk it spins beneath the head and induces a current in the coil depending on the polarization of the field of a particular bit. This current results in a comparable voltage at the control unit of the head which interprets it as either ‘1’ or ‘0’.

22

The Figure 2-8 shows a simplified layout of a magnetic disk. The disk is divided into several tracks which contain a fixed number of sectors and are separated from another tracks by small gaps. This prevents or at least minimizes errors due to misalignment of the head or magnetic interference between tracks. The width of a track matches the width of the assigned head to ensure that no undesired magnetization of adjacent tracks happens.

Fig. 2-8 Structure of a hard disk

The head reads/writes on one track in a bit serial manner, meaning bit by bit. To read/write the next track requires the arm to re-position to the next track. In reality it is less likely that data packages will fill a whole track, therefore a sector or cluster is used to store data. One cluster can consist of one or more sectors, depending on the formatting of the HDD. An exemplary size of a sector is 512 Bytes, which is a standard size for most HDD’s in use. CD’s and DVD’ use 2048 Bytes as a sector size, while newer HDD’s with the advanced format attribute may even have a sector size of 4096 Bytes (4KiB).

To determine the capacity (volume) of a storage media (BV) the following term is used:

𝑩𝑽 = 𝑵𝑶𝑯 ∙ 𝑵𝑶𝑻 ∙ 𝑵𝑶𝑺 ∙ 𝑵𝑩𝑺

NOH: number of heads

NOT: number of tracks

NOS: number of sectors

NBS: number of Bytes per Sector (by default 512 Bytes/ Sector)

23

2.3 STRUCTURE OF PORTABLE DATA DISCS

The following chapter describes the structure and organization of a portable disc. While the floppy disk was the most common portable data storage system in the 80’s to early 90’s it has today been superseded by CD’s and DVD’s as data medium. There are still some areas where floppy disks are used even today, mostly because the system still works and an update would be unnecessary and expensive. In order to understand the evolution of disc shaped data devices the 1.44” floppy disk will be explained and compared to the optical based data storage systems.

Fig. 2-9 Allocation of capacity on a disk

Figure 2-9 shows the schematic of a 1.44” disk and its similarities to a HDD. But while a HDD can use every sector on a platter the floppy disk cannot use its innermost tracks due to the different data density. This means that parts of the physical space of the disk cannot be used for data storage. This problem was solved with the technology of optical data discs due to a spirally alignment of sectors with a fixed size (ref. Fig. 2-10), resulting in a much higher data density.

24

Fig. 2-10 Sector alignment on a CD-ROM

Unlike most CD’s floppy disks and hard disks can be written on both sides. The indexing of the sector identification number was done by track number, meaning, that the sectors of one side were numbered and then continued on the other side. When all tracks on the backside were numbered the front side were next (ref Fig. 2-11).

Fig. 2-11 Numbering of the sectors on a two-sided disk

25

2.4 BLOCK SIZE AND MANAGEMENT OF FREE SPACES

2.4.1 OPTIMAL BLOCK SIZE

To determine the optimal block size for a data storage system it is necessary to understand the characteristics of each method. If the disk is organized in small blocks the disk space can be better utilized to store data compared to large blocks (-> fragmentation). The disadvantage of this method is an increased cost of space required to manage the huge number of blocks (ref. chap. 2.1.1). For example, a HDD with 16 GiB disk space and a block size of 512 bytes means there are about 33.55M blocks to manage and thus addresses with 25 bits length are needed. In addition, small blocks may result to long loading times, if a big file is distributed on many different blocks, depending on the data rate of the HDD. Large blocks on the other hand may result in poor disk exploitation due to internal fragmentation. The reading time of the HDD is the other variable that needs to be addressed in order to determine the optimal block size. Figure 2-12 shows an exemplary platter with the associated rotation- and position time.

Fig. 2-12 Disk layout, head positioning and rotation time

The following formula shows how to determine the required time to read a block.

𝑹𝒆𝒂𝒅 𝒕𝒊𝒎𝒆 = 𝑩𝒍𝒐𝒄𝒌 𝑺𝒊𝒛𝒆

𝑻𝒐𝒕𝒂𝒍 𝑻𝒓𝒂𝒄𝒌 ∙ 𝑹𝒐𝒕𝒂𝒕𝒊𝒐𝒏 𝒕𝒊𝒎𝒆

16,67 ms/rotation ↔ 3600 rotation/min

32.768 Bits/sector ↔ 64 sectors with 512 bits

Figure 2-13 shows the relationships between data rate, block size and disk exploitation. The data rate and disk usage is plotted as a function of the block size with block/cluster sizes of 128 Bytes to 8KiB with an assumed file size of 1KiB.

26

Fig. 2-13 Disk Usage and Data Rate

The solid curve (left-hand scale) shows the data rate of the disk. The dashed curve (right-hand scale) shows the efficiency of disk usage.

To calculate the values (data rate) the following formula was used:

𝑹 = 𝒕𝒊𝒎𝒆 𝒕𝒐 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏 𝒉𝒆𝒂𝒅 + 𝟏

𝟐 𝒓𝒐𝒕𝒂𝒕𝒊𝒐𝒏 𝒕𝒊𝒎𝒆 + 𝒕𝒓𝒂𝒏𝒔𝒇𝒆𝒓 𝒕𝒊𝒎𝒆 𝒇𝒐𝒓 𝒕𝒉𝒆 𝒃𝒍𝒐𝒄𝒌

For example:

[30+8,3+(k/32.768) * 16,67] ms.

Block size k in Bit, track length would be 32.768 Bits 4K : 38,3 ms+ (4K/32K) * 16,67 ms = 40,38 ms = 101,44 Kbit/sec 2K : 39,34 ms = 50,84 KB/sec 1K : 38,82 ms = 25,76 KB/sec

To determine the percentage of disk exploitation the following formula was used:

𝑬𝒅𝒊𝒔𝒌 = {

𝟏, 𝒃𝒍𝒐𝒄𝒌 𝒔𝒊𝒛𝒆 < 𝒇𝒊𝒍𝒆 𝒔𝒊𝒛𝒆

𝒇𝒊𝒍𝒆 𝒔𝒊𝒛𝒆

𝒃𝒍𝒐𝒄𝒌 𝒔𝒊𝒛𝒆, 𝒃𝒍𝒐𝒄𝒌 𝒔𝒊𝒛𝒆 ≥ 𝒇𝒊𝒍𝒆 𝒔𝒊𝒛𝒆

Note that this formula only works for files with a fixed size of a multiple of the block size. The terms external and internal fragmentation are related with the allocation and selection of block size. External fragmentation occurs when stored data is released and the related disk space is rewritten multiple times. Internal fragmentation occurs when the typical file size is smaller than the set block size. Both effects are undesired.

Data rate (kbyte/s)

Disk exploitation (percentage)

27

2.4.2 FREE SPACE MANAGEMENT

Since a HDD is divided into a vast number of separate blocks it is necessary to keep track of the used and unused disk space. The memory mapping needs to be efficient and accurate in case a read/write operation form the CPU is given in order. The size of the blocks determines the complexity and space requirement of the management method and thus its overall effectiveness. Following are two methods of memory management presented:

1. Bitmap.

Fig. 2-14 Bit-Map

A Bitmap provides a simple way of tracking memory words in a fixed amount of memory. The size of bitmap depends on the size of the mapped memory and how it is allocated. Every cluster is mapped according to its virtual position (usually represented by registered number) and represented by one bit. The bit is set to 0 if the space is free and set to 1 if the space is occupied. The smaller the cluster size is set the larger the bitmap gets, because more clusters need to be addressed and represented through the bitmap.

Advantage: fast and effective method to find an unoccupied memory block or n consecutive blocks, because there is no need to access the HDD for information.

Disadvantage: To be effective the bitmap has to be stored in the main memory of the system. The space (in Bits) it occupies corresponds to the number of sectors of the HDD plus an offset of system-relevant bytes.

Example: A HDD of 128GiB, formatted in NTFS has a cluster size of 4KiB. The size of the pure bitmap is calculated according to the following formula, k equals the number of Bits one cluster is represented in the bitmap (here: 1)

𝑆𝐵𝑖𝑡𝑚𝑎𝑝 = 𝑆𝐻𝐷𝐷 ∗ 𝑘 𝐵𝑖𝑡

𝑆𝐶𝑙𝑢𝑠𝑡𝑒𝑟 ∗ 𝑛 𝐵𝑖𝑡

𝐵𝑦𝑡𝑒

= 128 𝐺𝑖𝐵

4 𝐾𝑖𝐵 ∗ 8 1

𝐵𝑦𝑡𝑒

= 128 ∗ 230 𝐵𝑦𝑡𝑒

4 ∗ 210 ∗ 8= 4𝑀𝑖𝐵

28

2. Linked list of free blocks.

Fig. 2-15 Linked list

This method creates a linked list of disk blocks, with each block holding as many free disk block addresses as possible. The space needed for a linked list can be calculated by the following formula:

Stable = (Nbr. of Bytes per pointer * Nbr of free blocks) * (1 + 1 / cluster size)

Example:

Same HDD as in the Bitmap-example is used, all blocks are considered free:

Number of free blocks = 𝑆𝐻𝐷𝐷

𝑆𝐶𝑙𝑢𝑠𝑡𝑒𝑟 = 32 * 2^20 = 32,554,432

Number of Bytes per pointer => 2^5 * 2^20 => 25 Bits => 4 Byte

𝑆𝑡𝑎𝑏𝑙𝑒 = (4 𝐵𝑦𝑡𝑒 ∗ 32,554,432) ∗ (1 + 1

4∗1024 𝐵𝑦𝑡𝑒)

≈ 128 𝑀𝑖𝐵

The linked list is much bigger than the bitmap but does not need to be stored in the main memory, since it shows only free areas of the disk, making a search routine for free memory blocks unnecessary. A search-operation for a bulk of free clusters is only carried out if the free clusters need to be adjacent to each other. Like the linked list linked block index system this method has equal disadvantages and the same structure. The last entry of a block contains the pointer to the next block with addresses of free clusters.

In the days of MS-DOS this method has been developed into a more effective one, which is still used today, called the File Allocation Table. Files a stored via linked list method (ref Fig. 2-16) and a table in main memory keeps track where a file begins and which blocks of the HDD are unused.

29

Fig. 2-16 Organization of files in MS-DOS (FAT)

Figure 2-17 shows the interaction between the directory table and the memory. To find a free block, FAT is searched from the beginning until a cluster appears that matches the requirements.

Fig. 2-17 Linked List on MS-DOS (FAT)

30

2.4.2.1 FAT Sizes: FAT12, FAT16 and FAT32

The file allocation table or FAT stores information about the clusters on the disk in a table. There are three different varieties of this file allocation table, which vary based on its maximum size. The tool of the system used to partition the disk will normally choose the optimal type of FAT for the volume available, however, the type of FAT can sometimes still be chosen manually.

Since each cluster has one entry in the FAT, and these entries are used to hold the cluster number of the next cluster used by the file, the size of the FAT is the limiting factor on how many clusters any disk volume can contain. The following are the three different FAT versions now in use:

• FAT12: The oldest type of FAT uses a 12-bit binary number to hold the cluster number. A volume formatted using FAT12 can hold a maximum of 4,086 clusters, which is 2^12 minus a few values (to allow for reserved values to be used in the FAT). FAT12 is therefore most suitable for very small volumes, and is used on floppy disks and hard disk partitions smaller than about 16 MB.

• FAT16: This FAT is mostly outdated but may still be found on older systems, uses a 16-bit binary number to hold cluster numbers. A volume using FAT16 can hold a maximum of 65,526 clusters, which is 2^16 less a few values (again for reserved values in the FAT). FAT16 is used for hard disk volumes ranging in size from 16 MB to 2,048 MB. VFAT is a variant of FAT16.

• FAT32: The newest FAT type, FAT32 is supported by all past windows-systems since 95. FAT32 uses a 28-bit binary cluster number--not 32, because 4 of the 32 bits are "reserved". 28 bits is still enough to permit large volumes--FAT32 can theoretically handle volumes with over 268 million clusters, and will support (theoretically) drives up to 2 TB in size. However to do this the size of the FAT grows very large. For these limitations most memory storage systems with a size > 4GiByte use the now-standard NTFS as a default file system.

The following table shows a comparison of the three types of FAT:

Attribute FAT12 FAT16 FAT32 NTFS

Used For Floppies and very small hard disk volumes

Small sized hard disk volumes (old)

Small to Medium-sized to hard disk volumes (pre NTFS standard)

New standard for data storage greater or equal 4 GiB

Size of Each FAT Entry

12 bits 16 bits 28 bits

Maximum Number of Clusters

4,086 65,526 ~268,435,456 ~4,295 Billion

Cluster Size Used

0.5 KiB to 4 KiB usually 512 Byte

0.5 KiB to 64 KB usually 4 KiB

0.5 KB to 32 KB usually 4KiB

4KiB to 64KiB

Maximum Volume Size

16,736,256 2,147,123,200 Win NT 3.5 ~ 2^41 newer OS: ~ 2^35

256 * 10^40

Fig. 2-18 FAT size table

31

2.5 DIRECTORIES

Directories are used in a computer system for several reasons. They are used to order files or separate them from each other in order to avoid unintended data modification. Users can implement them to create backups from existing files, while an OS does this in a similar manner before an update is implemented. Folders also contain data used by a joint program to minimize the search of certain software elements and to sort files by certain standards to simplify a search initialized by the OS. Summarized:

To order files

Backup and updates

To avoid unintended data modification

Allow joint use of data

Issue of access rights

Shorter search time for files by operating system

The directories are located on the HDD in disk blocks and they include the mapping of file names to their respective disk blocks. A directory entry in MS-DOS looks like this, Fig. 2-19.

Byte: 8 3 1 10 2 2 2 4 | 32Bit

File name

Name extension

Attribute

Reserved

time

date

number 1st block

size

Fig. 2-19 Directory entry in MS-DOS

The following table shows the directory entries and information about file and cluster sizes. Here the size of a sector is assumed to be 512 bytes and there is no distinction between block and sector.

Attributes

Read only: not modifiable |R

Archive: archived since last modification |A

System: cannot be delete using del |S

Hidden: is not listed in dir |H

Directory |D

Disk label |V

Fig. 2-20 Attributes for entries in MS-DOS

32

The following table shows the entries of a directory in detail and offers information about the file- and cluster size (ref. Fig. 2-21). In this case the size of a sector is fixed 512 Byte and there is no differentiation between block and sector.

Position Length Content

0x00-0x07 8 File name

0x08-0x0A 3 File name extension 0x0B 1 Attribute

0x01 File is read-only 0x02 Hidden file(will not be shown with DIR) 0x04 System File(will not be shown with DIR) 0x08 Volume label (Directory entry ist he disk name)

0x10 Directory entry refers to a subdirectory

0x20 File has not yet been archieved 0x0C-0x15 10 reserved for DOS 0x16-0x17 2 Time of last modification or the creation of the file 0x18-0x19 2 Data of last modification or the creation of the file 0x1A-0x1B 2 Start cluster number of the file 0x1C-0x1F 4 Length of the file in bytes

Cluster size in sectors max. File system size

(FAT 12) (FAT 16)

1 = 512 Byte 2 MiByte 32 MiByte 4 = 2 KiByte 8 MiByte 128 MiByte 8 = 4 KiByte 16 MiByte 256 MiByte

16 = 8 KiByte 32 MiByte 512 MiByte 32 = 16 KiByte 64 MiByte 1024 MiByte

Cluster size in sectors FAT-size with 32-MiByte-File system

1 = 512 Byte 64 KiByte (16-Bit-FAT) 4 = 2 KiByte 16 KiByte (16-Bit-FAT) 8 = 4 KiByte 6 KiByte (12-Bit-FAT)

16 = 8 KiByte 3 KiByte (12-Bit-FAT)

Cluster size in sectors Number of file Lost Storage/KiB

1 100 25 4 500 500 8 1000 2000

16 2000 8000

Fig. 2-21 Directory entries, file and cluster sizes

The following figures show several examples of directory structures in different systems. While the general structure of the different directory methods is quite similar, there are differences between each method according to their type of space management. The older floppy disks / hard disks utilized the structure of a file directory was according to figure 2-22.

33

Read from ROM-BIOS

Fig. 2-22 Arrangement of entries on disk

The boot sector is the first sector in this arrangement which starts at the relative address 0. This was always a reserved sector but usually the only one. Followed by this was the file allocation table which size depended on the chosen type of FAT as well as the root directory and the file blocks.

The UNIX file system uses the following directory entry:

Byte 2 14/255 im BSD-UNIX

Fig. 2-23 UNIX Directory Entries

Those entries are ordered as shown in figure 2-24 on a hard disk.

Content

Boot

Block

Super-

Block

I-Nodes

(per 64 bytes) Data Blocks

512 Byte

Fig. 2-24 Arrangement of entries in a disk

Every I-Node (of 64 Bytes) equals one file. Every file is stored in data blocks. The I-Node reference which block belongs to a file and the super block contains relevant information about the file system.

Boot Sector

Sector 0

FAT

FAT- copy

Root Directory

File Blocks

I-Node Data name

34

2.5.1 FILE ORGANIZATION IN UNIX

Each file system has a table of contents, in which all existing files are recorded. In UNIX this is the I-Node list, the I-Nodes are the elements, which represent the header.

The structure has the following order:

Block 0 (boot block)

Superblock

List of Head of Files (I-node list)

Range of the data blocks

Fig. 2-25 UNIX file structure

The super block in main memory contains information about:

the size of the file system in blocks of 512 bytes

Name of the log. Disc

Name of the file system

Size of the I-node list

Pointer to the first element of the list of free data blocks

Pointer to the first element of the list of the free I-nodes

Date of last modification

Date of last backup

Identification whether 512 or 1K byte file system exists

35

The structure of an I-Node is shown in Fig. 2-. The I-Node 1 manages faulty blocks, node 2 contains the root folder.

Fig. 2-26 Search for file /usr/gd/DV3

36

3 MEMORY MANAGEMENT

The term memory management is a preamble for certain methods to order, save and if necessary relocate files on storage systems. As it is known so far the memory access speed has a large impact on the overall speed of the system. The typical computer has several levels of memory depending on the specific purpose the memory serves. The fast access cache, which is usually implemented on the CPU is used to store code segments that are needed for the current operation. The main Memory (in a PC called RAM) is used to store files relevant for the executed application. Files with a low access priority are stored on a mass data storage device (HDD). Those devices are designed for storage capacity and organized by the operating system for optimal usage of disk space.

The amount of space of a disk that can be utilized to store files depends on the method how the files are stored and ordered and the actual size of the storage medium (See Chapter 2).

When dealing with wasted memory space there are 2 terms that are encountered:

Internal fragmentation: large unused wasted memory spaces that result from allocation.

External fragmentation: very small free memory spaces that are spread all over the memory, that can neither be used for allocation nor to be merged with one another.

The performance of a computer system depends partially on the effective usage of space in the cache and the main memory. If a large amount of memory is wasted, due to fragmentation, ineffective indexing and similar problems, a request for data will take more time than necessary. While this data request is pending the CPU cannot continue calculating the current process and the system will slow down.

A high amount of fragmentation can lead to memory shortage in the RAM, which will result in a higher amount of necessary reloads from the hard disk (Virtual Memory Concept or Storage Access), thus resulting in longer memory access times. An inefficient indexing affects the data access time directly because the process of copying data from the main memory to the cache gets interrupted by the tracing of the next memory block from the current memory segment.

Developers want to achieve an increase of computer performance as much as possible. This performance is considered by a user as time between request and response; the operator of a router considers this as throughput, i.e. number of routed packets/time unit.

The difference between two execution times is calculated to:

𝒏 = 𝑬𝒙𝒆𝒄𝒖𝒕𝒊𝒐𝒏 𝒕𝒊𝒎𝒆 𝑿

𝑬𝒙𝒆𝒄𝒖𝒕𝒊𝒐𝒏 𝒕𝒊𝒎𝒆 𝒀

With n giving a relative number of the improvement (or worsen) between two execution times. The performance is measured by programs that load (as in burden) different systems with an equal task and compare the performance to one another (Benchmarks).

37

Locality Principle

According to a rule of thumb: 90% of the execution time of a program consumes 10% of the commands of the program. This is known as access locality. The same applies to the data access.

We distinguish between temporal and spatial locality:

a) Temporal locality states that recently addressed commands are more likely to be addressed again next;

b) Spatial locality means that temporally consecutive commands are also spatially adjacent (for example, loops).

Observations show that actually the last-mentioned addresses are most likely to be addressed again as the next. It is further shown that access to spatially neighboring addresses occurs adjacent also in time. From this experience results the concept of memory hierarchy as a measure for organizational performance improvement. We distinguish between CPU registers, cache, main memory/RAM, disks, floppy disk and CD’s.

38

3.1 MEMORY HIERARCHY

Implementing a smart memory hierarchy is meant to speed up the execution of tasks. To address these improvements some basic concepts should be cleared first:

Addressing is divided two parts: a physical address (for CPU, registers, cache, RAM) and a logical address (for mass storage). Mapping is the translation from logical address (virtual address) to physical address.

Amdahl´s Law:

The performance improvement of a system to be gained from using faster mode of execution is limited by the slowest fraction of a system that can’t be parallelized. This means that the slowest part of a process specifies the optimal time the process can be done:

𝑷𝒐𝒗𝒆𝒓𝒂𝒍𝒍 =𝑶𝒓𝒊𝒈𝒊𝒏𝒂𝒍 𝒆𝒙𝒆𝒄𝒖𝒕𝒊𝒐𝒏 𝒕𝒊𝒎𝒆

𝑰𝒎𝒑𝒓𝒐𝒗𝒆𝒅 𝒆𝒙𝒆𝒄𝒖𝒕𝒊𝒐𝒏 𝒕𝒊𝒎𝒆

𝑷𝒐𝒗𝒆𝒓𝒂𝒍𝒍 =𝒕𝒐𝒍𝒅

𝒕𝒏𝒆𝒘

=𝟏

(𝟏 – 𝒇𝒓𝒂𝒄𝒕𝒊𝒐𝒏𝒊𝒎𝒑𝒓𝒐𝒗𝒆𝒅) + 𝒇𝒓𝒂𝒄𝒕𝒊𝒐𝒏𝒊𝒎𝒑𝒓𝒐𝒗𝒆𝒅

𝑷𝒇𝒓𝒂𝒄𝒕𝒊𝒐𝒏

=𝟏

(𝟏 − 𝒇𝒓𝒂𝒄𝒕𝒊𝒐𝒏𝒊𝒎𝒑𝒓𝒐𝒗𝒆𝒅) + 𝒇𝒓𝒂𝒄𝒕𝒊𝒐𝒏𝒊𝒎𝒑𝒓𝒐𝒗𝒆𝒅𝒕𝒐𝒍𝒅𝒕𝒏𝒆𝒘

39

The performance improvement is also called speedup.

CPU L1 Cache L2 Cache Main Memory I/O-Device

(Register) (SRAM) (SRAM) (DDR3 RAM) (Drives)

Size 256 B 32 KiB 256 KiB 4 GiB >128 GiB Access time 0.28 ns ~1 ns ~3 ns ~40 ns ~5ms

For taking a look at the memory access, Amdahl´s law is used to compare a system with and without a cache.

If it is supposed that the cache is 10 times faster than main memory and the cache can be used 90% of time (90% cache hits, the following performance improvement (speedup) can be achieved:

POverall =1

(1 − % cache access) + % cache access

Pcache

𝑃𝑂𝑣𝑒𝑟𝑎𝑙𝑙 = 1

1 − 0.9 + 0.910

= 1

0.19 = 5.26

If the CPU execution is considered, it should be known that performance is made out of

• the time texecution to handle a cache hit (i.e. the time to execute the instructions only including cache accesses) and

• The time tmemory_stall the CPU is stalled waiting for a memory access.

Resulting in:

𝑪𝑷𝑼𝒕 = 𝒕𝒆𝒙𝒆𝒄𝒖𝒕𝒊𝒐𝒏 + 𝒕𝒎𝒆𝒎𝒐𝒓𝒚_𝒔𝒕𝒂𝒍𝒍

𝑪𝑷𝑼𝒕 = (𝑪𝑷𝑼 𝒄𝒚𝒄𝒍𝒆𝒔 + 𝒘𝒂𝒊𝒕 𝒄𝒚𝒄𝒍𝒆𝒔 𝒐𝒏 𝒎𝒆𝒎𝒐𝒓𝒚) ∙ 𝑻

T = wait periods

Wait cycles on memory = number of misses * penalty cycles

To have the possibility to calculate the CPU execution time the different terms of the sum should be considered in a more detailed way:

Access to the cache

The CPU execution time without any memory access (including all cache hits) is the product of:

• CPI, cycles per instruction,

• IC, the number of instructions called instruction count and

• T, the CPU clock, e.g. 1 GHz T = 1ns.

40

Resulting in:

𝒕𝒆𝒙𝒆𝒄𝒖𝒕𝒊𝒐𝒏 = 𝑪𝑷𝑰 × 𝑰𝑪 × 𝑻

Memory accesses

The time the CPU is stalled during the cache misses is the product of

• MPI, the memory accesses per instruction

• IC, the instruction counter,

• MR, the (cache) miss rate,

• penaltymiss , the number of cycles in case of a miss and

• T, the clock cycle time.

Resulting in:

tmemory_stall = MPI · MR · penaltymiss · IC · T

The miss rate MR is calculated by the number of the accesses that miss divided by number of accesses (both hits and misses).

MPI means only if an instruction demands for memory accesses, the number of the memory accesses is described by MPI.

The product MPI · MR describes the number of the cache misses per instruction.

So the final execution time for CPU is:

CPUt = (CPI + MPI · MR · penaltymiss ) · IC · T

Example: Assume a system with CPI = 8.5, MPI = 3, MR = 0.11%, penaltymiss = 6. Calculate the execution time!

What will be the CPU execution time if no cache exists?

With Cache:

CPUt = (8.5 + 3 · 0.11 · 6) · IC · T = 10.48 · IC · T

Without Cache:

CPUt = (8.5 + 3 · 1 · 6) · IC · T = 26.48 · IC · T

Thus it is reasonable, in addition to a correspondingly large set of registers, to provide a cache with sufficiently large size, which takes up reasonably large program parts that would result in shorter processing times. The term cache makes it clear that it is not a question of addressable storage locations, such that the registers or the RAM, but of an area that is not accessible or hidden to the programmers (assembler level). The operating system administers the (automatic) management of memory hierarchy. In case of cache, this is done by a special hardware (usually the CPU).

41

3.2 CACHE MEMORY

The “hidden” cache memory is located between the CPU and RAM, initially outside, today on the CPU chip. Nowadays, two cache levels - an internal one and an external one with different sizes and speed- are common, in order to further increase the speed. Since the physical size of memory elements has decreased over the past years most modern CPU’s have a third level cache (L3 cache) that is usually shared between the separate cores and has a size up to 8 MiB. The basic cache arrangement is shown here:

Fig. 3-1 Basic memory arrangement with cache

Figure 3-2 shows the attributes and relative location of the different types of memory:

Fig. 3-2 Basic relation between different components

Figure 3-2 shows the relative location of the memory to the CPU. The speed of a data access operation from the CPU to a memory system depends on the speed of the BUS system and the relative “distance” to the specific system. A cache is usually connected directly with the core of a CPU and stores only small amounts of data, meaning a data transfer can be done in 1 CPU clock. Transfer Data from RAM (i.e.) DDR3 1600 RAM system is done with a transfer rate of 1600 MT/s (Mega Transfers/Sec) on a 64 Bit Bus, requiring about 5 system clocks (Not including delays and access times).

42

DRAM and SRAM:

The difference between both RAM types is given in the first letter, D stands for “dynamic” and S stands for “static”. The cycle time of SRAM is 10 to 20 times faster than DRAM, but for same technology, the capacity of DRAM is 5 to 10 times bigger than that of SRAM. Therefore it can be said:

• Main memory is DRAM

• On-chip caches are SRAM

• Off-chip caches depends

3.2.1 CACHE STRUCTURE

A cache miss may increase the processing time of a given task. To improve a system it is necessary to understand what causes a cache miss and how to decrease the chance of a miss. Usual causes of a miss are:

First access (Compulsory)

Capacity is too low

Conflict by different block addresses

Parameters of cache:

Block (or row) size 4 - 128 Bytes

Hit time 1 - 4 Cycles (normal 1)

Failure access time 8 - 32 Cycles (time to replace a block)

Access time 6 -10 Cycles (Access to 1. word of the Block)

Transfer time 2 - 22 Cycles (Time for remaining words)

Failure access rate 1 % - 20 %

Cache size 1 KiB – 64KiB (L1), 256KiB – 2MiB (L2), 2-20MiB (L3)

43

3.2.2 CACHE PERFORMANCE

In order to calculate the time a CPU needs to acquire a set of data the following formula is used.

𝒕𝒂𝒄𝒄 = 𝜶 ∙ 𝒕𝒉𝒊𝒕 + (𝟏 − 𝜶) ∙ 𝒕𝒑𝒆𝒏𝒂𝒍𝒕𝒚

In this case thit stands for the time the process needs in case of a cache hit, tpenalty

equals the time a system needs for an external memory access. 𝜶 is the probability of a cache hit.

Based on the experience that the hit rate of the instructions is higher than the one of the data, one can now make a decision about:

separate

combined

instruction and data cache.

Example:

thit = 5 ns tpenalty = 5 ns ∙ 20 penalty cycles = 100 ns

From statistical studies of programs it can be seen that approximately 26% of the commands relate to the instructions of a program and 9% to data. This means that 74% of the memory accesses (26 of 26 +9) go to the instruction cache and 26% to the data cache. Table 3-1 shows results from measurements on the hit rates for various cache sizes.

Memory Space

Instruction Cache

Data Cache

Shared Cache

1 KiB 3.06% 24.61% 13.34%

2 KiB 2.26% 20.57% 9.78%

4 KiB 1.78% 15.94% 7.24%

8 KiB 1.10% 10.19% 4.57%

16 KiB 0.64% 6.47% 2.87%

32 KiB 0.39% 4.82% 1.99%

64 KiB 0.15% 3.77% 1.35%

128 KiB 0.02% 2.88% 0.95%

Table 3-1 Measurement results on hit rates and cache sizes

44

Example:

Which arrangement gives the lower miss rate?

Shared cache with 32 KiB or

Separate instruction and data cache with 16 KiB each

About 74% of the memory accesses in a system relates to instruction access and 26% to the data access. The data in Table 3-1 are calculated for the 16 KiB cache miss rate from:

(0.74 x 0.64%) + (0.26 x 6.47%) ≈ 2.16%

The table provides, on the other hand, for the 32 KiB common cache: 1.99%. Therefore, for the performance, the average access time is to be considered with respect to miss rates!

For the average access time to memory:

𝒕𝒂𝒄𝒄 = % 𝒄𝒐𝒎𝒎𝒂𝒏𝒅𝒔 (𝒓𝒆𝒂𝒅 𝒕𝒊𝒎𝒆 𝒐𝒏 𝒉𝒊𝒕 + 𝒎𝒊𝒔𝒔 𝒓𝒂𝒕𝒆 × 𝒑𝒆𝒏𝒂𝒍𝒕𝒚 𝒕𝒊𝒎𝒆) + % 𝒅𝒂𝒕𝒂 𝒂𝒄𝒄𝒆𝒔𝒔 (𝒘𝒓𝒊𝒕𝒆 𝒕𝒊𝒎𝒆 𝒐𝒏 𝒉𝒊𝒕 + 𝒇𝒂𝒊𝒍𝒖𝒓𝒆 𝒂𝒄𝒄𝒆𝒔𝒔 𝒓𝒂𝒕𝒆 × 𝒑𝒆𝒏𝒂𝒍𝒕𝒚 𝒕𝒊𝒎𝒆)

For separate cache 16 KiB:

𝑡𝑎𝑐𝑐−𝑠𝑝𝑙𝑖𝑡 = 74% (1 + 0.64 × 50) + 26% (1 + 6.47% × 50) ≈ 2.08

For combined cache 32 KiB:

𝑡𝑎𝑐𝑐 = 74% (1 + 1.99% × 50) + 26% (1 + 1∗ + 1.99% × 50) ≈ 2.26

* The additional clock cycle resulting from the collision between instruction and data cache access in shared cache

3.2.3 AVERAGE ACCESS TIME

The average access time for each memory access (to either cache or to the main memory) can be calculated as follows:

𝒕𝒂𝒄𝒄 = 𝜶 ∙ 𝒕𝒄𝒂𝒄𝒉𝒆 + (𝟏 − 𝜶) ∙ 𝒕𝑫𝑹𝑨𝑴

Where: α = cache hit rate 1- α = cache miss rate (MR) tcache = time to access cache tDRAM = time to access main memory

For example: Main Memory with 125 ns access time 11 Wait States* Cache with 12 ns access time “Zero Wait States” Clock cycle processor 12 ns = 83 MHz

* Wait states: number of clock cycles, the CPU waits for memory

45

The average access time of a set of data with a hit rate of 90% results in:

tz /ns= 0,9 *12 + 0,1 *125 = 23,3

The average number of wait states can be calculated to: NState_wait = 0.9*0 + 0.1 *11 = 1.1

The average access time gives a rough overview of an access time of a given system, it cannot however calculate all cases of a cache access. For example if a given set of data is not simultaneously copied to the CPU registers and the Cache, a new data access with a cache hit would be required to access this newly copied data. This depends on the working parameters of the system that manages the cache memory and data access.

3.2.4 CACHE ORGANIZATION

Cache can be organized as follows:

• Fully Associative: a block can be placed anywhere in the cache

• Direct Mapped: each block has only one place it can appear in the cache n# of block = Block address MOD Number of blocks in cache

• N-Way Set Associative – a block can be placed in a restricted set of n blocks in the cache n# of set = Blocks address MOD Number of set in cache

The range of caches from direct mapped to fully associative is in fact a subtype of levels of set associative cache:

• Direct mapped cache is one-way set associative

• Fully associative cache consisting of m blocks is m-way associative.

Thus direct mapped cache and fully associative cache are special cases of the n-way set associative cache. The majority of processor caches today are direct mapped, 2-way set associative or 4 way set associative, depending on their respective level.

If the cache is not direct mapped, there are many blocks to choose from on a miss. Which strategies are employed for selecting which block to replace?

Random – the candidate blocks are randomly selected (simple to build in hardware).

Least-recently used – the block replaced is the one that has been unused for the longest time.

46

The following figure depicts the miss rates vs. set associativity.

Fig. 3-2 Miss rates vs. set associativity

Figure 3-2 shows, that a bigger cache and a higher level of set associativity result in lower miss rates but. But higher levels of associativity may result in longer cache searches resulting in a smaller time improvement in contrast to lover associativity. Following is an example for a cache system with different methods of mapping:

A cache contains 8 blocks and the main memory consists of 32 blocks. Fig. 3-3 and Fig. 3-4 describe the mapping of the block 12 from the main memory into the cache.

Fig. 3-3 Block 12 of the memory

Fig. 3-4 Set Associative Cache Mappings

A real cache contains hundreds of blocks and has to map millions of blocks of a real RAM system.

47

3.3 MAIN MEMORY MANAGEMENT

Main memory management is an important task of the operating system. If the RAM is managed effectively the overall speed of the computer system improves. Usually main memory management-systems can be classified into:

1. One that transfers processes during execution between main memory and hard disk by means of swapping and paging

2. One that does not.

3.3.1 MEMORY MANAGEMENT WITHOUT SWAPPING AND PAGING

The easiest Memory-Management is to keep only one process in memory. The process has full access to the whole memory and after the process terminates the next process is loaded.

As mentioned later in this lecture, having several processes in memory is better for many reasons. However, to process several processes, memory has to be assigned to that a specific process.

All multi process operation systems have the problem of relocation and protection.

3.3.1.1 Relocation

Since it is unknown where in memory a process will be loaded, absolute addressing of memory needs additional modification.

For example, if a process wants to read or write data to the address 100 and the process begins at the position 100KiB, the memory access must be modified to (100KiB + 100). If the process starts at 200 KiB it must be modified to (200KiB + 100) and so on.

One solution is to store additional information, which reference to all positions in the program using absolute addressing. Another solution is to use a hardware register which is automatically added to all memory accesses (segmentation).

3.3.1.2 Protection

Relocation does not solve the problem that a process may read or write from partitions of other processes (spying or destructing).

Segmentation could be a solution by adding a second register at the end of a partition. Memory request not within the start and the end of a partition are disabled by the hardware.

3.3.1.3 Partitions

3.3.1.3.1 Creating Partitions

There are two ways for creating partitions:

Fixed Partitions: splits the memory into n fixed partitions. This is by far the easiest way. General problem of this method is that the number of partitions and the sizes of the partitions can never be optimal. For example:

48

Consider the following scenario: Memory is divided into 4 partitions with the respective sizes as depicted in Fig. 3-5. The green boxes indicate the jobs assigned to the corresponding partitions. The distribution of jobs are carried out using:

o Several wait queue: each partition manages a list of jobs. Jobs are added to the shortest partition list. The problem of this method is, lists of small partitions may be full while lists of bigger partitions are empty (waste of time).

o One wait queue: partitions share a common list, hence the jobs are distributed to the next idle partition. Here the problem appears, that small jobs may waste space of a big partition while smaller partitions are idle and a job that needs a big partition has to wait.

Fig. 3-5 a) Several wait queue b) One wait queue

Variable partitions: This method splits the memory into varying partition volume, depending on the request. Considering the size and number of partitions, this method is efficient. However, memory allocation and de-allocation becomes complicated.

3.3.1.3.2 Allocation Strategies

The following strategies are used to find the partitions to be allocated:

1. First fit: takes the first partition that is large enough to accommodate the request. Easy, quick and cheap to implement.

2. Next fit: remembers the last position of a free partition, and starts from there to find the next appropriate free space.

3. Best fit: finds for the most optimal size of a partition for the corresponding request. Although this strategy prevents large memory waste, it often leads to the creation of many very small partitions that cannot be used at all.

4. Worst fit: finds the largest possible partition to be allocated to the process.

The 4 strategies above often lead to the main problem of memory allocation - the creation of many very small partitions that cannot be allocated to large memory request. The solution to this was memory compaction where neighboring partitions are put together to form larger partitions. Unfortunately, the process of finding neighbouring partitions results in large

49

administration overhead. To reduce large administration overhead, the Buddy-System is introduced.

Buddy-System:

Memory is divided into buddies (neighboring partitions of exact same volume) only when a request arrives. The size of a partition is always 2k, and it is determined according to the request. If the buddies are freed, they can be merged to form a larger partition.

Advantages: fast allocation and de-allocation, reduced overhead for memory compaction.

Disadvantages: some requests results in a large unused memory space. For example request is 513 KiB, where the complete memory size is 1MiB. The whole memory has to be assigned to the request!

3.3.2 MEMORY MANAGEMENT WITH SWAPPING AND PAGING

As long as there is enough memory to keep all processes, there is no need to use something more complicate, e.g. embedded system. In other systems there may not be enough memory for all processes. If currently idle, waiting or interrupted processes can be moved from memory to disk, the partitions can be freed, compacted and used by a process reloaded from the disk to its (new) position.

In swapping-system the whole process and its data is moved in or out. This could be several megabytes every time the process is moved. For swapping to be allowed, the whole process its data must fit into to the available memory.

Example of process loading

Fig. 3-6 Example of a filled cache

Now suppose Process B is swapped out.

50

Example of process loading (cont.)

Fig. 3-7 Swapping of data in a cache

Simple Paging

• Main memory is partitioned into equal fixed-sized chunks (of relatively small size)

• Trick: each process is also divided into chunks of the same size called pages

• The process pages can thus be assigned to the available chunks in main memory called frames (or page frames)

• Consequence: a process does not need to occupy a contiguous portion of memory

Page tables

Fig. 3-8 Example of page tables

• The OS now needs to maintain (in main memory) a page table for each process

• When process A and C are blocked, the pager loads a new process D consisting of 5 pages

• Process D does not occupy a contiguous portion of memory

• There is no external fragmentation

• Internal fragmentation consist only of the last page of each process

51

• Each entry of a page table consist of the frame number where the corresponding page is physically located

• The page table is indexed by the page number to obtain the frame number

• A free frame list, available for pages, is maintained

Logical address used in paging

• within each program, each logical address must consist of a page number and an offset within the page

• A CPU register always holds the starting physical address of the page table of the currently running process

• Presented with the logical address (page number, offset) the processor accesses the page table to obtain the physical address (frame number, offset)

Fig. 3-9 Example of paging

• By using a page size of a power of 2, the pages are invisible to the programmer, compiler/assembler, and the linker

• Address-translation at run-time is then easy to implement in hardware

• The logical address becomes a relative address when the page size is a power of 2

• Ex: if 16 bits addresses are used 10 bits for offset and have 6 bits for offset and have 6 bits available for page number

• Then the 16 bit address obtained with the 10 least significant bit as offset and 6 most significant bit as page number is a location relative to the beginning for the process

52

Abstract addresses

A programmer and the CPU “think” in abstract addresses, because at the time of the design and execution of a program the physical addresses of the used system are unknown. Therefore a separation of physical and logical addresses is necessary. When a program is installed its contents are stored on a disk and the physical addresses of the contents are mapped within the program / the operating system.

When the program is executed its functions work with abstract addresses that are decoded to the physical ones as soon as a data access is necessary. It parts of the program are stored in the main memory the mapping of the data has to be dynamic, since the data blocks can be relocated within the RAM and thus the abstract address the CPU is working with needs to be remapped to the new physical address.

Logical-to-Physical Address

The logical address (n, m) gets translated to physical address (k, m) by indexing the page table and appending the same offset m to the frame number k (ref. Fig. 3-10).

Fig. 3-10 Translation in paging

This figure shows how a logical address is translated into a physical. The 6 bits referencing the page number are compared with the page table and the relative address bits replace the page related bits in the logical address.

53

3.3.3 REPLACEMENT STRATEGIES FOR PAGES

When a process is still running in the CPU but the main memory / cache reaches a critical level of free space old data must be replaced, should a new data request appear. There are several strategies to determine which set of data will be deleted in order to make space for a new set. These replacement methods can have a huge impact on the systems speed, should they delete a set of data, which is important for the process within the next cycles. In this case a new access to the main memory has to be made, causing a delay depending of the data access speed to the HDD. Following are some of the most used processes for swapping data pages and a short description how they work.

Not recently used: The NRU method deletes a set of data that has not been marked as ‘referenced’ recently. Recently means that the files are time stamped at each reference and the file with the oldest time stamp is usually the one marked for deletion. If more than one or no page qualifies for this criteria the NRU method checks if one of those pages has been modified lately. If there is a match, this page will be deleted. In case of more than one page fulfilling that criteria one will be chosen randomly. Should no page qualify for replacement, one that has been read recently will be chosen randomly and deleted. This method is easily implemented and presents good results.

Least recently used: The LRU algorithm checks the ‘referenced’ marking of all pages and chooses the one with the oldest ‘last access’ entry. Problem of this method is, that pages that may be used more often but with longer intervals between two accesses may be removed from the main memory. This would result in repeating loading times of those files from a HDD.

FIFO: FIFO stands for First in – first out, which basically describes how this method works. The first data that was stored within the memory will be replaced first. This method has no regards how often a data is modified or when it was recently used, which makes this method inefficient in modern RAM-management. It is still used to create stacks however in order to memorize a sequence of tasks i.e.

Second Chance: The second chance algorithm works like the FIFO algorithm, but instead of replacing the page instantly it checks for the ‘reference’ bit. If the bit is set the page will not be swapped and the next page in line will be checked, thus preventing the removal of a heavily used page. Should all the pages in the main memory have their ‘reference’ bit set, second chance turns into a simple FIFO algorithm.

Not frequently used: The NFU method is more difficult to implement, since every page needs a counter-byte (or several). Every time a reference is done to a particular page its counter increments by one. If a page will be replaced the algorithm picks the one with the least number of accesses. This method is effective for larger main memory systems, since it needs time to figure out, which data sets are used most. Should a replacement be necessary in the early stages of a process all counters may be at a very low level, which could result in the replacement of a process that is needed more often in the future.

54

3.4 VIRTUAL MEMORY

Most computers today have a lot of GiB of RAM available for the CPU to use. Unfortunately, that amount of RAM is not enough to run all of the programs that most users expect to run at once. The amount of RAM may not be enough even to run a single program. So the concept of virtual memory was come out to solve this problem.

Here are three types of memory organization:

• One-word-wide memory organization

Fig. 3-11 One-word-wide memory

• Wide memory organization

Fig. 3-12 Wide memory

• Interleave memory organization

Fig. 3-13 Interleave memory

55

Virtual memory uses the concept of paging presented in 3.3.2. A set of data that is swapped from the working memory because it is not needed for the current process may be stored on a HDD for later use. In this case the physical address of the data will be stored in the RAM or even cache if the data is needed again. Since a physical address from a HDD may get very big in size itself it is often indexed through a paging table. This creates the illusion of a very large working memory while actually parts of the HDD are used to store process relevant data. Since the data access of a HDD takes considerably more time than access to cache and/or RAM only rarely used blocks of data are stored in the virtual memory. The translation is usually done through a hardware-implemented translation unit. The paging table operates usually not only as a reference for the physical address but gives information about the data stored, like the following:

Is the content of the address resident in the main memory?

Has a modification occurred => Differs the HDD Version from a copy in main memory

The drawback from virtual memory are the loss of memory capacity caused by the tables. This method also tends to create internal fragmentation through the pages since all pages can only be handled completely, whether or not they need all the allocated memory space. In case of how large the page size should be some arguments need to be considered. If the page is too small the result are huge tables that often need to reload but with less internal fragmentation. If they are too large fewer reloads need to be done but the main memory may be used ineffectively. In order to allocate virtual memory to best use the operation system implements the partition strategies from 3.3.1.3 for the virtual memory concept.

For a detailed analysis of virtual memory we recommend the thesis “Virtual Memory: Issues of Implementation by Bruce Jacob & Trevor Mudge

56

4 PROCESS MANAGEMENT

A process in a computer system describes the instance of a computer system that is being executed by the hardware. A program contains the data for several instructions, when it is started the corresponding hardware executes these instructions within one or several processes. The operating system needs to manage and distribute those processes according to the abilities of the hardware in order to achieve the best possible performance. To understand which method works best for a system it is necessary to examine the specifics of a process and the timeframe that is needed for completion. A program usually resides on a local memory device and is partially or entirely transferred to the main memory upon execution. The program can then be comprised of:

Programs, sub-routines

Data

Instruction pointers

Stack pointers

CPU states

Register contents

The hardware unit that executes programs or processes data is usually the CPU, some programs may use another piece of hardware like the graphics accelerator card, but in this lecture the CPU will be the main program processing unit.

Processes are described via context and are registered with the process control block (PCB). The PCB often consists of two parts:

[a] Hardware-PCB (internal by operating system)

a. description of current process

b. or (if stopped): status of all frozen variables

[b] Software-PCB (external by programmer)

a. Identification

b. State

c. Priority, etc.

57

4.1 PROCESS STATES AND PRINCIPLES OF HANDLING

Several possible process states are defined for regular process administration. This is depicted in the following figure, where conditions and triggers for the transition states are also taken into account.

Fig. 4-1 Process states and its transitions

1. Process creation by running program or operating system

2. Integration into process system by running program or operating system

3. Done by scheduler according to execution time or dispatcher according to interrupt/priority

4. Blocked while waiting for input

5. Input is completed, ready to continue

6. Process termination

58

4.1.1 PROCESS MANAGEMENT TOOLS

To run a single process on a single processor is simple (e.g. the standard personal computer), but if several processes have to be done on a single processor or maybe even distributed across several processors the above-stated definitions are of critical importance. In this case the operating system needs to monitor free resources and open tasks to allocate CPU time to them while another process is waiting for data and not actively using processor power. The tasks of the usual administrative tools which carry out these processes are illustrated in

Fig. 4-2.

Fig. 4-2 Process management tools

59

4.1.2 STRUCTURE OF MULTI-TASKING OPERATING SYSTEMS

A multi-tasking operating system constantly allocates upcoming tasks to the CPU (or one Core of the CPU) to create the image of a parallel processing of several tasks. In reality, multitasking does not mean parallel execution, but depending on the strategy the overall efficiency of processing can be increased. Those strategies almost always operate at the same principle: Processes are split up into segments and each segment is given a specific amount of time at the CPU (ref. Fig. 4-3). The amount of time and when a process is given the time depends on the implemented algorithm. Multitasking can be described as time-multiplexing, since only one CPU (or Core) is occupied with several tasks and allocated to them within a certain time frame.

Fig. 4-3 Structure of a multitasking system

In a multi-user system the OS has to monitor how many user are currently demanding the resources of the system in order to allocate them efficiently. As new users log into an operating system, access rights are monitored (log in) and fixed contingents of computer time, storage space and further resources are assigned. Furthermore, an account is opened by the operating system to keep track of the computer time used and possibly of the executed functions. Accounting allows the control of all computer operations during use as well as afterwards. In addition, an actual bill for computer services used can be produced. For every user his/her tasks are set up in a dynamic administrative chart. Apart from that, there are a number of general administrative tasks, e.g. log in and account, as mentioned above, as well as the usual terminal- I/O functions, print, etc. The scheduler assigns the processor to work on the different tasks in a time-sharing operation mode.

Two questions can arise from that structure:

Problem 1: How to change between processes

Problem 2: How to optimize efficiency in the use of CPU

60

4.1.2.1 The time sharing concept (Solution for Problem 1: To change between processes)

Figure 4-4 shows the model of the time sharing concept displayed as a time wheel. It is important to know, that not only the times of each process combined result in the time wheel, but the time needed to change between the tasks times the number of processes.

Fig. 4-4 Time-wheel model

Assume: T= 100 ms are given for pseudo parallel execution of multiple tasks if only one processor is available. Tasks P1, P2, P3 occur only a few times over hours (log in), or seldom (print). Tasks P4-P8 are user tasks. They require as much time as possible.

Scheduler divides the CPU time and allocates it to a task, which is running as context for e.g. registers, program counter, or stack. After the processing time of a task is over, the scheduler changes context, and, as a result, moves on to the next task (e.g. account). This sequential allotment of time slices and changes of contexts continues until the time slice of the last task has been completed at a time T and a new turn of the time wheel starts.

To change from one task to another, context has to be changed too. Context changes are used to reload process-specific registers of the processor, i.e. to switch from one virtual processor to the next. At context change, status has to be saved and reloaded, and a von Neumann machine doesn’t differentiate between data and instruction, so that it causes slow context change. Solution for context switch is separation of data and instruction.

When a process/task changes, then the stack pointer, registers, etc. must be changed according to the next process. It is necessary to ensure that the context change is done in a very short amount of time. There are two methods to do so:

61

The complete set of user data or process data is stored in a private memory space.

Context change by pointers only: - PC: program state - Pointers points to ceiling/start of memory slot. That’s why context change

can occur in a few µ𝑠 (another reason why increasing memory can increase performance)

4.1.2.2 Scheduling algorithms (Solution for problem 2: Increase efficiency of use of CPU))

The second main task of a multi-tasking operating system is the synchronisation of running programmes with technical processes. The interlocking of data-technical processes with value entry or user input, for example, can be executed

a) by cyclic flow control (polling), i.e. processes call for data, meaning program-controlled.

Drawback: CPU is always busy with polling, so it is often idle due to I/O operations.

Fig. 4-5 Flowchart of cyclic flow control

b) time-controlled in certain given intervals with a real-time clock.

Drawback: how to handle idle time? If the time intervals are too short the number of data swapping will increase until a process is finished, which leads to a decrease of overall efficiency. If the time intervals are too long a finished process will leave the CPU with idle time.

62

Fig. 4-6 Flowchart of time controlled processing

c) by request (interrupt) via the technical process event-controlled. Wait active: Idle programs, waiting for other program can start until interrupt starts.

Fig. 4-7 Flowchart of a request based model

Process requests (triggered by the technical process):

• can be announced at any time

• have high priority (importance) by

- priority in execution and/or

- blocking other requests

• also lead to context change

An example of this is shown in Fig. 4-8. A background process (this can also be the time-sharing operation) is interrupted by two interrupts.

63

Fig. 4-8 Interrupt Control

Here the priority of the background process is at its lowest at 1, i.e. every interrupt is able to interrupt this process, as can be seen with I1, for instance. The priority of IRS 1 is still set relatively low at 2, so that the process can easily be interrupted by I2 with its higher priority IRS 2. In this example IRS 2 can complete its task without any further interruptions before IRS 1 can complete its remaining tasks. Finally, the background process is resumed.

As mentioned above, time-sharing operations and interrupt control usually overlap.

A process computer

• allows the execution of several data manipulation processes (quasi-parallelism)

• allows real-time measuring, controlling or regulation of technical processes

From now on the terms process (in the sense of data processing) and task will be used synonymously; the former originating from data processing and computer science respectively, the latter stemming from the user's point of view. As far as process computers are concerned, the terms are identical. However, the terms task and user have to be clearly distinguished. At least one task (or the solution of which) will be assigned to every user. However, a user can request several tasks without any problem, and similarly, several tasks can be assigned to no user at all, since they simply co-operate with a technical process. Thus, a process computer is originally a multi-tasking computer which can be used as a single- user computer as well as a multi-user computer.

For the sake of simplicity the following explanation of administrative mechanisms will begin with the structure of a multi-tasking administration, then moving on to the co-operation between computer and technical processes. In practice both mechanisms are intertwined.

64

4.2 PROCESS CHANGE

If only one Processor or Core is available and the operating system is multitasking the CPU has to alternate between processes. Figure 4.9 shows the problem of time consumption within a CPU for each activity in a time sharing situation. The time taken to switch can be between μs and ns. The address range of the old process eventually has to be saved and updated by complex memory management.

Fig. 4-9 Switching between processes by the operating system (OS)

The operating system maintains several queues; one is for the jobs that are waiting for joining in the distribution processes and one for jobs which are waiting for next allocation of CPU. The others are for the devices: I/O, printer, storage etc. Fig. 4-10 shows an example:

Fig. 4-10: Queues in the operating system

If a process switch is triggered by, for example, a timer, the context switch is taken over by a program, the so called dispatcher, in the operating system. Fig. 4-11 shows a possible procedure. New context will be updated from the corresponding stack region.

65

Fig. 4-11: Context save for process change

The processing performed by the dispatcher command sequence is shown in Fig. 4-12. RET instruction causes the loading of CS: IP (80x86) with the return address from the stack.

Fig. 4-12: New context update

66

The situation in the queue is shown in Fig. 4-13; a) before, b) after the process change.

Fig. 4-13: The queue of “ready” processes a) before, b) after the process change

The position in the queue determines (in the example) the priority; this is recorded in the PCB. This means that there must be a (linked) list of PCBn in the operating system. The register contents are then saved by the dispatcher in each PCB. Meanwhile the address is passed to the dispatcher when calling the scheduler. SP can be used again for the rapid exchange.

67

4.3 SCHEDULING

When a computer is designed for multiprogramming, it frequently has multiple processes or threads that need the CPU resources at the same time. In the scheduling process the operating system chooses the next steps for the upcoming or already queued processes and which one is up next for the hardware. In terms of planning we distinguish between:

a) Long-term planning (in Batch Systems, job scheduling)

The long-term planning organizes the multi-program behavior; it occurs when a process comes to an end. So it starts when a process ends.

b) Short-term planning

The short-term planning is about 100 µs active and should take the shortest possible time.

c) Dispatcher

The dispatcher performs the context switch, is an immediate action. This takes about 1 µs to 1 ns. Extended periods result in complex storage management systems.

The program alternates between CPU cycles that occur in bundles, and individual I/O-Instructions. It does not require any activity of the CPU.

4.3.1 SCHEDULER

Depending on certain events the scheduler selects the next process queue of ready processes from the operating system. This is done by the short-term scheduler. The data in the queue, in general, consists of the PCBs of the processes.

Scheduling decisions can be made:

- In the transition of a process from the active to the waiting state (I/O),

- During the transition of a process from active to ready state (Interrupt),

- During the transition of a process of waiting in the ready state (I/O end),

- Upon termination of a process.

If a process has to wait for an I/O input, which would take several cycles, it improves the overall efficiency if the process is cleared temporary from the CPU and another process gets assigned. If the scheduler already knows, that a process will need another I/O input it can plan ahead. Usually the scheduler cannot know which actions a process will take until it is processed, but the process itself can change variables upon running, like its own priority. With this knowledge we look upon the two kinds of scheduling algorithms, the preemptive and non-preemptive. The distinction between both of them are:

At the non-preemptive scheduling a process is picked and started and it will run until it blocks (either on I/O or waiting state [Example: MS-Windows]) or it voluntarily releases the CPU. This process will not be forcibly suspended until a clock interrupt occurs, meaning that the time a process has been given is has run

68

out. During the clock interrupt no scheduling decisions are made, but if there is now a process in queue with a higher priority the old process will be replaced by the new one.

At the preemptive scheduling once a process is started, it runs a fixed amount of time. If the process is still running at the end of the time, it is suspended and the scheduler picks another one. Preemptive scheduling requires having a clock interrupt occur at the end of every time interval to give control of the CPU back to the scheduler.

There are problems associated, requiring additional effort:

• Data consistency: two processes use the same data set

• Call the scheduler while working on a system call by the operating system. Eventually the scheduler activity replaces the process that led the invocation. Some operating systems, e.g. many versions of UNIX, first end the system call or the I/0-Block before running the context switch. Real-time processing is not possible.

Planning Criteria

The scheduler needs to be planned with a respect to the following criteria:

• Use/efficiency of CPU-Utilization (40 % - 90 %)

• Throughput (completed processes / time unit)

• Cycle time (time from input to output of the results)

• Waiting time (Waiting in the CPU-WS)

• Response time (In interactive systems: Time to reaction)

4.3.2 SCHEDULING ALGORITHMS

A scheduling algorithm is a set of rules that determine which process to be run at a particular time or in particular period of time. If the processor are more the scheduling algorithm determines the distribution to the processors.

4.3.2.1 Requirements to a Scheduling Algorithm

In order to fulfill the planning criteria shown in 4.3.1, to a full or at least partial extend, some requirement must be fulfilled.

• Fairness: Each process need a fair share of the CPU. This is really important, because a comparable process should get comparable service. Of course different categories of processes may be treated differently but in the end the importance and time consumption should always be respected.

• Good utilization of resources, keep all parts of the system busy.

• The algorithm must be executed efficiently.

69

4.3.2.2 Classification of scheduling algorithms

There exist several different algorithms to manage scheduling. In order to classify them it has to be distinguished between the following:

1. Static (a priori) Scheduling Algorithms

The scheduling decisions of static algorithms take place at a fixed time interval. When the process coordination takes place the scheduling is planned before the programs run. The input of the scheduling algorithm is a set of processes which will be considered for scheduling, all processes that arrive the scheduler during the runtime of a process are stored for the next cycle. The scheduled processes run until all of them are done and/or until the given time is over.

Conditions for static algorithms: No dynamic process creation during the program. Event-driven processes can be incorporated only if the time conditions are schedulable.

2. Dynamic Scheduling Algorithms

The coordination process takes place while the processes are running. Time points of the update are:

o at fixed time intervals,

o as soon as a new process is created,

o as soon as a process ends.

Advantages and disadvantages of the dynamic scheduling contain: event-driven processes can be coordinated, but the decision process during a scheduler operation costs time. Due to lack of efficiency of the algorithms heuristics are required.

3. Algorithms for more Processors

In contrast to the scheduling of a processor, there are multiple processors generally only NP complete algorithms, so that one must work with heuristics.

4. Algorithms for complex Process models

Complex process models are understood as processes with one or more of the following properties:

o Interruptible, non-interruptible (preemptive, non-preemptive) processes with the further distinction in any or only at certain points interruptible process

o Cyclic, non-cyclic processes

o Switching to another processor

70

4.3.3 ANALYSIS OF SCHEDULING ALGORITHMS

This chapter takes a look on several scheduling algorithms and describes how they work and what advantages and disadvantages they might have. For most algorithms it is difficult to perform well/optimal due to:

- Complexity

- The execution time of a task might be unknown in advance

To visualize the function of an algorithm several types of diagrams will be introduced to give a clear picture of the working order and time consumption of each process.

4.3.3.1 Gantt-Diagram

The Gantt-Diagram shows task order of a CPU in dependence of the time a process arrives at the scheduler. It is assumed that each process needs a given amount of time that is either known or can be speculated, but in reality are not needed for an algorithm. If more than one processors are available the y-axis specifies the processor that is used for the allocated tasks.

Fig. 4-14: Gantt-Diagram of a 2 processor system with 5 tasks

Example:

Two Processors I, II are available and five processes have to be handled. The scheduling algorithm specifies the task-sequence depending on the priority of a task. A running process will be interrupted if another process with higher priority arrives. Should a CPU be unoccupied and a higher priority task needs to be scheduled running tasks will not be interrupted, instead the free Core will be assigned to that task. Priorities for the example-task P1-P5 are:

P3 > P2

P4, P5 > P1

P1 > P2

Process arrivals and executions

- P1 and P2 arrive at the same time (t=0), P3 at t=1, P4 at t=5, P5 at t=6

Processor Process Assumption:

P3 is more urgent than P2, P1 is more urgent than P2 P4, P5 are more urgent than P1. Order of arrival of the processes: - P1, P2 simultaneously, - then P3, P4. - while P3 and P4 are still running, P5 occurs.

71

Priority of P1 is higher, but both Processors are available => Assign P1 -> Processor I

=> Assign P2 -> Processor 2

- after that: P3, then P4 arrive

Priority P1 > P2

Priority P3 > P2

P2 will be interrupted by P3 (on Processor II)

P2 continues (on Processor I) after P1 finished (P4 not yet arrived)

By arrival of P4

Priority P4 > P1 > P2

Priority P3 > P2

P2 will again be interrupted, now by P4 (on Processor I)

- While P3, P4 are running; arrival of P5

P4, P5 > P1

P3 > P2

P1 > P2

P3 will interrupted, by P5 (on Processor I)

After P4 finished, P3 will be continued on Processor I

4.3.3.2 Timing Diagram

A timing diagram shows the sequence of processes for (mostly) a single processor. The y-axis is divided to the number of tasks currently running on the CPU, while the x-axis shows the timeframe. In a single-CPU case only one task can be executed at any given time (a special case of the Gantt chart). For each additional CPU the number of tasks running simultaneously is increased by one

Fig. 4-15: Timing-Diagram of a 1 processor system with 3 tasks

72

4.3.3.3 Example of Planning Algorithms

Following is a list of several common known and used scheduling algorithms.

First-come, first-served (FCFS); FIFO-WS

The FCFS algorithm does exactly what its name describes. The first process that arrives the scheduler will be the next in line, there are no interrupts.

Process P1: τ1 = 300 ms Waiting period 0 ms

P2: τ2 = 40 ms Waiting period 300 ms

P3: τ3 = 32 ms Waiting period 340 ms

Gantt Chart:


The average wait time is: 640/3. 213 ms

The average wait time is minimal when the order of P3, P2, P1 is: (32+72) / 3 ≈ 35 ms.

Shortest-Job-first (SJF)

Another example of a description within the name. When the scheduler has to decide which process will be next he will choose the job that needs the least amount of time. This provides the optimum in terms of waiting time. For short-term planning the process is not applicable, since the next block computation time is not known. Also this method is quite problematic should a more critical process arrive with a long runtime. It will be stalled until the probably less important but faster processes are computed.

Assuming that the time required for the next block is known, it is possible to construct an example shown in figure 4-17 (times in ms).

73


Priority Scheduling

The priority scheduling chooses the process with the highest priority for the next in line. Should another process with a higher priority arrive an interrupt is send and the active task will be replaced by the new one.

The problem is that low-priority processes will wait a long time should high priority-long time processes stall the CPU. To solve this, increasing the priority of the waiting tasks is a solution.

Round-robin Scheduler (RR)

A round robin algorithm switches tasks after a fixed interval of time. New processes are added to this lest and complete ones are removed. This guaranties, that each process is receiving a fair amount of time at the CPU.

The disadvantages of this method are, that in case of many active tasks a lot of time is wasted due to the change of a process. Also the completion of a long runtime process will be delayed more with an increase of active tasks.

Combination Several allocation levels

Combination could be time sharing as background job (RR) plus interrupt on demand based on priorities.

Processes differ according to priorities. Within a priority level, the allocation shall be made according to RR.

After an appropriate waiting time processes can move up in priority. Example: Aging

4

74

Part II - COMPUTER NETWORKS

5. INTRODUCTION TO COMPUTER NETWORKS

A computer network by definition is a group of computing devices linked together through communication channels and –devices in order to communicate or share resources with each other. The processes behind a modern LAN-network or even the internet are quite complex. A lot of protocols, converter circuits and cables are implemented to ensure that messages are delivered fast and preferably to the right receiver. In this part of the lecture an overview of several models and schematics that will help to understand the principles and demands of modern network technology.

Nowadays there are several different types of networks. One of the classical types that is still heavily used is the LAN (Local area network). This comes in 2 different types, wired and wireless. The internet can be imagined as a worldwide version of a LAN with different types of routing systems and infrastructure (telephone cables, tv-cables, power lines). Another growing field of technology in computer networks is the mobile network. An increasing demand on wireless communication apart from a centralized W-LAN router lead to a faster and more accessible generation of mobile internet technologies.

This chapter will introduce different models that served as a blueprint for the actual creation of network technology. Next the information coding in terms of frames and data transmission will be discussed, followed by the aspect of code efficiency. At last an example of transmission protocols will be introduced and analyzed.

Fig. 5-1: Model of a computer network

75

5.1 ISO/OSI REFERENCE MODEL

The architecture of a network can be cut into several layers of hardware and software (mostly protocols). Every layer fulfills a certain purpose like translate and transmitting data. To illustrate the layers of a network the analogy in Figure 4.2 demonstrates a common problem. Manager X in Germany wants to send a letter to manager Y in China. Since each of them doesn’t know the address and language from the other, the letter is given to a secretary who knows where to send it. Then the letter is translated into English and the official signatures are added. A postal service delivers the letter to its destination where another interpreter translates the English letter into Chinese.

Manager X wants to communicate with Manager Y

Secretary knows Address and

Department

English is taken as a common

understood language

Official

signature is

added

International Post-ZIP-Code added

Fig. 5-2: Model of a computer network

The important demonstration of the analogy is, that each layer does not need to understand how the other layer works, or what it does. The managers are not interested in the addresses, the Post-ZIP-Codes and how difficult the language translation is. The secretaries are not interested in know the letter finds its way to China and the correct building. The postal transport system should not and does not need to know the content of the letter. This system has the advantage that each layer can be changed, either because technical inventions or standards demand it or because another system needs other protocols to receive data, without changing the entire system. Figure 5-3 illustrates which parts of the real network model can be interpreted as what specific part of the analogy. This should help to comprehend the tasks of every layer.

76

Fig. 5-3: Comparison between analogy and real model

OSI-reference model

The OSI reference model was published in 1984 by the International Standard Organization and stands for Opens Systems Integration. This was the first step to standardize the several communications protocols and structures. While the protocols are rarely used today the model itself is still very broadly phrased and still valid. The model is based on:

Knowledge of technology

Human communication

The goal to link different systems in a transparent way

The goal to cover all fields from application to technical media

Figure 5-4 shows a model that describes the service-protocol relationship. A process on different systems at the same level of communication has to use the same protocols in order to translate data from a lower to a higher level within the hierarchy. This basis is also utilized in the OSI reference model (ref. Fig. 5-5), which names each layer according to its general function.

77

Fig. 5-4: Abstract model of communication levels

Fig. 5-5: OSI-reference model

The figure pictures from top to bottom a similar structure as the layered structured operating system from Tanenbaum (ref. Chapter 1.1). The closer the message, that needs to be delivered, gets to the bottom of the model the closer it gets to the hardware, respectively pure voltage signals. All layers will be discussed within the next sub-chapters.

78

5.1.1 PHYSICAL LAYER

Fig. 5-6 Examples of physical layer

Layer 1 deals primarily with the signaling and wiring standards. For signaling, a standard typically specifies things such as the voltages used to signal a binary digit or special information such as the beginning or end of a data frame. For wiring, a standard typically specifies attributes like the shape of the connectors at the end of the wires, the electrical properties of the wire. In the case of the standard for 10BASE-5 Ethernet, the wiring standard not only specifies a one half inch diameter for the coaxial cable, but suggests that the outer insulation be orange. Furthermore, it suggests that it be marked every few meters where taps may be placed.

Repeaters are the most interesting category of networking devices that operate exclusively at layer 1. They are capable of receiving a somewhat distorted analogue signal for a bit and transmitting a cleaner analogue signal for the bit. Repeaters permit the wiring to span greater lengths than would otherwise be possible, but also contribute a small, non-negligible delay to the signal that may contribute to other problems. For example, according to the standard a coaxial 10BASE-2 Ethernet cable may not extend for more than 200 meters. Inserting a repeater in the middle of a 300 meter cable brings a non-conforming and possible non-functional cable back into conformance with the standard. The standard also specifies that the signal must propagate from any device to any other device in less than 5 microseconds. A repeater adds a delay on the order of hundreds of nanoseconds and each 200 meter wire cable has a propagation time of about a microsecond or so. Hence, it is typically recommended that one never place more than 3 repeaters between any two devices in such an Ethernet.

Another category of networking devices that operate at layer 1 are media converters. Many actual networks are built using a variety of types of wiring and may include a mixture of optical signal carriers and electrical media (generically called "wiring" even when one may include glass fiber in the set of things one is talking about). In order to connect different types of media to each other, one uses a media converter which performs the same basic function as a repeater: it takes a signal from one medium and converts it, bit by bit, into a signal for another medium.

79

5.1.2 DATA LINK LAYER

Fig. 5-2 Examples of data link layer devices

Layer 2 deals primarily with frames and packets. Layer 1 exclusively deals with hardware, but layer 2 deals with a combination of hardware and software. Layer 1 is essentially for the media that connect networking devices whereas layer 2 deals with the circuitry inside the networking device as well as the software that may control it.

A frame is a group of bits travelling across the physical connection. A frame may be referred to as a cell in a system where the grouping of bits is of small, fixed size. A frame may contain information at its beginning, sometimes called a preamble that is not relevant to the software controlling the interface. A frame may also contain information at its end, sometimes called a trailer that is likewise not important to the software part of the interface. For example, an Ethernet frame trailer contains a checksum of the contents of the frame to verify that it was not garbled in transmission.

A packet is a frame with the preamble and trailer (if either or both exist in the particular system) removed. A packet is the part of the frame that is passed to the software part of an interface. Since a packet and a frame contain essentially the same information, the terms are often used interchangeably.

Layer 2 devices deal with packets as a whole and include things like bridges and switches which can filter and forward packets from one group of wires to another. This process is fairly simple and can be accomplished without using a sophisticated general-purpose CPU. Typically, such devices are controlled with special purpose CPUs and firmware or they may be constructed out of very large scale integrated (VLSI) application specific integrated circuits (ASICs). The VLSI/ASIC approach is generally much cheaper when mass produced. In the past for example, the Ball State Computer Science Department purchased an Ethernet switch for under $2,000 with a VLSI/ASIC design instead of a functionally similar software controlled switch that would have cost about $15,000. For very high speed devices such as an ATM switch, the VLSI/ASIC approach is the only one currently practical.

80

Stand-alone components like HUBs and switches also include the functionality of Layer 1. Thus this components can also act as a repeater or media converter, e.g. a 10/100-Base-T Ethernet Dual Speed HUB / Switch.

5.1.3 NETWORK LAYER

Fig. 5-3 Example Router

Layer 3 deals with the delivery of datagrams in a media-independent manner. Datagrams are a group of data that travels as a single package from a sending computer's operating system to a receiving computer's operating system. Standards must specify how datagrams are to be inserted into frames for transport across a communication link.

Datagrams also contain addressing information. A layer 3 protocol will assign an address to each layer 3 device. The receiver's address needs to be attached to the datagram and the sender's address is also typically required to be present. The layer 3 addresses may or may not be related to any layer 2 address in any direct way. The addressing information is typically contained in a datagram header separate from the message data being delivered from one computer to another.

Sending and receiving computers might not be directly connected by a communication link. In that case, they would depend on the services of layer 3 device called a router. A router is a device which connects to more than one network and offers its services to computers on those networks as a mechanism to forward datagrams from one network to another network. The joining of two or more networks in this way is called internetworking and the networks formed in this way are called internets.

81

5.1.4 TRANSPORT LAYER

Layer 4 deals with communication between programs on computers, as opposed to layer 3 which deals with communication between operating systems. Some layer 4 protocols merely add program identification information to the information sent by the layer 3 protocol so the message data can be delivered to the destination program by the operating system on the receiving computer. Other layer 4 protocols are much more elaborate and deal with pathological conditions such as layer 3 datagrams being lost or delivered out of order.

Typically the transport layer is the boundary between system services and the application programs that use them. Higher layers are not typically built in to the computer's operating system and depend on separate software libraries.

Another category of networking devices that operate at layer 4 are firewalls. Those devices keep track of several datagrams which belong to the same connection and/or service. They can mangle datagrams to hide other hosts and they can block datagrams to protect hosts from forbidden traffic. They also act as router, but mostly just between an inner/private and an outer/public network.

5.1.5 SESSION LAYER

Layer 5 deals with establishing and maintaining a context for a sequence of messages delivered by layer 4. A network session is directly analogous to a terminal session where a user logs in, sends keystrokes and receives text characters, and then logs out. The session layer is responsible for maintaining the context in which layer 4 data is interpreted, just as a login session maintains a context for the incoming keystrokes. In establishing this context, the session layer may need to verify the identity of the party at the other end of the communication path. The process of establishing this identity is called authentication, since it is intended to demonstrate that the data is authentic. Authentication is separate from, but often related to authorization, which is the process of determining if some action is allowed.

Since the session layer typically not part of the package delivered with an operating system, it is somewhat less standardized and well developed. In many cases, it is a null layer. That is, no attempt is made to provide the services this layer should provide. Only in the last few years have serious standards been proposed and adopted in a widespread manner.

5.1.6 PRESENTATION LAYER

Layer 6 is the one charged with interpreting the meaning of the bits sent from one program to another via encoding standards. Different machines may represent numbers and characters with different bit patterns. IBM mainframes often use the EBCDIC encoding of characters, whereas most other modern systems use the ASCII encoding. Many machines use the 2's compliment representation of signed integers, but may differ in the number of bits used to represent them. Even machines that use a 32-bit IEEE representation of numbers may differ in the order in which the 8-bit bytes are stored internally.

More significant problems result when one considers how to transmit data types that are not built in to typical hardware such as something as simple as a date or time. At one extreme, all data can be converted into a text string by the sender and parsed or interpreted by the receiver. This wastes bandwidth and CPU cycles. At the other extreme, all data can be transmitted in some standardized binary format that may or may not be equivalent to the internal format used by the parties to the

82

communication. Either case leads to format conversions that may not be precise and may consume a fair number of CPU cycles.

During the early development of networking standards, processing the presentation protocol was often slower than the transmission of data across a local area network. Hence, they were often ignored since they were only required if the systems communicating where actually different. Since then, however, CPU speed has increased faster than network speed, so it is now quite practical to incorporate non-trivial presentation protocols.

5.1.7 APPLICATION LAYER

The top layer, layer 7, builds on the lower layers to get the information of the lower layers to the user in an understandable way. One must be careful, however, to distinguish between application protocols and applications. Often, the application that uses a given protocol has the same name as the protocol. For example, the "File Transfer Protocol," FTP, is often employed by a program named "ftp." The confusion between application layer protocols and application has lessened with the advent of the World Wide Web. A typical Web browser will employ several application layer protocols such as HTTP, FTP, POP, and SMTP.

83

5.2 INFORMATION CODING

Information on a computer system is always coded in some way. Complex files are coded hex numbers which again are coded in binary ones and zeros, to be stored on a memory system. In order to communicate within a network those binary numbers have to be transmitted wireless or via cable and the receiver has to distinguish between a logical 1 and 0. In short coding is needed for representation of data.

Physical level - Communication

Data links logical level - Digits

In the physical level of data encoding it has to be ensured, that both sender and receiver know what kind of modulation is used for the transport. Examples of electrical encoding are shown in figure 5-4.

Electrical coding

Fig. 5-4 Example encoding

1. Amplitude modulation: Most known technique of modulation. A voltage higher than a certain threshold is considered a logical ‘1’, everything below a logical 0. Instead of high value it is possible to check for the absolute value of the amplitude.

2. Frequency modulation: The signal is checked for its frequency values of the amplitudes within a certain time frame. A frequency below a certain threshold is considered a logical ‘0’, everything above a logical ‘1’. While not as easy to decode as amplitude modulation the frequency modulation allows a better dynamic within the transmission and a longer transmission range.

3. Phase modulation: This modulation changes the phase of the load-signal depending on the encoding technique. I.e. it is possible if both sender and receiver know that a signal is modulated with an 8-PSK (Phase Shift Key) method, 3 bits can be transmitted with 1 change of the phase. This modulation has the widest bandwidth.

84

Frames and Blocks:

Frames and blocks are methods for setting up frames and special procedures. It is used in general communication interface as parallel (printer) or serial (peripherals inclusive Modems). The transmission type must be identical between both systems. Known type are shown in figure 5-5.

Fig. 5-4 Transmission methods

1. Asynchronous transmission: The transmission between both systems starts with a specific code word (like 5 ones in a row) and ends with the same code word. Usually a decoding signal is send as well, that the receiver knows what signal frequency is used. This is a simple way of transmitting data since both systems only need to know that there is an asynchronous connection. The downside is, that the decoding needs more resources in order to understand the signal correctly.

2. Synchronous transmission: In a synchronous transmission a clock signal is send between both systems. This serves as a messenger for the systems frequency of sender and receiver and makes the decoding very simple. The downside here is the transmission of an additional signal.

3. Bit oriented: A bit oriented transfer method sends certain bits structured in frames. A certain pattern of bits in a frame indicates its purpose, like ‘Flag’, ‘Control’ and so on. This is a more complex/structured version of the synchronous or asynchronous transmission.

4. Byte oriented: An upgraded version of the Bit oriented transmission. The bits within a Byte are decoded into symbols and characters that indicate which purpose a byte serves. This enables a more complex transmission operation.

85

5.2.1 FRAMES AND DATA TRANSMISSION

A closer look at the more complex message methods is necessary to understand the advantages and disadvantages of them. Since they are usually compatible with synchronous and asynchronous transmission systems both of them will not be investigated further. A message frame consists of

message

protocol

As stated before, there are several different transmission types:

Bit orientated transmission

Byte orientated transmission

Packet oriented transmission

5.2.1.1 Bit Oriented Transmission

When a connection is physically established a message protocol has to be implemented to interpret the message.

For example: HLDC: High-Level-Data-Link Control

Line Protocol: special symbols -> Flags like:

The flag indicates the start or stop of a specific message.

The benefit of a bit oriented transmission are:

- Code independent by fixed fields - Hand shaking by frames ( Acknowledge) - Error detection via FCS (Frame checking sequence (2 byte)) - Can be encapsulated and transported over the layer

Fig. 5-6 Problem: distinguish between message and flag

86

A problem occurs when having a fixed sequence as a flag. What happens if part of the load has the same pattern as the flag signal? In order to guarantee transparency: bit stuffing is implemented:

- To avoid long sequence of constant ‘1’; suffering 0 after each 5th bit of ‘1’

- To avoid misinterpretation of flags

The receiver removes the added ‘0’ because he knows by protocol that after each 5th ‘1’ a ‘0’ was inserted. Here is the detailed diagram description:

Fig. 5-6 Flowchart of the bit stuffing

87

5.2.1.2 Symbol (Byte) oriented Transmission

In a byte oriented transmission the interpretation of each byte is very important. Usually there is a strict code how the header is structured to make it easier for the receiver to decode the message. For example: BSC: Binary Synchronous Communication

STX TEXT ETX BCC BCC SYN

SOH: Start of Header

STX: Start of Text

ETX: End of Text

ETB: End of Block

EOT: End of Transmission

ENQ: Enquiry

ACK: Acknowledge

SYN: Synchronous idle

DLE: Data Link Escape Extension of the control rate

ESC: Extension of the control rate

The Communication protocol is a mix of message and control characters, so it is not transparent (control might be misunderstood/misinterpreted).

There are three ways to avoid misinterpretation:

1) Doubling

Fig. 5-7 Doubling: Inserting an ETX after each ETX

2) Using ESC symbol: next X bytes are controlled (by definition of protocol).

3) Create a new start of transparent text / end of DLE ETX

Fig. 5-8 Adding a DLE to the STX and EOT bytes

88

Figure 5-9 shows the Protocol for multiple frame transmission:

Fig. 5-9 Multiple frame transmission

5.2.1.3 Packet Oriented Transmission

In wide area networks, sometime many frames are transmitted in one packet. Data packets can be transported on network level.

Flag Address Control Information FCS Flag

Fig. 5-10 Packet

It can be delivered in several frames⇒ nodes are receivers and distributors. E.g. DATEX - P (protocol: x.25) Highest level (Packet Level): Start/stop; Node address (computer); Control function; Safety.

Lower level (Frame Level): Message for logical address; Function identifier.

Special feature: per node can consist of multiple logical units. Nodes are collectors and distributors of messages so that it will be an optimum utilization of bandwidth.

Logical channel number

Type Message info

89

5.3 CODE EFFICIENCY

5.3.1 STATIC CODE EFFICIENCY

The efficiency of a data transmission depends on 2 variables: How long is the data frame and how many load information can be transmitted with it. Usually a data frame should not be too big, because if an error occurs during transmission the whole frame will be dropped. That means that the code efficiency of a huge frame would be good, but the actual transmission rate would be bad. To get a picture what static code efficiency means the following examples are listed:

Asynchronous ASCII Higher protocols

(BSC) (HLDC)

2-3 control bits 1xDLE, STX 2xFlag

5-8 message bits 1xDLE, ETX address field

1xSYN control

2xBCC 2xFCS

7 control characters 6 control characters

100 characters per frame 100 characters per frame

Loss depending 7% loss 6%loss

This begs the question just how big should a block be in order to be ‘efficient’. The answer to this question has to take into account, that it is possible that a frame is transmitted with an error. Static code efficiency can only be applied in an error-free environment, for real applications the dynamic code efficiency has to be calculated.

5.3.2 DYNAMIC CODE EFFICIENCY

The dynamic code efficiency separates 2 cases. The first case are transmissions with the potential of an error. The second case adds a handshake protocol, which sends an ‘All clear’-signal back to the sender of the first message. This handshake can be lost in transmission as well, which would increase the chance of a miss.

5.3.2.1 Due to errors on channel

Figure 5-11 shows a graphical example how the optimal block size can be determined in dependence of the error rate. The variables are stated as:

m => minimal block size

Ƞreuse => probability of a resend of the frame

bopt = optimal block size

90

Fig. 5-11 Example of determining bopt

ds : digit sent (without frame information protocol)

drc : digit received correctly

1) Ideal case scenario: Only payload without tare and transmission lost. Payload transmission rate = bandwidth transmission rate

𝑇𝑇 = 𝑇𝑅

2) Error free transmission scenario: The message is send with 2 frame containing relevant information for the transmission. The transmission is error-free, or the errors are easily correctable.

Fig. 5-12 Message with tare (t) and load (m)

The transmission rate can be calculated as:

𝑻𝑻 = 𝒏

𝒃∙ 𝑽 =

𝒃−𝒕

𝒃∙ 𝑽 = (𝟏 −

𝒕

𝒃) ∙ 𝑽

The optimal block length can be chosen freely since no error occurs during transmission. The transmission rate indicates, that larger blocks lead to a higher efficiency at transmitting payload.

3) Defective transmission: The message was transmitted and an error occurred that cannot be corrected easily. In this case the whole block is usually discarded and re-transmitted.

Fig. 5-13 Message with tare (t), load (m) and an error

To calculate the optimal block size the probability of a faulty transmission must be known. With this information the following formula can be solved with q = rate of

ERROR

91

faulty transmission. We differentiate between TT = ‘payload transmission rate’ and TR = ‘bandwidth transmission rate’

𝑇𝑇 = [ (1 −𝑡

𝑏) − 𝑞 ∙ 𝑏 ] ∙ 𝑇𝑅

To determine the optimal block length the local maximum has to be found.

𝑏𝑜𝑝𝑡 = 𝜕𝑇𝑅

𝜕𝑏= 0!

→𝜕[[ (1 −

𝑡𝑏

) − 𝑞 ∙ 𝑏] ∙ 𝑇𝑅]

𝜕𝑏= 0

→ 𝑡 ∙ 𝑏−2 − 𝑞 = 0 (𝑚𝑖𝑡 𝑏 ≜ 𝑏𝑜𝑝𝑡)

→ 𝑡 ∙ 𝑏𝑜𝑝𝑡 −2 = 𝑞 → 𝑏𝑜𝑝𝑡 = √

𝑡

𝑞

Example:

t=10

q=10-5

=> bopt=103

If the communication system uses handshakes in its protocols the overall efficiency of the communication declines. Reasons can be the follwing:

a) a long message has to be cut to piece which are frames of optimal block length, then transmit separately

=>frame information (t) per block

b) Each block must be acknowledged by the receiver separately.

=> acknowledgement signal from receive (per transmission and per block)

Efficiency: ŋ=∑ 𝒖𝒔𝒆𝒅 𝒄𝒉𝒂𝒓𝒂𝒄𝒕𝒆𝒓𝒔

∑ 𝒔𝒆𝒏𝒕 𝒄𝒉𝒂𝒓𝒂𝒄𝒕𝒆𝒓𝒔+∑ 𝒂𝒄𝒌𝒏𝒐𝒘𝒍𝒆𝒈𝒆𝒎𝒆𝒏𝒕 𝒄𝒉𝒂𝒓𝒂𝒄𝒕𝒆𝒓𝒔

92

5.4 TRANSMISSION PROTOCOLS

5.4.1 CARRIER SENSE MULTIPLE ACCESS (CSMA) PROTOCOLS

To reduce frame collisions, this protocol was introduced, where sender stations scan the medium before transmitting data. If the medium is free, then the station starts a transmission. If the medium is occupied the station retries its attempt at a later time according to its protocol settings.

The following are the variations of CSMA:

1. 1-persistent CSMA: station listens to the channel to determine if it is free or not. If the channel is free the frame is transmitted. If the channel is busy, the station keeps scanning the medium until the channel is idle, and transmits its frame as soon as this idle state occurs. The station does not back off on its first trial. If there is a collision, the sender waits a random period of time before trying to transmit again.

2. Non-persistent CSMA: the station senses the medium for signal transmission and transmits if the channel is free. When the channel is busy, the station waits a random of time to re-start its sending procedure (start with listening to channel).

3. P-persistent CSMA: station senses the channel continually to see if it is idle or busy. If the channel is free, the sender transmits with a probability of p.

Problem: if two (or more) stations intend to transmit at the same time, it is possible that they both detect the line to be free, and start transmitting at the same time. In this case, frames will collide in the middle of transmission.

5.4.2 CARRIER SENSE MULTIPLE ACCESS WITH COLLISION DETECTION

The improvement of the CSMA is to stop transmitting as soon as a collision is detected, which would waste less time. This protocol with collision detection is called CSMA/CD, which is the most popular MAC protocol to-date. The CSMA/CD is discussed later in this document when one of the implementaions (Ethernet) is explained in detail .

A performance analysis of CSMA and CSMA/CD is rather involved. But the final formula is simple.

𝜌 = 1

1 + 6.44 𝛼

where ρ is the system throughput, and α is the ration of propagation time over

average packet transmission time. Assuming that τ is the propagation time of the

medium (typically it is taken as 5μ seconds per kilometer), and T is the average

packet transmission time, then

𝛼 = 𝜏

𝑇

93

The throughput of a SCMA/CD based network is inversely proportional to the value of α. The smaller the value of α is, the more efficient is the network. With today´s high speed network technology, especially with long-haul fiber potics network and small frame size (such as ATM which has a frame size of 53 bytes), CSMA/CD based technology is not suitable for long distance network.

Intuitively,

• the shorter the packet is, the more difficult for a station to detect if another station is in the middle of transmission;

• the longer the network medium is, the more difficult is for a station to detect if another station is transmitting;

• the station in the middle of the network has the advantage of detecting transmitting easier than the stations at the end of the network medium.

5.4.3 STOP-AND-WAIT TRANSMISSION PROTOCOLL

The stop-and-wait transmission protocol is a simple method to secure communications between two stations. Whenever a set of data has to be sent to the other station the data is organized in packets (or frames) and send one at a time. The sending station then waits for a reply from the receiving station which comes with a simple byte ‘ACK’ for ‘Acknowledged’. As soon as the ‘ACK’ is processed the next frame in line is send. If there is an error during the transmission and the receiving station recognizes the error a ‘NAK’ (Not Acknowledged) is send to the sending station resulting in a retransmission. If there is no response from the receiving station after a certain amount of time (timeout) the frame will be resend. This method provides a simple and save transmission but is rarely used because of its lack of speed.

Fig. 5-13 Stop-and-wait transmission protocoll

94

5.4.4 GO-BACK-N TRANSMISSION PROTOCOLL

To improve the speed of a transmission it is common to send several packages in sequential order without waiting for the response from the recipient. To secure a reliable transmission with these attributes the Go-back-n transmission protocol was devised. This protocol is used at the TCP transmission protocol which is as basis for all network communication. With the Bo-back-n protocol a set of data is buffered in a window of size n. At the beginning of the transmission the first frame of this window is send, followed by frame number two and so on. The receiving station however is capable of storing only one frame. If a frame arrives at the receiving station without error an ‘ACK’ is send to the sending station and the frame is processed. When the sender receives the ‘ACK’ message for the first frame in the window it is dropped out of the window and the next frame in queue is added to the end of it. Should there be an error during the transmission and the expected frame arrives with an error or doesn’t arrive at all the following things happen:

If a transmission error occurred the receiving station sends a ‘NAK’ back to the sender, drops the buffered frame and will not accept further frames until the expected one arrived without error. As soon as the sending station receives the ‘NAK’ response it stops transmitting further frames and restarts the transmission at the first frame in the window.

If the sending station doesn’t get an ‘ACK’ after a certain amount of time (timeout error) it stops transmitting further frames and restarts at the first frame of the window.

Fig. 5-14 Go-back-n transmission protocol with timeout error

The Go-back-n protocol offers a fast, simple and reliable method for transmitting data. The workload on the receiving station is very low and it can process data relatively fast. The downside of this protocol is, that in areas with bad connections a lot of retransmissions are necessary, meaning that a lot of packages are send, but never accepted until the expected frame arrives correctly. This results in a drop of speed and therefor in a less effective network.

95

5.4.5 SELECTIVE REPEAT TRANSMISSION PROTOCOLL

The selective Repeat transmission protocol is an enhanced version of the Go-back-n protocol. Here the sending station has a window for frames with the size of n and the receiving station also has a window for buffering frames of the same size. Whenever a frame is transmitted it is buffered in its equal location within the receiver’s window. If a transmission is started the sending station sends all frames buffered in the window to the receiving station in sequential order. For each frame the receiving station acknowledges an ‘ACK’ is send back to the sender. For every ‘ACK’ the sending station receives the according frame within the window is marked as ‘transmitted’. If the first frame in the window is marked it is dropped and the next frame in queue is added to the end of the window. Should an error occur during the transmission two things can happen:

If the frame arrived at the receiving station with an error a ‘NAK’ will be replied to the sender. As soon as the sending station receives the ‘NAK’ the respective frame will be retransmitted at the next possible timeslot.

If the frame or the response is lost during transmission (timeout) the respective frame will be retransmitted at the next possible timeslot.

Fig. 5-15 Selective repeat transmission protocol with timeout error

This transmission protocol is much more reliable in areas with a high rate of transmission errors, but is also more demanding on the hardware and complexity of the software.

96

6 NETWORKS

6.1 PHYSICAL NETWORKS

The structural topology of a network is described with the designs just made. The logical and electrical topology may differ from the structural topology. A network can be structurally a star, all cables meet together in a centerpiece, electrically speaking, it may however be a bus system, since the participants are parallel switched and all signals are simultaneously received. Logically, this can be however a ring network, when the transmitting authority is passed from participant to participant (token ring, see access mechanisms).

A network consists of the connections (transmission media), the participants (computer) and the connection units.

6.2 NETWORK TOPOLOGIES

6.2.1 LINE/BUS

Within a computer exists a series of networks that allow communication between each hardware system. Without those networks a computer system couldn’t operate as efficient as it does today. Those networks are usually described as ‘Bus-system’.

The name Bus (also „data-highway “or „party-line“) is derived from the known autobus, which stops at each station to let people get off or get on. This means, a continuous medium connects the individual stations, which are connected for example by coaxial cable without the medium to interrupt. A message that was issued by a station, spreads out in both directions and can be detected and read by all other stations. At both ends of the object line the message is absorbed in order to prevent reflection.

Fig. 6-1 Bus

97

Like the ring, the bus has a typical property, but in contrast to the ring it has shorter delay times, because the message does not pass through the stations, but will tap coupler on the pipeline. Also, the message is automatically removed from the medium once it has reached the end. The bus topology is also less vulnerable than a ring, as may follow without fail for the network stations. The bus is suitable for real time applications and hence particularly for operating and regulatory purposes. It is also possible to branch the bus, people call this branch-highway.

One of the most common known bus systems within a computer structure is the data bus. It allows the communication between the working memory and the CPU and has recently received a major update with the DDR4-RAM memory. The data transfer of a DDR4 system consists of a 64 bit wide bus that has an I/O Bus clock of 1200 MHz leading to a transfer rate up to 2400 MT/s. This example shows the main stats the transfer rate is dependent on. At first there is the width of a bus, which describes how many bits can be transferred between two systems at once. Increasing the width is a difficult task since it can be only increased with a factor of 2. An improvement of a 64 bit bus would result in a 128 bit bus, meaning that 128 copper wires have to be physically implemented on the circuit board. The second factor that affects the transfer rate is the speed of the bus in clock cycles. This speed depends on the characteristics of the hardware at both ends of the bus and the insulation against influences from outside and from each other wire of the bus.

Another form of bus system is the point-to-point communication. Those connections can be realized by a synchronous or asynchronous communication and usually support line multiplexing. As shown in figure 6-2 only two participants can open a point to point communication which usually means that no target address is necessary (provided that no routing takes place between each system).

Fig. 6-2 Point-to-Point

6.2.2 RING

A ring (or Loop, Daisy-Chain) can be considered as a regular mesh. The data always flow in the same direction and follow the same route. So it is possible to achieve a considerable reduction of the protocols and flow instructions. The stations are connected to the circuit of wires. This leaves the problem, that once a station fails (and no special precautions are taken), the whole ring fails. Another problem might occur, if a message is issued and the receiver does not accept it (due to an error) the message is not taken out of the line. In this case control measures have to step in or the message will endlessly circulate in the ring and thus block the remaining communication.

A token ring network differs from a daisy chain network, although both are considered part of the ring-network topology. In a token ring computers are added to the ring, but are not necessary for the transportation (ref. Fig 6-3).

98

Fig. 6-3 Ring-token network

The daisy chain is a system where one computer is connected to the next and previous computer in line and only to those. If a message needs to be transmitted from x to y the computer sends it to the next in line, which will resend it to the next and so on, until the target destination is arrived (ref. Fig 6-4).

Fig. 6-4 Daisy chain

The daisy chain topology can be realized in linear form, which means that each computer can not only send to the next in line, but also to the previous one. A connection from the last to the first system is unnecessary in the linear topology, but each system needs two receivers and two transmitters to provide full communication.

6.2.3 STAR

In a star topology is the central computer with each user station connected by a separate line. Usually there is a hierarchical relationship between the central and the connected computers. The central is usually the master computer, while the other computers are the slaves. If the master computer fails the whole network fails at once. A fitting example for a star network is a modern local area network (10Base-T). The central unit is usually a router or switch that organizes the traffic between each computer. The following benefits have to be considered when planning a star topology network.

- Low cost for installation

- simultaneous transmission of data and language

- compatibility with public networks

99

Fig. 6-5 Star network topology

Fig. 6-6 Star network communication

As shown in Figure 6-6 the communication from the central station to each connected systems works both ways. This is achieved through hardware that can up- and download data with the same cable connected.

100

6.2.4 MESH

The necessity for a meshed topology arises when computer systems are widely scattered and a long distance lies in between them. For this reason, an irregular topology is needed to keep the cost per data line low. This form of spatial expansion also results in a higher transfer rate if a signal is broken into pieces and transmitted via several lines. Additionally a meshed structure has a better failsafe than a star network since a damaged wire does not result in the breakdown of the network or isolation of one computer system. The most common example for a meshed network is the internet.

Fig. 6-7 Mesh network topology

Fig. 6-8 Mesh network communication

101

The disadvantage of a mesh network is the necessity of a number of continuous stations and a higher demand of cables/wires to ensure access at all places and more routing choices. That again is an advantage, since more choices for data transfer are an in-build failsafe and possible speedup for the network.

Logical access

Fixed:

Polling: Cyclic query of all stationens

Selektion (certain Si)

Temporary definition of a point-to-point connection (contemplation quasi-static)

Realization:

Switched

HP-Bus

Floating: procurement of primary station

Time slices, similar to multitasking; this leads to a temporarily fixed allocation

Requirment by Sj

Realization forms:

Procurement of controlls according to your needs

Random bus access (ethernet, LAN)

Token Ring (LAN)

Arbitration (bus)

Commonalities:

Higher reliability

Simple switch on and switch off of participants

102

6.2.5 OTHER TOPOLOGIES

6.2.5.1 Multi-drop

Fig. 6-9 Multi-drop network communication

A Multi-drop topology uses a bus structure, however in the sense of logical circuit it is considered as a star configuration. A master determines who is next in the row and turns all other receivers off by separating them physically or logically.

6.2.5.2 Head-end

Fig. 6-10 Head-end network communication

The Head-End topology is built from 2 bus-connections, the “up-link” and the “down-link”. The down-links are connected to the same end and build the so called “Head-End”, which can be seen as a repeater. All stations are connected by the up-link and the down-link. A message that is transmitted by one station stops on the up-link and if the line is free will be received (after going through the repeater) on the down-link of all stations. The physical realization of a Head-End topology can be diverse. The up- and down-link can be a different media, or even a different type of transmission technology (modulation). In modern Wide-Area-Networks the Head-End topology is used in satellite networks. The satellite, the “Head-End” is called transponder. In this context, the name “one-hop” topology is also used.

103

6.3 TECHNICAL REALIZATION OF NETWORK TOPOLOGIES

Following are some examples of different topologies and how they are used.

6.3.1 BUS TOPOLOGY

6.3.1.1 Thin Ethernet (10Base2)

Fig. 6-11 Thin Ethernet

6.3.1.2 10Base-T Ethernet (twisted pair ethernet)

Fig. 6-12 Head-end network communication

104

6.3.2 RING

6.3.2.1 Token Ring

6.3.2.2 FDDI

6.3.3 STAR: ATM

Outer ring used for data

Inner ring unused except during failture

Failure station

Station adjacent to failure loops back

(a) (b)

105

7 COMPUTER NETWORKS

Computer networks connect computers of different manufacturers with different operating systems. Computer networks can be classified by their geographical size independent of their technical realization as follows:

Local Area Network (LAN): local network within a building

Metropolitan Area Network (MAN): networks that cover the whole city

Wide Area Network (WAN): network that stretches over the whole earth

Figure 7-1 shows a table that describes which technology can be used for what specific connection.

Fig. 7-1 Types of networks for LAN and WAN

The detailed description of each technology follows:

Ethernet: Ethernet is the basic network that is used for Local Area Network, realized with a switch or router. Connectionless communication means, that each data unit is individually addressed and routed based on the information carried.

Token Ring: The Token Ring network is one predecessor of the Ethernet Technology but can be used the same way. Since its setup is more difficult and the network is more crash-prone than Ethernet it is rarely used anymore.

FDDI: The Fiber distributed data interface is a standard for an optic fiber data network. The transfer speed is limited to 100 Mbit/s and its topology is structured as a single or double ring. The maximum distance between two stations is limited to 2km or 40km for monomode fiber optic cables. Since FDDI is still structured as a Token-Ring it is impractical for long distance communication.

106

Frame Relay: The Frame Relay network establishes a connection between two communication participants by creating a virtual, direct connection between both of them. Since the dawn of broadband networks the Frame Relay topology is rarely used to access the internet, except in area without broadband connection. In Europe the GSM network stations are connected to the landline via frame network.

SMDS: The Switched Multi-megabit Data Service is a connectionless topology to connect WAN’s, MAN’s and LAN’s through the telephone network.

ATM: ATM stands for Asynchronous Transfer Mode and is a communication protocol that was designed to handle both high- throughput data traffic and real-time, low latency content. The protocol is used over the Integrated Services Digital Network and Public Switched Telephone Network. The data transmitted is organized in cells that are transferred within a specific timeframe over a virtual path or channel.

LocalTalk: LocalTalk describes an outdated network system provided by apple that was specified for a two-wire line (RS-422). It implemented a bit- transfer layer (Level 1 in OSI-Model) for the AppleTalk-protocol family.

107

8 ROUTING PROTOCOL - DIJKSTRA’S ALGORITHM

In order to transport data gramms into WAN, a packet broker (packet switch) can be used. This packet intermediary consists of a computer that is used solely for sending and receiving packets. The following illustration shows the basic construction of a packet broker.

Fig. 8-1 Packet switch overview

Inputs and outputs on the left side are shown to/from another package intermediaries. Recipient or sender of the data can be associated on the left side. Mostly local networks are here connected, which then contain the actual receiver and transmitter. With the help of the packet intermediaries can LANs be connected together. An example for the use of packet intermediaries is shown in the following illustration:

Fig. 8-2 Several connected LAN’s

The transportation intermediaries between the individual packages can be based on different technology: optical fiber, satellite channels, frame relay, ATM and so on.

108

Each WAN technology defines its own frame format in order to send or receive data. Many WANs use a hierarchical addressing scheme. One address is divided into several parts. The simplest one address allocation scheme is divided into two parts: the first part identifies a packet switch and the second is computer connected to this intermediary.

Fig. 8-3 Addressing in connected switches

In practice the address is represented as a binary value, which is used by some of the bits for the first and further part of the address for the second.

A packet for each broker must select a forwarded packet output path. When the packet is determined, which is connected directly or via a LAN connected to it for a computer, directs the packet switch there further. If it is intended for a computer that is connected to any other packet-broker, it must use one of these connections leading to brokers. The appropriate choice is made by the broker package with the help of the stored destination address in the packet.

If the packet broker manages no complete information about it, as all possible targets are to reach, and it has only information about the next part of the route switching, is called a partial route switching(Next-Hop-Forwarding).

Usually a routing table is managed by the router, can determine with the help of the router off for the forwarded data paketes.

The router forwards the data packets without adding routing tables further is called a simple routing process. This process can be divided in

• Random Routing: each node sends a for itself determined message by chance to one of its neighbors. Excluded is the neighbor from which it received the message.

• Flooding: A node X send a not for itself determined message, which received from Y, to all the neighbours, except for the transmission link from which the data packet has arrived. so that the so-triggered “flood” stops at some point, the message flood carries a counter. A node only generates new messages when the counter of a received message is under a threshold value. The node, which initiierenes the message flood, sents the counter to 1.

Since these methods copy the data packets to be sent multiple times and send duplicated into network, this has a more significant burden to follow. Therefore, routing-procedure is preferred, which used routing-algorithms, to transfer the targeted data packet.

109

Example:

Fig. 8-3 Example of finding a route

The information about the respective fare stages are organized in a table: Routing Table.

For each target there is an input in the table. When forwarding a packet, the broker takes the therein contained destination address, searches the table for a matching input and then sends the packet to the route specified in the table in part. (Hop)

The forwarding of packets over some stretch does not depend on the source or by the individual to the competent stage packet from broker, but only on the goal of the package. Since during forwarding no source information is consulted, only the destination address of packet has to be removed. All destination addresses with the equal first part, localized at the same packet broker.

So in the forwarding of the packet only the first part of the hierarchical address must be checked by package intermediaries. Thus the computation time, which is used to forward a package can be reduced. The routing table on an input per target packet is reduced.

When a packet reaches the broker, which is connected to the target computer, the broker checks this second part of the address and selects the corresponding computer.

So that the router is working properly, the values contained in the routing table have to make sure the following:

•Routing-table contains only one hop (universal routing) for each possible target

•the hop-value in the routing-table contains the shortest path to the target (optimal route).

110

The routing in WAN the routing can be thought best based on a graph that models the network. Each node in the graph corresponds to an in-network packet broker. When a direct connection exists in the net between 2 packet broker, the graph comprises an edge or a link between the corresponding nodes.

Example:

Results in the following routing table:

In the field “next hop” entries represent the two parts of the edge (u, v) in the graph from node u to node v. Although the hierarchical addressing reduces the scope of the routing table, since repeatedly occurring routes should be removed to individual computers, includes the routing table abridged many records with the same route. For the avoidance of duplicate inputs in routing tables, a routing mechanism called default is used. A single input replaces a row of inputs with the same hop-value. When the forwarding mechanism is no explicit entry for a particular destination, the route is default. Thus its results in the following table:

The default route is shown in the table with a “*”. A default entry is only displayed if more than one target has the same hop value.

111

For the calculation of the routing table, there are two approaches:

·static routing: calculated routes will not change during operation

·dynamic routing: routing table will be updated during operation

The static or non-adaptive routing is not based on the routing decision measurement or estimation of network circulation, but is calculated offline and loaded at boot of the router. There is no updating during operation.

The advantage of static routings is the simplicity and low overhead of the procedure. The disadvantage is that, it cannot be changed. Occurring problems (e.g. failure of individual transmission links) in the forwarding of packets cannot be corrected. Therefore, dynamic routing is used in large networks instead. Adaptive methods differ in the art as they relate to their routing information:

- supply source of information: local, of neighboring routers, of all routers

- update date: every x seconds in circulation or topology change

- calculation methods, different metrics to optimize the route: distance, number of fare stages, estimated transmission time

Shortest path routing

- creation of a part-network graph: each node corresponds to a router and each arc of a transmission path

- determination of the shortest path

criteria for determination

- path length: the number of routing lying between transmission path

- geographical (physical) path length

- average queue time

- transmission delay

- labeling the arcs of the graph as a function of

- distance

- band width

In dynamic or adaptive routing changes in topology or network circulation are considered by the selection of the transmission. Information about the routes will be updated at regular intervals.

112

Points to be considered in the dynamic or adaptive changes in the network topology or traffic over the routes for selection of the data transmission

- Average circulation

- Transmission cost

- Average queue length

- Measured delay

- …

With aid of the weighting function of the shortest path is then determined by means of various algorithms (Dijkstra).

Flow-based Routing

With this method, the circulation volume is collected on the transmission routes. The average data flow is relatively stable and predictable. The average packet delay can be calculated using queue theory.

Following information for the network has to be available:

- Topology of network

- Data circulation matrix Fij

- Line capacitance matrix Cij, capacitance of each line in bps

- (Possibly temporary) routing-algorithm

Distance-Vector-Routing

Basic method for dynamic routing procedure, used in Arpanet till 1979 dynamic routing-algorithm, where each router maintains a table in the form of a vector. The information in the table is updated by the regular exchange of information between adjacent routers, while each router receives from its neighbor routers a table with information relevant to its neighbor. This information can be for example the delay time. The new estimated delay is simultaneously determined at a given node and the next required node is stored in the table for the output packet.

Link-State-Routing

The method solved Distance-Vector-Routing, because different bandwidths were not considered on the individual transmission links and the algorithm was too slow.

Link-State-Routing consists of the parts: Each router must

- determine its neighbors and their network addresses

- measure delay or cost of the neighbors

- put package together, where all information is, and send it to all router

- calculate the shortest path to all other routers

OPERATING SYSTEMS AND COMPUTER NETWORKS OSCN... · operating systems and computer networks lecture notes operating systems and computer networks lecture notes u n i v e r s i t y

Documents