Lecture – Performance

Lecture –Performance

Performance management on UNIX

21/04/23 2

Performance Analysis

Performance analysis involves identifying various system bottlenecks

This involves a number of steps We must ask a number of questions

Is there a performance Problem? Is the problem CPU or I/O related?

21/04/23 3

Performance Analysis CPU Related?

What is the current load on the CPU? What is the average load on the CPU?

I/O Related Is it normal disk I/O?

Would more/faster disks help? Is it paging I/O?

Would more physical memory help?

21/04/23 4

Related to a Particular User or Program?

Identify the user / program Identify what they are doing to cause the problem Revise their operating procedures Consider removing them from the system

21/04/23 5

Determining CPU Usage Determining the CPU usage is the first thing we should

do There are a number of tools to do this

vmstat gives several pieces of useful information including CPU usage vmstat [interval] [count] Interval is the number of seconds between reports and count is

the number of reports to generate

21/04/23 6

vmstat 2 10[rbradley@aisling]$ vmstat 2 10 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98 0 0 0 5484 27216 136584 198844 0 0 0 0 130 51 0 2 98 0 0 0 5484 27216 136588 198848 0 0 0 86 157 63 0 0 100 0 0 0 5484 27216 136588 198848 0 0 0 0 139 46 0 0 100 0 0 0 5484 27224 136588 198836 0 0 0 30 153 47 0 0 100 0 0 0 5484 27712 136588 198824 0 0 0 8 166 107 1 0 99 0 0 0 5484 26876 136588 198828 0 0 0 0 139 92 6 2 91 0 0 0 5484 26876 136592 198824 0 0 0 144 137 69 0 0 100

7

vmstat

The first line gives the average values since the system was booted and should be ignored

To determine the CPU usage, we are interested in the last three columns, us, sy, id

us: % of CPU dedicated to User tasks sy: % of CPU dedicated to System tasks. Including I/O performing general O/S

functions etc. id: % of CPU idle

procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98

21/04/23 8

Analysing vmstat output (CPU) Just because CPU time is high or idle time is low does not

indicate a system problem It may simply indicate that a number of batch jobs are

scheduled to run at the same time and might benefit from being rearranged

In order to establish if there is a genuine problem it is necessary to monitor the system over an extended period

If average CPU% remain high, there is a problem

21/04/23 9

Analysing vmstat output(Process States)

There are three states in which a process may be at any point in time

Runtime, uninterrupted sleep, swapped out Process Statistics:

r: Number of processes waiting for runtime b: Number of processes in uninterrupted sleep w: Number of processes swapped out, but otherwise able to run

A high r suggests there is a bottle neck.


21/04/23 10

Analysing vmstat output (Memory)

Memory Statistics swapd: Amount of virtual memory used (KB) free: Amount of idle memory (KB) buff: Ammount of memory used in buffers cache:amount of memory left in cache


21/04/23 11

Analysing vmstat output (Swap)

Swap Statistics si: Amount of memory swapped in from disk (KB/s) so: Amount of memory swapped out to disk (KB/s)

Swap statistics are arguably the most important statistic to monitor, and of these, the so field

This field indicates the pages that have been swapped out, even if done before vmstat was started


21/04/23 12

Analysing vmstat output (I/O)

I/O Statistics bi: Blocks received from a block device (blocks/sec) bo: Blocks sent to a block device (blocks/sec)

If there are a large number of block transfers, the problem with your system may lie here (i.e. device access is high)

A single reading, however is not indicative of the system as a whole, simply a snapshot

All Linux blocks are 1KB except for CDRom blocks (2KB)


21/04/23 13

Analysing vmstat output (System)

System Statistics in: The number of interrupts per second, including the

system clock cs: The number of context switches per second


21/04/23 14

Analysing vmstat output (CPU usage)

System Statistics us: % of CPU dedicated to user tasks sy: % of CPU dedicated to system tasks id: % of CPU idle


21/04/23 15

top top is another tool for identifying problems with a LINUX system Displays the top CPU processes Displays a listing of the most CPU intensive tasks on the system Can provide an interactive interface for manipulating the processes Default is to update every 5 seconds

top operates by examining files in the /proc pseudo file system This pseudo file system is used as an interface to kernel data

structures man proc

21/04/23 16

[rbradley@aisling rbradley]$ top 17:14:41 up 47 days, 2:27, 8 users, load average: 0.06, 0.03, 0.0761 processes: 59 sleeping, 2 running, 0 zombie, 0 stoppedCPU states: 0.0% user 0.2% system 0.0% nice 0.0% iowait 99.8% idleMem: 513316k av, 200052k used, 313264k free, 0k shrd, 44976k buff 57692k actv, 11208k in_d, 1024k in_cSwap: 1052248k av, 9096k used, 1043152k free 34656k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1 root 15 0 108 76 56 S 0.0 0.0 0:15 0 init 2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd 3 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kapmd 4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU0 9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 226 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 586 root 15 0 200 160 116 S 0.0 0.0 0:08 0 syslogd 590 root 15 0 180 168 120 S 0.0 0.0 0:03 0 klogd 666 root 15 0 480 348 232 S 0.0 0.0 1:09 0 sshd 719 root 15 0 52 4 0 S 0.0 0.0 0:00 0 gpm 728 root 15 0 176 148 88 S 0.0 0.0 0:05 0 crond 785 xfs 15 0 1836 60 32 S 0.0 0.0 0:00 0 xfs 803 daemon 15 0 180 164 116 S 0.0 0.0 0:00 0 atd 812 root 23 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 813 root 23 0 52 4 0 S 0.0 0.0 0:00 0 mingetty

top

17

Analysing top output

Up: The time the system has been up and the three load averages Average number of processes ready to run in the last 1,5 and 15

minutes Same as the output of uptime

Processes: The total number of processes running at the time of the last update Broken down into running, sleeping, stopped and zombied (A zombie process is a finished process where the parent has not read it

exit state – which causes the process to be cleaned up)

17:14:41 up 47 days, 2:27, 8 users, load average: 0.06, 0.03, 0.0761 processes: 59 sleeping, 2 running, 0 zombie, 0 stoppedCPU states: 0.0% user 0.2% system 0.0% nice 0.0% iowait 99.8% idleMem: 513316k av, 200052k used, 313264k free, 0k shrd, 44976k buff 57692k actv, 11208k in_d, 1024k in_cSwap: 1052248k av, 9096k used, 1043152k free 34656k cached

21/04/23 18

Analysing top output CPU States: The percentage of CPU time in user mode, system

mode, niced tasks (negative nice tasks) and idle Time spent in niced tasks will also be counted system and user time, so

the total will be more than 100%

Mem: Statistics on memory usage, including total available memory, free memory, used memory, shared memory, memory used for buffers


21/04/23 19

Analysing top output Swap: Statistics on swap space including total swap space and

used swap space This and the Mem section together are the same as the output of free*

PID: The process ID of each task USER: The username pf the task’s owner PRI: The priority of the task NI: The nice value of the task. Negative values are lower priority


PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1 root 15 0 108 76 56 S 0.0 0.0 0:15 0 init

21/04/23 20

Analysing top output

SIZE: The size of the task’s code plus data stack space, in kilobytes RSS: The total amount of physical memory used by the task in

kilobytes SHARE: The amount of shared memory used by the task STATE: The state of the task, S: sleeping, D: uninterrupted sleep,

R: running, Z: zombies, T: stopped or traced %CPU: The task’s share of the CPU since the last screen update as

a a percentage of total CPU time %MEM: The task’s percentage of physical memory Time: Total CPU time used by process since it started COMMAND: The task’s command name

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1 root 15 0 108 76 56 S 0.0 0.0 0:15 0 init

21/04/23 21

Using top to control processes

In addition to command-line options for controlling the appearance of top (not covered here) there are a number of commands that can be issued to top while running Space: immediately updates the display ^L: Erases and redraws the screen k: kill a process You will be prompted for the pid and a signal to

send to the process (normally 15)

21/04/23 22

Using top to control processes

i: ignore zombie processes n: change the number of processes to view r: renice a process P: sort tasks by CPU usage M: sort tasks by Memory usage

21/04/23 23

Renice

The renice command is used to alter the priority of running processes

The default nice value is 0 The range in Linux is -20 to +20 The lower the value the faster the process runs Can examine the nice value of a process using ps –l

21/04/23 24

Renice

The owner of and root can change the nice value of aprocess using renice

Changes apply to all child processes renice priority [[-p] pid ...] [[-g] pgrp ...] [[-u] user ...]

[rbradley@aisling]$ ps -lF S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD0 S 1634 24496 24495 0 75 0 - 1091 wait4 pts/1 00:00:00 bash0 R 1634 26361 24496 0 75 0 - 778 - pts/1 00:00:00 ps

[rbradley@aisling]$ renice 5 2449624496: old priority 0, new priority 5

[rbradley@aisling]$ ps -l

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD0 S 1634 24496 24495 0 80 5 - 1091 wait4 pts/1 00:00:00 bash0 R 1634 26363 24496 0 80 5 - 777 - pts/1 00:00:00 ps

21/04/23 25

Renice Once a nice value has been increased, only the root user can

reduce it again, not even to the default value

[rbradley@aisling]$ renice 19 2449624496: old priority 5, new priority 19

[rbradley@aisling]$ ps -l

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD

0 S 1634 24496 24495 0 94 19 - 1091 wait4 pts/1 00:00:00 bash

0 R 1634 26390 24496 0 94 19 - 778 - pts/1 00:00:00 ps

[rbradley@aisling]$ renice 1 24496

renice: 24496: setpriority: Permission denied

21/04/23 26

How Much Swap Space? A quick rule of thumb often used is twice as much as you have

physical memory

This approach is a bit simplistic and does not scale well

1. Estimate total memory requirements

2. Add some megabytes as a spare

3. Subtract the amount of physical memory available

4. If the value from 3 is > 3 times the available physical memory, you need more memory

21/04/23 27

How Much Swap Space

Sometimes the above formula will show that you don’t need swap space at all

It is a good policy to create some anyway Linux uses the swap space so that as much physical

memory as possible is kept free It swaps out pages that have not been used for a while When memory is needed, it is available

21/04/23 28

How Much Swap Space?

If swap space is removed (using the swapoff command) the system will attempt to move any swapped pages into other swap space or physical memory

If there is not enough space elsewhere the system may become unavailable for a time, while it sorts itself out, but it will come back

Lecture – Performance

Documents

problem cpu

cpu time

cpu idle

cpu usagedetermining

number of processes

number of reports

extended periodif average

system problemit