Top Banner
Object Innovations Course 310 Dominic Duval Student Guide Revision 1.3 Linux Internals
75

LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Jul 12, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Object Innovations Course 310

Dominic Duval

Student Guide Revision 1.3

Linux Internals

Page 2: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Rev. 1.3 Copyright ©2002 Object Innovations, Inc. ii All Rights Reserved

Linux Internals Student Guide Information in this document is subject to change without notice. Companies, names and data used in examples herein are fictitious unless otherwise noted. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Object Innovations. Product and company names mentioned herein are the trademarks or registered trademarks of their respective owners. Copyright ©2002 Object Innovations, Inc. All rights reserved. Object Innovations, Inc. 420 Boston Turnpike Shrewsbury, MA 01545 781-272-3860 www.ObjectInnovations.com Printed in the United States of America.

Page 3: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Rev. 1.3 Copyright ©2002 Object Innovations, Inc. iii All Rights Reserved

Table of Contents

Chapter 1 Introduction to Linux Internals Chapter 2 Kernel Overview Chapter 3 Memory Management Chapter 4 Inter-Process Communication Chapter 5 File System Chapter 6 System Calls Chapter 7 Kernel-Related Commands Chapter 8 Device Drivers Chapter 9 Module Management Chapter 10 Networking Chapter 11 SCSI Subsystem Chapter 12 Boot Process Chapter 13 Debugging Tools Appendix A Learning Resources Appendix B Data Structures Appendix C Labs

Page 4: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-1 All Rights Reserved

Chapter 1

Introduction to Linux Internals

Page 5: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-2 All Rights Reserved

Introduction to Linux Internals

Objectives

After completing this unit you will be able to:

• Setup the environment that will be used for Kernel development.

• Understand the main characteristics of the Linux operating system.

• Identify the differences between the various Linux distributions.

• Configure, compile and install the Linux Kernel.

Page 6: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-3 All Rights Reserved

Environmental Setup

• The following describes what is needed for this course.

• PCs running Linux:

− Any computer on which you have access to a shell and a compiler could theoretically serve the purpose of Kernel development.

− Greater processor speed essentially means a shorter compile time as well as faster execution and debugging.

− Having access to more recent technologies (Pentium for example) allows you to optimize the Kernel for a specific architecture. Kernels optimized for a specific architecture will not work on other (older) version of the processor.

• GNU cc compiler:

− The compiler is called on the command line with the gcc command.

− The quality of the compiler depends on the architecture used. For PCs, gcc competes with commercial grade compilers. The results may vary depending on which architecture you are developing for.

− The Linux Kernel is gcc-dependent. Thus, no other compiler could be used to build a working version of the Linux Kernel.

Page 7: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-4 All Rights Reserved

Environmental Setup (Cont’d)

• Bash command interpreter

− The Bash shell is the most commonly used and distributed shell on the Linux Operating System. It is very similar to the Bourne shell, on which it was based.

− We will use exclusively the Bash shell in the examples and exercises provided in this course. Feel free to use any shell that you like for Kernel development. However, keep in mind that most scripts included in the Kernel rely on Bash.

• X11R6 Window System

− The X Window System can be used for Kernel configuration.

− However, having a working X Window System is not an absolute requirement since most tools used for Kernel development are text-based.

− Development may be done with graphical editors based on the X Window System. The xemacs editor is a good example of an X Window System application that is often used by developers.

• GNU Debugger

− Called on the command line with the gdb command.

− This tool often used for User space applications.

− The debugger may also be used, to a very limited extent, for Kernel debugging. We will however focus on other, more adapted, debugging tools in this course.

Page 8: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-5 All Rights Reserved

Environmental Setup (Cont’d)

• Tcl/Tk Interpreter

− This interpreted language is used for the xconfig configuration application. It is not required if you are developing exclusively with text-based tools.

• Text Editors

− At least one of the commonly used text editors should be available on your system.

− vi or emacs are the two most popular editors, since they tend to integrate very well in the type of environment used for Kernel development. Other editors may be used as well.

• Kernel Sources

− The system should contain the sources of the Linux Kernel. We will put a particular emphasis on the 2.4 version in this course, and the precise versions of the Kernel will be specified during the course

− As you will see, a fairly extensive number of features have been added in Linux Kernel 2.4. This includes obvious additions like Netfilters, PCMCIA and USB. Many updates and bug fixes, some of which are very subtle, were also included in this release.

• Questions:

1. Will a Kernel for the Intel 486 CPU run on a Pentium? What about the opposite?

2. Can Kernel development possibly be done on a 386/25 Mhz?

Page 9: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-6 All Rights Reserved

Linux Features

• Linux has been designed to support the following features:

− Multitasking: Several programs can be executed at the same time on the same machine.

− Multi-user: Many users can be working on the same computer at once. Their respective execution and storage space is clearly delimited.

− Multi-platform: The Kernel has been ported to non-x86 architectures such as Sparc, MIPS, Alpha, IA/64 and PowerPC.

− Multiprocessor: More than one CPU can be used on a machine. This feature is often referred to as SMP, which stands for Symmetric Multi-Processing.

− Memory protection: A process cannot access the memory associated allocated to another process. Otherwise, the system would be subject to memory corruption and crashes.

− Paging to disk: Pages (not whole processes) can be swapped to disk when necessary. This speeds up the execution process and avoids wasting memory for unused code.

− Dynamically linked shared libraries: Programs can share some pieces of code by dynamically linking a library at execution time. The libc library is the standard shared library on Linux system. However, other shared libraries may also be used.

Page 10: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-7 All Rights Reserved

Linux Features

− Modules: Device drivers may be compiled into the Kernel or loaded dynamically by the user at runtime. This is usually a good way to free up memory.

− Core dumps: Dumping the state of the program to disk allows developers to debug programs after they have crashed.

− Multiple virtual consoles: The user of the system has access to several consoles from which the system can be controlled. For example, consoles can be operating simultaneously on the system’s physical console, in the X Window System and through the serial port.

− Several common filesystems are supported, including MSDOS, FAT, FAT32, Minix, HP-UX and all the common system V filesystems.

− Many networking protocols available: TCP, IPv4, IPv6, AX.25, X.25, IPX, DDP (Appletalk), NetBEUI, Netrom, and others.

− Standards: Linux aims at being compliant with POSIX, System V, and BSD in the Kernel sources. It is however fully POSIX compliant for job control.

Page 11: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-8 All Rights Reserved

Linux Features

• Supported Hardware

− Linux runs on many hardware platforms including the Intel x86, DEC Alpha, IA/64, Sun Sparc, PowerPC, MIPS, ARM and Motorola68K architectures.

− Linux works on new versions of these platforms as well as on the older ones. For instance, the Intel 386, 486 and Pentium machines are well supported under Linux. The OS also support Intel clones (Cyrix, AMD).

− A wide range of hardware device drivers is available. Legacy devices also tend to be well supported, so the life of older computing systems can easily be extended.

− Any type of memory (EDO, DRAM, SDRAM) can be used with Linux. The Kernel can run on 8 MB of memory, but most distributions require 16 MB as a bare minimum. The maximum amount of memory that can be used depends on the architecture of the system. On PCs, the Kernel imposes a limit of 2 GB of RAM.

− IDE and SCSI disk storage subsystems are supported. The RAID (Redundant Arrays of Inexpensive Disks) standard can also be used in order to maximize speed and/or reliability.

− Most devices commonly used on PCs (modems, sound cards, CD-ROM drives, network cards) are supported, but there are sometimes some limitations with software-controlled devices, particularly with modems.

− USB (Universal Serial Bus) is supported under the Kernel version 2.4 and patches are necessary for older versions.

Page 12: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-9 All Rights Reserved

Linux Features

• Here is a brief overview of the supported software under Linux (commercial and free):

− UNIX similarities: Since Linux is mostly POSIX compliant, almost all of the GNU utilities are available. Therefore, most standard UNIX commands (ls, sed, awk, vi, etc) have been converted to Linux.

− Programming languages: C, C++, Objective C, Java, Modula-3, Modula-2, Oberon, Ada95, Pascal, Fortran, ML, Scheme, Tcl/tk, Perl, Python, Common Lisp, and many others.

− Development tools: gcc, gdb, make, bison, flex, perl, rcs, cvs.

− Editors: GNU Emacs, XEmacs, MicroEmacs, jove, ez, epoch, elvis (GNU vi), vim, vile, joe, pico, jed, and others.

− Shells: bash (POSIX sh-compatible), zsh (includes ksh compatibility mode), pdksh, tcsh, csh, rc, es, ash and others.

− Mail and news clients: C-news, innd, trn, nn, tin, smail, elm, mh, pine and others.

− Server-oriented: Apache, Sendmail, Postfix, Exim, Qmail, Samba, SSH, AOL Server, WU-ftpd, ProFTPd, Squid, etc.

− Commercial software is also becoming widely available on Linux. The most popular ones include: Netscape, Star Office, Opera, Real Player, Adobe Acrobat, Accelerated X, BRU, Metro X, Iplanet, Openmail, etc.

Page 13: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-10 All Rights Reserved

Linux Distributions

• Mainstream distributions1

− Caldera Systems Open Linux is a popular commercially distributed Linux distribution. It includes many proprietary enhancements that are designed to ease the administration of the system.

− Corel Linux is a Debian-based distribution that focuses on ease of use. It is designed to attract MS-Windows users by providing several graphical tools integrated to the KDE desktop environment. This distribution has been sold to Xandros in 2001 and renamed Xandros Desktop OS.

− Debian GNU/Linux is a distribution developed entirely by unpaid volunteers. It is now one of the largest distributions and it is particularly aimed towards system administrators and developers who have some experience with Linux. Debian is supported on processors such as Intel, PPC, Alpha and Sparc. Other architectures are also in development.

− LinuxPPC is the native port of Linux to the PowerPC processor. LinuxPPC is very similar to the Red Hat distribution, and is often used for server-oriented applications. It is known to work on G4, iBooks and iMacs.

1 For a more complete and current listing of the various Linux distributions, take a look at http://www.distrowatch.com/

Page 14: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-11 All Rights Reserved

Linux Distributions (Cont’d)

− Mandrake Linux is one of the most popular commercial distributions, largely due to its ease of use and installation. It was originally based and still is similar to Red Hat. It focuses on the KDE desktop environment, and all the packages provided with Mandrake are optimized for Intel Pentium-family chips.

− MkLinux is a project that was started by Apple Computers and the Open Group Research Institute. It consists of an implementation of the Linux operating system hosted on the Mach microkernel, and is designed for the PowerPC architecture.

− Red Hat is one of the most popular Linux distributions. It is not aimed at any particular task, so it can be used as a development workstation as well as a server, and is also popular among beginners due to its ease of installation. Red Hat Software distributes versions of its distribution for the Intel, s/390 and Itanium architectures.

− Slackware has been almost exclusively developed by Patrick Volkerding, and was a few years ago the most popular Linux distribution. It is still considered a solid distribution for Intel machines.

− SuSE is currently the most popular distribution in Europe, and particularly in Germany. SuSE is often associated with the YaST configuration utility, which contributed to the ease of use of this distribution. SuSE is also a large contributor to the XFree86 project for the development of X servers for newer graphics cards and supports the Intel, PPC, Alpha and Sparc architectures.

Page 15: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-12 All Rights Reserved

Linux Distributions (Cont’d)

− Turbo Linux offers various Linux distributions geared towards corporate needs as well as the home desktop. It is available in English, Japanese, Korean and Chinese, which makes this distribution a heavy player in the Asian market. Versions are also available for various IBM’s big-iron series.

− Yellow Dog Linux is a distribution for the IBM RS/6000 and Apple Macintosh PowerPC. It is based on RedHat and aimed at high-end, mission critical server applications. It is also well suited as a desktop environment, since it ships with all major graphical tools available on RedHat.

− Lycoris Desktop LX is a distribution that appeared in 2002 and, as its name implies, provides a Linux system designed for ease of use and installation.

− Gentoo Linux is another distribution that has gained some momentum in 2002. Gentoo uses a package management system called Portage, which provides a way to automatically built and optimized for the hardware on which Gentoo is installed. The Intel, PPC, Sparc and Sparc64 architectures are supported.

Page 16: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-13 All Rights Reserved

Linux Distributions (Cont’d)

• Mini distributions

There are many small distributions available that fit on one or two floppy disks and use a small set of applications. Some of these are aimed at very precise tasks (firewalls and X terminals for example), while others are just simple “rescue” tools. The following is a short list of these distributions:

− Coyote Linux is designed for those who wish to share with other computers an Internet connection that is provided via an Ethernet link. The primary focus of Coyote Linux is to make it as easy as possible to configure and share the connection. SmoothWall is also a distribution that aims at similar goals.

− Monkey Linux is a minimal Linux distribution that fits in a 7.5MB archive (5 floppy disks) designed to be used within MSDOS without any dedicated Linux partition.

− Bootable Business Card is a 46MB Linux distribution containing most common utilities used for troubleshooting computers and networks. Its size make it possible to install the distribution on a Business Card sized CD-ROM. It has been developed by Linuxcare and is based on Debian.

− Trinux is a portable Linux distribution that boots from a single floppy disk and runs entirely in RAM. It contains network security tools for port scanning, packet sniffing, vulnerability scanning, sniffer detection, packet construction, active/passive OS fingerprinting, network monitoring, session hijacking and intrusion detection.

Page 17: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-14 All Rights Reserved

Embedded Linux Distributions

• Embedded systems distributions are used in small computers with a very small amount of memory (typically 2 or 4 MB) and no hard drive.

− BlueCat Linux runs on the Intel, MIPS, ARM and PPC platforms, and comes with many commercial tools and packages for developing and deploying BlueCat on embedded devices.

− ETLinux is a complete Linux distribution designed to run on small industrial computers. It can be run on about 1Mb of space (which includes a web server and a mail server).

− Hard Hat Linux is a compact Linux distribution running on Intel x86, Motorola Power PC, StrongARM and other processors. It comes with the Hard Hat CDK (Cross Development Kit) and is developed by MontaVista Software.

− µLinux (muLinux) is a fully configurable, minimalist, application-centric distribution of Linux originally made in Italy. µLinux resides on a single 1722K floppy and runs on the Motorola and Intel.

− RtLinux is a distribution aimed at applications that need a hard real time Kernel for devices requiring deterministic, preemptive scheduling.

− Embedix is an embedded Linux distribution for the PowerPC, ARM, MIPS and Intel architectures. Lineo, a company specializing in Linux embedded devices, distributes this Embedix.

Page 18: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-15 All Rights Reserved

Distribution-specific Questions

• The following questions are all related to the Linux distributions currently available. There are no definitive answers to these questions; just try to back up your arguments with facts.

3. You are assigned a project in which you need to migrate a network of 150 computers running on the Intel and Sparc architectures to Linux. Which distribution would you recommend for this particular task?

4. A company specializing in residential Internet access devices (also known as firewalls) contracts you to write the specifications for the physical device that will be used for this task. How could this be done with Linux?

5. The technical support company you are working for often has problems with customers accidentally erasing the Master Boot Record of their hard disk. You need to reboot in Linux to fix the problem, but LILO (the boot loader) is no longer available. What distribution could help you quickly solve this problem?

Page 19: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-16 All Rights Reserved

Linux Kernel Installation

• Installing or examining the Linux Kernel source tree can be done in a number of ways, which we will spell out in detail:

− Where to find the Linux Kernel

− Download the Kernel Tree

− CD-ROM Distributions

− CVS

− CVSweb

− Linux Cross Reference

− Installing the Sources

− Location of Components

− Updates and Patches

Page 20: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-17 All Rights Reserved

Downloading the Kernel Sources

• The Linux Kernel sources can be downloaded directly from:

ftp://ftp.kernel.org

− This is always the first place where new versions of the

Linux Kernel appear. Nowadays, the sources are compressed in .bz2 and .tar.gz format, whereas older versions were packaged exclusively as .tar.gz archives.

− You will need the appropriate tool (tar or bunzip2) in order to decompress the .tar.gz or .tar.bz2 Kernel source archives, respectively.

• The Kernel sources can also be found on most distribution CD-ROM. The Red Hat Linux 6.2 distribution, for example, provides RPM packages containing both the Kernel binary packages and the Kernel sources:

− kernel-2.2.14-5.0.i386 – Linux Kernel Compiled for 386 and latter architectures.

− kernel-2.2.14-5.0.i586 – Linux Kernel Compiled for Pentium architecture.

− kernel-2.2.14-5.0.i686 – Linux Kernel Compiled for Pentium II architecture.

− kernel-source-2.2.14-5.0.i386 – Sources of the Linux Kernel

− Precompiled Linux Kernels with SMP (multiprocessor) capabilities are often available as well.

Page 21: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-18 All Rights Reserved

CVS access

• The Linux Kernel CVS tree can be accessed by connecting to vger.samba.org.

− Kernel sources can also be accessed from a CVS (Concurrent Versions System) server, which contains the sources as worked on by several of the main Linux kernel developers. Note however that the official changes to the Kernel may not be included in this CVS server, since Linus Torvalds himself holds the official Kernel development repository.

− Note however that the latest modifications are not necessarily contained in the CVS server, as they are often submitted directly to Linus Torvalds.

− The following are the basic commands that should be used for CVS access. First, you have to set CVS to be able to access the remote server:

cvs -d :pserver:[email protected]:/vger login

Password: cvs

• Secondly, the Kernel tree should be downloaded.

− This process could take some time, depending on the speed of your connection:

cvs -z3 -d :pserver:[email protected]:/vger co linux

− You now have a complete copy of the Kernel sources in linux/ (relative to the directory from which the cvs commands were executed).

Page 22: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-19 All Rights Reserved

CVS and CVSweb

• Updating the Kernel tree is also an important step:

− Changes to the Kernel occur every day. In order to apply the latest modifications to your Kernel tree, you should use (from the linux/ directory where the sources were downloaded):

cvs -z3 update –d

1. Note: In order to download the 2.2 Kernel sources, use the following command:

cvs -z3 -d :pserver:[email protected]:/vger co -r linux_2_2 linux

− The CVS system also allows developers to modify the files on the CVS server. As this requires special permissions from the CVS server administrator and is beyond the scope of this course, please refer to the CVS man pages for more information.

• CVSweb access is also available from: http://vger.samba.org/

− CVSweb lets you look at the sources by using a browser without actually downloading the whole tree.

− You can therefore access the contents of all the individual source files, look at the revision history and commit logs of individual files.

− It is also possible to ask for a “diff” listing between any two versions of the same file, which will give you the modifications executed on this precise file.

Page 23: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-20 All Rights Reserved

BitKeeper

• BitKeeper is a configuration management system that is architecture-independent, scalable and adapted to geographically distributed environments such as the one in which Kernel developers usually work.

− For various reasons, BitKeeper utilization has increased in 2002 among Linux Kernel developers.

− BitKeeper is becoming the de facto standard for developers actively working on the Kernel and submitting patches to the Kernel maintainers.

− BitKeeper handles the creation of patches, which is a good way to save some time during development. The exchange of patches between developers is also greatly simplified and allows maintainers to be much more efficient.

− More information regarding BitKeeper can be found on http://www.bitkeeper.com.

Be aware that BitKeeper contains a license that may seem quite restrictive compared to other Open Source licenses. Basically, any work done with the free version of BitKeeper should be considered publicly available. On the other hand, any company that needs to keep sources secret must pay for the commercial version of BitKeeper. Refer to the BitKeeper web site for more information regarding licensing issues.

Page 24: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-21 All Rights Reserved

BitKeeper (Cont’d)

• BitKeeper supports commands that are similar to CVS. The most commonly used commands for Kernel development are explained here:

− Downloading the official Kernel tree is usually the first step that needs to be done:

bk clone http://linux.bkbits.net/linux-2.4 linux24

− Using the source tree that was just downloaded, we can create a parallel tree based on the one that is already stored on the local disk:

bk clone –l linux24 devel24

− We are now ready to update our development tree with recent changes made by other developers (in this case new drivers):

cd devel24 bk pull http://gkernel.bkbits.net/net-drivers-2.4

− A checkout needs to be done before the files appear at their standard place in the Kernel source tree:

bk -r co –q

− A file in our repository may then be modified and added to the original repository:

bk vi drivers/net/3c509.c bk citool bk push bk://[email protected]/net-drivers-2.4

Page 25: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-22 All Rights Reserved

Linux Cross Reference

• Understanding a code repository as complex as the Linux Kernel source tree can be viewed as difficult, at best. Tools such as the Linux Cross Reference have been developed to browse through this large amount of code.

• The homepage of the Linux Cross Reference is located at http://lxr.linux.no.

− This web site allows users to examine all major Kernel releases, from 1.0.9 to 2.5.x.

− Users may also specify the architecture of the Kernel tree to be viewed.

• The lxr tool lets you search for patterns in the source:

− The identifier search allows users to specify a variable or function name. The results will show the definition of the variable or function, along with the files in which it is invoked.

− The free text search lets you search for a string anywhere in the code and in the comments.

− The file search lets the user specify a file name to search for in the Kernel tree.

• Lxr may also be installed on any computer running Linux. This could be particularly useful for projects in which users must spend a lot of time reading Kernel code.

Page 26: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-23 All Rights Reserved

Questions

• Launch a browser on the Linux Cross Reference web site and answer the following questions:

6. Locate the file named i386_ksyms.c in the 2.4.x Kernel source tree.

7. By using the identifier search tool, find what part of the Kernel the strip_zone data structure is associated with. In which file is this structure defined?

8. A programmer named Alan Cox sometimes hacks the Linux Kernel. Find the files in which he worked in the 2.4.x Linux Kernel. (Optional) Try to find in which particular area he is specializing.

• General Kernel source questions:

9. What is the difference between a Linux Kernel and a Linux distribution?

10. From the perspective of a Linux user who doesn’t know anything about programming, what would be the most appropriate way to update the Kernel? How do developers usually update it?

11. A C developer working for a well-known distribution company is currently working on a project that will enhance USB support in the Linux Kernel. Demand for this type of work is so strong that 15 other developers join the project. What would be the best method to produce a new version of the drivers developed by the team at the end of each week?

Page 27: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-24 All Rights Reserved

Installing the Sources

• We usually refer to the Kernel tree as being placed in /usr/src/linux/.

− The linux/ directory is usually a symbolic link to the directory really containing the tree, which could be /usr/src/linux-2.4.0/ for example.

• Linux kernel versions are divided in two series, which are downloadable separately:

− Experimental (odd series, e.g. 1.3.xx or 2.5.x). These are fast moving versions that are used to test new features, updates, device drivers, etc. By their own nature the experimental kernels may behave in unpredictable ways, so one may experience data losses, random machine lockups, corrupted file systems, etc., depending on the parts of the Kernel which are being used.

− Production (even series, e.g. 1.2.xx, 2.0.xx, 2.2.xx or the new 2.4.xx). Production kernels are known for being very stable, they have almost no known bugs, and problems like file system corruption and system crashes are very unlikely to occur. These are always the ones that are shipped with Linux distributions.

Page 28: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-25 All Rights Reserved

Location of Components

• The Kernel is delimited under specific subdirectories:

− arch/ - Contains architecture-specific code, like assembly files (.S).

− drivers/ - Contains the code for each device drivers supported directly by the Kernel.

− fs/ - The actual implementations of the various supported file systems (ext2, fat, hpfs, etc) are placed in this directory.

− include/ - Header files associated with most .c files in the other directories.

− init/ - Contains the main.c file for the initialization of the Kernel.

− kernel/ - Contains core Kernel functions, which are used in other parts of the Kernel (for example: printk()and panic())

− lib/ - Various library routines. Some are deprecated and not used anymore.

− mm/ - Memory management routines

− net/ - Implementation of the various network protocols available on Linux.

− scripts/ - Contains some Kernel configuration scripts (menuconfig, xconfig) and debugging programs.

Page 29: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-26 All Rights Reserved

Location of Components (Cont')

• The following is a list of the major components of the Linux Kernel along with their approximate weight in the source tree:

Size Directory Files

90Mb /usr/src/linux 7645

4.5 Mb Documentation 380

16.5 Mb arch 1685

54 Mb drivers 2256

5.6 Mb fs 489

14.2 Mb include 2262

28 Kb init 2

120 Kb ipc 6

332 Kb kernel 25

80 Kb lib 8

356 Kb mm 19

5.8 Mb net 453

400 Kb scripts 42

Page 30: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-27 All Rights Reserved

Updates and Patches

• In order to decrease bandwidth consumption and only update the files that are actually different from the older versions, patches can be downloaded. They allow developers to only apply the changes on files that were modified since the last version.

− In the case of kernel 2.4.0 (with sources stored under /usr/src/linux) you would have to get the file named patch-2.4.1.gz or patch-2.4.1.bz2 (depending on the archive format). In order to update your current kernel tree to 2.4.1, execute the following commands:

cd /usr/src gzip –cd patch-2.4.1.gz | patch –p0 (or bzip2 –dc patch-2.4.1.bz2 | patch –p0)

− Since you only download the changes between two kernel versions, the space occupied by a typical compressed patch is around 1 MB (depending on the amount of work done on a specific kernel update), whereas the full compressed kernel is about 20 MB.

− Patches need to be applied in order of increasing kernel version. E.g., you cannot update from 2.4.0 to 2.4.2 with a single patch. patch-2.4.1.gz has to be applied before patch-2.4.2.gz. Failing to do so will result in a corrupted kernel tree.

− Patches are also available for updating a single file, usually a driver for a specific device supported by the Linux kernel.

Page 31: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-28 All Rights Reserved

Linux Kernel Installation

• In order to be usable, the Linux Kernel sources need to be properly configured and compiled. This could involve some of the following steps:

− make config

− make menuconfig

− make xconfig

− Kernel and modules compilation

Page 32: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-29 All Rights Reserved

Configuration – make config

• Description

− Before compiling the kernel, we need to configure the devices, file systems, network protocols, etc., that are needed on the system.

− These details are specified in linux/.config. This text file could be edited by hand, but some tools are available to make the configuration process both fast and easy and to ensure that .config has the right format.

• make config

− Typing make config in /usr/src/linux will start a text-based Kernel configuration tool:

Page 33: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-30 All Rights Reserved

Configuration – make menuconfig

• make menuconfig

− Typing make menuconfig in /usr/src/linux will start another text-based kernel configuration tool. Menuconfig has the advantage of being both easier and faster, since it is based on menus and dialogs:

− Note that menuconfig requires that you install the curses library on your system, if it was not added by default during the installation. On Red Hat, this is provided by the ncurses-devel package.

− Menuconfig is often considered the fastest way to configure the Kernel.

Page 34: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-31 All Rights Reserved

Configuration – make xconfig

• make xconfig

− Typing make xconfig in /usr/src/linux will start the graphical kernel configuration tool. You also need to be in the X Window System when you start xconfig. xconfig has many similarities with menuconfig since it operates with the same set of dialogs and menus:

− Note that xconfig requires a recent version of the Tcl/Tk scripting language. Most distributions install Tcl/Tk by default during the installation.

Page 35: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-32 All Rights Reserved

Kernel Compilation

• Compilation of the Kernel

− make mrproper cleans the Kernel source tree so that no remaining object or configuration file could affect any subsequent Kernel build.

− Prior to compilation and just after configuration (config, menuconfig or xconfig), make dep should be run in order to adjust the dependencies in the various Makefiles used throughout the kernel tree.

− Make sure that your software versions are at least the same as those specified in Documentation/Changes. This is particularly true if the kernel fails to compile correctly.

− make bzImage (or make zImage if the kernel image is small enough) will compile the kernel. This process should take a few minutes, depending on the speed of the processor.

− The Makefile in the root of the kernel tree gives a great description of the steps completed when building any part of the kernel. For example, the Makefile located in drivers/char will determine which files in drivers/char need to be compiled.

− If the user types make bzImage or make zImage, the resulting binary kernel image is stored as arch/i386/boot/bzImage or arch/i386/boot/zImage, respectively.

• Note that ‘make oldconfig’ can be used to recover a .config file that was saved from another Kernel tree.

Page 36: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-33 All Rights Reserved

Kernel Compilation (Cont’d)

− The following is a brief description of the steps executed in order to build the image (we will focus on bzImage here):

a. C and assembly files are compiled in object format (.o). Some of them are assembled into archive object files (.a) by using the ar utility.

b. ld is used to link the object files created in step a. This produces a statically linked (no calls to a library!), ELF 32-bit executable file named vmlinux. This file contains the binary code that implements the various tasks for which we need a kernel.

c. System.map is produced by running the nm vmlinux command, which creates a list of the symbols in vmlinux.

d. bbootsect is created from the bootsect.S assembly file. This will be the piece of code that will reside in the boot sector and start the kernel during the boot stage.

e. bsetup is created from the setup.S assembly file. This will be the piece of code that will be executed just after the boot sector.

f. vmlinux is linked with other files, then compressed and finally becomes bvmlinux.

g. The arch/i386/boot/tools/build tool is used to group together bbootsect, bsetup and bvmlinux into a single file named arch/i386/boot/bzImage, which is the final kernel image.

− If parts of the kernel have been configured as modules, you will need to type make modules. This will compile the Kernel components that were set as Kernel modules during the configuration step.

− Modules also need to be installed in /lib/modules. Otherwise, the initialization scripts will not be able to find them when the system boots. Installing modules in the appropriate directories can be done with make modules_install.

Page 37: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-34 All Rights Reserved

Questions

• The following questions are related to the Kernel source tree and the steps involved in its configuration installation:

12. Briefly state the benefits and drawbacks that are generally associated with development version of the Linux Kernel.

13. Would it be possible to do some Kernel development work on a system for which you don’t have super-user permissions?

14. Apart from decreasing bandwidth consumption, what other benefit results from the use of patches for Kernel updates? (Hint: take a look at the content of a patch file)

15. What is the difference between zImage and bzImage?

16. What are the effects of skipping the make modules and make modules_install steps? Will the system boot correctly if those two steps are not executed?

17. We often hear from people who are new to Kernel compilation that they compiled the Kernel properly but it does not show up in the boot prompt. What step was probably skipped in this case?

Page 38: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-35 All Rights Reserved

Lab 1

Linux Kernel Development Basics

This lab focuses on the steps involved in setting up the environment needed for Linux kernel development. We will begin by installing and building a working kernel for a Red Hat distribution. Upon completion of this first part of the lab, we will experiment how to add modules to an already working kernel. Since this module is completely external to the standard kernel tree, we will also need to compile it from the source, test it and debug it if needed. This practical problem will allow us to have our first contact with the common kernel logging tools. Detailed instructions are contained in the Lab 1 write-up in Appendix C. Suggested time: 60 to 90 minutes

Page 39: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 1-36 All Rights Reserved

Summary

• An Operating System is the most widely used piece of software in any computer. On Linux, the Kernel provides the vast majority of the tasks that we expect from an Operating System.

• The features of the Linux Kernel are almost identical to those of most other UNIX operating systems, largely due to its POSIX compliance.

• Distributions are available for just about every possible task. Some are more specialized than others and may look quite different, but all of them run on the same basic code, the Linux Kernel.

• The Linux Kernel is publicly available, in development as well as production versions.

• The Kernel benefits from having a very modular design, where every task is clearly delimited in specific files and directories.

• The installation and compilation process has a simple, well-defined interface that makes it easy for users to build their own kernels. Make config, make dep, make bzImage are sufficient to build a new kernel image.

Page 40: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-1 All Rights Reserved

Chapter 7

Kernel-Related Commands

Page 41: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-2 All Rights Reserved

Kernel-Related Commands

Objectives

After completing this unit you will be able to:

• Understand the primary commands used to monitor the processes running on the system.

• Use and implement programs that monitor the memory usage of the system.

• Basically understand how some of the Kernel-related commands are implemented, and which system calls they use.

• Develop new programs for reporting Kernel-related information provided by the /proc filesystem.

Page 42: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-3 All Rights Reserved

Kernel Commands: ps

• As you now know, every thing that operates under Linux, every user command, task or system daemon, is represented by the system as a process.

• A formal definition of a process is that it is a single program running in its own virtual address space, and represented in the Kernel as a task.

• Three types of processes are generally used under Linux:

− Interactive processes: A process initiated from (and controlled by) a shell. Interactive processes may be in the foreground or background.

− Batch processes: Processes that are not associated with a terminal but are submitted to a queue to be executed sequentially.

− Daemon processes: Processes usually initiated when Linux boots and that run in the background until they are not longer required.

• The ps command, which stands for “process status”, gives a view of the processes currently running on the system.

• The ps command is available to all system users.

Page 43: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-4 All Rights Reserved

Kernel Commands: ps (Cont’d)

• ps is probably the greatest example of a Kernel-related command that makes use of the /proc virtual file system. Therefore, if the /proc filesystem is not compiled in the Kernel, then the ps command will not be able to execute properly.

• The output of the ps command is always organized in columns (the –l option will display all columns):

− The TTY column shows you which terminal the process was started from.

− The S column in the ps command output shows the current status of the process.

− The TIME column shows the total amount of system (CPU) time used by the process so far. These numbers constitute the total CPU time, not the amount of real time the process has been alive.

− The CMD column contains the name of the command line you are running. This is usually a command that was started by the user, although some programs automatically create child processes (i.e., Apache).

− The USER column shows who started and owns the process.

− PID and PPID are respectively the process ID and parent process ID of each task.

− PRI and NI are respectively the priority and nice values.

− SZ is the size of the code, data and stack space, in Kb.

Page 44: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-5 All Rights Reserved

Kernel Commands: ps (Cont’d)

• Since it does not have to invoke any system calls, ps does not need (and should not have!) any special privilege (i.e. SUID) assigned to it.

• The ps command supports a wide variety of options, some of which are described in the following lines:

− The -A option selects every process currently running on the system.

− Some options let the user define which types of processes need to be listed. For example, they can be selected by command line, UID, GID, user name and terminal.

− Various output format switches may be specified in order to change the way the ps format presents the information.

• Note that various standards have set the way ps should work.

− Unix98, BSD and GNU all determine a different way to pass options to ps. The ps program used on Linux supports each one of these.

− This is actually the reason why “ps –a” and “ps a” gives the same result on Linux.

Page 45: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-6 All Rights Reserved

Kernel Commands: top

• Top is designed to provide the same type of information as ps, except that it does it in a dynamic and interactive way.

• Top provides the following information:

− Uptime

− Processes running on the system

− Memory and swap statistics

− the PID and PPID of each process

− User ID and username of each process' owner

− The priority and nice values associated with the process.

− The size of the process in memory

− Code size of the process

− Size of dirty pages

− Total amount of physical memory used by each process

− CPU time the process has used

− Percentage of CPU time and memory

− Command name of the process

Page 46: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-7 All Rights Reserved

Kernel Commands: top (Cont’d)

• top also relies on the /proc virtual filesystem, particularly on the stat, uptime, loadavg and meminfo files.

• top is part of the procps package. The program itself is located in top.c, but most data-gathering related functions are the following:

− The sprint_uptime() function fills the fields related to the time the system has been working without rebooting, as well as the load averages during this time.

− show_meminfo() is used to report the memory and swapspace usage.

− readproctab2() gets a list of the processes currently running on the system.

− show_task_info() gives information about a specific process.

• Top supports the following options:

− -d specifies the delay between screen updates (this can be changed when top is running by using the s command).

− -q forces top to refresh without a delay.

− -S uses cumulative mode (the CPU time listed for each process includes any children the process spawned).

− -s disables interactive commands (secure).

− -i ignores idle or zombie processes.

Page 47: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-8 All Rights Reserved

Kernel Commands: free

• free is used to report the amount of memory that is free and in use.

• It is part of the procps package.

• Just like the ps and top programs, free uses the /proc filesystem to output some information about the memory consumption of the system. /proc/meminfo is the file used to gather this information.

• The output includes the total amount of used and free memory in the system (both physical memory and swap), along with information about the shared memory and buffers used by the Kernel.

• The free command supports a few options:

− Memory units may be specified with the -b, -k or -m options, which force free to represent information in bytes, kilobytes or megabytes respectively.

− The -s switch, along with a number, will continuously update the output of free, correspondingly to the specified delay number.

Page 48: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-9 All Rights Reserved

Free Internals

• The implementation of free is located in the free.c file of the procps package.

• The code involved with the free command largely deals with text formatting, which we will not describe here.

• Since we deal with the /proc filesystem in this program, the most significant function is obviously used to fetch the data located in /proc/meminfo.

− The meminfo() function, which is also shared with the top command, is used to accomplish this task.

− Notice that the meminfo() functions use the FILE_TO_BUF macro to speed up access to the files in /proc.

− Once the /proc/meminfo file is opened, all that is needed is to scan the content of the file in order to retrieve the needed information.

Page 49: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-10 All Rights Reserved

Kernel Commands: init

• init acts as the parent of all processes. Its main purpose is to create processes during startup.

• It is the first process to be started when the system is initialized. This involves a few constraints:

− Clearly, the init process cannot be the result of a fork(), since no process has been executed before init.

− The Kernel must thus be able to start the init process. This is accomplished in the init() function in the init/main.c file (we will describe the boot process in more detail in Chapter 8).

− The Kernel uses the execve() function in order to start the init program.

− The Sys V initialization standard is usually the standard in the Linux world. This standard specifies that the init program should be located in /sbin.

− However, some distributions such as Debian rely on the older BSD initialization standard. It assumes that the init program is located in /etc.

− The Kernel will look in /etc and /bin if it does not find the init program in /sbin.

Page 50: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-11 All Rights Reserved

Kernel Commands: init (Cont'd)

• The Kernel also supports other init replacements:

− The program to be used as the first process may be specified as a command line in the LILO boot loader. For example, the argument init=/usr/bin/mc will start the Midnight Commander file manager as the first and unique process.

− If no init command line is provided in LILO and the Kernel does not find the init executable in one of the standard directories, a shell will be started (/bin/sh) upon startup.

− In this previous case, no startup scripts will be executed, but programs will still be executable on the physical console.

• Processes that should be started by init are specified in /etc/inittab.

− This file contains some information about the programs that should be started by init during system initialization.

− Runlevels are defined in this file, so that processes may be grouped in different execution levels, depending on the kind of process to be executed.

Page 51: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-12 All Rights Reserved

Kernel Commands: init (Cont'd)

• The started processes are grouped in runlevels. The following is a brief description of the various runlevels used by init:

# 0 - halt (also called shutdown) # 1 - Single user mode # 2 - Multiuser, without NFS (Same as 3, if networking is not enabled) # 3 - Default multiuser text mode # 4 - Usually unused, depending on distributions. # 5 - X-Windows for System V (default for graphical workstations). # 6 – reboot

• The init program is used to start /etc/rc.d/rc.sysinit, which calls every necessary service that needs to be initialized at boot time.

Page 52: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-13 All Rights Reserved

Kernel Commands: shutdown

• shutdown is used to bring the system down in a graceful and secure way. It also manages how processes are brought down.

− Due to the high number of critical processes running on the system and the nature of the filesystem, a computer running Linux should not be directly switched off.

− shutdown notifies users who are currently logged on that the system is going down and that they will be disconnected. It also blocks any incoming login request by disabling the login process (by creating the /etc/nologin file).

− Running processes are sent a signal (SIGTERM) that notifies them that they should stop.

− The init process is warned by shutdown, and changes the runlevel of the system. By default, init switches the state of the system to runlevel 1, which lets the user perform administrative tasks.

− shutdown also needs to synchronize the file system buffers with the physical file systems.

Page 53: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-14 All Rights Reserved

shutdown Internals

• shutdown is part of the sysvinit package, which also includes the sources of init, mesg, wall, last, reboot and halt.

• The reboot() system call is critical for the shutdown program. This syscall is defined in the init_reboot macro:

#if defined(__GLIBC__) #define init_reboot(magic) reboot(magic) #else #define init_reboot(magic) reboot(0xfee1dead,

672274793, magic)

#endif

• Notice that the two first parameters of the reboot system call contain fixed hexadecimal number. These are called "magic" number, and are utilized as mechanisms to make sure the calling function intended to bring the system down.

Page 54: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-15 All Rights Reserved

shutdown Internals (Cont’d)

• The third field is the important one, and varies depending on the arguments passed by the user executing the shutdown command:

− LINUX_REBOOT_CMD_RESTART makes the system reboot.

− LINUX_REBOOT_CMD_POWER_OFF powers down the system.

− LINUX_REBOOT_CMD_CAD_ON and LINUX_REBOOT_CMD_CAD_OFF are used to enable or disable the CTRL-ALT-DEL keys on the system.

Page 55: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-16 All Rights Reserved

Kernel Commands: strace

• strace is used for tracing the system calls invoked by a given process as well as monitoring the received signals.

• This program is a useful debugging and educational tool.

− By knowing which system calls are invoked, the user can gain a fairly deep understanding of the way a program works.

− The traced programs do not need to be compiled in debug mode in order to be traced.

− The strace program does not need the sources of the traced program.

• strace returns the name of the system call, its arguments and the return values. Each line corresponds to a system call invoked by the program during its execution.

Page 56: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-17 All Rights Reserved

strace Example

• A simple example could give you an idea of how powerful the strace command is:

− Create a new file containing a single character:

echo "1" > ~/testfile

− Use strace to examine the system calls used by the cat command when it is used on your newly created file.

− Try to associate each output line with the function that you would expect from a command such as the one that you have just executed.

• As you have seen in the previous example, the number of system calls monitored can grow to a fairly large number. The -e option can be used to filter the syscalls so that only a subset of them will be shown:

− The trace value can take any of the following: file, process, network, ipc and signal. The resulting command would look like the following:

strace -e trace=network ping 192.168.1.1

Page 57: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-18 All Rights Reserved

Kernel Commands: traceroute

• traceroute prints the route that a packet takes in order to reach the specified host.

• This command has only one mandatory argument, which is the IP address of the host to reach.

• The traceroute program sends some UDP probe packets containing a small TTL (Time To Live) value.

− When the packet with an expired TTL goes through a gateway, the program expects to get an ICMP "time exceeded" reply from that gateway.

− By gradually incrementing the TTL, traceroute can find every gateway through which the packet goes.

• The traceroute program assumes that the gateway will reply with an ICMP reply if the TTL is expired, which is not always the case.

− In this case, a star (*) will be shown to let the user know that the gateway did not respond.

Page 58: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-19 All Rights Reserved

Kernel Commands: mount - umount

• mount is used to attach a filesystem on the current root filesystem. Similarly, umount detaches a filesystem that was previously mounted from the root filesystem.

• The mount command is part of the util-linux package. The sources for mount are located in mount/mount.c

− The mount() system call is crucial for this program.

• Mount without any arguments will show the filesystems currently mounted on the system.

• The mount program expects the user to specify which type is the mounted filesystem.

− The code in mount/mount_guess_fstype.c is used in cases where the filesystem type is not specified. It will try to find the filesystem type that the user tries to mount by looking in its superblock.

− Recall that the filesystems currently supported by the system are listed in /proc/filesystems.

Page 59: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-20 All Rights Reserved

Questions

1. Explain the memory values that you obtain by running the top command on your system.

2. Turn your swap partition off by using the swapoff command. Examine the output of top. What is different now? Does this have severe effects on the system responsiveness?

3. By looking at the ps man pages, find how you could list every process on a system except the ones associated with user “tux”.

4. Why is the init process considered the “father of all processes”?

5. Why is the strace tool often considered awkward for debugging large programs?

6. Name at least three other Kernel-related commands apart from the ones that we have just described.

7. Why do most of the programs we have looked at make use of the /proc filesystem, instead of using system calls?

8. You found that the web and FTP servers on your system were cracked, but you don’t want to switch it off and loose the 679 days uptime that you reached. How could you quickly close every service on this server without loosing this impressive uptime?

Page 60: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-21 All Rights Reserved

Summary

• The ps and top commands are used to provide information about processes running on the system.

• The free command presents information about the current memory state of the system.

• init is the most crucial process for a Linux system, since it is the first process executed. It also manages the execution of the various programs started at boot time through runlevels.

• The shutdown command is necessary on any UNIX system in order to bring the system down in a secure way.

• strace is a powerful tool that allows programmers to gain a clear understanding of the way a program is built, even if the source is not available.

• The /proc filesystem is an elegant way to avoid making use of system calls in order to look at the state of the system. The fact that it is almost entirely text-based also makes it easy to integrate into programs.

Page 61: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 7-22 All Rights Reserved

Page 62: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-1 All Rights Reserved

Chapter 11

SCSI Subsystem

Page 63: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-2 All Rights Reserved

SCSI Subsystem

Objectives

After completing this unit you will be able to:

• Understand the basic architecture under which the SCSI subsystem is built.

• Identify the layers through which SCSI commands are communicated to the hardware.

• Describe the mechanisms and functions associated with each of the layers used in the SCSI subsystem.

• Have a basic understanding of how a SCSI controller driver is implemented in Linux.

Page 64: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-3 All Rights Reserved

SCSI Architecture Overview

• The SCSI subsystem is integrated in the Kernel as a 3-level architecture.

− The upper level is the one being closest to User space.

− The lower level is closest to the hardware.

− The mid level is the unifying layer between the upper and low level.

• Any operation using the SCSI subsystem will involve one driver at each of these three levels.

− For example, a User space application will access a device implemented by the upper level, which will call the mid level for requests to the physical device.

− The mid level will in turn communicate with the lower level, which implements the SCSI controller driver and allows access to the physical device.

• SCSI involves one particular aspect: the only hardware-related code is related to the SCSI controller.

− The SCSI standard is strictly defined so that the upper layers handles the details about access to disks, CDROM or tapes.

− Therefore, the only code that might change if the hardware is modified is the controller device driver.

− Generally, no device drivers are necessary for devices such as disks and CDROMs.

Page 65: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-4 All Rights Reserved

SCSI Architecture Overview (Cont’d)

• The SCSI architecture may be viewed as the following diagram:

Kernel Space

User Space

SD Block

Devices

SR Block

Devices

SG Character Devices

ST Character Devices

SCSI Mid Level

Upper Level

Low Level SCSI Adapter

Pseudo Driver for non-SCSI Services

Page 66: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-5 All Rights Reserved

Names and Conventions

• A naming convention must be used in cases where the user and the Kernel need to refer to a specific device.

• SCSI Device names are located in the /dev directory. Just as every other device, they are identified with their device type (block or character), major and minor device numbers.

− Each major number can support 256 minor numbers.

− Major numbers associated with a specific device are listed in Documentation/devices.txt. You should always refer to this file before associating a device with a given major number.

• Eight major numbers are reserved for SCSI disks (sd) devices: 8,65,66,67,68,69,70 and 71.

− A single major number, for example 8, allows devices to span from /dev/sda (which is allocated major 8 and minor 0) to /dev/sdp15 (which is allocated major 8 and minor 255).

− The next major number, 65, will span from /dev/sdq to /dev/sdaf15.

− Note that a maximum of 15 partitions are possible on a single SCSI disk, so the ending digit of any SCSI device will not be larger than 15.

− Devices that do not end with a digit (e.g., /dev/sda) refer to a whole SCSI disk.

− The maximum number of SCSI disks is therefore 128 (compared to 20 for the IDE subsystem).

Page 67: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-6 All Rights Reserved

Names and Conventions (Cont’d)

• SCSI CDROM (sr) devices have been allocated the block major number 11.

− 256 CDROM devices are therefore supported on a single system.

− Most recent Linux distributions refer to sr devices as “scd” in the /dev directory.

• SCSI Tape devices are allocated the char major number 9.

− 32 tapes are supported on a single system, and each one of them may be access in four different modes. This takes up 128 minor numbers.

− Tape devices may also be accessed in rewind or non-rewind mode. Minor numbers from 0 to 127 are used for rewind mode, whereas minor number 128 to 256 are used for non-rewind mode.

− For example, /dev/st0 refers to tape number 0, mode 0 in rewind mode. /dev/nst0 refers to tape number 0, mode 0 in non-rewind mode. /dev/st0l, /dev/st0m and /dev/st0a refer to mode 1, 2 and 3 respectively.

• SCSI generic (sg) devices are allocated the char major number 21 and support 256 sg devices, which are identified from /dev/sg0 to /dev/sg255.

Page 68: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-7 All Rights Reserved

Upper Level SCSI Layer

• The upper level drivers are usually referred to with two-character device names:

− sd stands for SCSI disks, like hard drives. These are always block devices.

− sr devices correspond to CDROM and DVD drives, which are also block devices.

− st stands for SCSI tapes, which are character devices.

− sg is for pass through character devices, which are commonly used in scanner drivers.

• The upper level implements the User-Kernel interfaces.

• The files in the Kernel source tree that implement the upper level are the following:

− drivers/scsi/sd.c and drivers/scsi/sr.c provide the SCSI disk and CDROM device drivers, respectively.

− drivers/scsi/st.c and drivers/scsi/sg.c provide the SCSI character and generic device drivers, respectively.

• The modules corresponding to each device type are sd_mod.o, sr_mod.o, st.o and sg.o. They are needed for SCSI disks, SCSI CDROMs, SCSI tapes and generic devices, respectively.

Page 69: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-8 All Rights Reserved

Block Devices

• Block devices are represented as the “sd” and “sr” component of the upper layer.

• The most common operation on a block device is to mount a filesystem.

− For a sd (SCSI disk) device, this is typically done with the following command:

mount –t ext2 /dev/sda6 /home

− For a sr (CDROM for example) device, this would be achieved in the following way:

mount –t iso9660 /dev/sr0 /mnt/cdrom

− The dd (disk dump) device can also be used to read or write from block devices. In this case, the block size parameter (bs) needs to be set to the block size of the device, which is usually 512 bytes.

− The fdisk command also takes charge of sd devices.

− The hdparm utility, which is usually used to change settings related to IDE hard disks, can also be used to change some options for SCSI disks.

Page 70: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-9 All Rights Reserved

CDROM And Char Devices

• The “sr” component of the upper layer is part of the CD-ROM subsystem.

− Just as for SCSI disks, these kinds of devices may be mounted with the mount command.

− Audio CD’s may also be read. This does not involve mounting the disk. This operation rather makes use of IOCTLs in order to gain access to the CD-ROM disk.

• Character devices include the st component of the upper layer for reading and writing to tape devices.

− General-purpose commands such as tar and dd may be used for these devices, but the mt command has been specially designed for the purpose of reading and writing tape devices.

Page 71: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-10 All Rights Reserved

Generic drivers

• The Linux sg driver is a upper level SCSI subsystem device driver that is used to handle devices that are not covered by the other upper level drivers: sd (disks), sr (CDROMs) and st (tapes).

• General-purpose commands such as dd and mount may not be used for sg devices.

• This device is usually used by applications such as SANE (for scanners) and cdrecord (for CD writers).

• This driver supports system calls that are expected to work on any character device, like open(), close(), write(), read() and ioctl().

• For each of these system calls, the sg_io_hdr_t data structure is used to convey information about the length of data to be read/written by the associated SCSI command. The following fields are included:

− The dxfer_direction determine if the current operation is a read or write.

− The cmd_len is the command length that cmdp points to.

− timeout is the time elapsed before a command aborts.

− status contains the status byte, defined in the SCSI standard.

− info contains some information about the current operation.

Page 72: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-11 All Rights Reserved

Mid Level SCSI Layer

• This layer is mostly implemented in scsi.c.

• The SCSI mid level is common to all operations, independently of the upper level component that is used and the SCSI controller installed on the system.

• It provides internal interfaces and common services to the upper and lower level drivers.

• The mid level uses the Scsi_Host_Template object, which provides an interface to low-level drivers.

− This object offers the interface that is used by the mid level layer to communicate to the low level layer.

− This is the only way that the SCSI controller (and therefore the SCSI devices) may be accessed.

− As long as the interface of the Scsi_Host_Template object does not change, new SCSI controller drivers may be added in the Kernel without modifying any other code. This is one of the main benefits that made the Kernel “object-oriented”.

• The mid layer also uses the scsi_cmnd data structure.

− This data structure is passed as an argument to most of the functions implemented by the low-level device driver.

− It is used by the high-level code to specify a SCSI command for execution by the low-level code.

− This data structure contains a private and public part. Low-level drivers use only the public part.

Page 73: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-12 All Rights Reserved

Low Level SCSI Layer

• A SCSI controller driver basically implements a series of functions as defined in the SHT data structure. This provides an interface that the mid-level layer will use in order to access the SCSI device.

• The functions implemented by the device driver are defined in the Scsi_Host_Template object. The following are the fields that are most commonly implemented:

− The proc_info() function is used to export information about the device to the /proc filesystem.

− name and procname contain the name of the device driver.

− The detect() function is called by the SCSI layer when the driver is initialized. This function is called at boot time or when a SCSI device module is loaded.

− The info() function returns a description of the actual controller itself.

− The queuecommand() function issues a SCSI command and does not wait for it to finish.

− The command() function issues a SCSI command and then waits for it to complete. This usually uses queuecommand().

− The abort() function is used to handle error situations. For example, if a time out occurred in the mid level layer, it may want to abort the command.

Page 74: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-13 All Rights Reserved

Summary

• The Linux SCSI subsystem relies on three specific layers:

− The low level layer interfaces directly with the hardware.

− The mid level layer receives commands from upper levels and sends requests to the low level drivers.

− The high level layer provides the user with devices that can be used accessed with common system calls.

• SCSI devices are separated in four different types:

− SCSI Disks (sd)

− SCSI CDROMs (sr)

− SCSI character or tape devices (st)

− SCSI generic devices (sg)

• The implementation of SCSI controllers is object oriented. A set of well-defined functions needs to be implemented in order to have a fully functioning driver.

Page 75: LinuxInt 13 Contents · Introduction to Linux Internals Objectives After completing this unit you will be able to: • Setup the environment that will be used for Kernel development.

Linux Internals

Rev. 1.3 Copyright © 2002 Object Innovations, Inc. 11-14 All Rights Reserved