Top Banner
29

BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Jan 05, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
Page 2: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

BY

A Mikati

&

M Shaito

Page 3: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.
Page 4: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Awk Utility

Introduction Some basics Some samples Patterns & Actions Regular Expressions Boolean start /end BEGIN /END

Page 5: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Awk Utility (continued)

Awk variables Control of flow statements: a: If_Else statement b: While Statement c: For statement

Page 6: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Introduction:

History:

The name awk comes from the initials of its designers: Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan. The original version of awk was written in 1977. In 1985 a new version made the programming language more powerful, introducing user-defined functions, multiple input streams, and computed regular expressions.

Page 7: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Introduction (cont’d):

If you are like many computer users, you would frequently like to make changes in various text files wherever certain patterns appear, or extract data from parts of certain lines while discarding the rest. To write a program to do this in a language such as C or Pascal is a time-consuming inconvenience that may take many lines of code. The job may be easier with awk.

The awk utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs easily with just a few lines of code.

Page 8: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Some Basics:The basic function of awk is to search files for lines (or other units of text) that contain certain patterns.

Awk recognizes the concepts of "file", "record", and "field".

A file consists of records, which by default are the lines of the file. One line becomes one record.

Awk operates on one record at a time.

A record consists of fields, which by default are separated by any number of spaces or tabs.

Field number 1 is accessed with $1, field 2 with $2, and so forth. $0 refers to the whole record.

Page 9: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Some Samples:>awk ‘{print $0}’ filename

Perhaps the quickest way of learning awk is to look at some sample programs. The one above will print the file in its entirety, just like cat(1). Here are some others, along with a quick description of what they do.

>awk '{print $2,$1}' filename

will print the second field, then the first. All other fields are ignored.

What if you don't want to apply the program to each line of the file? Say, for example, that you only wanted to process lines that had the first field greater than the second. The following program will do that:

>awk '$1 > $2 {print $1,$2,$1-$2}' filename

Page 10: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Patterns & Actions:The part outside the curly braces is called the "pattern", and the part inside is the "action". The comparison operators include the ones from C:

== != < > <= >= ?:If no pattern is given, then the action applies to all lines. This fact was used in the sample programs above. If no action is given, then the entire line is printed. If "print" is used all by itself, the entire line is printed. Thus, the following are equivalent:

awk '$1 > $2' filenameawk '$1 > $2{print}' filenameawk '$1 > $2{print $0}' filename

Page 11: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Patterns &Actions: (cont’d)

The various fields in a line can also be treated as strings instead of numbers. To compare a field to a string, use the following method:

>awk '$1=="foo"{print $2}' filename

There are various types of patterns and actions that will be explained in details.

Page 12: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Kinds of patterns:/regular expression/

A regular expression as a pattern. It matches when the text of the input record fits the regular expression.

expression A single expression. It matches when its value, converted to a number, is nonzero (if a number) or non null (if a string).

BEGIN END Special patterns to supply start-up or clean-up information to awk.

null The empty pattern matches every input record.

Page 13: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Regular Expressions:

A regular expression, or regexp, is a way of describing a class of strings. A regular expression enclosed in slashes (`/') is an awk pattern that matches every input record whose text belongs to that class.

The simplest regular expression is a sequence of letters, numbers, or both. Such a regexp matches any string that contains that sequence. Thus, the regexp `foo' matches any string containing `foo'. Therefore, the pattern /foo/ matches any input record containing `foo'. Other kinds of regexps let you specify more complicated classes of strings.

>awk '/foo.*bar/{print $1,$3}' filename

Page 14: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Boolean:

A Boolean pattern is an expression which combines other patterns using the Boolean operators "or" (`||'), "and" (`&&'), and "not" (`!'). Whether the Boolean pattern matches an input record depends on whether its subpatterns match.

For example, the following command prints all records in the input file `filename' that contain both `2400'and `foo'.

awk '/2400/ && /foo/' filename

Page 15: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Start & end:

There are three special forms of patterns that do not fit the above descriptions. One is the start-end pair of regular expressions. Also it is known as range pattern which is made of two patterns separated by a comma, of the form startpat, endpat. It matches ranges of consecutive input records. The first pattern startpat controls where the range begins, and the second one endpat controls where it ends. For example, awk '$1 == "on", $1 == "off"’ filename

Page 16: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

BEGIN /END:

Any action associated with the BEGIN pattern will happen before any line-by-line processing is done. Actions with the END pattern will happen after all lines are processed.

But how do you put more than one pattern-action pair into an awk program? There are several choices.

One is to just mash them together, like so:

>awk 'BEGIN{print"fee"}\ $1=="foo"{print"fi"}\

END{print"fo fum"}' filename

Page 17: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

BEGIN /END: (cont’d)

Another choice is to put the program into a file, like so:

BEGIN{print"fee"}$1=="foo"{print"fi"}END{print"fo fum"}

Let's say that's in the file giant.awk. Now, run it using the "-f" flag to awk:

>awk -f giant.awk filename

Page 18: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

BEGIN / END : (cont’d)

A third choice is to create a file that calls awk all by itself. The following form will do the trick

#!/usr/bin/awk -fBEGIN{print"fee"}$1=="foo"{print"fi"}END{print"fo fum"}

If we call this file giant2.awk, we can run it by first giving it execute permissions,

>chmod u+x giant2.awk

and then just call it like so:

>./giant2.awk filename .

Page 19: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

BEGIN /END: (cont’d)

awk has variables that can be either real numbers or strings. For example, the following code prints a running total of the fifth column:

>awk '{print x+=$5,$0 }' filename

This can be used when looking at file sizes from an "ls -l". It is also useful for balancing one's checkbook, if the amount of the check is kept in one column.

Page 20: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Actions:

An awk program or script consists of a series of rules and function definitions, interspersed. A rule contains a pattern and an action, either of which may be omitted. The purpose of the action is to tell awk what to do once a match for the pattern is found. Thus, the entire program looks somewhat like this:

[pattern] [{ action }][pattern] [{ action }]function name (args) { ... }

An action consists of one or more awk statements, enclosed in curly braces (`{' and `}'). Each statement specifies one thing to be done. The statements are separated by newlines or semicolons.

Page 21: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Actions: (cont’d)

Here are the kinds of statements supported in awk: 1)Expressions, which can call functions or assign values to variables .Executing this kind of statement simply computes the value of the expression and then ignores it. This is useful when the expression has side effects

2)Control statements, which specify the control flow of awk programs. The awk language gives you C-like constructs (if, for, while, and so on) as well as a few special ones 3)Compound statements, which consist of one or more statements enclosed in curly braces. A compound statement is used in order to put several statements together in the body of an if, while, do or for statement.

Page 22: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Actions:(cont’d)

4)Input control, using the getline command and the next statement

5)Output statements, print and printf.

6)Deletion statements, for deleting array elements.

Page 23: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Awk variables

Most awk variables are available for you to use for your own purposes; they never change except when your program assigns values to them, and never affect anything except when your program examines them.

A few variables have special built-in meanings. Some of them awk examines automatically, so that they enable you to tell awk how to do certain things. Others are set automatically by awk, so that they carry information from the internal workings of awk to your program.

user-modified: Built-in variables that you change to control awk. Auto-set: Built-in variables where awk gives you info.

Page 24: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Control of flow statements:Control statements such as if, while, and so on control the flow of execution in awk programs. Most of the control statements in awk are patterned on similar statements in C.

All the control statements start with special keywords such as if and while, to distinguish them from simple expressions.

Many control statements contain other statements; for example, the if statement contains another statement which may or may not be executed. The contained statement is called the body. If you want to include more than one statement in the body, group them into a single compound statement with curly braces, separating them with newlines or semicolons.

Page 25: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

If- statement :The if-else statement is awk's decision-making statement. It looks like this: if (condition) then-body [else else-body]condition is an expression that controls what the rest of the statement will do. If condition is true, then-body is executed; otherwise, else-body is executed (assuming that the else clause is present). The else part of the statement is optional. The condition is considered false if its value is zero or the null string, and true otherwise.

awk '{ if (x % 2 == 0) print "x is even"; else print "x is odd" }'

Page 26: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

While Statement :In programming, a loop means a part of a program that is (or at least can be) executed two or more times in succession.

The while statement is the simplest looping statement in awk. It repeatedly executes a statement as long as a condition is true. It looks like this:

while (condition) body

this example prints the first three fields of each record, one per line. awk '{ i = 1 while (i <= 3) { print $i i++ }}'

Page 27: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

For Statement :

The for statement makes it more convenient to count iterations of a loop. The general form of the for statement looks like this:

for (initialization; condition; increment) body

This statement starts by executing initialization. Then, as long as condition is true, it repeatedly executes body and then increment.

Here is an example of a for statement:

awk '{ for (i = 1; i <= 3; i++) print $i }'

This prints the first three fields of each input record, one field at a time.

Page 28: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

Thanks for listening

A Mikati

M Shaito

Page 29: BY A Mikati & M Shaito Awk Utility n Introduction n Some basics n Some samples n Patterns & Actions Regular Expressions n Boolean n start /end n.

For more information about Awk utility

VISIThttp://mshaito.tripod.com/awk/awk.html

http://