awk ­ A Pattern Scanning and Processing Language

The name is not an abbreviation of awkward, but consists of the initials of its authors!

The general form of awk is like sed,

awk 'commands' filenames

commands have the form

pattern { action }

As with grep and sed, awk reads the files one line at a time, compares it with the pattern, then takes the action on matching. The input file is not changed.

Either the pattern or the action may be omitted. The default is "match anything" and the default action is to print the input line.

awk considers each input line as a record, and splits it into fields, i.e. strings of characters separated by blanks or tabs.

Example
Suppose we had the following data in a file called contacts

Jane Gallagher 278819 59_George_St Oxford
Christiane Jarvis 273231 13_Banbury_Rd Oxford
Laura Green 273288 13_Banbury_Rd Oxford

awk would consider this to be a file of 3 records, each consisting of 5 fields. It calls the fields $1, $2, $3, .... $NF where NF is a variable holding the total number of fields, in this case 5. The whole input line, i.e. one record is called $0.

To extract the forenames and the telephone numbers, we could use the following command:

awk '{ print $1, $5, $3 }' contacts

Example using data files with non-blank or tab separators
If a different field separator has been used, awk can cope with this. Had the file contacts looked like:

Jane:Gallagher:278819:59 George St:Oxford
Christiane:Jarvis:273231:13 Banbury Rd:Oxford
Laura:Green:273288:13 Banbury Rd:Oxford

then the same command with the added -F: option,

awk -F: '{ print $1, $5, $3 }' contacts

would achieve the same result.

Examples using awk programming structures
If the file were longer and only Christiane's number were required then a pattern could be added to the command. This pattern tests if field 1, of each record, is equal to the string Christiane.

awk '$1 == "Christiane" {print $1, $5, $3 }' contacts

Once again, regular expressions can be used as well as literal strings.

To negate a comparison, use ! in front of the condition.

awk '!($5 == "Oxford"){print $1, $5, $3}' contacts

awk has string functions such as length() and substr(). In fact awk is a complete programming language in its own right, able to perform arithmetic, process arrays and format data.

Help: for more information see awk(1).