Text Processing With Filters


 

Overview

When large sets of data are to be edited, an interactive screen based editor is not always the best way to make repeated changes. Many editors can look for strings to match, but cannot check for more complex patterns. For example, from the start of each line, find only the first ":" ; replace it with "[" and output the next 7 characters to a file. Unix has several programs that process text files and perform user-specified, repeated actions on their contents.

Three filters looked at in this document are grep, sed and awk:

grep
Global Regular Expression match and Print, this is useful for extracting lines from a file (or output from a command) containing certain patterns.
sed
Stream EDitor, this allows 'search and replace' type actions on files.
awk
Can be used to do the same things as sed and grep but is actually a programming language in its own right and therefore allows more sophisticated operations.

All of these utilities act upon a given search string (pattern). These patterns are referred to as regular expressions,

Important!

All of these filters, by default, send the resulting output to the standard output stream. If you wish to save the output, never redirect the output to the file from which you are filtering but to a new filename.