AWK on Linux: A Text Processing Language with Examples.

09 Jun 2026, 20:08:04
Most operations on Linux and Unix systems involve managing text streams. Commands, configuration files, logs, and data input and output - all of these are text that we need to manage.
To effectively manage text data, the specialized programming language AWK was created in 1977. The name comes from the first letters of the authors’ last names: Alfred Aho, Peter Wainberger, and Brian Kernighan.
Decades later, AWK remains one of the core tools for working with Linux and Unix systems and is included in the standard software package of many distributions based on them.
In this article, we will look at the syntax, basic parameters, and specific examples of how this popular language is used.

How AWK Works

The AWK utility’s speed stems from its ability to read an input stream or file line by line, regardless of the size of the input data stream. The program’s operating model is based precisely on this:
  1. Reading a line;
  2. Splitting a string into fields;
  3. Check terms and conditions;
  4. Performing actions;
  5. Go to the next line.

Structure of an AWK script

The complete structure of an AWK script consists of three blocks:
BEGIN { }Performed before reading the data.
/search_pattern/Request a template.
{ print $1 }The main set of instructions that apply to each line.
END {
    print "Done"
}
This is performed after all data has been processed.

However, in everyday tasks, AWK is typically used for quickly processing single-line data, so BEGIN and END blocks are almost never encountered. The simplified form of the command is more common. For example:
awk -F ':' '{ print $1 }' /etc/passwd
  • -F - sets the separator (by default, space and tab).
The command displays a list of system users:
20260609_i9cwsYs5
In this example, we used the variable $1, into which AWK places the first field of the line. We specified the separator “:” because that is the one used in the input file.
AWK creates as many variables as there are fields found in the line. Numbering starts with one ($1). The variable $0 contains the entire line.
In addition to standard variables, there are special AWK variables that store information about the input data.

AWK Special Variables

VariablePurpose
NRCurrent line number
NFNumber of fields
FNRLine number in the current file
FSField separator
RSLine separator
OFSOutput data separator
ORSOutput data line separator
As an example, let’s display only the usernames and their home directories with line numbers, and change the field separator to "|":
awk -F ':' -v OFS='|' '{ print NR,$1,$6 }' /etc/passwd
  • -F - sets the separator;
  • -v - assigns a value to a variable;
  • NR - current line number;
  • $1 и $6 - the first and sixth fields.
The result:
20260609_pFJfZHjY
We can also construct the string ourselves:
awk -F ':' -v OFS='' '{ print NR,". ",$1,": ",$6 }' /etc/passwdPlease note that we changed the variable OFS='' by setting it to an empty string as the separator in order to obtain the desired output format.20260609_PgaFEHyR

Examples of Using AWK

Like many other programming languages, AWK can work with arrays. In practice, this can be used, for example, to list the top IP addresses by number of requests from an Nginx log:
awk '{ ip[$1]++ } END { for(k in ip) print ip[k], k }' access.log | sort -nr | headIn this command, we create an associative array ip and add elements to it where the key is an IP address and the value is the number of lines in which that IP address appeared in the first field. Next, we use a loop to print all the array elements and their values.
20260609_MIRmak5A
AWK can count the number of unique users (unique IP addresses in the site's Nginx log):
awk '{ ip[$1]=1 } END { print length(ip) }' access.logIn this command, we used the built-in AWK function length(), which returns the number of elements in the array ip. Of course, this is a rough example that takes into account all requests to the site, including those from various bots.
20260609_oSWW6kC9
You can use AWK to check for conditions. Let’s extract all requests that returned an error code from the Nginx log:
awk '$9 >= 400 { print $0 }' access.logWe check the value of the ninth field ($9) and compare it to the number 400 (response codes of 400 or higher indicate an error); if the value is greater than or equal to 400, we display the entire line.
20260609_QytBKVYk
Or we can find all successful requests (code 200) to the URI /api:
awk '$7 ~ /^\/api/ && $9 == 200 {print $0}' access.logWe check whether the value of the seventh field matches a regular expression and whether the ninth field is equal to a specific value. We clearly demonstrate that AWK supports logical operators: || (logical OR), && (logical AND), and ! (negation).
20260609_5DWZsAOh
Here's a useful example: let's find the most visited URLs from the Nginx log:
awk '{ url[$7]++ } END { for(u in url) print url[u], u }' access.log | sort -nr | headIn the main block of the script, AWK creates and populates the url array with the value of the seventh field ($7). After the main block executes, the utility outputs the number of hits and the URL itself using a for loop. The sort -nr command sorts the input data as numbers and reverses the sort order. head prints the first 10 lines.20260609_yeFufYU1
There are also more interesting, yet simple, examples. Let’s find the processes that are using more than 50% of the CPU:
ps aux | awk '$3 > 50'20260609_sR9tuM0J
Or we can find processes that are using more than 500 MB of RAM:
ps aux | awk '$6 > 500000 {print $2, $11, $6}'We generate the output ourselves, displaying only the 2nd, 11th, and 6th fields, and comparing the 6th field.
20260609_2OyXYW81
Continuing with the topic of mathematics, we can also calculate the average value for a column. For example, suppose we have a file named users.txt containing user data in the following format:
20260609_JaZGLbIu
We can calculate the average salary of a user using AWK:
awk -F ':' '{ sum+=$4 } END { print "Avg. salary:", sum/(NR-1) }' users.txt
  • -F - sets the separator;
  • sum+=$4 - sum the value of the 4th field to the variable;
  • print "Avg. salary:", sum/(NR-1) - calculates the average salary and displays it.
20260609_e2Bwuv93
Using AWK you can replace some basic utilities, such as head:
awk 'NR <= 5' users.txtThe command displays the first 5 lines of the users.txt file.
20260609_U9DAiNmx
Let's make the command a little more complex and print lines 3 through 5:
awk 'NR >= 3 && NR <= 5' users.txt
  • NR - current line number.
20260609_wmIZPk4V
AWK is useful for formatting a command’s output into the format we need, either for convenience or for further processing. Let’s compile a list of open ports on the server:
ss -tuln | awk 'NR > 1 {print $5}'
  • NR > 1 - starts the output from the second line (In this case, the first line is the column name).
20260609_RF3rAOE0

Conclusion

Although AWK is not a replacement for full-fledged scripting languages like Python, it is a powerful tool for quickly processing text in a pipeline. It is indispensable for complex Bash scripts and for analyzing large log files in the terminal. Its strength does not lie in the ability to write extensive scripts, but rather in its ability to quickly process large amounts of text manually and extract necessary information on the fly. This is precisely why this tool remains a staple in the arsenal of any Linux administrator decades after its creation.

SSD Storage VPS

Browse Configurations

Premium Dedicated Servers

Browse Configurations