Home UncategorizedWhat Does awk Do? A Beginner’s Guide with Examples

What Does awk Do? A Beginner’s Guide with Examples

by The Linux Digest Guy
The Awk Command post image

If you’ve spent any time on the Linux command line, you’ve probably seen awk show up in tutorials and Stack Overflow answers. It looks a bit cryptic at first. But once you get the hang of it, it becomes one of those commands you reach for all the time.

awk is a text processing tool. It reads input line by line, splits each line into fields, and lets you do something with those fields. That’s the whole idea. The power comes from how flexible that “do something” part turns out to be.


A Simple awk Example

Let’s start with a file. Say you have a file called servers.txt that looks like this:

web01 192.168.1.10 running
web02 192.168.1.11 stopped
db01  192.168.1.20 running

To print just the first column (the server names), you’d run:

awk '{ print $1 }' servers.txt

Output:

web01
web02
db01

That $1 means “the first field”. $2 would give you the IP addresses, $3 the status. Simple.


How awk Splits Fields

By default, awk treats any whitespace (spaces or tabs) as the delimiter between fields. So in the example above, it splits each line on the spaces and numbers the pieces starting from $1.

There’s also a special variable $0 which means the entire line. So print $0 prints the whole line. Not very exciting on its own, but useful when you’re combining it with other things.


Changing the Field Separator

Whitespace isn’t always your delimiter. A very common use case is parsing CSV files or colon-separated files like /etc/passwd.

You can set a custom field separator with the -F option:

awk -F: '{ print $1 }' /etc/passwd

This splits each line on : and prints the first field, which in /etc/passwd is the username. Try it on your own system. It’s a satisfying little command.

You can use any character as the separator. For a CSV file:

awk -F, '{ print $2 }' data.csv

Printing Multiple Fields

You’re not limited to printing one field at a time. Just list them separated by a comma:

awk '{ print $1, $3 }' servers.txt

Output:

web01 running
web02 stopped
db01 running

The comma puts a space between the fields in the output. If you want a different separator, you can use the output field separator variable OFS:

awk 'BEGIN { OFS="," } { print $1, $3 }' servers.txt

Output:

web01,running
web02,stopped
db01,running

That BEGIN block runs once before awk processes any lines. It’s handy for setting variables upfront.


Filtering Lines with a Pattern

One of the things that makes awk really useful is that you can filter which lines get processed. You put a pattern before the action block, and awk only runs the action on lines that match.

Going back to our servers.txt, to print only the lines where the status is “running”:

awk '$3 == "running" { print $1 }' servers.txt

Output:

web01
db01

You can also use regular expressions as the pattern. To find any line containing “web”:

awk '/web/ { print $0 }' servers.txt

This works a lot like grep, but you can combine the filtering with field extraction in one go.


Using awk to Do Math

awk can handle arithmetic, which makes it great for quick calculations on tabular data. Say you have a file with some numbers:

item1 10 2.50
item2 5  4.00
item3 8  1.75

To calculate the total cost per item (quantity × price):

awk '{ print $1, $2 * $3 }' items.txt

Output:

item1 25
item2 20
item3 14

You can also accumulate a running total using a variable:

awk '{ total += $2 * $3 } END { print "Total:", total }' items.txt

Output:

Total: 59

Notice the END block here. It’s the opposite of BEGIN: it runs once after all lines have been processed.


Counting Lines That Match

Want to count how many servers are running?

awk '$3 == "running" { count++ } END { print count }' servers.txt

Output:

2

The count++ just increments a counter each time a matching line is found.


Piping Output into awk

You don’t have to work with files. awk is perfectly happy reading from a pipe, which is where it really shines in day-to-day shell use.

For example, to list all running processes and show just the process name and PID:

ps aux | awk '{ print $1, $2 }'

Or to check disk usage and flag anything over 80%:

df -h | awk '$5 > "80%" { print $6, "is getting full:", $5 }'

Combining awk with other commands like ps, df, netstat, or cat is where it starts to feel like a proper sysadmin tool.


A Quick Reference

Here are the most useful awk building blocks in one place:

What you wantHow to do it
Print field N{ print $N }
Print whole line{ print $0 }
Custom delimiter-F"," or -F:
Filter by field value$N == "value" { ... }
Filter by pattern/pattern/ { ... }
Run before processingBEGIN { ... }
Run after processingEND { ... }
Arithmetic$2 * $3, total += $2

awk has a lot more depth to it. You can write multi-line programs, define functions, and handle some surprisingly complex data transformations. But the examples above cover the 90% you’ll actually use day to day. Start with those and build from there.

If you found this useful, check out our guides on grep and sed. The three of them together cover most of what you’ll ever need for text processing on the command line.

Related Posts

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy