What is the difference between awk and sed?

sed is best for line-level substitutions (find-and-replace). awk is best for column-based work — it automatically splits each line into fields ($1, $2, ...) and lets you do arithmetic, conditional logic, and aggregation. Use sed to swap text; use awk to process structured data.

20 Practical awk One-Liners for Developers

Q: What is awk and when should I use it?

awk is a text-processing tool built into every Unix system. Use it when you need to work with structured text — extracting columns, filtering rows by field values, doing math on fields, or reformatting files. For anything involving fields and columns, awk is cleaner than sed.

Q: How do I set a custom field separator in awk?

Use the -F flag: awk -F',' for comma-separated, awk -F'\t' for tab-separated, or awk -F':' for colon-separated (like /etc/passwd). You can also set it inside the script with FS=','.

Q: Can awk handle CSV files with quoted fields?

Basic awk with -F',' breaks on quoted fields containing commas (e.g. "Smith, John"). For properly quoted CSVs, use Python's csv module or miller (mlr). For simple CSVs without embedded commas, awk -F',' works perfectly.

2026-06-25 — 9 min read

awk is one of those tools that feels cryptic until the moment it clicks. Then you reach for it constantly: extracting a column from logs, summing a CSV field, filtering rows that match a condition, counting unique values. It lives on every Unix system and needs no install.

This guide covers the patterns you'll actually use — not awk programming, just the 20 one-liners that solve 90% of real problems.

Contents

Extracting columns & fields
Filtering & pattern matching
Math & aggregation
Transforming & reformatting
FAQ

1. Extracting Columns & Fields

awk splits each line on whitespace (or your chosen delimiter) into fields: $1 is the first, $2 the second, $NF the last. $0 is the whole line.

Print a specific column

awk '{print $2}' file.txt

No flag needed — whitespace delimited by default. Use $1 for the first column, $NF for the last. Works on ls -l, log files, anything space-separated.

Print the first and last field on each line

awk '{print $1, $NF}' file.txt

NF is the count of fields on the current line, so $NF is always the last field regardless of how many columns there are.

Extract columns from a CSV

awk -F',' '{print $1, $3}' data.csv

-F',' sets the field separator to comma. Use -F'\t' for TSV, -F':' for /etc/passwd-style files.

Print columns with a custom output separator

awk -F',' 'BEGIN{OFS="\t"} {print $1, $2, $3}' data.csv

OFS (output field separator) controls what awk puts between fields in a print statement. Set it in BEGIN so it applies to every row. Here: read CSV, write TSV.

Print columns in a different order

awk -F',' '{print $3, $1, $2}' data.csv

Handy when you need to reorder columns for a different tool — no need to load a spreadsheet or write a Python script.

2. Filtering & Pattern Matching

Print lines matching a pattern

awk '/error/' file.log

The pattern between // is a regex applied to the whole line. Like grep 'error', but you can immediately add field processing after the //.

Print lines NOT matching a pattern

awk '!/error/' file.log

Prefix with ! to negate. Equivalent to grep -v 'error', but again composable with field operations.

Filter by an exact field value

awk '$3 == "prod" {print}' servers.txt

Tests only the third field. Use ~ for regex match: $3 ~ /prod/ matches "prod", "production", etc. Use !~ to negate.

Filter rows where a numeric field exceeds a threshold

awk '$5 > 1000 {print $1, $5}' data.txt

awk does arithmetic natively. This prints the name and value for any row where the 5th field is over 1000. Useful for log analysis ("show me requests slower than 1000ms").

Print lines between two patterns

awk '/START/,/END/' file.txt

The comma creates a range: awk prints from the first line matching START through the first line matching END, inclusive. Useful for extracting a section from a log or config file.

3. Math & Aggregation

awk's END block runs once after all lines are processed — perfect for totals, averages, and counts.

Sum a column of numbers

awk '{sum += $1} END {print sum}' numbers.txt

The most common awk one-liner. Use printf "%.2f\n", sum instead of print sum if you want two decimal places.

Average a column

awk '{sum += $1} END {print sum/NR}' numbers.txt

NR is the total number of records (lines) processed. Dividing by NR gives the mean.

Count lines matching a pattern

awk '/error/ {count++} END {print count}' app.log

Equivalent to grep -c 'error' app.log, but you can add conditions on fields: $4 == "500" && /timeout/ {count++}.

Find the maximum value in a column

awk 'BEGIN{max=-999999} {if($1+0 > max) max=$1} END{print max}' data.txt

$1+0 coerces the field to a number (in case there's a header). Set BEGIN{max=-999999} to a safe floor, or skip it if you know the data is positive.

Count occurrences of each unique value

awk '{count[$1]++} END {for (k in count) print count[k], k}' file.txt | sort -rn

awk's associative arrays make frequency tables trivial. Pipe to sort -rn to rank by count. Replace $1 with any field — HTTP status codes, usernames, error types.

4. Transforming & Reformatting

Skip the header row

awk 'NR>1' file.txt

NR is the current line number (record number). NR>1 skips line 1. Use NR==1 {next} as an explicit alternative.

Print line numbers alongside content

awk '{print NR, $0}' file.txt

$0 is the entire line. This is faster than cat -n for piped input and lets you add conditions ("print only lines 10-20": NR>=10 && NR<=20).

Print every Nth line

awk 'NR%3==0' file.txt

Prints every 3rd line (lines 3, 6, 9 ...). Change the modulus to sample at different rates. Useful for downsampling high-frequency logs.

Replace a field value conditionally

awk '$2 == "staging" {$2 = "prod"} {print}' servers.txt

Assigning to a field ($2 = "prod") rewrites it in place. awk then rebuilds the line with OFS between fields. Set OFS="," in BEGIN if you need the output to be CSV.

Convert CSV to TSV

awk -F',' 'BEGIN{OFS="\t"} {$1=$1; print}' data.csv

The $1=$1 trick forces awk to rebuild $0 from the current fields using OFS — without it, awk prints the original line unchanged. A no-op assignment that has a real side effect.

FAQ

What is awk and when should I use it?

awk is a text-processing tool built into every Unix system. Use it when you need to work with structured text — extracting columns, filtering rows by field values, doing math on fields, or reformatting files. For anything involving fields and columns, awk is cleaner than sed.

What's the difference between awk and sed?

sed is best for line-level substitutions (find-and-replace across a file). awk is best for column-based work: it automatically splits each line into fields and lets you do arithmetic, conditional logic, and aggregation. Use sed to swap text; use awk to process structured data.

How do I set a custom field separator in awk?

Use the -F flag: awk -F',' for CSV, awk -F'\t' for TSV, awk -F':' for colon-delimited (like /etc/passwd). You can also set it with FS="," inside a BEGIN block.

Can awk handle CSV files with quoted fields?

Basic awk with -F',' breaks on quoted fields that contain commas (e.g. "Smith, John"). For properly quoted CSVs, reach for Python's csv module or the miller tool (mlr). For simple CSVs without embedded commas, awk -F',' works perfectly.

These 20 patterns cover most of what you'll reach for awk to do. If you want the broader shell context these fit into, see the 25 bash one-liners guide — awk appears there too, integrated with find, sort, and df. For shortening repetitive commands, the bash aliases guide shows how to wrap your most-used awk patterns into two-keystroke shortcuts.