AWK

·	全称	描述	example
NR	Number of Records	当前记录行号, 从1开始, 全局累加，不管处理多少个文件	`awk '{ print NR, $0 }' file.txt` 会在每行前打印行号。
FNR	File Number of Records	当前文件记录行号, 每个文件单独计数	`awk '{ print FNR, NR, $0 }' file1.txt file2.txt`
NF	Number of Fields	字段数	`awk '{ print NF, $NF }' file.txt` NF 是字段数, $NF 是最后一个字段（非常常用）

$1, $2, ... $NF 可以直接用来访问行的各个字段。
$0 代表整行文本。

·	全称	描述	example
OFS	Output Field Separator	输出字段分隔符，默认是空格	`awk 'BEGIN { OFS=","; print "a", "b", "c" }'` 会输出 `a,b,c`
ORS	Output Record Separator	输出记录分隔符，默认是换行符	`awk 'BEGIN { ORS="---\n"; print "a"; print "b"; print "c" }'` 会输出 `a---\nb---nc---\n`
FS	Field Separator	输入字段分隔符，默认是空格	`awk -F, '{ print $1, $2 }'` 会把逗号作为字段分隔符。
RS	Record Separator	输入记录分隔符，默认是换行符	`awk 'BEGIN { RS="---" } { print $0 }'` 会把 `---` 作为记录分隔符。

·	全称	描述	example
FILENAME	-	当前处理的文件名	`awk '{ print FILENAME, $0 }' file.txt` 带上文件名打印每行。
ARGC	-	命令行参数个数	`awk 'BEGIN { print ARGC }' file.txt arg1 arg2`
ARGV	-	命令行参数数组	`awk 'BEGIN { for (i=0; i<ARGC; i++) print ARGV[i] }' file.txt arg1 arg2`

awk 的设计哲学（这是核心）

1. 世界观：文本 = 记录流（record stream）

awk 不认为自己在“处理文件”，而是在：处理一串有结构的记录.
默认：
- 记录（record）== 一行
- 字段（field） == 以 FS 分隔的列
```
record 1 -> $1 $2 $3
record 2 -> $1 $2 $3
```
所以 awk 的第一性原理是：
- 对每条记录执行同一段逻辑
- 这就是为什么, 不用写循环。
```
{ sum += $3 }
```

2. 程序结构：隐式 for-each

awk 程序等价于：

while (getline) {
    if (pattern) action;
}

设计哲学：程序员只描述“条件 + 动作”.

3. Pattern → Action，而不是 Control Flow

awk 不是 if/for 为中心，而是：
```
pattern { action }
```
甚至 pattern 可以省略：
```
{ print }
```
这本质上是一个规则系统（rule engine）。

4. 内建状态，而非对象模型

awk 的状态来源：
- 关联数组
- NR / NF / FNR
- 用户变量
哲学: 状态应当显式、扁平、可观察
- 这也是 awk 非常适合统计、聚合的原因。

5. 流式优先（Streaming First）

awk 的核心约束：
- 不缓存全文件
- 一次遍历
- 记录级处理
所以：
- 能用数组解决的就用数组
- 能一次算完的绝不回头

进阶

阶段 2：精通 4 个核心概念（80% 能力）

FS / RS / OFS / ORS
```
BEGIN { FS=","; OFS="\t" }
```
NR / FNR / NF
```
NR==1 { print "header" }
NF<3 { next }
```

关联数组（awk 的灵魂）

count[$1]++
sum[$1] += $3
END { for (key in count) print key, count[key] }

END / BEGIN
```
BEGIN { init }
{ process }
END { report }
```
- 这是一个 Map → Reduce 的模型

阶段 3：把 awk 当“数据库引擎”

用 SQL 思维写 awk

SQL	awk
SELECT	{print}
WHERE	pattern
GROUP BY	array[key]
HAVING	END
ORDER BY	sort

example

awk '{cnt[$2]++} END{for(k in cnt) if(cnt[k]>10) print k,cnt[k]}'

awk 的“内功心法”（高手共识）

1. awk 是声明式，不是命令式

$3 > 100 { print }

不是：如果……然后……, 而是： 凡是满足条件的记录，都做这件事.

2. awk 的变量是“列驱动”的

不要写：

for(i=1;i<=NF;i++) ...

3. 好 awk 程序长得像“公式”

这是一条数据公式，不是脚本。

{ sum[$1]+=$3; cnt[$1]++ }
END { for(k in sum) print k, sum[k]/cnt[k] }

Syntax

Think of awk as just a simple program that iterates over your file one line at a time.

For every line in the file it checks to see whether that line has a certain pattern, if that line match a pattern, some action is performed.

NOTE: awk won't stop when if match a pattern, it will keep matching the rest patterns.

# PSEUDO
awk 'if(PATTERN1){ACTION1} if(PATTERN2){ACTION2} ...'

Since this way of doing things is so common, the if statement is never written, and the brackets (), even the PATTERN, can be omitted.

# syntax
awk 'PATTERN1{ACTION1} PATTERN2{ACTION2} ...'

Example

print the first 2 lines

NR stands for Row Number

$ ls -l | awk 'NR==1{print $0} NR==2{print}'
total 0
drwx------@  3 root  staff    96 Dec  4  2021 Applications

print full information of the first 3 lines, and only last field for the rest lines

NR stands for Number of Fields

ls -l | awk 'NR<=3{print} NR>3{print $NF}'
total 0
drwx------@  3 root  staff    96 Dec  4  2021 Applications
drwx------@ 14 root  staff   448 Jan 31 18:43 Desktop
Documents
Downloads
Library
Movies
Music

print the last 2nd filed
```
awk '....{print $(NF-1)}'
```

Specify field separator

awk treats space as the column delimiter
use -F to specify a delimiter
```
$awk -F ':' '{...}'
```
The delimiter can be a regular expression.
```
$awk -F '[.:]' '{...}'
```

Use Regular Expression

print folder names which ends with s

ls -l | awk '/s$/{print $NF}'
Applications
Documents
Downloads
Movies
Pictures

you can specify which field RE to be match
```
ls -l | awk '$0 ~ /s$/{print $NF}'
```

BEGIN / END

BEGIN / END are special pattern in awk. It doesn't match any specific line, but instead, cause the action to execute when awk starts up / shut down.

ls -l | awk '
  BEGIN{temp_sum=0; total_records=0; print "Begin calculating average file size"}
  NR>1{temp_sum += $5; total_records +=1;}
  END{print "Average file size:" temp_sum/total_records }
'

Begin calculating average file size
Average file size:506.273

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWK

awk 的设计哲学（这是核心）

1. 世界观：文本 = 记录流（record stream）

2. 程序结构：隐式 for-each

3. Pattern → Action，而不是 Control Flow

4. 内建状态，而非对象模型

5. 流式优先（Streaming First）

进阶

阶段 2：精通 4 个核心概念（80% 能力）

阶段 3：把 awk 当“数据库引擎”

awk 的“内功心法”（高手共识）

1. awk 是声明式，不是命令式

2. awk 的变量是“列驱动”的

3. 好 awk 程序长得像“公式”

Syntax

Example

Specify field separator

Use Regular Expression

BEGIN / END

FilesExpand file tree

linux_awk.md

Latest commit

History

linux_awk.md

File metadata and controls

AWK

awk 的设计哲学（这是核心）

1. 世界观：文本 = 记录流（record stream）

2. 程序结构：隐式 for-each

3. Pattern → Action，而不是 Control Flow

4. 内建状态，而非对象模型

5. 流式优先（Streaming First）

进阶

阶段 2： 精通 4 个核心概念（80% 能力）

阶段 3：把 awk 当“数据库引擎”

awk 的“内功心法”（高手共识）

1. awk 是声明式，不是命令式

2. awk 的变量是“列驱动”的

3. 好 awk 程序长得像“公式”

Syntax

Example

Specify field separator

Use Regular Expression

BEGIN / END

阶段 2：精通 4 个核心概念（80% 能力）