Skip to content

Commit e26586c

Browse files
committed
nettrace: add man page that generated from markdown
Signed-off-by: Menglong Dong <imagedong@tencent.com>
1 parent 0d580bc commit e26586c

2 files changed

Lines changed: 234 additions & 1 deletion

File tree

script/.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +0,0 @@
1-
zh_CN/nettrace.8

script/zh_CN/nettrace.8

Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
.TH NETTRACE 8 "20 JULY 2022" Linux "User Manuals"
2+
.SH NAME
3+
.PP
4+
nettrace \- Linux系统下的网络报文跟踪、网络问题诊断工具
5+
.SH SYNOPSIS
6+
.PP
7+
\fB\fCnettrace\fR [选项]
8+
.SH DESCRIPTION
9+
.PP
10+
\fB\fCnettrace\fR是基于eBPF的集网络报文跟踪(故障定位)、网络故障诊断、网络异常监控于一体的网
11+
络工具集,旨在能够提供一种更加高效、易用的方法来解决复杂场景下的网络问题。
12+
.SH OPTIONS
13+
.TP
14+
\fB\fC\-s,\-\-saddr\fR \fIsource_address\fP
15+
根据IP源地址来进行报文筛选
16+
.TP
17+
\fB\fC\-d,\-\-daddr\fR \fIdest_address\fP
18+
根据IP目的地址来进行报文筛选
19+
.TP
20+
\fB\fC\-\-addr\fR \fIaddress\fP
21+
根据IP源地址或者目的地址来进行报文筛选
22+
.TP
23+
\fB\fC\-S,\-\-sport\fR \fIsource_port\fP
24+
根据UDP/TCP源端口进行报文筛选
25+
.TP
26+
\fB\fC\-D,\-\-dport\fR \fIdest_port\fP
27+
根据UDP/TCP目的端口进行报文筛选
28+
.TP
29+
\fB\fC\-\-port\fR \fIport\fP
30+
根据UDP/TCP源端口或者目的端口进行报文筛选
31+
.TP
32+
\fB\fC\-p,\-\-proto\fR \fIprotocol\fP
33+
根据报文的协议(三层或者四层)进行过滤,如\fI\-p udp\fP
34+
.TP
35+
\fB\fC\-t,\-\-trace\fR \fItraces\fP
36+
要启用(跟踪)的内核函数、tracepoint。
37+
.IP
38+
这里将这些被跟踪的对象(内核函数、tracepoint等)简称为跟踪器,
39+
所有的跟踪器以树状图的方式被组织了起来,使用命令:
40+
\fInettrace \-t ?\fP
41+
可以查看所有的跟踪器。
42+
.IP
43+
默认情况下,大部分的跟踪器会被启用,一些设备相关的跟踪器(如ipvlan、bridge等)默认
44+
不启用。使用参数\fI\-t all\fP可启用所有的跟踪器。
45+
.IP
46+
可以同时指定多个跟踪器,以\fI,\fP分隔,比如\fInettrace \-t ip,link,kfree_skb\fP
47+
可以指定跟踪器的目录,也可以直接指定跟踪器。
48+
.TP
49+
\fB\fC\-\-ret\fR
50+
显示被跟踪的内核函数的返回值
51+
.TP
52+
\fB\fC\-\-detail\fR
53+
显示跟踪详细信息,包括当前的进程、网口和CPU等信息
54+
.TP
55+
\fB\fC\-\-basic\fR
56+
启用\fB\fCbasic\fR跟踪模式。默认情况下,启用的是生命周期跟踪模式。启用该模式后,会直接打印
57+
出报文所经过的内核函数/tracepoint
58+
.TP
59+
\fB\fC\-\-intel\fR
60+
启用诊断模式
61+
.TP
62+
\fB\fC\-\-intel\-quiet\fR
63+
只显示出现存在问题的报文,不显示正常的报文
64+
.TP
65+
\fB\fC\-\-intel\-keep\fR
66+
持续跟踪。\fB\fCintel\fR模式下,默认在跟踪到异常报文后会停止跟踪,使用该参数后,会持续跟踪下去。
67+
.TP
68+
\fB\fC\-\-hooks\fR
69+
打印netfilter上的钩子函数
70+
.TP
71+
\fB\fC\-v\fR
72+
显示程序启动的日志信息
73+
.TP
74+
\fB\fC\-\-debug\fR
75+
显示调试信息
76+
.SH EXAMPLES
77+
.SS 生命周期跟踪
78+
.TP
79+
跟踪源地址为\fB\fC192.168.1.8\fR的ping报文:
80+
\fInettrace \-p icmp \-s 192.168.1.8\fP
81+
.TP
82+
跟踪源地址为\fB\fC192.168.1.8\fR的ping报文在IP协议层和ICMP协议层的路径:
83+
\fInettrace \-p icmp \-s 192.168.1.8 \-t ip,icmp\fP
84+
.TP
85+
显示详细信息:
86+
\fInettrace \-p icmp \-s 192.168.1.8 \-\-detail\fP
87+
.SS 诊断模式
88+
.PP
89+
使用方式与上面的一致,加个\fB\fCintel\fR参数即可使用诊断模式。上文的生命周期模式对于使用者的
90+
要求比较高,需要了解内核协议栈各个函数的用法、返回值的意义等,易用性较差。诊断模式是在
91+
生命周期模式的基础上,提供了更加丰富的信息,使得没有网络开发经验的人也可进行复杂
92+
网络问题的定位和分析。
93+
.PP
94+
比于普通模式,诊断模式提供了更多的可供参考的信息,包括当前报文经过了iptables的哪些表和
95+
哪些链、报文发生了NAT、报文被克隆了等。诊断模式设置了三种提示级别:
96+
.RS
97+
.IP \(bu 2
98+
\fB\fCINFO\fR:正常的信息提示
99+
.IP \(bu 2
100+
\fB\fCWARN\fR:警告信息,该报文可能存在一定的问题,需要关注
101+
.IP \(bu 2
102+
\fB\fCERROR\fR:异常信息,报文发生了问题(比如被丢弃)。
103+
.RE
104+
.PP
105+
如果当前报文存在\fB\fCERROR\fR,那么工具会给出一定的诊断修复建议,并终止当前诊断操作。通过添
106+
\fB\fCintel\-keep\fR可以在发生\fB\fCERROR\fR事件时不退出,继续进行跟踪分析。下面是发生异常时的日志:
107+
.PP
108+
.RS
109+
.nf
110+
\&./nettrace \-p icmp \-\-intel \-\-saddr 192.168.122.8
111+
begin trace...
112+
***************** ffff889fb3c64f00 ***************
113+
[4049.295546] [__netif_receive_skb_core] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
114+
[4049.295566] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *ipv4 in chain: PRE_ROUTING*
115+
[4049.295578] [nft_do_chain ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *iptables table:nat, chain:PREROUT* *packet is accepted*
116+
[4049.295594] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *bridge in chain: PRE_ROUTING*
117+
[4049.295612] [__netif_receive_skb_core] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
118+
[4049.295624] [ip_rcv ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
119+
[4049.295629] [ip_rcv_core ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
120+
[4049.295640] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *ipv4 in chain: PRE_ROUTING*
121+
[4049.295644] [ip_rcv_finish ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
122+
[4049.295655] [ip_route_input_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
123+
[4049.295664] [fib_validate_source ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
124+
[4049.295683] [ip_forward ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0
125+
[4049.295687] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *ipv4 in chain: FORWARD* *packet is dropped by netfilter (NF_DROP)*
126+
[4049.295695] [nft_do_chain ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *iptables table:filter, chain:FORWARD* *packet is dropped by iptables/iptables\-nft*
127+
[4049.295711] [kfree_skb ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 0 *packet is dropped by kernel*
128+
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- ANALYSIS RESULT \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
129+
[1] ERROR happens in nf_hook_slow(netfilter):
130+
packet is dropped by netfilter (NF_DROP)
131+
fix advice:
132+
check your netfilter rule
133+
134+
[2] ERROR happens in nft_do_chain(netfilter):
135+
packet is dropped by iptables/iptables\-nft
136+
fix advice:
137+
check your iptables rule
138+
139+
[3] ERROR happens in kfree_skb(life):
140+
packet is dropped by kernel
141+
location:
142+
nf_hook_slow+0x96
143+
drop reason:
144+
NETFILTER_DROP
145+
146+
analysis finished!
147+
148+
end trace...
149+
.fi
150+
.RE
151+
.PP
152+
从这里的日志可以看出,在报文经过iptables的filter表的forward链的时候,发生了丢包。在
153+
诊断结果里,会列出所有的异常事件,一个报文跟踪可能会命中多条诊断结果。这里的诊断建议是让
154+
用户检查iptables中的规则是否存在问题。
155+
.PP
156+
其中,\fB\fCkfree_skb\fR这个跟踪点是对\fB\fCdrop reason\fR内核特性(详见droptrace中的介绍)做了
157+
适配的,可以理解为将droptrace的功能集成到了这里的诊断结果中,这里可以看出其给出的丢包
158+
原因是\fB\fCNETFILTER_DROP\fR。因此,可以通过一下命令来监控内核中所有的丢包事件以及丢包原因:
159+
.PP
160+
\fInettrace \-t kfree_skb \-\-intel \-\-intel\-keep\fP
161+
.SS netfilter支持
162+
.PP
163+
网络防火墙是网络故障、网络不同发生的重灾区,因此\fB\fCnetfilter\fR工具对\fB\fCnetfilter\fR提供了
164+
完美适配,包括老版本的\fB\fCiptables\-legacy\fR和新版本的\fB\fCiptables\-nft\fR。诊断模式下,
165+
\fB\fCnettrace\fR能够跟踪报文所经过的\fB\fCiptables\fR表和\fB\fCiptables\fR链,并在发生由于iptables
166+
导致的丢包时给出一定的提示,上面的示例充分展现出了这部分。出了对iptables的支持,
167+
\fB\fCnettrace\fR对整个netfilter大模块也提供了支持,能够显示在经过每个HOOK点时对应的协议族
168+
和链的名称。除此之外,为了应对一些注册到netfilter中的第三方内核模块导致的丢包问题,
169+
\fB\fCnettrace\fR还可以通过添加参数\fB\fChooks\fR来打印出当前\fB\fCHOOK\fR上所有的的钩子函数,从而深入
170+
分析问题:
171+
.PP
172+
.RS
173+
.nf
174+
\&./nettrace \-p icmp \-\-intel \-\-saddr 192.168.122.8 \-\-hooks
175+
begin trace...
176+
***************** ffff889faa054500 ***************
177+
[5810.702473] [__netif_receive_skb_core] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
178+
[5810.702491] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *ipv4 in chain: PRE_ROUTING*
179+
[5810.702504] [nft_do_chain ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *iptables table:nat, chain:PREROUT* *packet is accepted*
180+
[5810.702519] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *bridge in chain: PRE_ROUTING*
181+
[5810.702527] [__netif_receive_skb_core] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
182+
[5810.702535] [ip_rcv ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
183+
[5810.702540] [ip_rcv_core ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
184+
[5810.702546] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *ipv4 in chain: PRE_ROUTING*
185+
[5810.702551] [ip_rcv_finish ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
186+
[5810.702556] [ip_route_input_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
187+
[5810.702565] [fib_validate_source ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
188+
[5810.702579] [ip_forward ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943
189+
[5810.702583] [nf_hook_slow ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *ipv4 in chain: FORWARD* *packet is dropped by netfilter (NF_DROP)*
190+
[5810.702586] [nft_do_chain ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *iptables table:filter, chain:FORWARD* *packet is dropped by iptables/iptables\-nft*
191+
[5810.702599] [kfree_skb ] ICMP: 192.168.122.8 \-> 10.123.119.98 ping request, seq: 943 *packet is dropped by kernel*
192+
\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- ANALYSIS RESULT \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-
193+
[1] ERROR happens in nf_hook_slow(netfilter):
194+
packet is dropped by netfilter (NF_DROP)
195+
196+
following hook functions are blamed:
197+
nft_do_chain_ipv4
198+
199+
fix advice:
200+
check your netfilter rule
201+
202+
[2] ERROR happens in nft_do_chain(netfilter):
203+
packet is dropped by iptables/iptables\-nft
204+
fix advice:
205+
check your iptables rule
206+
207+
[3] ERROR happens in kfree_skb(life):
208+
packet is dropped by kernel
209+
location:
210+
nf_hook_slow+0x96
211+
drop reason:
212+
NETFILTER_DROP
213+
214+
analysis finished!
215+
216+
end trace...
217+
.fi
218+
.RE
219+
.PP
220+
可以看出,上面\fB\fCfollowing hook functions are blamed\fR中列出了导致当前\fB\fCnetfilter\fR
221+
丢包的所有的钩子函数,这里只有\fB\fCiptables\fR一个钩子函数。
222+
.SH REQUIREMENTS
223+
.PP
224+
内核需要支持CONFIG\fIBPF, CONFIG\fPKPROBE功能
225+
.SH OS
226+
.PP
227+
Linux
228+
.SH AUTHOR
229+
.PP
230+
Menglong Dong
231+
.SH SEE ALSO
232+
.PP
233+
.BR nettrace-legacy (8),
234+
.BR droptrace (8)

0 commit comments

Comments
 (0)