天天看点

linux 性能调优工具perf + 火焰图 常用命令

本文是从本人笔记直接搬运过来,将就着看

perf性能分析: 生成火焰图(执行1-4步骤): 1、perf record -e cpu-clock -g -p pid (perf record -F 99 -g -p pid 99HZ采样) -g 选项是告诉perf record额外记录函数的调用关系 -e cpu-clock 指perf record监控的指标为cpu周期 -p 指定需要record的进程pid     perf report -i perf.data -i 指定要查看的文件 2、perf script -i perf.data &> perf.unfold 用perf script工具对perf.data进行解析 3、./stackcollapse-perf.pl perf.unfold &> perf.folded 将perf.unfold中的符号进行折叠 4、./flamegraph.pl perf.folded > perf.svg 最后生成svg图 火焰图项目地址:git clone  https://github.com/brendangregg/FlameGraph.git

1、统计事件,stat:statistics # CPU counter statistics for the specified command: perf stat command # Detailed CPU counter statistics (includes extras) for the specified command: perf stat -d command # CPU counter statistics for the specified PID, until Ctrl-C: perf stat -p PID # CPU counter statistics for the entire system, for 5 seconds: perf stat -a sleep 5 # Various basic CPU statistics, system wide, for 10 seconds: perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles -a sleep 10

2、剖析 Profiling # Sample on-CPU functions for the specified command, at 99 Hertz: perf record -F 99 command # Sample on-CPU functions for the specified PID, at 99 Hertz, until Ctrl-C: perf record -F 99 -p PID # Sample on-CPU functions for the specified PID, at 99 Hertz, for 10 seconds: perf record -F 99 -p PID sleep 10 # Sample CPU stack traces (via frame pointers) for the specified PID, at 99 Hertz, for 10 seconds: perf record -F 99 -p PID -g -- sleep 10 常用参数 -e:Select the PMU event. -a:System-wide collection from all CPUs. -p:Record events on existing process ID (comma separated list). -A:Append to the output file to do incremental profiling.  -f:Overwrite existing data file. -o:Output file name. -g:Do call-graph (stack chain/backtrace) recording. -C:Collect samples only on the list of CPUs provided. 3、Static Tracing # Trace new processes, until Ctrl-C: perf record -e sched:sched_process_exec -a # Trace all context-switches, until Ctrl-C: perf record -e context-switches -a # Trace context-switches via sched tracepoint, until Ctrl-C: perf record -e sched:sched_switch -a # Trace all context-switches with stack traces, until Ctrl-C: perf record -e context-switches -ag # Trace all context-switches with stack traces, for 10 seconds: perf record -e context-switches -ag -- sleep 10

4、Dynamic Tracing # Add a tracepoint for the kernel tcp_sendmsg() function entry ("--add" is optional): perf probe --add tcp_sendmsg # Remove the tcp_sendmsg() tracepoint (or use "--del"): perf probe -d tcp_sendmsg # Add a tracepoint for the kernel tcp_sendmsg() function return: perf probe 'tcp_sendmsg%return' # Show available variables for the kernel tcp_sendmsg() function (needs debuginfo): perf probe -V tcp_sendmsg # Show available variables for the kernel tcp_sendmsg() function, plus external vars (needs debuginfo): perf probe -V tcp_sendmsg --externs

5、Mixed # Sample stacks at 99 Hertz, and, context switches: perf record -F99 -e cpu-clock -e cs -a -g  # Sample stacks to 2 levels deep, and, context switch stacks to 5 levels (needs 4.8): perf record -F99 -e cpu-clock/max-stack=2/ -e cs/max-stack=5/ -a -g

6、Reporting # Show perf.data in an ncurses browser (TUI) if possible: perf report # Show perf.data with a column for sample count: perf report -n # Show perf.data as a text report, with data coalesced and percentages: perf report --stdio # Report, with stacks in folded format: one line per stack (needs 4.4): perf report --stdio -n -g folded # List all events from perf.data: perf script # List all perf.data events, with data header (newer kernels; was previously default): perf script --header

继续阅读