perf对多线程Profile简单流程_perf 线程_BanFS的博客

使用perf对多线程进行profile

准备多线程程序

在此程序中，创建了两个线程。分别跑了不同次数的func1()方法。 gcc -lpthread main.c
#include <pthread.h>
#include <stdio.h>
#include <string.h>
pthread_t thread[2];
void func1() {
    int i = 0;
    while (i<10000)
        ++i;
void func2() {
    int i = 0;
    while (i<10000)
        i = i*2;
    func1();
void *thread1()
        for (;;)
            func1();
        pthread_exit(NULL);
void *thread2()
        for (;;)
            func2();
        pthread_exit(NULL);
void thread_create(void)
        int temp;
        memset(&thread, 0, sizeof(thread));
        if((temp = pthread_create(&thread[0], NULL, thread1, NULL)) != 0)
                printf("线程1创建失败!\n");
                printf("线程1被创建\n");
        if((temp = pthread_create(&thread[1], NULL, thread2, NULL)) != 0)
                printf("线程2创建失败\n");
                printf("线程2被创建\n");
int main()
        thread_create();
        pthread_join(thread[0],NULL);
        printf("线程1加入");
        pthread_join(thread[1],NULL);
        printf("线程2加入");
        return 0;
使用perf对程序进行采样
 
对还没启动的程序
 
root@ma100:/home/banfushen/perf_cpu/multi_thread# perf record -h
Usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]
    -F, --freq <n>        profile at this frequency
    -g                    enables call-graph recording
    -p, --pid <pid>       record events on existing process id
    -t, --tid <tid>       record events on existing thread id
perf record -g -F 99 ./a.out，对多线程程序进行采样，采样频率99，(-F 99: sample at 99 Hertz (samples per second). I’ll sometimes sample faster than this (up to 999 Hertz), but that also costs overhead. 99 Hertz should be negligible. Also, the value ‘99’ and not ‘100’ is to avoid lockstep sampling, which can produce skewed results.)。运行完毕会得到一个perf.data，要想得到火焰图，还需要借助别的工具。 
对已经启动的程序
 
对于已经启动的程序，要拿到pid，perf record -g -F 99 -p <pid> 
下载工具FlameGraph
 
git clone https://github.com/brendangregg/FlameGraph.git 
banfushen@ma100:~/perf_cpu/FlameGraph$ pwd
/home/banfushen/perf_cpu/FlameGraph
生成火焰图
 
对perf.data生成火焰图(按照上面来说就是一个进程的)
 
perf script |/home/banfushen/perf_cpu/FlameGraph/stackcollapse-perf.pl|/home/banfushen/perf_cpu/FlameGraph/flamegraph.pl > output.svg
  
对单个线程生成火焰图
 
要知道线程id 
root@ma100:/home/banfushen/perf_cpu/multi_thread# perf script -h
 Usage: perf script [<options>]
    or: perf script [<options>] record <script> [<record-options>] <command>
    or: perf script [<options>] report <script> [script-args]
    or: perf script [<options>] <script> [<record-options>] <command>
    or: perf script [<options>] <top-script> [script-args]
    -v, --verbose         be more verbose (show symbol address, etc)
        --pid <pid[,pid...]>
        --tid <tid[,tid...]>
                          only consider symbols in these tids
perf script -v --tid <tid> 指定线程
 perf script -v --tid 2283471|/home/banfushen/perf_cpu/FlameGraph/stackcollapse-perf.pl|/home/banfushen/perf_cpu/FlameGraph/flamegraph.pl > output1.svg
  
对多个线程生成火焰图
 
perf script -v --tid <tid[,tid...]> 指定多个线程
 perf script -v --tid 2283472,2283471|/home/banfushen/perf_cpu/FlameGraph/stackcollapse-perf.pl|/home/banfushen/perf_cpu/FlameGraph/flamegraph.pl > output3.svg
 
 参考资料:
 perf Examples
 perf性能分析
 性能分析利器之perf浅析
 利用perf剖析Linux应用程序
 Linux性能分析工具Perf简介
                    背景知识Perf是用于软件性能分析的工具，通过Perf，应用程序可以利用PMU，tracepoint和内核中的特殊计数器进行性能统计。Perf不但可以分析应用程序的性能问题(per thread)，也可以分析内核的性能问题，处理所有性能相关的事件：程序运行期间的硬件事件，如instructions retired ，processor clock cycles等；软件事件，如Page Fault和进程切换。Perf基本原理是对被监测对象进行采样，最简单的情形是根据tick中断进行采样，即在tick中断内
系统级性能优化通常包括两个阶段：性能剖析（performance profiling）和代码优化。
性能剖析的目标是寻找性能瓶颈，查找引发性能问题的原因及热点代码。
代码优化的目标是针对具体性能问题而优化代码或编译选项，以改善软件性能。
在性能剖析阶段，需要借助于现有的profiling工具，如perf等。在代码优化阶段...
				perf 是 Linux 内核中的一个性能分析工具，它可以用来收集、报告和分析系统性能数据。
perf 使用方法可以通过在命令行中输入 `perf` 来查看。例如，你可以输入 `perf record -e cycles sleep 10` 来收集最近 10 秒内的 CPU 周期数据。
也可以在 Python 中使用 perf。要使用 perf，你需要安装 python-perf 模块。你可以使用 pip 安装它：
pip install python-perf
然后，你可以使用以下代码在 Python 中使用 perf：
```python
import perf
def my_function():
    # do something here
# 开始记录性能数据
perf.start()
# 调用 my_function
my_function()
# 停止记录性能数据
perf.stop()
# 打印性能摘要信息
print(perf.summary())
这样，你就可以在 Python 中使用 perf 了。