1 Introduction
Improving the performance of an application is a very time-consuming and labor-intensive work, but it is usually not very obvious which functions in the program consume most of the execution time. The GNU Compiler Toolkit provides a profiling tool GNU profiler (gprof). gprof can accurately analyze performance bottlenecks for programs on the Linux platform. gprof accurately gives the time and number of times the function is called, and gives the function call relationship.
gprof user manual website http://sourceware.org/binutils/docs-2.17/gprof/index.html
2 Functions
Gprof is one of GNU gnu binutils tools, This tool is included in the Linux system by default.
1. “flat profile” can be displayed, including the number of calls of each function, the processor time consumed by each function,
2. “Call graph” can be displayed, including Function call relationship, how much time is spent in each function call.
3. Can display “commented source code”-a copy of the program source code, marked with the number of executions of each line of code in the program.
3 Principles
When compiling and linking programs (using the -pg compile and link option), gcc adds a function called mcount to each function of your application (or “_mcount”, or “__mcount”, depending on the compiler or operating system) function, which means that every function in your application will call mcount, and mcount will save a function call graph in memory , And find the address of the child function and the parent function in the form of the function call stack. This call graph also saves all information about the call time, number of calls, etc. related to the function.
4 Use process
1. Add the -pg option when compiling and linking. Generally, we can add it to the makefile.
2. Execute the compiled binary program. The execution parameters and methods are the same as before.
3. Generate the gmon.out file in the program running directory. If there is a gmon.out file, it will be overwritten.
4. End the process. At this time, gmon.out will be refreshed again.
5. Use the gprof tool to analyze the gmon.out file.
5 Parameter description
l -b no longer outputs the detailed description of each field in the statistical chart.
l -p only outputs the call graph of the function (the part of the Call graph information).
l -q only outputs the time consumption list of the function.
l -e Name no longer outputs the call graph of function Name and its child functions (unless they have other parent functions that are not restricted). Multiple -e flags can be given. Only one function can be specified with a -e flag.
l -E Name no longer outputs the call graph of function Name and its sub-functions. This flag is similar to the -e flag, but it excludes the function Name and its The time used by the sub-function.
l -f Name Outputs the call graph of the function Name and its sub-functions. You can specify multiple -f flags. Only one function can be specified with a -f flag.
l -F Name outputs the call graph of the function Name and its sub-functions. It is similar to the -f flag, but it uses only the printed routine time in the calculation of total time and percentage time. You can specify multiple -F flags. Only one function can be specified with a -F flag. The -F flag overrides the -E flag.
l -z displays routines with zero usage (calculated according to call count and cumulative time).
General usage: gprof -b binary program gmon.out >report.txt
6 Report description
Explanation of information generated by Gprof:
Call Graph field meaning:
Note:
The cumulative execution time of the program only includes the functions that gprof can monitor. Functions that work in kernel mode and third-party library functions that are not compiled with -pg cannot be monitored by gprof, (such as sleep(), etc.)
The specific parameters of Gprof can be queried through man gprof.
7 Support for shared libraries
The support for code profiling is added by the compiler, so if you want to get profiling information from shared libraries, you need to use -pg to compile these libraries . Provide the C library version (libc_p.a) that has been compiled with code profiling support enabled.
If you need to analyze system functions (such as libc library), you can replace -lc with -lc_p. This program will link libc_p.so or libc_p.a. This is very important, because only in this way can the execution time of the underlying C library functions (such as memcpy(), memset(), sprintf(), etc.) be monitored.
gcc example1.c –pg -lc_p -o example1
Be careful to use ldd ./example | grep libc to see if the program is linked to libc.so or libc_p.so
p>
8 User Time and Kernel Time
The biggest flaw of gprof: It can only analyze the user time consumed by the application in the running process, and cannot get the running time of the kernel space of the program. Generally speaking, when an application is running, it takes some time to run user code and some time to run “system code”, such as the kernel system call sleep().
There is a way to view the running time composition of the application, execute the program under the time command. This command will display the actual running time of an application, user space running time, and kernel space running time.
Such as time ./program
Output:
real 2m30.295s
user 0m0.000s
sys 0m0.004s
9 Precautions
1. g++ must use the -pg option in both compiling and linking processes.
2. Only static link libc library can be used, otherwise calling profile code before initializing *.so will cause “segmentation fault”. The solution is to add -static-libgcc or -static when compiling.
3. If you use ld to link the program directly without g++, add the link file /lib/gcrt0.o, such as ld -o myprog /lib/gcrt0.o myprog.o utils.o -lc_p . It may also be gcrt1.o
4. To monitor the execution time of third-party library functions, the third-party library must also be compiled with the -pg option.
5. gprof can only analyze the user time consumed by the application.
6. The program cannot be run as a demon. Otherwise, the time will not be collected. (The number of calls can be collected)
7. It is a good way to use time to run the program to determine whether gprof can generate useful information.
8. If gprof is not suitable for your analysis needs, there are other tools that can overcome some of the defects of gprof, including OProfile and Sysprof.
9. gprof is obviously useful for CPU-intensive programs whose code is mostly user space. It is difficult to optimize programs that run most of the time in the kernel space or run very slowly due to external factors (such as the operating system’s I/O subsystem overload).
10. gprof does not support multi-threaded applications. Under multi-threading, only the main thread performance data can be collected. The reason is that gprof uses the ITIMER_PROF signal, and only the main thread can respond to this signal in a multi-thread. But there is a simple way to solve this problem: http://sam.zoy.org/writings/programming/gprof.html
11. gprof can only generate reports after the program exits normally. (Gmon.out).
a) Reason: gprof generates result information by registering a function in atexit(). Any abnormal exit will not execute atexit(), so gmon.out file will not be generated.
b) The program can exit normally from the main function, or exit by calling the exit() function of the system.
10 Multi-threaded applications
gprof does not support multi-threaded applications. Under multi-threading, only the main thread performance data can be collected. The reason is that gprof uses the ITIMER_PROF signal, and only the main thread can respond to this signal in a multi-thread.
What method can be used to analyze all threads? The key is to be able to make each thread respond to the ITIMER_PROF signal. It can be achieved through a stub function, rewriting the pthread_create function.
11 Data Graphicalization p>
1) gprof ./main> profile.txt output data to profile.txt file
2) gprof2dot.py profile.txt> profile.dot generate dot file
3) dot -Tsvg -o gprof.svg To generate the svg file, we can open the svg directly with the browser to see that the function is a hot spot.
The gprof2dot.py script can be fork down with githun, and the dot tool and linux can be installed directly. The centos command yum install graphviz. For other distributions, just change the installation command.
Reference link: https://blog.csdn.net/stanjiang2010/article/details/5655143
https://fooyou. github.io/document/2015/07/22/performance-tools-for-linux-cplusplus.html