Content deleted Content added
Describe some applications of performance profilers |
→top: bold alt article name per MOS |
||
(40 intermediate revisions by 34 users not shown) | |||
Line 1:
{{Short description|Measuring the time or resources used by a section of a computer program}}
{{
{{Software development process|Tools}}
In [[software engineering]], '''profiling''' (
Profiling is achieved by [[Instrumentation (computer programming)|instrumenting]] either the program [[source code]] or its binary executable form using a tool called a ''profiler'' (or ''code profiler''). Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.
== Gathering program events ==
Profilers use a wide variety of techniques to collect data, including [[hardware interrupt]]s, [[Instrumentation (computer programming)|code instrumentation]], [[instruction set simulator|instruction set simulation]], operating system [[hooking|hooks]], and [[Hardware performance counter|performance counter]]s
== Use of profilers ==
[[File:CodeAnalyst3.png|thumb|Graphical output of the [[CodeAnalyst]] profiler
{{quotation|text=
Program analysis tools are extremely important for understanding program behavior. Computer architects need such tools to evaluate how well programs will perform on new [[computer architecture|architectures]]. Software writers need tools to analyze their programs and identify critical sections of code. [[Compiler]] writers often use such tools to find out how well their [[instruction scheduling]] or [[branch prediction]] algorithm is performing...|author=ATOM|source=[[Conference on Programming Language Design and Implementation|PLDI]]|'94}}
The output of a profiler may be:
Line 19 ⟶ 20:
/* ------------ source------------------------- count */
0001
0002 THEN DO
0003 ADD 1 to XCOUNT
0004 ELSE
0005
* A stream of recorded events (a '''trace''')
Line 31 ⟶ 32:
: This provides the opportunity to switch a trace on or off at any desired point during execution in addition to viewing on-going metrics about the (still executing) program. It also provides the opportunity to suspend asynchronous processes at critical points to examine interactions with other parallel processes in more detail.
A profiler can be applied to an individual method or at the scale of a module or program, to identify performance bottlenecks by making long-running code obvious.<ref>{{cite web| title=How to find the performance bottleneck in C# desktop application?| publisher=[[Stack Overflow]]| year=2012| url=https://stackoverflow.com/questions/13698674/how-to-find-the-performance-bottleneck-in-c-sharp-desktop-application}}</ref> A profiler can be used to understand code from a timing point of view, with the objective of optimizing it to handle various runtime conditions<ref>{{cite web| last=Krauss| first=Kirk J| title=Performance Profiling with a Focus| publisher=Develop for Performance| year=2017| url=http://www.developforperformance.com/PerformanceProfilingWithAFocus.html}}</ref> or various loads.<ref>{{cite web| work=Stackify Developer Tips, Tricks and Resources| title=What is code profiling? Learn the 3 Types of Code Profilers| publisher=Disqus| year=2016| url=https://stackify.com/what-is-code-profiling/}}</ref>
==History==
Performance-analysis tools existed on [[IBM/360]] and [[IBM/370]] platforms from the early 1970s, usually based on timer interrupts which recorded the [[
Profiler-driven program analysis on Unix dates back to 1973
S.L. Graham, P.B. Kessler, and M.K. McKusick, [http://docs.freebsd.org/44doc/psd/18.gprof/paper.pdf ''gprof: a Call Graph Execution Profiler''], Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, ''[[SIGPLAN]] Notices'', Vol. 17, No 6, pp. 120-126; [[doi:10.1145/800230.806987]]</ref>
In 1994, Amitabh Srivastava and [[Alan Eustace]] of [[Digital Equipment Corporation]] published a paper describing ATOM<ref>
A. Srivastava and A. Eustace, [http://www.ece.cmu.edu/~ece548/tools/atom/man/wrl_94_2.pdf ''ATOM: A system for building customized program analysis tools''], Proceedings of the ACM SIGPLAN Conference on Programming language design and implementation (PLDI '94), pp. 196-205, 1994; ACM ''SIGPLAN Notices'' - Best of PLDI 1979-1999 Homepage archive, Vol. 39, No. 4, pp. 528-539; [[doi:10.1145/989393.989446]]
</ref> (Analysis Tools with OM). The ATOM platform converts a program into its own profiler: at [[compile time]], it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique - modifying a program to analyze itself - is known as "[[Instrumentation (computer programming)|instrumentation]]".
In 2004 both the <code>gprof</code> and ATOM papers appeared on the list of the 50 most influential [[Conference on Programming Language Design and Implementation|PLDI]] papers for the 20-year period ending in 1999.<ref>
Line 55 ⟶ 56:
===Input-sensitive profiler===
Input-sensitive profilers<ref name="aprof">E. Coppa, C. Demetrescu, and I. Finocchi, [
==Data granularity in profiler types==
Profilers, which are also programs themselves, analyze target programs by collecting information on
===Event-based profilers===
Event-based profilers are available for the following programming languages:
* [[Java (programming language)|Java]]: the [[Java Virtual Machine Tools Interface|JVMTI]] (JVM Tools Interface) API, formerly JVMPI (JVM Profiling Interface), provides hooks to profilers, for trapping events like calls, class-load, unload, thread enter leave.
* [[.NET Framework|.NET]]: Can attach a profiling agent as a ''COM'' server to the ''CLR'' using Profiling ''API''. Like Java, the runtime then provides various callbacks into the agent, for trapping events like method [[Interpreter|JIT]] / enter / leave, object creation, etc. Particularly powerful in that the profiling agent can rewrite the target application's bytecode in arbitrary ways.
Line 68 ⟶ 69:
===Statistical profilers===
In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program
Unfortunately, running kernel code to handle the interrupts incurs a minor loss of CPU cycles from the target program, diverts cache usage, and cannot distinguish the various tasks occurring in uninterruptible kernel code (microsecond-range activity) from user code. Dedicated hardware can
▲In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program, and thus don't have as many side effects (such as on memory caches or instruction decoding pipelines). Also since they don't affect the execution speed as much, they can detect issues that would otherwise be hidden. They are also relatively immune to over-evaluating the cost of small, frequently called routines or 'tight' loops. They can show the relative amount of time spent in user mode versus interruptible kernel mode such as [[system call]] processing.
Some commonly used<ref>{{cite web| title=Popular C# Profilers| publisher=Gingtage| year=2014| url=http://www.ginktage.com/2014/10/popular-c-profilers/}}</ref> statistical profilers for Java/managed code are [[SmartBear Software]]'s [[AQtime]]<ref>{{cite web| work=AQTime 8 Reference| title=Sampling Profiler - Overview| publisher=SmartBear Software| year=2018| url=https://support.smartbear.com/viewarticle/54581/}}</ref> and [[Microsoft]]'s [[CLR Profiler]].<ref>{{cite web| work=Microsoft .NET Framework Unmanaged API Reference| last=Wenzal| first=Maira
▲Dedicated hardware can go beyond this: ARM Cortex-M3 and some recent MIPS processors JTAG interface have a PCSAMPLE register, which samples the [[program counter]] in a truly undetectable manner, allowing non-intrusive collection of a flat profile.
▲Some commonly used<ref>{{cite web| title=Popular C# Profilers| publisher=Gingtage| year=2014| url=http://www.ginktage.com/2014/10/popular-c-profilers/}}</ref> statistical profilers for Java/managed code are [[SmartBear Software]]'s [[AQtime]]<ref>{{cite web| work=AQTime 8 Reference| title=Sampling Profiler - Overview| publisher=SmartBear Software| year=2018| url=https://support.smartbear.com/viewarticle/54581/}}</ref> and [[Microsoft]]'s [[CLR Profiler]]<ref>{{cite web| work=Microsoft .NET Framework Unmanaged API Reference| last=Wenzal| first=Maira, et al.| title=Profiling Overview| publisher=Microsoft| year=2017| url=https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/profiling-overview#supported-features}}</ref>. Those profilers also support native code profiling, along with [[Apple Inc.]]'s [[Apple Developer Tools#Shark|Shark]] (OSX),<ref>{{cite web| work=[[Apple Developer Tools]]| title=Performance Tools| publisher=Apple, Inc.| year=2013| url=https://developer.apple.com/library/content/documentation/Performance/Conceptual/PerformanceOverview/PerformanceTools/PerformanceTools.html}}</ref> [[OProfile]] (Linux)<ref>{{cite web| work=[[IBM DeveloperWorks]]| last1=Netto| first1=Zanella| last2=Arnold| first2=Ryan S.| title=Evaluate performance for Linux on Power| year=2012| url=https://www.ibm.com/developerworks/linux/library/l-evaluatelinuxonpower/}}</ref>, [[Intel]] [[VTune]] and Parallel Amplifier (part of [[Intel Parallel Studio]]), [[Oracle Corporation|Oracle]] [[Performance Analyzer]].<ref>{{cite conference |last1=Schmidl |first1=Dirk |first2=Christian |last2=Terboven |first3=Dieter |last3=an Mey |first4=Matthias S. |last4=Müller |title=Suitability of Performance Tools for OpenMP Task-Parallel Programs |conference=Proc. 7th Int'l Workshop on Parallel Tools for High Performance Computing |year=2013 |pages=25–37 |url=https://books.google.com/books?id=-I64BAAAQBAJ&pg=PA27&lpg=PA27}}</ref>
===Instrumentation ===
This technique effectively adds instructions to the target program to collect the required information. Note that [[instrumenting]] a program can cause performance changes, and may in some cases lead to inaccurate results and/or [[heisenbug]]s. The effect will depend on what information is being collected, on the level of timing details reported, and on whether basic block profiling is used in conjunction with instrumentation.<ref>{{cite
Instrumentation is key to determining the level of control and amount of time resolution available to the profilers.
Line 95 ⟶ 92:
* '''Interpreter debug''' options can enable the collection of performance metrics as the interpreter encounters each target statement. A [[bytecode]], [[control table]] or [[Just-in-time compilation|JIT]] interpreters are three examples that usually have complete control over execution of the target code, thus enabling extremely comprehensive data collection opportunities.
===Hypervisor/
* '''Hypervisor''': Data are collected by running the (usually) unmodified program under a [[hypervisor]]. Example: [[SIMMON]]
* '''Simulator''' and '''Hypervisor''': Data collected interactively and selectively by running the unmodified program under an [[
==See also==
<!-- Please keep entries in alphabetical order & add a short description
{{div col
*
*
*
*
*
*
*
*
*
*
*
*
*
{{div col end}}
<!-- please keep entries in alphabetical order -->
|