Profiling (computer programming): Difference between revisions

Content deleted Content added
m addition explaining the analysis
top: bold alt article name per MOS
 
(517 intermediate revisions by more than 100 users not shown)
Line 1:
{{Short description|Measuring the time or resources used by a section of a computer program}}
:''"Profiling" redirects here. For the science of criminal psychological analysis, see [[Offender profiling|offender profiling]].''
{{more citations needed|date=January 2009}}
{{Software development process|Tools}}
In [[software engineering]], '''profiling''' ('''program profiling''', '''software profiling''') is a form of [[dynamic program analysis]] that measures, for example, the space (memory) or time [[Computational complexity theory|complexity of a program]], the [[instruction set simulator|usage of particular instructions]], or the frequency and duration of function calls. Most commonly, profiling information serves to aid [[program optimization]], and more specifically, [[performance engineering]].
 
Profiling is achieved by [[Instrumentation (computer programming)|instrumenting]] either the program [[source code]] or its binary executable form using a tool called a ''profiler'' (or ''code profiler''). Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.
In [[software engineering]], '''performance analysis''' (also known as '''dynamic program analysis''') is the investigation of a program's behavior using information gathered as the program runs, as opposed to [[static code analysis]]. The usual goal of performance analysis is to determine which parts of a program to [[optimization (computer science)|optimize]] for speed or memory usage.
 
== Gathering program events ==
A '''profiler''' is a performance analysis tool that measures the behavior of a program as it runs, particularly the frequency and duration of function calls. The output is a stream of recorded events (a '''trace''') or a statistical summary of the events observed (a '''profile'''). Profilers use a wide variety of techniques to collect data, including hardware interrupts, code instrumentation, operating system [[hooking|hooks]], and performance counters.
Profilers use a wide variety of techniques to collect data, including [[hardware interrupt]]s, [[Instrumentation (computer programming)|code instrumentation]], [[instruction set simulator|instruction set simulation]], operating system [[hooking|hooks]], and [[Hardware performance counter|performance counter]]s.
 
== Use of profilers ==
As the summation in a profile often is done related to the source code positions where the events happen, the size of measurement data is linear to the code size of the program. In contrast, the size of a trace is linear to the program's runtime, making it somewhat impractical. For sequential programs, a profile is usually enough, but performance problems in parallel programs (waiting for messages or synchronisation issues) often depend on the time relationship of
[[File:CodeAnalyst3.png|thumb|Graphical output of the [[CodeAnalyst]] profiler]]
events, thus requiring the full trace to get an understanding of the problem.
{{quotation|text=
Program analysis tools are extremely important for understanding program behavior. Computer architects need such tools to evaluate how well programs will perform on new [[computer architecture|architectures]]. Software writers need tools to analyze their programs and identify critical sections of code. [[Compiler]] writers often use such tools to find out how well their [[instruction scheduling]] or [[branch prediction]] algorithm is performing...|author=ATOM|source=[[Conference on Programming Language Design and Implementation|PLDI]]|'94}}
 
The output of a profiler may be:
:''Program analysis tools are extremely important for understanding program behavior. Computer architects need such tools to evaluate how well programs will perform on new [[computer architecture|architectures]]. Software writers need tools to analyze their programs and identify critical pieces of code. [[Compiler]] writers often use such tools to find out how well their [[instruction scheduling]] or [[branch prediction]] [[algorithm]] is performing...'' (ATOM, [[PLDI]], '94)
 
* A statistical ''summary'' of the events observed (a '''profile''')
:Summary profile information is often shown annotated against the source code statements where the events occur, so the size of measurement data is linear to the code size of the program.
 
/* ------------ source------------------------- count */
0001 IF X = "A" 0055
0002 THEN DO
0003 ADD 1 to XCOUNT 0032
0004 ELSE
0005 IF X = "B" 0055
 
* A stream of recorded events (a '''trace''')
:For sequential programs, a summary profile is usually sufficient, but performance problems in parallel programs (waiting for messages or synchronization issues) often depend on the time relationship of events, thus requiring a full trace to get an understanding of what is happening.
: The size of a (full) trace is linear to the program's [[instruction path length]], making it somewhat impractical. A trace may therefore be initiated at one point in a program and terminated at another point to limit the output.
* An ongoing interaction with the [[hypervisor]] (continuous or periodic monitoring via on-screen display for instance)
: This provides the opportunity to switch a trace on or off at any desired point during execution in addition to viewing on-going metrics about the (still executing) program. It also provides the opportunity to suspend asynchronous processes at critical points to examine interactions with other parallel processes in more detail.
 
A profiler can be applied to an individual method or at the scale of a module or program, to identify performance bottlenecks by making long-running code obvious.<ref>{{cite web| title=How to find the performance bottleneck in C# desktop application?| publisher=[[Stack Overflow]]| year=2012| url=https://stackoverflow.com/questions/13698674/how-to-find-the-performance-bottleneck-in-c-sharp-desktop-application}}</ref> A profiler can be used to understand code from a timing point of view, with the objective of optimizing it to handle various runtime conditions<ref>{{cite web| last=Krauss| first=Kirk J| title=Performance Profiling with a Focus| publisher=Develop for Performance| year=2017| url=http://www.developforperformance.com/PerformanceProfilingWithAFocus.html}}</ref> or various loads.<ref>{{cite web| work=Stackify Developer Tips, Tricks and Resources| title=What is code profiling? Learn the 3 Types of Code Profilers| publisher=Disqus| year=2016| url=https://stackify.com/what-is-code-profiling/}}</ref> Profiling results can be ingested by a compiler that provides [[profile-guided optimization]].<ref>{{cite web| last=Lawrence| first=Eric| work=testslashplain| title=Getting Started with Profile Guided Optimization| publisher=WordPress| year=2016| url=https://textslashplain.com/2016/01/10/getting-started-with-profile-guided-optimization/}}</ref> Profiling results can be used to guide the design and optimization of an individual algorithm; the [[Krauss matching wildcards algorithm]] is an example.<ref>{{cite web| last=Krauss| first=Kirk| title=Matching Wildcards: An Improved Algorithm for Big Data| publisher=Develop for Performance| year=2018| url=http://www.developforperformance.com/MatchingWildcards_AnImprovedAlgorithmForBigData.html}}</ref> Profilers are built into some [[application performance management]] systems that aggregate profiling data to provide insight into [[transaction processing|transaction]] workloads in [[distributed computing|distributed]] applications.<ref>{{cite web| work=Stackify Developer Tips, Tricks and Resources| title=List of .Net Profilers: 3 Different Types and Why You Need All of Them| publisher=Disqus| year=2016| url=https://stackify.com/three-types-of-net-profilers/}}</ref>
 
==History==
Performance-analysis tools existed on [[IBM/360]] and [[IBM/370]] platforms from the early 1970s, usually based on timer interrupts which recorded the [[program status word]] (PSW) at set timer-intervals to detect "hot spots" in executing code.{{citation needed|date=February 2014}} This was an early example of [[Sampling (statistics)|sampling]] (see below). In early 1974 [[Instruction Set Simulator|instruction-set simulator]]s permitted full trace and other performance-monitoring features.{{citation needed|date=February 2014}}
Profiler-driven program analysis dates back to 1982, with the publication of ''Gprof: a Call Graph Execution Profiler'' [http://docs.freebsd.org/44doc/psd/18.gprof/paper.pdf]. The paper outlined a system which later became the [[GNU]] profiler, also known as gprof.
 
Profiler-driven program analysis on Unix dates back to 1973,<ref name="prof">[http://www.tuhs.org/Archive/Distributions/Research/Dennis_v4/v4man.tar.gz Unix Programmer's Manual, 4th Edition]</ref> when Unix systems included a basic tool, <code>prof</code>, which listed each function and how much of program execution time it used. In 1982 <code>gprof</code> extended the concept to a complete [[call graph]] analysis.<ref name="gprof">
In 1994, Amitabh Srivastava and Alan Eustace of [[Digital Equipment Corporation]] published a paper describing ATOM [http://www-2.cs.cmu.edu/~bumba/filing_cabinet/papers/srivastava-atom.pdf]. ATOM is a platform for converting a program into its own profiler. That is, at [[compile time]], it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique, modifying a program to analyze itself, is known as "instrumentation".
S.L. Graham, P.B. Kessler, and M.K. McKusick, [http://docs.freebsd.org/44doc/psd/18.gprof/paper.pdf ''gprof: a Call Graph Execution Profiler''], Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, ''[[SIGPLAN]] Notices'', Vol. 17, No 6, pp. 120-126; [[doi:10.1145/800230.806987]]</ref>
 
In 1994, Amitabh Srivastava and [[Alan Eustace]] of [[Digital Equipment Corporation]] published a paper describing ATOM<ref>
In 2004, both the Gprof and ATOM papers appeared on the list of the 20 most influental [[PLDI]] papers of all time. [http://www.cs.utexas.edu/users/mckinley/20-years.html]
A. Srivastava and A. Eustace, [http://www.ece.cmu.edu/~ece548/tools/atom/man/wrl_94_2.pdf ''ATOM: A system for building customized program analysis tools''], Proceedings of the ACM SIGPLAN Conference on Programming language design and implementation (PLDI '94), pp. 196-205, 1994; ACM ''SIGPLAN Notices'' - Best of PLDI 1979-1999 Homepage archive, Vol. 39, No. 4, pp. 528-539; [[doi:10.1145/989393.989446]]
</ref> (Analysis Tools with OM). The ATOM platform converts a program into its own profiler: at [[compile time]], it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique - modifying a program to analyze itself - is known as "[[Instrumentation (computer programming)|instrumentation]]".
 
In 2004 both the <code>gprof</code> and ATOM papers appeared on the list of the 50 most influential [[Conference on Programming Language Design and Implementation|PLDI]] papers for the 20-year period ending in 1999.<ref>
[http://www.cs.utexas.edu/users/mckinley/20-years.html 20 Years of PLDI (1979–1999): A Selection], [[Kathryn S. McKinley]], Editor</ref>
 
==Profiler types based on output ==
 
===Flat profiler ===
Flat profilers compute the average call times, from the calls, and do not break down the call times based on the callee or the context.
 
===Call-graph profiler===
[[Call graph]] profilers<ref name="gprof" /> show the call times, and frequencies of the functions, and also the call-chains involved based on the callee. In some tools full context is not preserved<!--, but others can save full call tree-->.
 
===Input-sensitive profiler===
Input-sensitive profilers<ref name="aprof">E. Coppa, C. Demetrescu, and I. Finocchi, [https://web.archive.org/web/20180611201601/https://ieeexplore.ieee.org/document/6858059/ ''Input-Sensitive Profiling''], IEEE Trans. Software Eng. 40(12): 1185-1205 (2014); [[doi:10.1109/TSE.2014.2339825]]</ref><ref>D. Zaparanuks and M. Hauswirth, ''Algorithmic Profiling'', Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2012), ACM SIGPLAN Notices, Vol. 47, No. 6, pp. 67-76, 2012; [[doi:10.1145/2254064.2254074]]</ref><ref>T. Kustner, J. Weidendorfer, and T. Weinzierl, ''Argument Controlled Profiling'', Proceedings of Euro-Par 2009 – Parallel Processing Workshops, Lecture Notes in Computer Science, Vol. 6043, pp. 177-184, 2010; [[doi:10.1007/978-3-642-14122-5 22]]</ref> add a further dimension to flat or call-graph profilers by relating performance measures to features of the input workloads, such as input size or input values. They generate charts that characterize how an application's performance scales as a function of its input.
 
==Data granularity in profiler types==
Profilers, which are also programs themselves, analyze target programs by collecting information on the target program's execution. Based on their data granularity, which depends upon how profilers collect information, they are classified as ''event-based'' or ''statistical'' profilers. Profilers interrupt program execution to collect information. Those interrupts can limit time measurement resolution, which implies that timing results should be taken with a grain of salt. [[Basic block]] profilers report a number of machine [[cycles per instruction|clock cycles]] devoted to executing each line of code, or timing based on adding those together; the timings reported per basic block may not reflect a difference between [[CPU cache|cache]] hits and misses.<ref>{{cite web| work=OpenStax CNX Archive| title=Timing and Profiling - Basic Block Profilers| url=https://archive.cnx.org/contents/d29c016a-2960-4fc9-b431-9eda881a28f5@3/timing-and-profiling-basic-block-profilers#id6897344}}</ref><ref>{{cite journal| last1=Ball| first1=Thomas| last2=Larus| first2=James R.| journal=ACM Transactions on Programming Languages and Systems| volume=16| issue=4| pages=1319–1360| title=Optimally profiling and tracing programs| publisher=ACM Digital Library| year=1994| url=https://www.classes.cs.uchicago.edu/current/32001-1/papers/ball-larus-profiling.pdf| doi=10.1145/183432.183527| s2cid=6897138| access-date=2018-05-18| archive-url=https://web.archive.org/web/20180518195918/https://www.classes.cs.uchicago.edu/current/32001-1/papers/ball-larus-profiling.pdf| archive-date=2018-05-18| url-status=dead}}</ref>
 
===Event-based profilers===
Event-based profilers are available for the following programming languages:
* [[Java (programming language)|Java]]: the [[Java Virtual Machine Tools Interface|JVMTI]] (JVM Tools Interface) API, formerly JVMPI (JVM Profiling Interface), provides hooks to profilers, for trapping events like calls, class-load, unload, thread enter leave.
* [[.NET Framework|.NET]]: Can attach a profiling agent as a ''COM'' server to the ''CLR'' using Profiling ''API''. Like Java, the runtime then provides various callbacks into the agent, for trapping events like method [[Interpreter|JIT]] / enter / leave, object creation, etc. Particularly powerful in that the profiling agent can rewrite the target application's bytecode in arbitrary ways.
* [[Python (programming language)|Python]]: Python profiling includes the profile module, hotshot (which is call-graph based), and using the 'sys.setprofile' function to trap events like c_{call,return,exception}, python_{call,return,exception}.
* [[Ruby (programming language)|Ruby]]: Ruby also uses a similar interface to Python for profiling. Flat-profiler in profile.rb, module, and ruby-prof a C-extension are present.
 
==Performance Analysis [[http://futureobservatory.dyndns.org/9433.htm]]==
In most organizations the key data on performance, typically derived from order processing and invoicing, are likely to already be available on its computer databases. These should
provide accurate sales data split by product and by region. They should also provide these in a timely manner on a computer terminal. It should be recognized, however, that such systems are driven by accounting requirements, and in particular by accounting periods; they will often reflect an unbalanced
picture until the month-end procedures have been completed.
In the case of non-profit organizations it is just as important to keep track of the clients (recipients, donors, patients, customers and so on), as well as the transactions related to
them.
If the computer systems have been designed to cope with the level of detail needed, performance figures should be available down to individual customers or clients. On the other hand, this potentially poses the problem of `information overload'. There will be so much information, most of it redundant, that it will effectively be useless as a management tool.
There are a number of possible answers to this potential torrent
of data:
===ABC analysis===
Typically the reports are sorted in terms of volume (or value) of sales, so that the customers are ranked in order of their sales offtake; with the highest-volume (and hence most `important') customers at the top of the list and the many low-volume customers at the bottom (since it matters less if they are not taken into account in decisions).
The 80:20 Rule says that the top 20 per cent of customers on such a list are likely to account for 80 per cent of total sales; so this approach can, in effect, be used to reduce the data to be examined by a factor of five.
===Variance analysis===
In this approach performance criteria (typically budgets or targets) are set, against which each of the products or customers are subsequently monitored. If their performance falls outside the expected range this is high-lighted. This means that only those items where there are `variances' need be reviewed.
However, the variances are only as good as the criteria (usually the budgets) set; and setting these is, in practice, a major task. This is particularly problematical where parameters change
with time, so this approach is often only used (if at all) on the 20 per cent of most important items.
===Ad hoc database enquiries and reports===
If the basic data is suitably organized, on a computer database, it may be possible to access it from terminals across the organization. The abstracted data can then be processed from a variety of perspectives. This means that ad hoc reports or enquiries may be easily prepared. Unfortunately, few organizations even now have their performance data structured in such a way that it can be used for analysis in more than a very limited fashion.
Regrettably, though, many of the key measures may not have been recorded. The data collected by the average system are driven by accounting needs and record only those transactions which result
in the actual completion of a sale. It will be a very unusual system if it records details of sales lost, for example because the item wanted was out of stock or did not quite meet the specification required. Such information 'may' be available, typically to those taking the orders, but it is usually discarded as soon as it is obvious that a sale is not to be made; yet an analysis of such lost orders can be another invaluable input to marketing planning.
==Methods of data gathering==
===Statistical profilers===
SomeThese profilers operate by [[Sampling (statistics)|sampling]]. A sampling profiler probes the target program's [[programcall counterstack]] at regular intervals using [[operating system]] [[interrupt]]s. Sampling profiles are typically less numerically accurate and specific, providing only a statistical approximation, but allow the target program to run at near full speed. "The actual amount of error is usually more than one sampling period. In fact, if a value is n times the sampling period, the expected error in it is the square-root of n sampling periods."<ref>[http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html#SEC12 Statistical Inaccuracy of <code>gprof</code> Output] {{webarchive|url=https://web.archive.org/web/20120529075000/http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html |date=2012-05-29 }}</ref>
 
In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program and thus don't have as many side effects (such as on memory caches or instruction decoding pipelines). Also since they don't affect the execution speed as much, they can detect issues that would otherwise be hidden. They are also relatively immune to over-evaluating the cost of small, frequently called routines or 'tight' loops. They can show the relative amount of time spent in user mode versus interruptible kernel mode such as [[system call]] processing.
Some profilers instrument the target program with additional instructions to collect the required information. Instrumenting the program can cause changes in the performance of the program, causing inaccurate results and [[heisenbug]]s. Instrumenting can potentially be very specific but slows down the target program as more specific information is collected.
 
Unfortunately, running kernel code to handle the interrupts incurs a minor loss of CPU cycles from the target program, diverts cache usage, and cannot distinguish the various tasks occurring in uninterruptible kernel code (microsecond-range activity) from user code. Dedicated hardware can do better: ARM Cortex-M3 and some recent MIPS processors' JTAG interfaces have a PCSAMPLE register, which samples the [[program counter]] in a truly undetectable manner, allowing non-intrusive collection of a flat profile.
The resulting data are not exact, but a statistical approximation. ''The actual amount of error is usually more than one sampling period. In fact, if a value is n times the sampling period, the expected error in it is the square-root of n sampling periods.'' [http://lgl.epfl.ch/teaching/case_tools/doc/gprof/gprof_12.html]
 
Some commonly used<ref>{{cite web| title=Popular C# Profilers| publisher=Gingtage| year=2014| url=http://www.ginktage.com/2014/10/popular-c-profilers/}}</ref> statistical profilers for Java/managed code are [[SmartBear Software]]'s [[AQtime]]<ref>{{cite web| work=AQTime 8 Reference| title=Sampling Profiler - Overview| publisher=SmartBear Software| year=2018| url=https://support.smartbear.com/viewarticle/54581/}}</ref> and [[Microsoft]]'s [[CLR Profiler]].<ref>{{cite web| work=Microsoft .NET Framework Unmanaged API Reference| last=Wenzal| first=Maira|display-authors=etal| title=Profiling Overview| publisher=Microsoft| year=2017| url=https://docs.microsoft.com/en-us/dotnet/framework/unmanaged-api/profiling/profiling-overview#supported-features}}</ref> Those profilers also support native code profiling, along with [[Apple Inc.]]'s [[Apple Developer Tools#Shark|Shark]] (OSX),<ref>{{cite web| work=[[Apple Developer Tools]]| title=Performance Tools| publisher=Apple, Inc.| year=2013| url=https://developer.apple.com/library/content/documentation/Performance/Conceptual/PerformanceOverview/PerformanceTools/PerformanceTools.html}}</ref> [[OProfile]] (Linux),<ref>{{cite web| work=IBM DeveloperWorks| last1=Netto| first1=Zanella| last2=Arnold| first2=Ryan S.| title=Evaluate performance for Linux on Power| year=2012| url=https://www.ibm.com/developerworks/linux/library/l-evaluatelinuxonpower/}}</ref> [[Intel]] [[VTune]] and Parallel Amplifier (part of [[Intel Parallel Studio]]), and [[Oracle Corporation|Oracle]] [[Performance Analyzer]],<ref>{{cite conference |last1=Schmidl |first1=Dirk |first2=Christian |last2=Terboven |first3=Dieter |last3=an Mey |first4=Matthias S. |last4=Müller |title=Suitability of Performance Tools for OpenMP Task-Parallel Programs |conference=Proc. 7th Int'l Workshop on Parallel Tools for High Performance Computing |year=2013 |pages=25–37 |isbn=9783319081441 |url=https://books.google.com/books?id=-I64BAAAQBAJ&pg=PA27}}</ref> among others.
Two of the most commonly used statistical profilers are [[GNU]]'s gprof and [[Silicon Graphics|SGI]]'s Pixie.
 
===Instrumentation ===
This technique effectively adds instructions to the target program to collect the required information. Note that [[instrumenting]] a program can cause performance changes, and may in some cases lead to inaccurate results and/or [[heisenbug]]s. The effect will depend on what information is being collected, on the level of timing details reported, and on whether basic block profiling is used in conjunction with instrumentation.<ref>{{cite magazine| last1=Carleton| first1=Gary| last2=Kirkegaard| first2=Knud| last3=Sehr| first3=David| title=Profile-Guided Optimizations| magazine=[[Dr. Dobb's Journal]]| year=1998| url=http://www.drdobbs.com/profile-guided-optimizations/184410561}}</ref> For example, adding code to count every procedure/routine call will probably have less effect than counting how many times each statement is obeyed. A few computers have special hardware to collect information; in this case the impact on the program is minimal.
* Manual. Done by the programmer, e.g. by adding instructions to explicitly calculate runtimes.
 
* Compiler assisted. Example: "gcc -pg ..." for gprof.
Instrumentation is key to determining the level of control and amount of time resolution available to the profilers.
* Binary translation; the tool adds instrumentation to a compiled binary. Example: ATOM
* '''Manual''': Performed by the programmer, e.g. by adding instructions to explicitly calculate runtimes, simply count events or calls to measurement [[API]]s such as the [[Application Response Measurement]] standard.
* Runtime instrumentation: Directly before execution the code is instrumented. The program run is fully supervised and controlled by the tool. Examples: PIN, Valgrind
* '''Automatic source level''': instrumentation added to the source code by an automatic tool according to an instrumentation policy.
* Runtime injection: More lightweight than runtime instrumentation. Code is modified at runtime to have jumps to helper functions. Example: DynInst
* '''Intermediate language''': instrumentation added to [[Assembly language|assembly]] or decompiled [[bytecode]]s giving support for multiple higher-level source languages and avoiding (non-symbolic) binary offset re-writing issues.
* '''Compiler assisted'''
* '''Binary translation''': The tool adds instrumentation to a compiled [[executable]].
* '''Runtime instrumentation''': Directly before execution the code is instrumented. The program run is fully supervised and controlled by the tool.
* '''Runtime injection''': More lightweight than runtime instrumentation. Code is modified at runtime to have jumps to helper functions.
 
===Interpreter instrumentation===
* '''Interpreter debug''' options can enable the collection of performance metrics as the interpreter encounters each target statement. A [[bytecode]], [[control table]] or [[Just-in-time compilation|JIT]] interpreters are three examples that usually have complete control over execution of the target code, thus enabling extremely comprehensive data collection opportunities.
 
===Hypervisor/simulator===
* '''Hypervisor''': Data are collected by running the (usually) unmodified program under a [[hypervisor]]. Example: [[SIMMON]]
* '''Simulator''' and '''Hypervisor''': Data collected interactively and selectively by running the unmodified program under an [[instruction set simulator]].
 
==See also==
 
<!-- Please keep entries in alphabetical order & add a short description {{annotated link|WP:SEEALSO}} -->
{{div col|small=yes|colwidth=20em}}
* {{annotated link|Algorithmic efficiency}}
* {{annotated link|Benchmark (computing)|Benchmark}}
* {{annotated link|Java performance}}
* {{annotated link|List of performance analysis tools}}
* {{annotated link|Performance Application Programming Interface|PAPI}}
* {{annotated link|Performance engineering}}
* {{annotated link|Performance prediction}}
* {{annotated link|Performance tuning}}
* {{annotated link|Runtime verification}}
* {{annotated link|Profile-guided optimization}}
* {{annotated link|Static code analysis}}
* {{annotated link|Software archaeology}}
* {{annotated link|Worst-case execution time}} (WCET)
{{div col end}}
<!-- please keep entries in alphabetical order -->
 
== References==
{{reflist|30em}}
 
==External links==
* Article "[http://www.ibm.com/developerworks/rational/library/05/1004_gupta/ Need for speed &mdash; Eliminating performance bottlenecks]" on doing execution time analysis of Java applications using [[IBM Rational Application Developer]].
* [http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html gprof] The GNU Profiler, part of GNU Binutils (which are part of the GNU project); you can use some visualisation tools called [http://rw4.cs.uni-sb.de/~sander/html/gsvcg1.html VCG tools] and combine both of them using [http://www.ida.liu.se/~vaden/cgdi Call Graph Drawing Interface] (CGDI); a second solution is [http://kprof.sourceforge.net/ kprof]. More for C/C++ but works well for other languages.
*[http://software.intel.com/sites/products/documentation/hpc/vtune/windows/jit_profiling.pdf Profiling Runtime Generated and Interpreted Code using the VTune Performance Analyzer]
* [http://www710.univ-lyon1.fr/~yperret/fnccheck/profiler.html FunctionCheck], [http://sourceforge.net/projects/fnccheck/ @ sourceforge.net] is a profiler that was created "because the well known profiler gprof have some limitations". using GCC -finstrument-functions option. [http://kprof.sourceforge.net/ kprof] is a front-end. For C++/C.
* [http://valgrind.kde.org/ Valgrind] is a GPL'd system for debugging and profiling x86-Linux programs. You can automatically detect many memory management and threading bugs. [http://alleyoop.sourceforge.net/ alleyoop] is a front-end for valgrind. It works for any language and the assembler.
* [http://www.intel.com/cd/software/products/asmo-na/eng/vtune/index.htm/ VTune] is Intel's family of commercial performance analyzers for Windows and Linux executables on Intel CPUs. It has command-line tools, a standalone environment and plugins for Microsoft Visual Studio and Eclipse.
* [http://developer.apple.com/tools/sharkoptimize.html/ Shark] is Apple's free performance analyzer for Macintosh executables.
* [http://www-306.ibm.com/software/awdtools/purifyplus/ PurifyPlus] is a commercial family of performance analysis tools from IBM's Rational unit. For Linux, UNIX and Windows.
* [http://developer.amd.com/cawin.aspx CodeAnalyst] is AMD's free performance analyzer for Windows programs on AMD hardware. AMD also has a [http://developer.amd.com/calinux.aspx Linux version] of CodeAnalyst.
* [http://oprofile.sourceforge.net/ OProfile] statistical, kernel based [[GPL]] profiler for Linux
* [http://www.digitalmars.com/ctg/trace.html Profiler] for use with [[Digital Mars]] [[C programming language|C]], C++ and [[D programming language|D]] compilers.
* [http://jrat.sf.net JRat] [[Java programming language|Java]] Runtime Analysis Toolkit a [[LGPL]] profiler
* [http://www.microsoft.com/downloads/details.aspx?FamilyId=86CE6052-D7F4-4AEB-9B7A-94635BEEBDDA&displaylang=en CLR Profiler] is a free [[Common Language Runtime|CLR]] profiler provided by Microsoft for CLR applications.
* [http://developers.sun.com/prodtech/cc/analyzer_index.html Performance Analyzer] included with Sun Studio (now free!)
* [http://www.yourkit.com YourKit] a profiler for Java and .NET framework.
* [http://www.quest.com/jprobe/ JProbe], a profiler by [[Quest Software]] that is now part of the JProbe suite which also includes tools such as a [[memory debugger]].
* Article "[http://www.ibm.com/developerworks/rational/library/05/1004_gupta/ Need for speed -- Eliminating performance bottlenecks]" on doing execution time analysis of Java applications using IBM Rational Application Developer.
* [http://softtecharticles.com/mambo/index.php?option=com_content&task=view&id=42&Itemid=36 Tutorial on the use of Oprofile]
[[Category:Computer programming]]
 
{{DEFAULTSORT:Software Performance Analysis}}
[[de:Profiler (Programmierung)]]
[[Category:Software optimization]]
[[nl:Profiler]]
[[Category:Profilers|*]]
[[ru:Профилирование]]
[[zh:客户轮廓分析]]