J. Stan Cox’s research while affiliated with North Carolina State University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (4)


Hardware-Based Profiling: An Effective Technique for Profile-Driven Optimization
  • Article
  • Publisher preview available

April 1996

·

69 Reads

·

43 Citations

International Journal of Parallel Programming

Thomas M. Conte

·

Kishore N. Menezes

·

Burzin A. Patel

·

J. Stan Cox

Profile based optimization can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run significantly slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper introduces hardware-based profiling that uses traditional branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%-4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance systems.

View access options

Thomas M. Conte

December 1995

·

11 Reads

Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run significantly slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper introduces hardware-based profiling that uses traditional branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%--4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance systems. Keywords: Bran...


Using Branch Handling Hardware to Support Profile-Driven Optimization

November 1995

·

16 Reads

·

32 Citations

Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run 2--30 times slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper proposes using existing branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%--4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. This practically removes the inconvenience of profiling. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance sy...


Commercializing Profile-Driven Optimization

October 1994

·

22 Reads

·

4 Citations

There are a broad selection of code-improving optimizations and scheduling techniques based on profile information. Industry has been slow to productize these because traditional ways of profiling are cumbersome. Profiling slows down the execution of a program by factors of 2 to 30 times. Software vendors must compile, profile, and then re-compile their products. In addition, profiling requires a representative set of inputs and is hard to validate. Finally, profiling has had little success for system code such as kernel and I/O drivers. This paper discusses experiences AT&T Global Information Solutions has had with commercializing profile-driven optimizations. Three approaches to profiling are discussed, along with results and comments concerning their advantages and drawbacks. The validity of profiling is discussed. One new innovation, hardware-based profiling, removes many of the problems vendors have with profiling. The paper also discusses methods to profile system code and suppo...

Citations (3)


... Profiling can generally be divided into static (offline) and dynamic (during the execution). Dynamic profiling can use instrumentation, hardware counters [2], electromagnetic emanations [3], and program pre-runs, possibly in an emulator. Dynamic profiling has the following disadvantages: it by definition requires code to run, negatively affects program execution performance, requires representative data, and is time consuming. ...

Reference:

Method for Profile-Guided Optimization of Android Applications Using Random Forest
Hardware-Based Profiling: An Effective Technique for Profile-Driven Optimization

International Journal of Parallel Programming

... For hash tables, it can come up with a nearperfect hash function that is suited to the actual data that the program will be using. Static compiler writers have realized the bene t of using actual performance data, and pro le-directed optimization has already been incorporated into some modern commercial compilers 88,131,89]. ...

Commercializing Profile-Driven Optimization
  • Citing Article
  • October 1994

... To enable run-time analysis with low overhead many researchers have proposed the development of specialized on-chip hardware modules that can assist software developers in building more secure , more bug free, and more efficient applications. For example, a processor may be extended to dynamically insert instructions into the execution stream to profile a program for performance or buffer exploits [15] [16], analysis modules may be added to uncover the performance bottlenecks, hardware performance monitors may track the activities of the cache or branch unit [6] [13] [41] [52] [14] [21], replay boxes may be inserted for tracking down difficult to reproduce bugs [49], and a host of other mechanisms have been proposed in research literature. While the amount of information available at the hardware level makes this a natural place to add new runtime analysis functionality, the inclusion of specialized on-chip hardware is at odds with the cost and marketing constraints of those that build consumer microprocessor systems. ...

Using Branch Handling Hardware to Support Profile-Driven Optimization
  • Citing Article
  • November 1995