Difference between revisions of "Performance/WhatToTest"

From Adobe Open Source Wiki
Jump to: navigation, search
(More Complex Stuff)
(More Complex Stuff)
 
(38 intermediate revisions by one user not shown)
Line 1: Line 1:
What things need to be tested as part of this benchmark?
+
What things need to be tested as part of this benchmark suite?
 +
 
 +
If you have additions or suggestions, please add them.
 +
 
 +
 
 +
__TOC__
 +
 
  
 
== Fundamental Concepts ==
 
== Fundamental Concepts ==
Line 6: Line 12:
 
* basic math
 
* basic math
 
* basic flow control
 
* basic flow control
* dereference (pointer/iterator is a regular type that can be dereferenced)
+
* dereference
 +
** a pointer/iterator is a regular type that can be dereferenced
 
* type conversion
 
* type conversion
 
** performance of type conversions (known to be a problem on certain compilers)
 
** performance of type conversions (known to be a problem on certain compilers)
 +
 +
  
 
== Language Concepts ==
 
== Language Concepts ==
 
* abstraction
 
* abstraction
 +
**what happens to performance when I put { } around a value?
 +
 
* inheritance
 
* inheritance
* function calls
+
**what happens if my function is in a child class, virtual, or both?
* exceptions
+
**what happens if my values are in a child class?
* rtti
+
 
* function objects
+
* function calls - how expensive are different styles of function call?
 +
** inline, function pointer, defined in another module, etc.
 +
 
 +
* exceptions - what is the cost of using exceptions?
 +
** throwing versus not
 +
** compared to error return codes
 +
** exceptions enabled or not in the compiler
 +
** compared to setjmp/longjmp?
 +
 
 +
* RTTI - what are the costs?
 +
** compared to per-object ids or strings
 +
 
 +
* function objects - what are the costs/benefits compared to inline functions and function pointers?
 +
 
 
* template instantiation
 
* template instantiation
 +
** are there any problems with duplicate template bodies and code size?
 +
** are there any problems with recursive templates or deep usage of templates?
 +
 
* constructors
 
* constructors
 
** Are empty constructors correctly optimized away?
 
** Are empty constructors correctly optimized away?
 +
** Are copy constructors correctly optimized away?
 +
** Are default copy constructors efficient?
  
 
== Simple Idioms ==
 
== Simple Idioms ==
 
These are building blocks, with more than one way to express them.  These are commonly reused things, usually found in headers like "inlines.h", "math_utils.h", etc.
 
These are building blocks, with more than one way to express them.  These are commonly reused things, usually found in headers like "inlines.h", "math_utils.h", etc.
* swap
+
 
 +
For all of these:  What is the optimal implementation, and does the compiler recognize other implementations and substitute the optimal one?
 +
 
 +
 
 +
* swapping 2 values
 +
** does anything need to be tested for swapping 2 different types, or is that handled by assignment/conversion testing?
 
* absolute value
 
* absolute value
* min/max
+
* min/max of 2 values
 
* switch versus if/else trees
 
* switch versus if/else trees
 
* pin values to range (max, min combo)
 
* pin values to range (max, min combo)
* rotate bits
+
* rotate bits within a byte/word
 
* byte order reversal
 
* byte order reversal
 
* round up/down
 
* round up/down
Line 34: Line 68:
 
* thresholding
 
* thresholding
 
* pointer alignment tests and comparisons
 
* pointer alignment tests and comparisons
 +
* fixed point math
 +
* extended precision math (ie: 64 bit math on a 32 bit CPU)
  
 
== Complex Idioms ==
 
== Complex Idioms ==
 +
These may require multiple optimizations to perform well, or some pattern/idiom recognition in the compiler.
 +
These are simple algorithms important to many different tasks/fields, but are frequently the inner loops and most performance sensitive code.
 +
<BR>
 +
 +
 +
* safe bool cast from a smart pointer
 
* hand coded memcpy, memmove, memset, memcmp
 
* hand coded memcpy, memmove, memset, memcmp
 
** simple types and user defined types
 
** simple types and user defined types
 
** arrays and containers
 
** arrays and containers
 
* safe bool cast from a smart pointer
 
 
* convolution
 
* convolution
** 1D and 2D
+
** 1D, 2D, and more?
 
* matrix multiplication
 
* matrix multiplication
 +
** 2D is common
 +
** higher dimensions get a bit esoteric
 
* min/max of a sequence
 
* min/max of a sequence
 
* summation of a sequence
 
* summation of a sequence
 +
* product of a sequence
 +
* dot product of sequences
 
* sorting a sequence
 
* sorting a sequence
 +
** many possible algorithms and implementations
 +
** given the permutations, is this testable?
 +
** maybe just test STL implementations?
 
* reversing a sequence
 
* reversing a sequence
 
* rotating members of a sequence
 
* rotating members of a sequence
 
* find/search sequence
 
* find/search sequence
 +
** again, many possible algorithms and implementations
 +
** given the permutations, is this testable?
 +
** maybe just test STL implementations?
 
* lookup tables
 
* lookup tables
 
* interleave/deinterleave buffers
 
* interleave/deinterleave buffers
 
* transpose block
 
* transpose block
* rotate a block +- 90 and 180 degrees
+
* rotate a 2D block
 +
** plus or minus 90 degrees
 +
** 180 degrees
 +
** horizontal flip
 +
** vertical flip
 +
** transpose
 
* histogram building
 
* histogram building
 
* stack like operations
 
* stack like operations
Line 60: Line 115:
  
 
== Runtime and Library Support ==
 
== Runtime and Library Support ==
 +
These are library functions or header defined functions/macros supplied by the compiler or OS. These can sometimes be the performance limiting factor in a variety of applications.
 +
 +
 
* allocation / deletion
 
* allocation / deletion
 
** new, delete
 
** new, delete
 
** malloc, free
 
** malloc, free
 +
** performance for various sizes (testing suballocation)
  
 
* mathlib
 
* mathlib
 +
** all functions
 +
** compared to inline versions of some functions
  
 
* string manipulation routines
 
* string manipulation routines
** strcpy, strchr, strcat, etc.
+
** strcpy, strcmp, strchr, strcat, etc.
 +
** strncpy, strncmp, strncat, etc.
  
 
* memory routines
 
* memory routines
 
** memcpy, memmove, memset, memcmp, etc.
 
** memcpy, memmove, memset, memcmp, etc.
 +
 +
* ctype routines
 +
** isdigit, isalpha, etc.
 +
 +
* stdio
 +
** printf, getc, fread, fwrite, etc.
 +
 +
* C++ iostreams
 +
** cout, ostream, etc.
 +
** compare cout versus printf
 +
 +
* stdlib
 +
** strtod, strtof
 +
 +
* threads
 +
** creation, destruction, sleep overhead, semaphores, mutexes
  
 
== More Complex Stuff ==
 
== More Complex Stuff ==
 
* STL containers
 
* STL containers
 +
** insertion
 +
*** push_front
 +
*** push_back
 +
*** in-order, reverse-order and random insert for ordered containers
 +
** removal
 +
** deletion of container
 +
** iterating items
 +
** reverse iterating items
 +
** copy containers
 +
** erase all entries
 +
** search/find/lookup
 +
*** in-order, reverse-order and random
 +
** vector
 +
*** is iteration of vector the same as iterating a pointer
 +
*** is operator[] the same speed as indexing a pointer
 +
*** what is the penalty for using operator at() range checking
 +
 
* STL algorithms
 
* STL algorithms
 
* iostreams
 
* iostreams
 +
 +
 +
* strstr / text searching
 +
** no "best" algorithm
 +
** large variance in optimum based on alphabet (roman text, gene sequencing, bit pattern)
 +
 +
* Discrete Wavelet/Fourier/Cosine/Sine Transform
 +
** 1D (Audio, PDE spectral methods)
 +
** 2D (video, image compression)
 +
** simple and optimized algorithms
 +
** wavelet and FFT have many near optimal implementations
 +
** JPEG, MPEG
  
 
== Optimizations ==
 
== Optimizations ==
Need to test as many as possible.
+
We need to test as many optimizations as possible. Priority should be given to those that are used most often and offer the highest benefit to the most code (which isn't always easy to judge).
And this list deserves it's own page.
+
 
 +
 
 +
 
 +
* algebraic simplification
 +
* constant folding
 +
* constant propagation
 +
* copy propagation
 +
* common subexpression elimination
 +
* loop invariant code motion
 +
* dead code elimination
 +
* strength reduction
 +
* value range propagation (aka Predicate Simplification?)
 +
* loop fusion
 +
* loop interchange
 +
* loop normalization
 +
* loop removal
 +
* loop unrolling
 +
* loop unswitching
 +
* normalization of references
 +
* scalar replacement of arrays
 +
* scalar replacement of structs
 +
* instruction scheduling
 +
* removal of unused objects/allocations
 +
** this may not be legal in some circumstances
 +
* inlining
 +
** keyword inline
 +
** auto inline of small functions
 +
** functions defined in classes
 +
** functions defined in templates
 +
** depth of inlining (could be difficult to test)
 +
** leaf first or trunk first?
 +
** what factors might disable inlining on some compilers?
  
 
== Conformance to Specification ==
 
== Conformance to Specification ==
Line 85: Line 223:
 
* short circuit evaluation of boolean expressions
 
* short circuit evaluation of boolean expressions
 
* growth policies and complexities of STL containers
 
* growth policies and complexities of STL containers
 +
* concept checks compile away to nothing
 +
 +
== Binary Size ==
 +
* Are duplicate templates correctly collapsed?
 +
* Are unused template member functions correctly removed?
 +
* Are unused (or duplicate) static, const, and static const values correctly removed?
 +
 +
  
 
== Language Extensions ==
 
== Language Extensions ==
* OpenMP
+
* [http://www.boost.org/ Boost]
 +
* [http://openmp.org/wp/ OpenMP]
 +
 
 +
 
  
 
== Other ? ==
 
== Other ? ==
 +
* ineffiency returning or working with std::pair ?

Latest revision as of 00:35, 26 March 2009

What things need to be tested as part of this benchmark suite?

If you have additions or suggestions, please add them.


Contents


Fundamental Concepts

Most of these can't easily be tested.

  • regular types
  • basic math
  • basic flow control
  • dereference
    • a pointer/iterator is a regular type that can be dereferenced
  • type conversion
    • performance of type conversions (known to be a problem on certain compilers)


Language Concepts

  • abstraction
    • what happens to performance when I put { } around a value?
  • inheritance
    • what happens if my function is in a child class, virtual, or both?
    • what happens if my values are in a child class?
  • function calls - how expensive are different styles of function call?
    • inline, function pointer, defined in another module, etc.
  • exceptions - what is the cost of using exceptions?
    • throwing versus not
    • compared to error return codes
    • exceptions enabled or not in the compiler
    • compared to setjmp/longjmp?
  • RTTI - what are the costs?
    • compared to per-object ids or strings
  • function objects - what are the costs/benefits compared to inline functions and function pointers?
  • template instantiation
    • are there any problems with duplicate template bodies and code size?
    • are there any problems with recursive templates or deep usage of templates?
  • constructors
    • Are empty constructors correctly optimized away?
    • Are copy constructors correctly optimized away?
    • Are default copy constructors efficient?

Simple Idioms

These are building blocks, with more than one way to express them. These are commonly reused things, usually found in headers like "inlines.h", "math_utils.h", etc.

For all of these: What is the optimal implementation, and does the compiler recognize other implementations and substitute the optimal one?


  • swapping 2 values
    • does anything need to be tested for swapping 2 different types, or is that handled by assignment/conversion testing?
  • absolute value
  • min/max of 2 values
  • switch versus if/else trees
  • pin values to range (max, min combo)
  • rotate bits within a byte/word
  • byte order reversal
  • round up/down
  • round float to int
  • thresholding
  • pointer alignment tests and comparisons
  • fixed point math
  • extended precision math (ie: 64 bit math on a 32 bit CPU)

Complex Idioms

These may require multiple optimizations to perform well, or some pattern/idiom recognition in the compiler. These are simple algorithms important to many different tasks/fields, but are frequently the inner loops and most performance sensitive code.


  • safe bool cast from a smart pointer
  • hand coded memcpy, memmove, memset, memcmp
    • simple types and user defined types
    • arrays and containers
  • convolution
    • 1D, 2D, and more?
  • matrix multiplication
    • 2D is common
    • higher dimensions get a bit esoteric
  • min/max of a sequence
  • summation of a sequence
  • product of a sequence
  • dot product of sequences
  • sorting a sequence
    • many possible algorithms and implementations
    • given the permutations, is this testable?
    • maybe just test STL implementations?
  • reversing a sequence
  • rotating members of a sequence
  • find/search sequence
    • again, many possible algorithms and implementations
    • given the permutations, is this testable?
    • maybe just test STL implementations?
  • lookup tables
  • interleave/deinterleave buffers
  • transpose block
  • rotate a 2D block
    • plus or minus 90 degrees
    • 180 degrees
    • horizontal flip
    • vertical flip
    • transpose
  • histogram building
  • stack like operations
  • bitarrays and manipulations of them
  • complex type and template

Runtime and Library Support

These are library functions or header defined functions/macros supplied by the compiler or OS. These can sometimes be the performance limiting factor in a variety of applications.


  • allocation / deletion
    • new, delete
    • malloc, free
    • performance for various sizes (testing suballocation)
  • mathlib
    • all functions
    • compared to inline versions of some functions
  • string manipulation routines
    • strcpy, strcmp, strchr, strcat, etc.
    • strncpy, strncmp, strncat, etc.
  • memory routines
    • memcpy, memmove, memset, memcmp, etc.
  • ctype routines
    • isdigit, isalpha, etc.
  • stdio
    • printf, getc, fread, fwrite, etc.
  • C++ iostreams
    • cout, ostream, etc.
    • compare cout versus printf
  • stdlib
    • strtod, strtof
  • threads
    • creation, destruction, sleep overhead, semaphores, mutexes

More Complex Stuff

  • STL containers
    • insertion
      • push_front
      • push_back
      • in-order, reverse-order and random insert for ordered containers
    • removal
    • deletion of container
    • iterating items
    • reverse iterating items
    • copy containers
    • erase all entries
    • search/find/lookup
      • in-order, reverse-order and random
    • vector
      • is iteration of vector the same as iterating a pointer
      • is operator[] the same speed as indexing a pointer
      • what is the penalty for using operator at() range checking
  • STL algorithms
  • iostreams


  • strstr / text searching
    • no "best" algorithm
    • large variance in optimum based on alphabet (roman text, gene sequencing, bit pattern)
  • Discrete Wavelet/Fourier/Cosine/Sine Transform
    • 1D (Audio, PDE spectral methods)
    • 2D (video, image compression)
    • simple and optimized algorithms
    • wavelet and FFT have many near optimal implementations
    • JPEG, MPEG

Optimizations

We need to test as many optimizations as possible. Priority should be given to those that are used most often and offer the highest benefit to the most code (which isn't always easy to judge).


  • algebraic simplification
  • constant folding
  • constant propagation
  • copy propagation
  • common subexpression elimination
  • loop invariant code motion
  • dead code elimination
  • strength reduction
  • value range propagation (aka Predicate Simplification?)
  • loop fusion
  • loop interchange
  • loop normalization
  • loop removal
  • loop unrolling
  • loop unswitching
  • normalization of references
  • scalar replacement of arrays
  • scalar replacement of structs
  • instruction scheduling
  • removal of unused objects/allocations
    • this may not be legal in some circumstances
  • inlining
    • keyword inline
    • auto inline of small functions
    • functions defined in classes
    • functions defined in templates
    • depth of inlining (could be difficult to test)
    • leaf first or trunk first?
    • what factors might disable inlining on some compilers?

Conformance to Specification

  • user specified swap in STL algorithms
  • short circuit evaluation of boolean expressions
  • growth policies and complexities of STL containers
  • concept checks compile away to nothing

Binary Size

  • Are duplicate templates correctly collapsed?
  • Are unused template member functions correctly removed?
  • Are unused (or duplicate) static, const, and static const values correctly removed?


Language Extensions


Other ?

  • ineffiency returning or working with std::pair ?