Ada 95 Quality and Style Guide                      Chapter 10

Chapter 10 Improving Performance

CHAPTER 10 Improving Performance
10.1 PERFORMANCE ISSUES
10.2 PERFORMANCE MEASUREMENT
10.3 PROGRAM STRUCTURE
10.4 DATA STRUCTURES
10.5 ALGORITHMS
10.6 TYPES
10.7 PRAGMAS
10.8 SUMMARY

In many ways, performance is at odds with maintainability and portability. To achieve improved speed or memory usage, the most clear algorithm sometimes gives way to confusing code. To exploit special purpose hardware or operating system services, nonportable implementation dependencies are introduced. When concerned about performance, you must decide how well each algorithm meets its performance and maintainability goals. Use the guidelines in this chapter with care; they may be hazardous to your software.

The best way to build a system that satisfies its performance requirements is through good design. You should not assume that speeding up your code will result in a visible increase in system execution. In most applications, the overall throughput of the system is not defined by the execution speed of the code but by the interaction between concurrent processes and the response time of the system peripherals.

Most of the guidelines in this chapter read ". . . when measured performance indicates." "Indicates" means that you have determined that the benefit in increased performance to your application in your environment outweighs the negative side effects on understandability, maintainability, and portability of the resulting code. Many of the guideline examples show the alternatives that you will need to measure in order to determine if the guideline is indicated.

10.1 PERFORMANCE ISSUES

Performance has at least four aspects: execution speed, code size, compilation speed, and linking speed. Although all four are important, most people think of execution speed when performance is mentioned, and most of the guidelines in this chapter focus on execution speed.

Performance is influenced by many factors, including the compilation software, hardware, system load, and coding style. While only coding style is typically under the control of the programmer, the other factors have so much influence that it is impossible to make flat statements such as "case statements are more efficient than
if-then-else structures." When performance is critical, you cannot assume that a coding style that proves more efficient on one system will also be more efficient on another. Decisions made for the sake of performance must be made on the basis of testing the alternatives on the actual system on which the application will be fielded.

10.2 PERFORMANCE MEASUREMENT

While most well-known tools for measuring performance are stand-alone programs that concentrate on execution speed, there is a comprehensive tool that covers all four aspects of performance. The Ada Compiler Evaluation System (ACES) is the result of merging two earlier products: the United States Department of Defense's Ada Compiler Evaluation Capability and the United Kingdom Ministry of Defence's Ada Evaluation System. It offers a comprehensive set of nearly 2,000 performance tests along with automated setup, test management, and analysis software. This system reports (and statistically analyzes) compilation time, linking time, execution time, and code size. The analysis tools make comparisons among multiple compilation-execution systems and also provide comparisons of the run-time performance of tests using different coding styles to achieve similar purposes.

Version 2.0 of the ACES, released in March of 1995, includes a Quick-Look facility that is meant to replace the Performance Issues Working Group (PIWG) suite. The Quick-Look facility is advertised as being easy to download, install, and execute in less than a day, while providing information that is as useful as that generated by the PIWG suite. In addition, Version 2.0 contains a limited number of Ada 95 tests (all of which are also included in the Quick-Look subset). Version 2.1, including broad coverage of the "core" Ada 95 language, is scheduled for release in March 1996.

At the time of this writing, the ACES software and documentation can be obtained via anonymous FTP from the host sw-eng.falls-church.va.us, directory /public/AdaIC/testing/aces. For World Wide Web access, use the following uniform resource locator (URL): http://sw-eng.falls-church.va.us/AdaIC/testing/aces/.

While measuring performance may seem to be a relatively straightforward matter, there are significant issues that must be addressed by any person or toolset planning to do such measurement. For detailed information, see the following sources: ACES (1995a, 1995b, 1995c); Clapp, Mudge, and Roy (1990); Goforth, Collard, and Marquardt (1990); Knight (1990); Newport (1995); and Weidermann (1990).

10.3 PROGRAM STRUCTURE

10.3.1 Blocks

guideline

example
   ...
   Initial : Matrix;

begin  -- Find_Solution

   Initialize_Solution_Matrix:
      for Row in Initial'Range(1) loop
         for Col in Initial'Range(2) loop
            Initial (Row, Col) := Get_Value (Row, Col);
         end loop;
      end loop Initialize_Solution_Matrix;

   Converge_To_The_Solution:
      declare

         Solution       : Matrix           := Identity;
         Min_Iterations : constant Natural := ...;

      begin  -- Converge_To_The_Solution
         for Iterations in 1 .. Min_Iterations loop
            Converge (Solution, Initial);
         end loop;

      end Converge_To_The_Solution;

   ...
end Find_Solution;

rationale

Late initialization allows a compiler more choices in register usage optimization. Depending on the circumstance, this may introduce a significant performance improvement.

Some compilers incur a performance penalty when declarative blocks are introduced. Careful analysis and timing tests by the programmer may identify those declarative blocks that should be removed.

notes

It is difficult to accurately predict through code inspections which declarative blocks improve performance and which degrade performance. However, with these general guidelines and a familiarity with the particular implementation, performance can be improved.

10.4 DATA STRUCTURES

10.4.1 Dynamic Arrays

guideline

rationale

If array bounds are not known until run-time, then calculations of these bounds may affect run-time performance. Using named constants or static expressions as array bounds may provide better performance than using variables or nonstatic expressions. Thus, if the values of Lower and Upper are not determined until run-time, then:
... is array (Lower .. Upper) of ...

may cause address and offset calculations to be delayed until run-time, introducing a performance penalty. See NASA (1992) for a detailed discussion of the tradeoffs and alternatives.

10.4.2 Zero-Based Arrays

guideline

rationale

For some compilers, offset calculations for an array whose lower bound is 0 (either the integer zero or the first value of an enumeration type) are simplified. For other compilers, optimization is more likely if the lower bound is 1.

10.4.3 Unconstrained Records

guideline

example
subtype Line_Range   is Integer range 0 .. Max_Lines;
subtype Length_Range is Integer range 0 .. Max_Length;

-- Note that Max_Lines and Max_Length need to be static
type Paragraph_Body is array (Line_Range range <>, Length_Range range <>) of Character;

type Paragraph (Lines : Line_Range := 0; Line_Length : Length_Range := 0) is
   record
      Text : Paragraph_Body (1 .. Lines, 1 .. Line_Length);
   end record;

rationale

Determine the size and speed impact of unconstrained records having components depending on discriminants. Some compilers will allocate the maximum possible size to each object of the type; others will use pointers to the dependent components, incurring a possible heap performance penalty. Consider the possibility of using fixed-size components.

10.4.4 Records and Arrays

guideline

example
    -- Array of records
    Process (Student (Index).Name, Student (Index).Grade);
    -- Record of arrays
    Process (Student.Name (Index), Student.Grade (Index));
    -- Parallel arrays
    Process (Name (Index), Grade (Index));

rationale

Determine the impact of structuring data as arrays of records, records containing arrays, or parallel arrays. Some implementations of Ada will show significant performance differences among these examples.

10.4.5 Record and Array Aggregates

guideline

rationale

Determine the impact of using an aggregate versus a sequence of assignments. Using an aggregate generally requires the use of a temporary variable. If the aggregate is "static" (i.e., its size and components are known at compile- or link-time, allowing link-time allocation and initialization), then it will generally be more efficient than a sequence of assignments. If the aggregate is "dynamic," then a series of assignments may be more efficient because no temporary variable is needed.

See Guideline 5.6.10 for a discussion of aggregates from the point of view of readability and maintainability.

See Guideline 10.6.1 for a discussion of extension aggregates.

10.5 ALGORITHMS

10.5.1 Mod and rem Operators

guideline

example
   -- Using mod
   for I in 0 .. N loop
      Update (Arr (I mod Modulus));
   end loop;

   -- Avoiding mod
   J := 0;
   for I in 0 .. N loop
      Update (Arr (J));
      J := J + 1;
      if J = Modulus then
         J := 0;
      end if;
   end loop;

rationale

Determine the impact of using the mod and rem operators. One of the above styles may be significantly more efficient than the other.

10.5.2 Short-Circuit Operators

guideline

example
   -- Nested "if"
   if Last >= Target_Length then
      if Buffer (1 .. Target_Length) = Target then
         ...
      end if;
   end if;

   -- "and then"
   if Last >= Target_Length and then Buffer (1 .. Target_Length) = Target then
      ...
   end if;

rationale

Determine the impact of using nested if statements versus using the and then or or else operator. One of the above may be significantly more efficient than the other.

10.5.3 Case Statement Versus elsif

guideline

example
   subtype Small_Int is Integer range 1 .. 5;
   Switch : Small_Int;
   ...
   -- Case statement
   case Switch is
      when 1 => ...
      when 2 => ...
      when 3 => ...
      when 4 => ...
      when 5 => ...
   end case;

   -- "elsif construct"
   if Switch = 1 then
      ...
   elsif Switch = 2 then
      ...
   elsif Switch = 3 then
      ...
   elsif Switch = 4 then
      ...
   elsif Switch = 5 then
      ...
   end if;

rationale

Determine the impact of using case statements versus the elsif construct. If the case statement is implemented using a small jump table, then it may be significantly more efficient than the if .. then .. elsif construct.

See also Guideline 8.4.6 for a discussion of the table-driven programming alternative.

10.5.4 Checking for Constraint Errors

guideline

example
   subtype Small_Int is Positive range Lower .. Upper;
   Var : Small_Int;
   ...

   -- Using exception handler
   Double:
      begin
         Var := 2 * Var;
      exception
         when Constraint_Error =>
            ...
      end Double;

      -- Using hard-coded check
      if Var > Upper / 2 then
         ...
      else
         Var := 2 * Var;
      end if;

rationale

Determine the impact of using exception handlers to detect constraint errors. If the exception handling mechanism is slow, then hard-coded checking may be more efficient.

10.5.5 Order of Array Processing

guideline

example
    type Table_Type is array (Row_Min .. Row_Max, Col_Min .. Col_Max) of ...
    Table : Table_Type;
    ...
    -- Row-order processing
    for Row in Row_Min .. Row_Max loop
       for Col in Col_Min .. Col_Max loop
          -- Process Table (Row, Col)
       end loop;
    end loop;
    -- Column-order processing
    for Col in Col_Min .. Col_Max loop
       for Row in Row_Min .. Row_Max loop
          -- Process Table (Row, Col)
       end loop;
    end loop;

rationale

Determine the impact of processing two-dimensional arrays in row-major order versus column-major order. While most Ada compilers are likely to use row-major order, it is not a requirement. In the presence of good optimization, there may be no significant difference in the above examples. Using static array bounds is also likely to be significant here. See Guidelines 10.4.1 and 10.4.2.

10.5.6 Assigning Alternatives

guideline

example
   -- Using "if .. else"
   if Condition then
      Var := One_Value;
   else
      Var := Other_Value;
   end if;
   -- Using overwriting
   Var := Other_Value;
   if Condition then
      Var := One_Value;
   end if;

rationale

Determine the impact of styles of assigning alternative values. The examples illustrate two common methods of doing this; for many systems, the performance difference is significant.

10.5.7 Packed Boolean Array Shifts

guideline

example
   subtype Word_Range is Integer range 0 .. 15;
   type Flag_Word is array (Word_Range) of Boolean;
   pragma Pack (Flag_Word);
   Word : Flag_Word;
   ...

   -- Loop to shift by one bit
   for Index in 0 .. 14 loop
      Word (Index) := Word (Index + 1);
   end loop;
   Word (15) := False;

   -- Use slice assignment to shift by one bit
   Word (0 .. 14) := Word (1 .. 15);
   Word (15) := False;

rationale

Determine the impact of slice manipulation when shifting packed Boolean arrays. For Ada 83 implementations using packed Boolean arrays, shift operations may be much faster when slice assignments are used as opposed to for loop moving one component at a time. For Ada 95 implementations, consider using modular types instead (see Guideline 10.6.3).

10.5.8 Subprogram Dispatching

guideline