Chapter 18:
Designing a spreadsheet

Write the vision, and make it plain upon tables.
— Habakkuk, 2:2

18.1 Spreadsheets
18.2 Defining the program
18.3 The spreadsheet class
18.4 Cell implementation
18.5 Formula cells
18.6 Deriving a new expression type
Exercises

18.1 Spreadsheets

Now that the principal features of tagged types have been covered, it’s time to take another look at the design process and see how all this fits in. In the first part of the book I described how top-down design could be used to break problems down into a number of smaller problems; in the second part I described how the design process should revolve around the types of data that the program is intended to model using a combination of top-down and bottom-up design. The object-oriented design process also involves identifying type classes. As I mentioned in connection with abstract data types in chapter 10, nouns in a specification often correspond to data types and verbs to the operations on those types. Adjectives also provide a useful clue for identifying type classes by their inheritance relationships; for example, an urgent appointment is clearly a form of appointment so there will presumably be an inheritance relationship between Appointment_Type and Urgent_Appointment_Type (or between these and some common parent type). Also, as I showed you in the last chapter, not all type classes contain data; sometimes classes which just encapsulate a set of operations are a useful abstraction.

The design process is therefore similar to the approach I described for use with abstract data types except that there are a few more questions you need to ask. If two types are similar, is there an inheritance relationship between them or should there be a common parent type which encapsulates the features they share? In either case, you want to use tagged types. Then you need to consider what operations of the types are primitive, which should be class-wide, which should not be primitive. Might any of the types need extending at some point in the future? If so, you need to use tagged types again. This case requires some foresight and a feeling for possible maintenance scenarios in order to decide what sort of provision to make for future extensions.

In this chapter I’m going to illustrate this by developing a spreadsheet. Spreadsheets are among the most widely used applications for computers, and I’ve chosen it as an example because it’s one I expect most people to be familiar with. A spreadsheet consists of a grid of cells; the rows are numbered from 1 upwards and the columns are named A, B, C and so on. To allow for more than 26 columns, columns after Z are named AA, AB, AC and so on up to AZ, then BA, BB, BC and so on. Individual cells are referred to by their grid coordinate, e.g. A1 or BC100.

There are several possible different types of cell. Initially all cells are empty; the user can enter an expression to be stored in a cell (a formula cell) or a literal string that is displayed as it is (a string cell) for use as table headings and so on. Other types of cell are also possible. Naturally, the expression evaluator developed in the previous chapter will be a useful component for implementing formula cells.

Whenever the spreadsheet changes it is recalculated and redisplayed. Recalculation involves evaluating the expression in each formula cell. Expressions can refer to the values in other formula cells, so a change to a single cell might affect the values of several other cells. Empty cells and string cells have no value (they are undefined); if a formula refers to an empty cell or string cell its value is also undefined. Cells can’t refer to their own value, directly or indirectly; for example, if cell A1 was defined to be A2+1 and A2 was defined as A1–1 it would be impossible to work out the values of A1 or A2. If this happens it’s an error; the value of any cell whose definition is circular is undefined.

Looking at this specification, we can start work on it by identifying the classes that will be needed. We will obviously need types for spreadsheets and cells. Should these be tagged or untagged? Making a spreadsheet a tagged type will allow it to be extended in the future, so this seems to be a good idea. The specification above mentions several types of cell, so a cell should definitely be a tagged type and specific types of cell can then be derived from it (string cells and formula cells, for now). The spreadsheet can deal with a grid of class-wide pointers to cells to allow different cell types to be used within a single spreadsheet. As an implementation detail, empty cells can be dealt with by the spreadsheet itself; any cell that isn’t in use is empty. This will avoid having to store empty cells in memory.

What operations are needed on these types? The specification tells us that spreadsheets can be recalculated and displayed, and cell values can be changed. Changing an existing cell might involve changing its type, so this will have to be done by deleting the existing cell (if any) and creating a new one. Since expressions in formula cells can refer to other cells, we need some way of locating a particular cell as well. Recalculation is needed if a cell is changed, so it would probably be a good idea to have a procedure that a cell can call to notify the spreadsheet that a change has taken place. Doing it this way rather than recalculating every time a cell changes allows the spreadsheet to decide when recalculation is necessary (e.g. just before redisplaying the spreadsheet) to minimise the number of times that it gets recalculated. These considerations also imply that a cell must be able to identify the spreadsheet it belongs to.

We’ll also need a procedure to be called to cancel the change notification after the spreadsheet has been recalculated and a function to test if the spreadsheet has changed; these should only be called from within Recalculate, so they can go in the private part of the package. Although the only thing they’ll do will be to access a Boolean variable, it’s still a good idea to provide primitive operations to do this rather than just accessing the variable directly so that any future derived types can override them to provide different behaviour if necessary. This is an application of the sort of foresight I mentioned earlier. Finally, the problem of circular definitions for cells can be reported with an exception; syntax errors in expressions will also be reported by an exception. Here’s a first stab at a spreadsheet type:

    type Spreadsheet_Type is abstract tagged limited private;

    type Cell_Type (Sheet : access Spreadsheet_Type'Class) is
                                    abstract tagged limited private;
    type Cell_Access is access Cell_Type'Class;

    procedure Recalculate (Sheet : in out Spreadsheet_Type);
    procedure Display     (Sheet : in out Spreadsheet_Type) is abstract;
    procedure Change      (Sheet : in out Spreadsheet_Type);
    procedure Updated     (Sheet : in out Spreadsheet_Type);
    function  Changed     (Sheet : Spreadsheet_Type) return Boolean;
    function  Cell        (Sheet : Spreadsheet_Type;
                           Where : String) return Cell_Access;

    procedure Insert      (Sheet : in out Spreadsheet_Type;
                           Where : in String;
                           What  : in Cell_Access);
    procedure Delete      (Sheet : in out Spreadsheet_Type;
                           Where : in String);

    Circularity_Error : exception;

The Display procedure is abstract since this is view dependent; a program using a spreadsheet will need to derive a concrete type which displays the spreadsheet in an appropriate way. The spreadsheet type is limited to prevent assignment of one spreadsheet to another (since the effect would be to copy the pointers to the cells rather than the cells themselves). Each cell needs to know which spreadsheet it belongs to (so it can notify the spreadsheet whenever it changes) so I’ve used an access discriminant as described in chapter 11 to act as a pointer to the spreadsheet it’s part of. As explained in chapter 11, access discriminants are only allowed for limited types, so Cell_Type has to be limited; they can’t be null, so it isn’t possible to create cells without reference to a specific spreadsheet, and the accessibility checks used on named access types don’t apply. Cell_Type will be derived from Limited_Controlled since destroying a cell might well involve some clean-up action; for example, the cell could notify the spreadsheet that it had changed as the result of a cell being destroyed. Again, I’m applying some foresight to the design.

Cells have a number of common properties. They have a value that can be displayed on the screen; the value can also be accessed as an integer if it isn’t undefined. If the value is undefined, we can raise an exception to report it. Cells need to be evaluated as part of the spreadsheet recalculation, so a procedure to re-evaluate a cell will be needed. It might also be useful to be able to inspect the actual cell contents rather than just the evaluated result of the expression. Here’s a list of some plausible primitive operations for Cell_Type:

    procedure Evaluate   (Cell : in out Cell_Type) is abstract;
    function  Text_Value (Cell : Cell_Type) return String  is abstract;
    function  Num_Value  (Cell : Cell_Type) return Integer is abstract;
    function  Contents   (Cell : Cell_Type) return String  is abstract;

    Undefined_Cell_Error : exception;

The operations declared here are all abstract, so derived cell types will need to override them in an appropriate way. Formula cells can be derived from Cell_Type by adding an extra discriminant and overriding the abstract operations:

    type Formula_Cell_Type (Sheet : access Spreadsheet_Type'Class;
                            Size  : Natural) is new Cell_Type with private;

    procedure Evaluate   (Cell : in out Formula_Cell_Type);
    function  Text_Value (Cell : Formula_Cell_Type) return String;
    function  Num_Value  (Cell : Formula_Cell_Type) return Integer;
    function  Contents   (Cell : Formula_Cell_Type) return String;

You can’t inherit from a discriminated type without providing the necessary discriminants, so Formula_Cell_Type has an access discriminant called Sheet which is then used as the constraint for Cell_Type in the type declaration. Formula_Cell_Type also has a Natural as a discriminant which will be used for the length of the expression associated with the cell. String cells will also need a similar set of discriminants:

    type String_Cell_Type (Sheet : access Spreadsheet_Type'Class;
                           Size  : Natural) is new Cell_Type with private;

    procedure Evaluate   (Cell : in out String_Cell_Type);
    function  Text_Value (Cell : String_Cell_Type) return String;
    function  Num_Value  (Cell : String_Cell_Type) return Integer;
    function  Contents   (Cell : String_Cell_Type) return String;

The extra discriminant in this case is the size of the string in the cell. Constructor functions can be declared to construct a cell from a spreadsheet pointer and a string:

    function String_Cell  (Sheet : access Spreadsheet_Type'Class;
                           Value : String) return Cell_Access;
    function Formula_Cell (Sheet : access Spreadsheet_Type'Class;
                           Value : String) return Cell_Access;

Notice that these are not primitive operations of the cell types, so the inheritance problems related to constructors that I described earlier will be avoided.

18.2 Defining the program

Now that we’ve got this much of the design in place, we can start writing a program to use a spreadsheet before coming back to looking at the implementation of the spreadsheet abstraction. As usual I’ll define a view package:

    with JE.Spreadsheets;
    package JE.Views.Spreadsheet is
        type Command_Type is (Modify, Display, Quit);
        function Next_Command return Command_Type;

        type Sheet_Type is limited private;
        procedure Display (Sheet : in out Sheet_Type);
        procedure Modify  (Sheet : in out Sheet_Type);

    private
        type Sheet_Extension is new JE.Spreadsheets.Spreadsheet_Type
                                                       with null record;

        procedure Display (Sheet : in out Sheet_Extension);

        type Sheet_Type is
            limited record
                Innards : aliased Sheet_Extension;
            end record;
    end JE.Views.Spreadsheet;

Sheet_Type is a limited private type; it’s actually a record containing a single Sheet_Extension component, where Sheet_Extension is derived from Spreadsheet_Type. The reason for this is that we need to be able to supply cells with a discriminant value which points to the spreadsheet they’re part of, so the spreadsheet is made into an aliased component of Sheet_Type. Since the component is aliased, the 'Access attribute can be used to get a pointer to it which can then be used when creating new cells.

The commands are Modify, Display and Quit. The intention is that Modify will ask the user to select a cell, display its current contents and then ask for a new value. The main program will use the view package to process commands from the user:

    with JE.Views.Spreadsheet;
    procedure Spreadsheet is
        Sheet : JE.Views.Spreadsheet.Sheet_Type;
    begin
        JE.Views.Spreadsheet.Display (Sheet);
        loop
            case JE.Views.Spreadsheet.Next_Command is
                when JE.Views.Spreadsheet.Modify =>
                    JE.Views.Spreadsheet.Modify (Sheet);
                when JE.Views.Spreadsheet.Display =>
                    JE.Views.Spreadsheet.Display (Sheet);
                when JE.Views.Spreadsheet.Quit =>
                    exit;
            end case;
        end loop;
    end Spreadsheet;

Now we’ll need to implement the package body. This version will just use Ada.Text_IO; it’s a lowest-common-denominator interface that could easily be improved:

    with Ada.Text_IO; use Ada.Text_IO;
    package body JE.Views.Spreadsheet is
        function Next_Command return Command_Type      is ... end Next_Command;
        procedure Display (Sheet : in Sheet_Extension) is ... end Display;
        procedure Display (Sheet : in Sheet_Type)      is ... end Display;
        procedure Modify  (Sheet : in out Sheet_Type)  is ... end Modify;
    end JE.Views.Spreadsheet;

Next_Command just needs to display a short menu and command prompt:

    function Next_Command return Command_Type is
        Command : Character;
    begin
        loop
            New_Line;
            Put ("(M)odify, (D)isplay or (Q)uit: ");
            Get (Command);
            Skip_Line;
            case Command is
                when 'M' | 'm' =>
                    return Modify;
                when 'D' | ‘d' =>
                    return Display;
                when 'Q' | 'q' =>
                    return Quit;
                when others =>
                    Put_Line ("Invalid command -- " &
                     "please enter M, D or Q.");
            end case;
        end loop;
    exception
        when End_Error =>
            return Quit;
    end Next_Command;

The version of Display for Sheet_Type will just call the version of Display for Sheet_Extensions to display its Innards component:

    procedure Display (Sheet : in out Sheet_Type) is
    begin
        Display (Sheet.Innards);
    end Display;

I’ll assume that this program will be run on a standard 80-column text screen with 25 lines of text. I’ll use a couple of constants for this which you can change if you need to:

    Screen_Width  : constant := 80;
    Screen_Length : constant := 25;

I’ll display the cells in columns which are 12 columns wide; this will allow for six columns (A to F) with a seven-character left margin for the row number and a one-character right margin (to prevent the cursor ‘wrapping’ to a new line if text is displayed in the last column of the screen). I’ll leave one row for the column headings and another four for the menu and user responses, which leaves 20 rows. The spreadsheet itself might be bigger than 20 rows of six columns each, but if so the extra cells won’t get displayed. I’ll need some more constants for the rows and columns:

    Right_Margin  : constant := 1;
    Column_Width  : constant := 12;
    Column_Count  : constant := (Screen_Width - Right_Margin) / Column_Width;
    Left_Margin   : constant := Screen_Width - Right_Margin -
                                               (Column_Count * Column_Width);
    Top_Margin    : constant := 1;
    Bottom_Margin : constant := 4;
    Row_Count     : constant := Screen_Length - Top_Margin - Bottom_Margin;

Here’s how the version of Display for a Sheet_Extension can be implemented:

    procedure Display (Sheet : in out Sheet_Extension) is
        Column   : Character;
        Cell_Ptr : Cell_Access;
        Width    : Integer;
    begin
        Recalculate (Sheet);
        New_Line (Screen_Length);        -- clear screen by scrolling up
        Set_Col (Left_Margin);
        Column := 'A';
        for I in 0 .. Column_Count-1 loop
            Set_Col (Positive_Count(Left_Margin + I*Column_Width + 1));
            Put (Column);
            Column := Character'Succ(Column);
        end loop;
        for R in 1 .. Row_Count loop
            Put (R, Width => Left_Margin-2);
            Put (":");
            Column := 'A';
            for C in 0 .. Column_Count-1 loop
                declare
                    Row : String := Integer'Image(R);
                begin
                    Set_Col (Positive_Count(Left_Margin + C*Column_Width + 1));
                    Cell_Ptr := Cell (Sheet, Column & Row(2..Row'Last));
                    if Cell_Ptr /= null then
                        Width := Integer'Min (Column_Width - 1,
                                              Text_Value(Cell_Ptr.all)'Length);
                        Put (Text_Value(Cell_Ptr.all)(1..Width));
                    end if;
                    Column := Character'Succ(Column);
                end;
            end loop;
            New_Line;
        end loop;
    end Display;

The spreadsheet is recalculated before it’s displayed in case anything’s changed recently. Notice that the code above assumes that we’ll never have more than 26 columns (A to Z); multicharacter column names aren’t catered for. It uses a procedure called Set_Col from Ada.Text_IO; Set_Col moves the cursor to the specified column (character position) of the current screen line. The screen layout constants are used to calculate where each item being displayed should go. Note that the innermost loop which displays the actual values uses the character in Col together with the row number to construct a string which is the coordinate of the required cell; the row number is converted to a string using Integer'Image and the second character onwards is used in the name of the cell coordinate (the first character is the sign, either a space or a minus sign). If the text to be displayed is wider than one character less than the column width it will be truncated. This is the purpose of the variable Width; it is set to the minimum of Column_Width–1 and the width of the cell’s value, and the result is used to slice out the appropriate number of characters from the cell’s value.

Modifying the current cell will involve getting the cell coordinates, displaying the current cell contents and inviting the user to type in a new value. I’ll use the convention that string cells are created by typing a value beginning with a quote, empty cells are created by typing in a full stop, and formula cells are created for any other input. Entering a blank line will leave the current value unchanged. Here’s how it’s done:

    procedure Modify (Sheet : in out Sheet_Type) is
        Name      : String(1..10);
        Name_Size : Natural
        Line      : String(1..50);
        Line_Size : Natural;
        Which     : Cell_Access;
    begin
        Put ("Cell coordinate: ");
        Get_Line (Name, Name_Size);
        Which := Cell (Sheet.Innards, Name(1..Name_Size));

        Put ("Current value of " & Name(1..Name_Size) & ": ");
        if Which = null then           -- empty cell
            Put ("<empty>");
        else
            if Which.all in String_Cell_Type'Class then
                Put ('"');             -- string cell
            end if;
            Put (Contents(Which.all));
        end if;
        New_Line;

        Put ("Enter new value: ");
        Get_Line (Line, Line_Size);
        if Line_Size > 0 then       -- new value entered
            case Line(1) is
                when '.' =>         -- empty cell
                    Delete (Sheet.Innards, Name(1..Name_Size));
                when '"' =>         -- string cell
                    Insert (Sheet.Innards, Name(1..Name_Size),
                            String_Cell (Sheet.Innards'Access,
                            Line(2..Line_Size)) );
                when others =>      -- formula cell
                    Insert (Sheet.Innards, Name(1..Name_Size),
                            Formula_Cell (Sheet.Innards'Access,
                            Line(1..Line_Size)) );
            end case;
            Display (Sheet);
        end if;
    end Modify;

18.3 The spreadsheet class

Now that the main program is written (and of course tested using stubs for the missing spreadsheet and cell operations as described in chapter 8) we can move on to the spreadsheet class itself. The first thing to do is to consider what the full declaration of Spreadsheet_Type will look like. Here’s one approach:

    with JE.Lists;
    package Spreadsheets is
        type Spreadsheet_Type is abstract tagged limited private;
        type Cell_Type (Sheet : access Spreadsheet_Type'Class) is
                                            abstract tagged limited private;
        type Cell_Access is access Cell_Type'Class;
        ...        -- etc.

    private
        Cell_Name_Length : constant := 6;
        subtype Cell_Size is Integer range 0..Cell_Name_Length;

        package Cell_Pointers is new JE.Pointers (Cell_Type'Class, Cell_Access);

        type Cell_Record is
            record
                Where : String(1..Cell_Name_Length);
                Size  : Cell_Size;
                Cell  : Cell_Pointers.Pointer_Type;
            end record;
        package Cell_Lists is new JE.Lists (Cell_Record);

        type Spreadsheet_Type is
            abstract tagged limited record
                Cells : Cell_Lists.List_Type;
                Dirty : Boolean := False;
            end record;
    end Spreadsheets;

I’ve used a constant so that cell names are limited to a maximum of six characters. See if you can devise a method which will allow names to be unlimited in length, or a method which will allow the maximum length to be changed without recompiling the package and all its clients.

The spreadsheet consists of a list of (non-empty) cells and a ‘dirty’ flag. Despite the fact that the spreadsheet is theoretically a grid of cells, there’s no reason why a linked list can’t be used to implement it. After all, a spreadsheet is really just a collection of cells which happens to be presented as a rectangular grid; the external representation needn’t have anything to do with the internal representation. Each Cell_Record in the list contains the cell’s coordinate and a ‘smart pointer’ to the cell itself (as described in chapter 16). Using a smart pointer ensures that the cell will be destroyed automatically when it’s removed from the list.

Finding a cell with a given coordinate is just a matter of searching the list for a cell with that coordinate. If you can’t find it, it’s an empty cell. The Dirty flag is set whenever any cell has changed; this indicates when a recalculation is necessary. The primitive procedure Change sets the dirty flag, Updated clears it and the primitive function Changed tests it:

    procedure Change (Sheet : in out Spreadsheet_Type) is
    begin
        Sheet.Dirty := True;
    end Change;

    procedure Updated (Sheet : in out Spreadsheet_Type) is
    begin
        Sheet.Dirty := False;
    end Updated;

    function Changed (Sheet : Spreadsheet_Type) return Boolean is
    begin
        return Sheet.Dirty;
    end Changed;

Cell needs to scan through the linked list looking for a Cell_Record whose coordinate (the Where component) matches the coordinate given as its parameter. As I mentioned earlier, if the cell isn’t found Cell will just return a null pointer to indicate that it’s an empty cell:

    function Cell (Sheet : Spreadsheet_Type;
                   Where : String) return Cell_Access is
        Iter : List_Iterator := First(Sheet.Cells);
        Cell : Cell_Record;
    begin
        while Iter /= Last(Sheet.Cells) loop
            Cell := Value(Iter);
            exit when To_Upper(Cell.Where(1..Cell.Size)) = To_Upper(Where);
            Iter := Succ(Iter);
        end loop;

        if Iter /= Last(Sheet.Cells) then
            return Value(Value(Iter).Cell);
        else
            return null;
        end if;
    end Cell;

This uses the function To_Upper from Ada.Characters.Handling to ignore case differences when comparing cell coordinates so that ‘a1’ will be recognised as referring to the same cell as ‘A1’.

Delete searches for the named cell and deletes it from the list if it’s there, or does nothing if it isn’t :

    procedure Delete (Sheet : in out Spreadsheet_Type;
                      Where : in String) is
        Iter : List_Iterator;
        Cell : Cell_Record;
    begin
        Iter := First (Sheet.Cells);
        while Iter /= Last (Sheet.Cells) loop
            Cell := Value(Iter);
            if To_Upper(Cell.Where(1..Cell.Size)) = To_Upper(Where) then
                Delete (Iter);
                Change (Spreadsheet_Type'Class(Sheet));
                exit;
            end if;
            Iter := Succ(Iter);
        end loop;
    end Delete;

Deleting a cell will cause the smart pointer inside it to finalise itself, so the Cell_Type it points to will be deallocated properly.

Insert deletes the cell with the given coordinates if it already exists and then creates a new Cell_Record using the coordinate and cell pointer supplied as parameters (but if the cell pointer is null there’s nothing to insert, so this needs checking for) and then adds it to the end of the list:

    procedure Insert (Sheet : in out Spreadsheet_Type;
                      Where : in String;
                      What  : in Cell_Access) is
        New_Cell : Cell_Record;
    begin
        Delete (Sheet, Where);
        if What /= null then
            New_Cell.Size := Integer'Min (Cell_Name_Length, Where'Length);
            New_Cell.Where (1..New_Cell.Size) :=
                    Where (Where'First .. Where'First+New_Cell.Size-1);
            New_Cell.Cell := Pointer(What);
            Insert (Last(Sheet.Cells), New_Cell);
        end if;
        Change (Spreadsheet_Type'Class(Sheet));
    end Insert;

Recalculate checks if the spreadsheet has changed by calling Changed, and then goes through the list asking each cell to evaluate itself if it has:

    procedure Recalculate (Sheet : in out Spreadsheet_Type) is
        Iter : List_Iterator;
        Cell : Cell_Pointers.Pointer_Type;
    begin
        if Changed(Spreadsheet_Type'Class(Sheet)) then
            Iter := First(Sheet.Cells);
            while Iter /= Last(Sheet.Cells) loop
                Cell := Value(Iter).Cell;
                Evaluate (Value(Cell).all);
                Iter := Succ(Iter);
            end loop;
            Updated (Spreadsheet_Type'Class(Sheet));
        end if;
    end Recalculate;

This is not going to be terribly efficient; the expression in a formula cell can refer to the names of other formula cells, so evaluating a formula cell will involve evaluating any other formula cells that it refers to. This means that individual cells can end up being evaluated several times. One way to overcome this is to get cells to remember when they were last evaluated. If we keep an ‘evaluation number’ in the spreadsheet which is updated on each call to Recalculate, each cell can copy the evaluation number when it’s evaluated and then just return the current value without re-evaluating it if the spreadsheet’s current evaluation number is the same as the copy in the cell. Only two evaluation numbers are needed to distinguish between two successive calls to Recalculate. Here’s how the declaration of Spreadsheet_Type will need to change:

    type Evaluation_Number is mod 2;

    type Spreadsheet_Type is
        tagged limited record
            Cells : Cell_Lists.List_Type;
            Dirty : Boolean := False;
            Eval  : Evaluation_Number := Evaluation_Number'First;
        end record;

There’ll also need to be another primitive function to enable cells to access the current evaluation number:

    function Evaluation (Sheet : Spreadsheet_Type) return Evaluation_Number;

All this will need to do is to return a copy of the Eval component:

    function Evaluation (Sheet : Spreadsheet_Type) return Evaluation_Number is
    begin
        return Sheet.Eval;
    end Evaluation;

Recalculate will need to increment the evaluation number at the very beginning:

    procedure Recalculate (Sheet : in out Spreadsheet_Type) is
        -- as before
    begin
        if Changed(Spreadsheet_Type'Class(Sheet)) then
            Sheet.Eval := Sheet.Eval + 1;
            -- as before
        end if;
    end Recalculate;

Since the evaluation number is modular, it will go 0, 1, 0, 1 and so on on successive calls to Recalculate.

18.4 Cell implementation

The next piece of the jigsaw is how cells are implemented. Looking through what we’ve already got, we can immediately identify the data components that Cell_Type will need to contain:

A pointer to the enclosing spreadsheet. This is already there in the form of an access discriminant.
The current evaluation number, as explained above.
A flag to record if it is currently being evaluated. If this flag is set and the cell is asked to evaluate itself it indicates that there’s a circular reference, and we can use this to detect circular definitions and raise a Circularity_Error.
A flag to record if an error occurred during evaluation.
A flag to record if the cell’s value is undefined or not.

The last three can be combined into a state variable based on an enumerated type since they’re mutually exclusive; a cell is either being evaluated, or it’s erroneous, or it’s either defined or undefined. I’ll also have another value for when its state is unknown:

    type Cell_State_Type is (Unknown, Evaluating, Defined, Undefined, Error);

The Unknown state can be used as an initial value before the cell has been evaluated for the first time:

    type Cell_Type (Sheet : access Spreadsheet_Type'Class) is
                                    abstract new Limited_Controlled with
        record
            State : Cell_State_Type := Unknown;
            Eval  : Evaluation_Number;
        end record;

During evaluation the cell will be in the Evaluating state; at the end of evaluation it will end up as Defined, Undefined or Error.

Now we need to consider the derived types String_Cell_Type and Formula_Cell_Type. Here’s the full declaration for these types:

    type String_Cell_Type (Sheet : access Spreadsheet_Type'Class;
                           Size  : Natural) is new Cell_Type(Sheet) with
        record
            Text : String(1..Size);
        end record;

    type Formula_Cell_Type (Sheet : access Spreadsheet_Type'Class;
                            Size  : Natural) is new Cell_Type(Sheet) with
        record
            Text  : String(1..Size);
            Value : Integer;
        end record;

Notice how the full declarations have to provide the necessary discriminant (Sheet) for Cell_Type, unlike the declarations in the visible part of the package:

    type String_Cell_Type  (Sheet : access Spreadsheet_Type'Class;
                            Size  : Natural) is new Cell_Type with private;
    type Formula_Cell_Type (Sheet : access Spreadsheet_Type'Class;
                            Size  : Natural) is new Cell_Type with private;

Both types have a string to hold the cell contents, the size of which is given by their Size discriminants; Formula_Cell_Type also has an Integer component to record the result of evaluating the cell. The constructor functions for the two types can be defined like this:

    function String_Cell (Sheet : access Spreadsheet_Type'Class;
                          Value : String) return Cell_Access is
        Cell : Cell_Access := new String_Cell_Type (Sheet, Value'Length);
    begin
        String_Cell_Type(Cell.all).Text := Value;
        return Cell;
    end String_Cell;

    function Formula_Cell (Sheet : access Spreadsheet_Type'Class;
                           Value : String) return Cell_Access is
        Cell : Cell_Access := new Formula_Cell_Type (Sheet, Value'Length);
    begin
        Formula_Cell_Type(Cell.all).Text := Value;
        return Cell;
    end String_Cell;

Now we need to override the abstract operations inherited from Cell_Type. I’ll deal with String_Cell_Type first of all since it’s going to be simpler than Formula_Cell_Type.

The Text_Value and Contents operations just need to return the value of the string contained in the cell:

    function Text_Value (Cell : String_Cell_Type) return String is
    begin
        return Cell.Value;
    end Text_Value;

    function Contents (Cell : String_Cell_Type) return String is
    begin
        return Cell.Value;
    end Contents;

The value of a string cell is always undefined, so Evaluate just needs to set the state to Undefined and Num_Value just needs to raise Undefined_Cell_Error:

    procedure Evaluate (Cell : in out String_Cell_Type) is
    begin
        Cell.State := Undefined;
    end Evaluate;

    function Num_Value (Cell : String_Cell_Type) return Integer is
    begin
        raise Undefined_Cell_Error;
    end Num_Value;

18.5 Formula cells

Formula cells will need to use a derivation of Expression_Type as defined in the previous chapter to evaluate the expressions they contain; I’ll call it Formula_Type but I won’t consider how it’s going to be implemented just yet.

As with String_Cell_Type, Contents just needs to return the string discriminant:

    function Contents (Cell : Formula_Cell_Type) return String is
    begin
        return Cell.Expr;
    end Contents;

Text_Value needs to return the current value as a String if the state of the cell is Defined. If the state of the cell is Error then it should return an error message. Otherwise, the cell value is unknown so it should just return a null string:

    function Text_Value (Cell : Formula_Cell_Type) return String is
    begin
        if Cell.State = Defined then
            return Integer'Image(Cell.Value);
        elsif Cell.State = Error then
            return "*ERROR*";
        else
            return "";
        end if;
    end Text_Value;

Num_Value needs to return the current value as an Integer. If the value isn’t defined it can just raise an Undefined_Cell_Error exception:

    function Num_Value (Cell : Formula_Cell_Type) return Integer is
    begin
        if Cell.State = Defined then
            return Cell.Value;
        else
            raise Undefined_Cell_Error;
        end if;
    end Num_Value;

This only leaves us with Evaluate to be defined. If the cell is already being evaluated, it needs to raise a Circularity_Error. If the state is unknown or the evaluation number is out of date, the cell needs to be evaluated; otherwise, it’s already been evaluated and nothing needs to be done. If it does need evaluating, the evaluation number needs to be updated and the formula needs to be evaluated. The cell state must be set to Evaluating while the formula is being evaluated; afterwards the state can be set to Defined if all is well, or Undefined if a reference to an undefined cell occurs (which will be reported by Value as an Undefined_Cell_Error), or Error if an error occurs. The error can be a Syntax_Error from the expression evaluation, a Constraint_Error because the result is out of range, or a Circularity_Error as described above:

    procedure Evaluate (Cell : in out Formula_Cell_Type) is
        Expr : Formula_Type (Cell.Sheet);
    begin
        if Cell.State = Evaluating then
            raise Circularity_Error;
        elsif Cell.State = Unknown or Cell.Eval /= Evaluation(Cell.Sheet.all) then
            Cell.Eval  := Evaluation(Cell.Sheet.all);
            Cell.State := Evaluating;
            Cell.Value := Evaluate (Expr, Cell.Text);
            Cell.State := Defined;
        end if;

    exception
        when Undefined_Cell_Error =>
            if Cell.State /= Error then        -- don't change state if
                Cell.State := Undefined;       -- there's already been an
            end if;                            -- error reported
        when Syntax_Error | Constraint_Error | Circularity_Error =>
            Cell.State := Error;
    end Evaluate;

The formula will need to be supplied with a pointer to the spreadsheet it’s associated with so that cells referenced in the expression can be looked up. This is done by providing Formula_Type with an access discriminant, which means that Formula_Type will need to be a limited type.

As I’ve mentioned before, handling errors inside a package is generally a bad idea. It leads to a lack of flexibility, and providing flexibility is what object-oriented programming is all about. Another type of spreadsheet derived in the future might want to report errors to the user as they arise and give the option of carrying on or aborting the recalculation, but this isn’t possible with the current design. A better idea would be to provide a primitive operation of Spreadsheet called Handle_Error and call this from the exception handler above, like this:

    when Fault : Syntax_Error | Constraint_Error | Circularity_Error =>
        Cell.State := Error;
        Handle_Error (Cell.Sheet, Fault);

Since Sheet is a class-wide access discriminant, the call to Handle_Error will be a dispatching call. This means that Handle_Error will need to be declared like this:

    procedure Handle_Error (Sheet : access Spreadsheet_Type;
                            Error : in Ada.Exceptions.Exception_Occurrence);

The access parameter Sheet allows any access-to-Spreadsheet value to be used to call Handle_Error; also, as mentioned in chapter 14, an access parameter is treated as a controlling parameter so that Handle_Error will be a primitive operation of Spreadsheet_Type. The Error parameter allows Handle_Error to use the operations in Ada.Exceptions to get more information about the exception.

The default action can just be to do nothing, but making it a primitive operation of Spreadsheet means that derived spreadsheets can override it. Parent types in class hierarchies quite often end up with primitive operations which do nothing to act as ‘hooks’ to allow extra processing to be added in later by derived classes if it’s needed. You should always look carefully at ‘do-nothing’ bits of your code (null clauses in case statements, missing else parts in if statements and so on) and consider whether there will ever be a need to change it to do something. If so, add a primitive operation to do nothing for you. However, it takes some experience to be able to spot these things, because there’s nothing there to make you notice them!

Returning from Handle_Error will effectively mean that the error has been ignored, but an overridden version of Handle_Error could raise another exception (or the same one) in which case it will be raised at the point where Evaluate was called from (remember that if an exception is raised inside an exception handler, you immediately exit from the block containing the handler and you then look for an exception handler in the block you’ve returned to).

18.6 Deriving a new expression type

The final step to complete the program is to derive Formula_Type from the type Expression_Type defined in the previous chapter. This can go in a child package of JE.Expressions:

    with JE.Spreadsheets;
    package JE.Expressions.Spreadsheet is
        type Formula_Type (Sheet : access Spreadsheet_Type'Class) is
                                        new Expression_Type with private;
    private
        type Formula_Type (Sheet : access Spreadsheet_Type'Class) is
                                        new Expression_Type with null record;
        ...        -- other declarations
    end JE.Expressions.Spreadsheet;

The only difference between Formula_Type and Expression_Type is that Formula_Type needs to recognise cell coordinates as a new type of operand. This means we need a new type derived from Operand_Type together with an overriding declaration for the primitive operation Value:

    type Cell_Operand_Type (Cell : Cell_Access) is
                            new Operand_Type with null record;
    function Value (Operand : Cell_Operand_Type) return Integer;

These declarations will need to go in the private part of the package. This type has a Cell_Access value as a discriminant which points to the cell being referenced as an operand. The Value operation will involve evaluating the cell (in case it hasn’t been evaluated yet) and then returning its value:

    function Value (Operand : Cell_Operand_Type) return Integer is
    begin
        if Operand.Cell = null then
            raise Undefined_Cell_Error;
        else
            Evaluate (Operand.Cell.all);
        end if;
        return Value (Operand.Cell.all);
    end Value;

The only other thing that’s necessary is to recognise cell coordinates as a new type of token within the expression. This can be done by overriding the Fetch_Token primitive inherited from Expression_Type. Here’s the declaration for an overridden Fetch_Token procedure for Formula_Type:

    procedure Fetch_Token (Syntax : in Formula_Type;
                           Expr   : in String;
                           From   : in out Positive;
                           Token  : in out Token_Pointer);

Cell coordinates always begin with a letter and consist entirely of letters and digits. Fetch_Token will need to check that the current character is a letter and then search for the end of the token. The characters making up the token can then be used to obtain a pointer to the corresponding cell, and this can be used to create a Cell_Operand_Type to be returned as the value of the function. To make things easier I’ll use some more functions from Ada.Characters.Handling: Is_Letter, which tests if its parameter is a letter, and: Is_Letter Is_Alphanumeric, which tests if its parameter is a letter or a digit:

    procedure Fetch_Token (Syntax : in Formula_Type;
                           Expr   : in String;
                           From   : in out Positive;
                           Token  : in out Token_Pointer) is
    begin
        if Is_Letter(Expr(From)) then
            declare
                First : Integer := From;
            begin
                while From <= Expr'Last and then Is_Alphanumeric(Expr(From)) loop
                    From := From + 1;
                end loop;
                Cell_Ptr := Cell (Syntax.Sheet.all, Expr(First..From-1));
                Token    := Pointer(new Cell_Operand_Type(Cell_Ptr));
            end;
        else
            Fetch_Token (Expression_Type(Syntax), Expr, From, Token);
        end if;
    end Fetch_Token;

If the token doesn’t start with a letter, the Formula_Type parameter is converted to an Expression_Type value. The original Expression_Type version of Fetch_Token will then be used to extract the token, so this will deal with numbers, operators and so on.

Exercises

18.1	Modify the existing spreadsheet design so that the cell coordinates are stored in the cells themselves, rather than in a separate Cell_Record structure.
18.2	At the moment you can’t save a spreadsheet to a file or load a spreadsheet from a file. Provide primitive operations to load and save the spreadsheet. These should each take a String parameter representing the name of the file to load from or save to.
18.3	Modify the spreadsheet to support a new cell type which is like a formula cell except that its value represents a student’s percentage grade (which should be between 0 and 100) and displays it as a letter grade: A for 80 or above, B for 65 to 79, C for 55 to 64, D for 40 to 54, E for 20 to 39 or F for below 20.
18.4	Modify the spreadsheet to support a new cell type which is like a formula cell except that it has a second discriminant called Base, which is a value between 2 and 16 specifying the number base to be used for the text value of the cell; for example, when Base = 2 the cell’s value will be displayed in binary.

Contents

This file is part of Ada 95: The Craft of Object-Oriented Programming by John English.
Copyright © John English 2000. All rights reserved.
Permission is given to redistribute this work for non-profit educational use only, provided that all the constituent files are distributed without change.
$Revision: 1.2 $
$Date: 2001/11/17 12:00:00 $

Chapter 18: Designing a spreadsheet