Book notes: The elements of FORTRAN style

After I’ve read The Psychology of Computer Programming some time ago, I went looking for books from the same time period that also cover the psychological aspects of programming. The most relevant one I found was The Elements of FORTRAN style: Techniques for Effective Programming, published in 1972 and co-authored by Charles B. Kreitzberg and Ben Shneiderman.

Book cover of The Elements of FORTRAN style

I discovered that this was the first book that Ben Shneiderman wrote, but he is actually most known for Designing the User Interface, an essential book in the field of human-computer interaction. I became familiar with his work when I followed HCI courses at university.

The title of the book is a direct reference to The Elements of Style by William Strunk Jr., a classic and must-read for anyone wanting to become more effective at writing English. Similarly, The Elements of FORTRAN style teaches the art of writing a good program.

While I have never written any FORTRAN code, nor do I intend to, as a Python programmer I still found this book an interesting read. Especially when you consider when it was written and that FORTRAN programs ran on computers like the IBM System/360:

The book’s well-explained principles of good programming are valuable to any programmer, regardless of whether you use FORTRAN or another language.

Notes

Chapter 1. Introduction

Programmers rarely share a background due to the variation in operating systems, hardware, programming environments etc.
This makes it impossible to have a complete and universal set of programming rules.
It means that every programmer must decide which rules to apply that are appropriate in their situation.
Sometimes the best effect is achieved by deliberate transgression of a rule.
A program must produce the desired results to be correct. There can be many correct programs that solve a certain problem.
Programs can vary in speed, accuracy, storage requirements, ease of modification. The ideal program would have all these qualities, but in the real world such programs are rare.
Improving one aspect of a program (e.g. speed) can negatively impact another (e.g. ease of modification).
The goal should be the write the best possible program in the available time using the available resources.
Programs should be modular and its organization logical and meaningful so that it’s easy to modify and debug.
Kludge is used to describe a program that is not well-structured.
Choose meaningful names for variables and functions, you are not the only one to read your code and after six months you will have forgotten what you did anyway.
A program is not complete without good documentation.
After writing a program, go through it. Does it run correctly? Is it understandable? Can it be more efficient? Is it modular? If there is room for improvement, revise the code.
Shortcuts often lead to problems. If you absolutely must take them, carefully document your intention.
Avoid writing clever code, the computer can’t appreciate it anyway.
As a drawing should have no unnecessary lines, a program should have no unnecessary statements. Make every statement count.
Use standardized programming language implementations to ensure portability on other systems.
There is a limit to how much effort should be spent on a program. There is a point of diminishing returns after which additional effort is wasted.
Only by understanding the hardware, software, and job requirements is it possible to evaluate the quality of a program.
There is rarely an absolute standard of “goodness” which can be applied to programs.
The best programmers are those who understand both the problem that needs to be solved and the intricacies of using the computer.
Generally there is more than one way to solve a problem and it may be difficult to identify the best way.
The features of a programming language, e.g. availability of certain data structures or functions, shape how you implement an algorithm.
Novice programmers generally implement the first technique that comes to mind.
Professional programmers resist the urge to start coding immediately. Instead, they carefully consider multiple approaches and design the entire program before writing any code.
A few minutes of extra thought at an early stage can save significant effort later.
A good programmer not only considers the technical aspect of a program, but also the effects when it’s used.

Chapter 2. Reducing Program Compute Time

The run time of a computer program is the sum of: (1) compute time, the time needed to execute the instructions; (2) voluntary wait time, the time that a programs waits for the completion of event (I/O) and (3) involuntary wait time, the time that a programs must wait for other programs running on the computer.
A programmer can reduce the compute time and voluntary wait time, but the involuntary wait time is a function of the operating environment.
Before optimizing performance, consider these factors:
- The number of times that the program will be run
- The savings in compute time which can be realized
- The development time available
- The complexity of the program
- The cost of computer time
- The cost of programmer time

(The rest of this chapter contains many techniques that are specific to FORTRAN and only relevant as even the fastest computers of that time only had a tiny fraction of the compute power of computers today. However, some of these are still relevant as they are good coding practices, not necessarily for their performance benefits.)

Expressions that are redundant should be assigned to a variable.

# Before
area = width * height
volume = width * height * length  # Redundant width * height calculation

# After
area = width * height
volume = area * length

Cache expensive function results in a variable if they are used multiple times.

# Before
num_orders = len(get_orders())
total = sum([
    order.total for order in get_orders()  # Redundant get_orders() call
])

# After
orders = get_orders()
num_orders = len(orders)
total = sum([
    order.total for order in orders
])

Especially check for redundant operations in loops, as a large proportion of execution time is spent there, and evaluate those outside the loop.

# Before
for student in students:
    avg_grade = sum(
        student.grade for student in students
    ) / len(students)  # Redundant expression independent of loop
    if student.grade < avg_grade:
        print(student)

# After
avg_grade = sum(student.grade for student in students) / len(students)
for student in students:
    if student.grade < avg_grade:
        print(student)

Evaluate conditional checks that are independent of the loop outside the loop.

# Before
for video in videos:
    if has_video_codec():  # Conditional check independent of loop
        export_video(video)

# After
if has_video_codec():
    for video in videos:
        export_video(video)

Combine two adjacent loops when iterating over the same sequence.

# Before
emails = []
birth_dates = []

for user in users:
    emails.append(user.email)

for user in users:  # Redundant loop over same sequence
    birth_dates.append(user.birth_date)

# After
emails = []
birth_dates = []

for user in users:
    emails.append(user.email)
    birth_dates.append(user.birth_date)

Chapter 3. Reducing Program Storage Requirements

(This chapter contains only techniques that are specific to FORTRAN and relevant to computer systems of that time when storage was very expensive.)

Chapter 4. Increasing Computational Accuracy

Computers make mistakes due to how numbers are stored and manipulated, for example `1/3` is the infinitely repeating decimal `.333333333…` which cannot be accurately represented in the hardware of a finite computer.
Compounding these minor faults by millions of operations can produce major errors.
When dividing integers, for example, `5-:2`, carefully consider rounding behavior (floor, ceil or nearest) as it can significantly impact calculations in your application.
If exact decimal representation is crucial to your application, use data types or libraries that can properly perform decimal calculations.

Chapter 5. Documentation

Always provide more documentation than you think you need. Chances are that your code will live longer or be more widely used than you think.
Document as much of the program within the code as possible, where it will do the most good. It also allows code and comments to be modified at the same time.
Document unto others as you would be documented to.
A neatly written program is easier to read and understand than a sloppy one.
Meaningful comments can remove all doubt about the programmer’s intentions with a particular statement.
Carefully written comments enhance a program’s usefulness and its ability to be modified.
Variable names should be chosen with care: a poor choice of variable name will significantly decrease the program’s intelligibility.
Arithmetic and logical operators should be surrounded by blanks to make statements easier to read.
The output of a program should be adequately documented so it can be correctly interpreted.
When a program encounters an error, the error message should provide the reader with as much information as possible.
Large, complex programs should be accompanied with external documentation describing the formats of the inputs and outputs.

Chapter 6. Program Design

Designing a program is a very personal experience, each programmer has their own way of going about the development process.
Programming is an art, but even artists have rules of composition and style.
Design goals which programmers should consider:
1. Modularity, dividing a program into logical functions makes it easier to debug and modify.
2. Independence, a program that avoids platform-specific features is easier to deploy across different operating systems and architectures.
3. Generality, a program is more useful if it can handle variations in input.
4. Integrity, how resilient a program is by anticipating unexpected input and handling of edge cases.
These four design goals are not independent. For instance, by making a program more portable you may reduce its usefulness on certain platforms. The programmer must decide which goals are important.
There are two reasons to write a code segment as function:
1. When a code block is repeated several times in the program.
2. When a code block needs to be isolated from other parts of the program.
Before you write a line of code, consider the sequence of operations by creating a flowchart (in your mind).
Large sophisticated programs are not written in a day - they evolve through many iterations. A well-designed program can easily grow and be improved.
The general considerations on whether a program should have a high degree of independence:
1. Is it expected to be used in other environments? If so, how different are these environments?
2. Is the additional coding and debugging effort worth independence gained?
3. How much additional run time and storage will be used for this?
4. How will the clarity of the program be impaired?
A program with a high degree of integrity should check input values to determine if they are within realistic bounds.
Ensure no variable is referenced before it is set.
In debugging a program you should try to create combinations of input so that every branch of the program is tested.
A large complex program can never be fully tested and it is common to find bugs years after they have been released.
The rules presented are not firm rules which must never be violated: they are guidelines. The programmer must weigh the tradeoffs in each case.