Friday, September 28, 2007

assert(Useful);

I've gotten into arguments with other developers on more than one occasion about the value of program assertions.

(By the way, I'm talking about the use of "assert" macros/functions/pragmas in the Ada, C++, and Java programming languages. These languages pay my salary, so while I know about Haskell, Erlang, Eiffel, OCaml, Ruby, etc. I have no call to use any of them. So...YMMV on some of the details.)

Let me try to summarize the anti-assertion position of the last heated discussion I had about this, which was with a very talented C++ programmer:
  1. Assertions are almost always compiled out of the released code, so they're USELESS!
  2. If you're asserting something because you think it might break, there should be an explicit check for it in the code, so assertions are USELESS!
Too many programmers can't think beyond the code, so assertions are thought of as just a coding practice of questionable utility. After all, they're normally not going to be in the released version (see 1), and if you do decide to compile them into the release version, there's nothing to be done if they happen to trigger (2), so what's the point?


You need to step back from thinking of program assertions as just code.

The purpose of effective program assertion practice is embedding encoded information about requirements, design, and implementation assumptions into the code, as code.

Assertions are not to be used to error-check code execution, they're meant to be used to help verify that the software is implemented in accordance with its requirements.

What is being asserted in the following code?

assert(hnd != NULL);

That a pointer is not null, right?

Wrong!

I'm asserting a part of the design specification, the part that specifies that my function will only ever be called with a valid handle. I encode it as a null pointer check, but the purpose is not to check the pointer, it's to verify my code (and my caller's code) as its pertains to the design spec.

Conscientiously crafting meaningful assertions embeds a portion of my understanding of the functionality of that code...in the code.

The assertions themselves can then be reviewed by inspectors or peer reviewers, or the system engineers or architects or whatever. This gives them a means to verify that I have a correct understanding of what this piece of code is supposed to do. The assertions encode requirements and design aspects in the medium of code, and those notations of my understanding can be checked for accuracy. Do the asserted ranges for guaranteed inputs match the spec? Do my assertions reveal that I'm not allowing for the full range of allowable inputs? Are constraints being asserted that I should in fact be explicitly checking?

Embedding design-oriented assertions into the code aids the inspectors or peer reviewers in verifying that the implementation matches the design.

Embedding implementation assumptions as assertions helps inspectors verify that the code corresponds to the requirements, design, and the implementation assumptions.

Assertions are intended to capture the developer's understanding of what a given piece of code is supposed to do--they're not a coding thing, they're a correctness thing.

3 comments:

Jon C said...

I'm glad to hear someone else who uses the same techniques I do. This is a great one when you can't work out why your code is behaving in a particular way and whilst debugging spot that an assertion indicates that a function is a wrong. Then it's time to start backtracking.

Another nice little technique I've developed is to write all your code as just a shell. So define your architecture, write your classes and then write comments in a DOxygen style. If you can't define what each input / output and the effect of the function is, you've clearly not thought it through enough.

Like your assertion(s), it's another tool that makes you double check your thinking. One more tool for your Terrible Programmer Toolbelt ;)

Lucretia9 said...

What would be the point of assertions in Ada? Ada has strong type checking so these bounds check assertions are done during:

1) Compilation time
2) Run time

So, in Ada, I just can't see the point of them.

Luke.

Marc said...

Luke,

Not every constraint can be implicitly verified in Ada by type and bounds checking.

For example, suppose a data supplier is contractually obligated to supply my application with two floating point numbers that represent altitudes. The provider guarantees that the Maximum_Altitude value will always exceed the Minimum_Altitude value. An Altitude_Type of a suitable range has been declared and it is via variables of that type that the data is passed.

As you can see, there's nothing implicit in Ada's typing model that requires the Maximum exceed the Minimum.

So now I have three choices:

1) Explicitly check that the Maximum exceeds the Minimum. This of course is the safest. However, it's also a waste of CPU because the provider will never supply a pair of values that triggers this error (in fact, on their end, they've used the SPARK Examiner to prove it). And if the error were to occur--which it never will--what's the correct response? What's the correct response to something that will never happen? How much time do you spend on this, and how much code do you write?

2) Trust that the provider will provide the data they've promised and that it meets all constraints. Yep, you can go with this, but it'd be nice to have some kind non-interfering verification.

3) pragma Assert(Maximum_Altitude > Minimum_Altitude);
While this will not make it into the release version of the code, it should appear in the developmental versions, when modifications are being made to the system, and therefore does provide some degree of verification, and does it where the code is more volatile (in development) and a contract failure is mostly harmless (in the lab).

Another reason to include an Assert in this scenario is that it indicates to the code reviewer your, the developer's, understanding of the constraints on the incoming data. Suppose you misread the spec, and the Maximum must be greater than or equal to the Minimum? There's a chance now that that misunderstanding will be picked up by a reviewer, or someone down the line who's tracking down a bug that's occurring when the Max and Min altitudes are equal.