Wednesday, September 5, 2007

The Fundamental Theory of C

Whenever I describe to someone what I believe to be the Fundamental Theory of C I always have to first caveat it that I am not denigrating the language.

So, that the Fundamental Theory of C is "Portable Assembly Language" is not a knock on C. Now let's move on.

There are some tasks where a very low-level programming language really does make a lot of sense, e.g. operating system kernels, device drivers and graphics buffer drawing primitives. You're working at a very low-level, right on top of the hardware where performance is critical, and you're trying to layer just a facade of software across the hardware to provide that first level of software glue that higher level abstractions can then build on. If this bottom layer employs too much abstraction, it becomes difficult to verify correctness because the gap between software and hardware functionality begins to blur and it becomes more difficult to map software functions to hardware operation. (Of course this blurring is a good thing as you move up in layers of abstractions out to distributed systems, Web services, and SOA.)

So in a way the ideal choice of language for this kind of low-level programming of hardware is assembly language, which provides complete, detailed, total control of the hardware. The problem with this, of course, is that assembly language is hardware specific and therefore inherently non-portable.

So this is where the C programming language comes in. Written well, C is highly portable, as far as the language itself is concerned. Yet you have full access to the underlying hardware's interfaces, with very little compilation needed between the software and the hardware, giving you straightforward traceability from C to assembly code to hardware. Yet you're not constrained (much) by the specifics of the hardware's architecture--you don't have to work with registers and offsets and address modes, so in a way you have the ability to define your own architectural approach for interacting with the hardware (the Linux kernel obviously being the biggest and best example of this).

C is a computer programming language, meaning that it is optimized for programming computers in terms of their computational components. C has built-in primitives for direct memory access, bit shifting and rotation, increment, decrement, indirection, multiple indirection, unsigned arthmetic, etc. When you program in C you're programming a computer, to tell the computer what to do. Again, see Linux.

Understanding that the Fundamental Theory of C is that it is a portable assembly language, and what that means, directs the developer to the kinds of tasks for which the language is best employed and the developer mindset to have in place when writing the code. C is appropriate for tasks where hardware interaction and low-level "bit-twiddling" are called for, but the very characteristics that make it ideal for those kinds of tasks seriously detract from it when dealing with abstracted entities that have little "computerness" about them.

Higher level languages involve writing programs that manipulate classes, records, data structures, components, services, and suchlike abstract entities. Such programs run on computers, but you're not programming the computer for them.

Using C for programming a computer, having consciously selected that language as a portable assembly language suitable for a specific and appropriate set of computer-oriented tasks, maximizes the benefits C provides to the developer, and the correctness and success of the computational foundation upon which higher-level components and systems, using higher level languages, can then be built.


Bob Warfield said...

C is a portable assembly language. You couldn't be more right!

But the better portable assembly language is Java.

More in my blog:



jsnx said...

C has another important use case -- manipulating low level system objects, like file descriptors and process tables and so on. No point in upping the level of abstraction when it will involve more overhead than your actual data!

Misfit medico said...

a good online assembly language guide

Anonymous said...

Has anyone heard of/used Terse?