CS3343/3341
 Analysis of Algorithms 
  Pseudocode Conventions   
Intro. to Algorithms
Cormen, et al.


Overview

Your textbook ( Introduction to Algorithms, 3rd Ed. by Cormen, Leiserson, and Rivest) uses a special pseudocode to describe algorithms. It is moderately successful, since it avoids details of an actual implementation. The resulting descriptions are often short and fairly easy to understand, once you get used to the quirks of the notation. It is not oriented toward any particular programming language, and, in fact seems almost hostile to languages like Java, C, and C++.


Changes from the 2nd to the 3rd Edition

There were five relatively minor changes in this pseudocode between the 2nd and the 3rd editions, but these changes had a big visual impact. The changes were "based on many requests" to make it more familiar to those used to C, C++, and Java (as well as Python!). It's still not remotely like those other languages (except for Python, which it still resembles). I thought the old form was more successful.

Item changed 3rd Edition 2nd Edition
Start of Comment //
Keywords then and do not present present
Assignment Operator =
Equality Operator ===
Attributes or fields A . length length [A]

Their particular version of C's infamous "==" symbol is not simply two equal signs in a row (as it is almost anywhere else), but it's a special symbol similar to their equals character, but with two gaps, one above and one below. Three of the other comparison symbols are still completely different from C/C++/Java: using  ≠,  ⩽,  ⩾  in place of  !=,  <=,  >=. (I'm not suggesting that they should switch to these other operators also, but rather that they shouldn't have caved in on ==.)


Indentation for Block Structure

Most noticeable with this pseudo-code is the use of indentation to denote block structure, a convention followed only by Python: "Python's use of indentation for block delimiters is unique among popular programming languages." [Wikipedia] I think this works pretty well, and the new form (without then and do keywords) is especially easy to read. (The authors note that "in real programming languages" this would make "levels of indentation hard to determine when code is split across pages." With their short descriptions they don't have this problem.)


Array Bounds

Another significant feature is that the lower bound of an array is not usually 0 and must be explicitly given. In fact, the value 1 is the most common, as with:

Description Text's pseudo-code C / Java / C++
Array bounds
declaration
A[1..n]
(bounds explicitly declared)
A[n] (lower bound must be 0;
upper bound is n-1)
for loop for i = 1 to n for (int i = 0; i < n; i++)

So your text often uses 1 to n for array bounds, rather than the 0 to n-1 that you are used to. When you convert an algorithm in the text to C or Java, this difference can be more annoying and confusing than you might expect. (Beware! In particular, the mod n or % n operation works in many ways in conjunction with a range from 0 to n-1 with no comparable simple alternative for a range from 1 to n.)

The C conventions for arrays, introduced over 40 years ago, were just a quirky and confusing feature back then. I find it amazing that all the major programming languages have now gone this way. (Try explaining this to your non-computer-geek friends: "Yeah, see the first item is numbered 0 and the nth item is numbered n-1." However ... this is the way floors of a building are numbered in the UK or in Europe.)


Objects have attributes or fields

The second edition used square brackets for attibutes of an object or for a field of an object. Thus as noted above, the second edition used length[A] for the length of an array A. This is the same as array notation. The authors probably made the rigth move by switching to dot notation for the third edition, so the length of an array A is A.length, the same as with Java.

In a normal language, these attributes can be a built-in field of an object, as with A.length in Java, or a built-in function, as with s.length() in Java, where s is a String. They can also be a user-defined function, or more commonly an extra variable -- separate or a field in a class or struct. Thus if S is a stack, we'll have a field S.top which might be an integer variable stored somewhere that gives the integer index of the top stack element in an array. As part of a push, the text writes S.top = S.top + 1, whereas in the second edition they wrote top[S] = top[S] + 1.


Other Issues

The text says that parameters are passed by value, and that logical operators ("and" and "or") use short-circuit evaluation.


Revision date: 2011-12-09. (Please use ISO 8601, the International Standard Date and Time Notation.)