Programming Projects
 University of Texas at San Antonio 
Neal R. Wagner

For courses, see Courses Taught.

Here are a number of programming projects that I used in various computer science courses, mainly undergraduate. A long time ago I particularly liked to have non-trivial programming projects as part of my CS courses. I wanted them to be maximally interesting but without too much busywork -- sort a sweet spot. Over time I got increasingly tired of the problems caused by a single project worth so much of the grade, so I drifted toward more and easier assignments.

List of Projects
Regular Expression Recognizer
Database Management System
Interpreter for Pure Lisp
Simple compiler
Other compilers
Self-replicating C Programs
MIPS single-cycle simulator
Conventional Cryptosystem


Regular Expression Recognizer.
The program should take an input regular expression and an input string. The program then recognizes the first occurrence of a substring described by the regular expression.

The regular expressions can be arbitrarily complicated, involving the operators concatenation, alternation, and star. The "syntactic sugar" of REs in programming languages, such as Perl, is not included as a requirement. This project is good for an undergraduate algorithms or automata course.

Concepts needed: a parser for REs based on a formal grammar, representing an RE with an NFA with epsilon moves, computer representation and construction of an NFA, simulating the actions of an NFA on input (includes epsilon closure).


Database management system based on the relational algebra.
This project builds a true relational DBMS. With it one could create and use arbitrarily complicated relational databases. Of course the ease of use and extra features are entirely missing.


Interpreter for pure Lisp.
This project asks students to write an interpreter for "pure" Lisp. The resulting interpreter included functions with parameters, so it wasn't a completely toy system. This assignment was conceived by William F. Dowling at Drexel University, although it's a standard idea, and he may have gotten it elsewhere. I used the assignment myself at that time, and we worked on it together, but the details were mostly his. The link above gives a solution (in C), but this solution seems to have some problem with "defun". Since it once worked, there must be some simple problem.


Simple compiler (CS3723 Recitations 567).
I was trying to develop a compiler project that illustrated compiler concepts, but was easy enough for Junior-level CS majors to finish in three weekly recitations. (Well, plus a recitation on formal grammars.) Recitations 56, and 7 at the above link are the result, along with Recitation 4 on formal grammars. I taught scanners for compilers separately and earlier (Recitations 23), and I wanted students to be able to do the compiler part even if they hadn't finished the scanner part, so the language (Tiny®) has the property that all tokens are single characters, making the scanner trivial. While limiting, the write-up includes an example source program that generates and prints successive Fibonacci numbers, followed by their prime factorization. The project uses MIPS assembler language as the target. In spite of the simplicity, students still learn how syntax-directed translation works. See Tiny® extensions for extensions to this project: changing to floating point numbers, or adding functions with parameters.


Other compilers.
I started teaching compiler-type courses around 1977. It was usually to undergraduates, emphasizing the front end rather than the back end (code generation, optimization).

  • Parsers: I used many different kinds of parsers for courses.
    • Simple precedence. A bottom-up, shift-reduce parser, but with significant limitations on allowable grammars. These use a precedence function based on table entries for pairs of adjacent tokens.
    • Weak precedence. Based the book: Data Structures and Algorithms, by Aho, Ullman, Hopcroft, 1983, 400 pages. This text is old and fairly elementary. It includes a parsing method similar to simple precedence that is easy to understand, but again with restricted grammars.
    • Recursive descent. The workhorse for student compilers. A top-down approach (LL(1)). (I never used table-driven LL(1) parsers.) I've used it more recently because of its simplicity, relative power, and ease of implementation "by hand". Students using fancier parsers in courses often are left with tools they haven't mastered or can't transfer to a new environment. The technique scales up to large compilers and lets one "cheat", but also promotes sloppy programming that may make compiler changes harder.
    • YACC. I used YACC at least once, but now one might use Bison, or some other more modern tool. This is a bottom-up LALR(1) parser generator. Oriented toward C, and fairly hard to get used to. Once I used a different LALR parser generator that just provided the tables. Students wrote the parser itself (fairly easy, since producing the tables is the hard part). Pluses with this approach were that students had complete control over the parser. They also had to learn more about LALR in order to write the parser.

  • Source Language: I used several simple source languages. First was a small subset of Pascal. Compiler teachers often used Pascal subsets because access to variables at run-time is so complicated (due to arbitrary nesting of function and procedure definitions) and instructive. Much later I switched to the Tiny® language mentioned above.

  • Symbol Table: At first I used either a hash-based or binary-tree based table. Much later I felt that the symbol table was messing up weaker students, and making it harder for students to understand the most important principles, so I switched to a simple linear loopup, which of course works fine for a student compiler. (For nested declarations and such, you still have to add complexity.)

  • Target Language: At first I used some kind of quadruples. Often I had them write a quadruple interpreter. Other times I used at least two types of abstract stack machines, sometimes having them write an interpreter. Only once had I taught assembly and computer organization, using the LSI-11, so I wasn't much of a hardware person. Later I taught MIPS assembly, and sometimes used it as a target. (I never used MIPS machine language.)

  • The Back End: As I mentioned before, I never did much with this -- didn't know much about it. One of my friends went to work for the AT&T compiler group (back when it was the compiler group). He came to regard all but the back end as pretty easy stuff, and he said that he would emphasize these topics (code generation and optimization) if he taught compilers again.


Self-replicating C Programs.
This starts half-way down in the recitation (CS2213 Recitation 7). I was fond of this as a way to teach subtleties of C and especially to teach arrays of strings in C.


MIPS single-cycle simulator.
CS 2733: Fall 1998 (labs 8-12), Spring 1999 (labs 8-11), Fall 1999 (labs 9-11).
I used this as a programming project because I was initially basing my course on the course taught the previous semester by Kay Robbins, and she had used such a project. Her project was also in C, and more "object-oriented" than the version I was forcing my students to do. Eventually I got tired of all the copying that was going on with this project, so I switched to easier recitations.


Designing a conventional cryptosystem.
Laws of Cryptography (pages 313-320). Projects to design a conventional stream or block cipher are presented in Appendix C of this (incomplete) online book.