|
Programming Projects |
University of Texas at San Antonio |
Neal R. Wagner |
For courses, see
Courses Taught.
Here are a number of programming projects that I used in various
computer science courses, mainly undergraduate.
A long time ago I particularly liked to have non-trivial
programming projects as part of my CS courses. I wanted them
to be maximally interesting but without too much busywork -- sort a sweet spot.
Over time I got increasingly tired of the problems caused by a single
project worth so much of the grade, so I drifted toward more
and easier assignments.
Regular Expression Recognizer.
The program should take an input regular expression and an input
string. The program then recognizes the first occurrence
of a substring described by the regular expression.
The regular expressions can be arbitrarily complicated,
involving the operators concatenation, alternation, and star.
The "syntactic sugar" of REs in programming languages, such
as Perl, is not included as a requirement. This project
is good for an undergraduate algorithms or automata course.
Concepts needed: a parser for REs based on a formal grammar,
representing an RE with an NFA with epsilon moves,
computer representation and construction of an NFA,
simulating the actions of an NFA on input (includes epsilon closure).
Database management system based on the relational algebra.
This project builds a true relational DBMS. With it one could
create and use arbitrarily complicated relational databases.
Of course the ease of use and extra features are entirely missing.
Interpreter for pure Lisp.
This project asks students to write an interpreter for "pure" Lisp.
The resulting interpreter included functions with parameters, so it
wasn't a completely toy system. This assignment was conceived by
William F. Dowling at Drexel University, although it's a standard
idea, and he may have gotten it elsewhere. I used the assignment myself at
that time, and we worked on it together, but the details were mostly his.
The link above gives a solution (in C), but this solution seems to have some
problem with "defun". Since it once worked, there must be some simple problem.
Simple compiler (CS3723 Recitations
5,
6,
7).
I was trying to develop a compiler project that illustrated compiler
concepts, but was easy enough for Junior-level CS majors to finish in
three weekly recitations. (Well, plus a recitation
on formal grammars.) Recitations
5,
6, and
7 at the above
link are the result, along with
Recitation
4
on formal grammars.
I taught scanners for compilers separately and earlier
(Recitations
2,
3),
and I wanted students to be able to do the compiler part even
if they hadn't finished the scanner part, so the
language (Tiny®) has the property that all tokens are single
characters, making the scanner trivial.
While limiting, the write-up includes an example
source program that generates and prints successive Fibonacci
numbers, followed by their prime factorization. The project
uses MIPS assembler language as the target. In spite of the
simplicity, students still learn how syntax-directed translation works.
See
Tiny® extensions for extensions to this project:
changing to floating point numbers, or adding functions with parameters.
Other compilers.
I started teaching compiler-type
courses around 1977. It was usually to undergraduates,
emphasizing the front end rather than the back end (code generation,
optimization).
- Parsers: I used many different kinds of parsers for courses.
- Simple precedence. A bottom-up, shift-reduce parser,
but with significant limitations on allowable grammars.
These use a precedence function
based on table entries for pairs of adjacent tokens.
- Weak precedence. Based the book:
Data Structures and Algorithms, by Aho, Ullman, Hopcroft,
1983, 400 pages. This text is old and fairly elementary.
It includes a parsing method similar to simple precedence that
is easy to understand, but again with restricted grammars.
- Recursive descent. The workhorse for student compilers.
A top-down approach (LL(1)). (I never used table-driven LL(1) parsers.)
I've used it more recently because of its simplicity, relative power,
and ease of implementation "by hand". Students using fancier parsers
in courses often are left with tools they haven't mastered or can't
transfer to a new environment. The technique scales up to large
compilers and lets one "cheat", but also promotes sloppy programming
that may make compiler changes harder.
- YACC. I used YACC at least once, but
now one might use Bison, or some other more modern
tool. This is a bottom-up LALR(1) parser generator. Oriented toward C,
and fairly hard to get used to. Once I used a different LALR parser
generator that just provided the tables. Students wrote the parser itself
(fairly easy, since producing the tables is the hard part).
Pluses with this approach were that students had complete control over
the parser. They also had to learn more about LALR in order to write
the parser.
- Source Language: I used several simple source languages.
First was a small subset of Pascal. Compiler teachers often used
Pascal subsets because access
to variables at run-time is so complicated (due to arbitrary nesting
of function and procedure definitions) and instructive. Much later I switched to
the Tiny® language mentioned above.
- Symbol Table: At first I used either a hash-based or
binary-tree based table. Much later I felt that the symbol table was
messing up weaker students, and making it harder for students
to understand the most important principles, so I switched to
a simple linear loopup, which of course works fine for a
student compiler. (For nested declarations and such, you still
have to add complexity.)
- Target Language: At first I used some kind of
quadruples. Often I had them write a quadruple interpreter.
Other times I used at least two types of abstract stack machines,
sometimes having them write an interpreter.
Only once had I taught assembly and computer organization,
using the LSI-11, so I wasn't much of a hardware person.
Later I taught MIPS assembly, and sometimes used it as a target.
(I never used MIPS machine language.)
- The Back End: As I mentioned before, I never did much
with this -- didn't know much about it. One of my friends went to
work for the AT&T compiler group (back when it was the
compiler group). He came to regard all but the back end as pretty
easy stuff, and he said that he would emphasize these topics
(code generation and optimization) if
he taught compilers again.
Self-replicating C Programs.
This starts half-way
down in the recitation (CS2213 Recitation 7). I was fond of this as a way to teach
subtleties of C and especially to teach arrays of strings in C.
MIPS single-cycle simulator.
CS 2733:
Fall 1998 (labs 8-12),
Spring 1999 (labs 8-11),
Fall 1999 (labs 9-11).
I used this as a programming project because I was initially basing
my course on the course taught the previous semester by Kay Robbins,
and she had used such a project. Her project was also in C, and
more "object-oriented" than the version I was forcing my students to do.
Eventually I got tired of all the copying that was going on with
this project, so I switched to easier recitations.
Designing a conventional cryptosystem.
Laws of Cryptography (pages 313-320).
Projects to design a conventional stream or block cipher are
presented in Appendix C of this (incomplete) online book.
|