CS 3721 Recitation 7 Answers

CS 3721
Programming Languages
Spring 2014

Recitation 7
Possible Answers in Red

Initial Python Program: Write, debug, and run your program. You must use regular expressions in your program:

Suppose you have input data giving student names and student numbers in the following form:

Input Data to Process
17 Dennis A. Andrade @00002222 18 Isabel Carrera @00555555 19 Carlton W. Creech @00444444 20 Philip R. Hayden @00044444 21 Jason R. Luna @00077777 22 Manuel Neri @00222222

We want to rewrite these in a different way. This assumes that the last name is always followed by one or more blanks and then an '@' character. There may or may not be a middle initial. You should drop the initial number, write the student number without leading 0's, and write the last name first, followed by a comma, followed by a blank and the rest of the name, as shown below:

Proper Output Data
2222 Andrade, Dennis A. 555555 Carrera, Isabel 444444 Creech, Carlton W. 44444 Hayden, Philip R. 77777 Luna, Jason R. 222222 Neri, Manuel

Here is one answer. It solves the two problems as follows (there are other ways to solve them):

Extra Middle Initial and Dot: In this case, the RE subpart "([ .a-zA-z]+)" will match both fields (or just one if there is no middle initial).

Drop extra initial zeros: The RE subpart "@0+([0-9]+)" doesn't include "@" or leading zeros in the match.

Transform Student Records

#!/usr/bin/python
import re
import sys
st = r"([0-9]{2})\s+([ .a-zA-z]+)\s+([a-zA-Z]+)\s+@0+([0-9]+)"
stest = ["17 Dennis A. Andrade  @00002222",
         "18 Isabel Carrera     @00555555",
         "19 Carlton W. Creech  @00444444",
         "20 Philip R. Hayden   @00044444",
         "21 Jason R. Luna      @00077777",
         "22 Manuel Neri        @00222222"]

def trans(reg, dat):
    sys.stdout.write("Input: \"" + dat + "\" ---> ")
    stc = re.compile(reg)
    sts = stc.search(dat)
    if sts != None:
        sys.stdout.write(sts.group(4) + " ")
        sys.stdout.write(sts.group(3) + ", ")
        sys.stdout.write(sts.group(2) + "\n")
    else:
        sys.stdout.write("None\n")

for stud in stest:
    trans(st, stud)
Output:
Input: "17 Dennis A. Andrade  @00002222" ---> 2222 Andrade, Dennis A.
Input: "18 Isabel Carrera     @00555555" ---> 555555 Carrera, Isabel
Input: "19 Carlton W. Creech  @00444444" ---> 444444 Creech, Carlton W.
Input: "20 Philip R. Hayden   @00044444" ---> 44444 Hayden, Philip R.
Input: "21 Jason R. Luna      @00077777" ---> 77777 Luna, Jason R.
Input: "22 Manuel Neri        @00222222" ---> 222222 Neri, Manuel

Second Way: Using a File for Data

#!/usr/bin/python
import re
import sys
st = r"([0-9]{2})\s+([ .a-zA-z]+)\s+([a-zA-Z]+)\s+@0+([0-9]+)"
stc = re.compile(st)
# read in students' records
f = open("students.txt",'r')
for line in f:
    sts = stc.search(line)
    if sts != None:
        sys.stdout.write(sts.group(4) + " ")
        sys.stdout.write(sts.group(3) + ", ")
        sys.stdout.write(sts.group(2) + "\n")
    else:
        sys.stdout.write("None\n")
Input data:
% cat students.txt
17 Dennis A. Andrade  @00002222
18 Isabel Carrera     @00555555
19 Carlton W. Creech  @00444444
20 Philip R. Hayden   @00044444
21 Jason R. Luna      @00077777
22 Manuel Neri        @00222222

% python students.py
2222 Andrade, Dennis A.
555555 Carrera, Isabel
444444 Creech, Carlton W.
44444 Hayden, Philip R.
77777 Luna, Jason R.
222222 Neri, Manuel

Two More Python Programs:

Consider the following two representations of date and time:

"American" style	International style	Comments
10:03 pm, April 20, 2004	2004-04-20 22:03:00	(random)
8:04 am, January 4, 1998	1998-01-04 08:04:00	(random)
11:59 am, July 4, 2012	2012-07-04 11:59:00	1 min. < next entry
12:00 pm, July 4, 2012	2012-07-04 12:00:00	This is noon
11:59 pm, December 31, 2003	2003-12-31 23:59:00	1 min. < next entry
12:00 am, January 1, 2004	2004-01-01 00:00:00	This is midnight

Here is a single program showing the translation in both directions:

Translate Both Directions

#!/usr/bin/python
import re
import sys
mon = {'01': 'January', '02': 'February', '03': 'March',
       '04': 'April',   '05': 'May',      '06': 'June', 
       '07': 'July',    '08': 'August',   '09': 'September', 
       '10': 'October', '11': 'November', '12': 'December'}
mon_inv = {} # create empty dictionary
for (k, v) in mon.items():
    mon_inv[v] = k # add (v, k) to mon_inv

atest = ["10:03 pm, April 20, 2004",
         " 8:04 am, January 4, 1998",
         "11:59 am, July 4, 2012",
         "12:00 pm, July 4, 2012",
         "11:59 pm, December 31, 2003",
         "12:00 am, January 1, 2004"]
itest = ["2004-04-20 22:03:00",
         "1998-01-04 08:04:00",
         "2012-07-04 11:59:00",
         "2012-07-04 12:00:00",
         "2003-12-31 23:59:00",
         "2004-01-01 00:00:00"]
def amtoin(reg, dat):
    sys.stdout.write("Input American: \"" + dat + "\" ---> ")
    r = re.compile(reg)
    # now try search
    s = r.search( dat )
    if s != None:
        # output international form
        res =  s.group(6)+"-";
        res += mon_inv[s.group(4)]+"-";
        days = s.group(5);
        if len(days) == 1:
            days = "0" + days
        res += days + " "
        if s.group(3) == "am":
            hours = s.group(1);
            if hours == "12":
                hours = "00"
            if len(hours) == 1:
                hours = "0" + hours
            res += hours + ":"
        else: # s.group(3) == "pm"
            ihours = int(s.group(1))
            # sys.stdout.write(str(ihours) + "***")
            if ihours < 12:
                ihours = ihours + 12
            res += str(ihours) + ":"
        res += s.group(2) + ":00"
        res += "\n"
    else:
        res = "None\n"
    return res
    
def intoam(reg, dat):
    sys.stdout.write("Input Internat: \"" + dat + "\" ---> ")
    r = re.compile(reg)
    # now try search
    s = r.search( dat )
    if s != None:
        # output international form
        hours =  s.group(4)
        ihours = int(hours)
        if ihours < 12:
            am = 1
        else:
            am = 0
        if ihours > 12:
            ihours = ihours - 12
        if ihours == 0:
            ihours = 12
        res = str(ihours)+":"
        #if ihours < 10:
        #    res = "0"+res
        res += s.group(5)+" "
        if am == 1:
            res += "am, "
        else:
            res += "pm, "
        res += mon[s.group(2)] + " "
        days = s.group(3)
        if int(days) < 10:
            days = days[1]
        res += days + ", "
        res += s.group(1)
        res += "\n"
    else:
        res = "None\n"
    return res

# data hardwired in as function calls
regexpa = r'\s*([0-9]{1,2}):([0-9]{2})\s(am|pm),\s([A-Z][a-z]+)\s([0-9]{1,2}),\s([0-9]{4})'
regexpi = r'\s*([0-9]{4})-([0-9]{2})-([0-9]{2}) ([0-9]{2}):([0-9]{2}):([0-9]{2})'
for amer in atest:
    sys.stdout.write(amtoin(regexpa, amer))
for inter in itest:
    sys.stdout.write(intoam(regexpi, inter))
Output:
Input American: "10:03 pm, April 20, 2004" ---> 2004-04-20 22:03:00
Input American: " 8:04 am, January 4, 1998" ---> 1998-01-04 08:04:00
Input American: "11:59 am, July 4, 2012" ---> 2012-07-04 11:59:00
Input American: "12:00 pm, July 4, 2012" ---> 2012-07-04 12:00:00
Input American: "11:59 pm, December 31, 2003" ---> 2003-12-31 23:59:00
Input American: "12:00 am, January 1, 2004" ---> 2004-01-01 00:00:00

Input Internat: "2004-04-20 22:03:00" ---> 10:03 pm, April 20, 2004
Input Internat: "1998-01-04 08:04:00" ---> 8:04 am, January 4, 1998
Input Internat: "2012-07-04 11:59:00" ---> 11:59 am, July 4, 2012
Input Internat: "2012-07-04 12:00:00" ---> 12:00 pm, July 4, 2012
Input Internat: "2003-12-31 23:59:00" ---> 11:59 pm, December 31, 2003
Input Internat: "2004-01-01 00:00:00" ---> 12:00 am, January 1, 2004

Revision date: 2014-03-25. (Please use ISO 8601, the International Standard Date and Time Notation.)