CS 3721
Programming Languages  
Spring 2014
 Recitation 7
Possible Answers in Red 


Initial Python Program: Write, debug, and run your program. You must use regular expressions in your program:

  1. Suppose you have input data giving student names and student numbers in the following form:

    Input Data to Process
    17 Dennis A. Andrade  @00002222
    18 Isabel Carrera     @00555555
    19 Carlton W. Creech  @00444444
    20 Philip R. Hayden   @00044444
    21 Jason R. Luna      @00077777
    22 Manuel Neri        @00222222
    

    We want to rewrite these in a different way. This assumes that the last name is always followed by one or more blanks and then an '@' character. There may or may not be a middle initial. You should drop the initial number, write the student number without leading 0's, and write the last name first, followed by a comma, followed by a blank and the rest of the name, as shown below:

    Proper Output Data
    2222 Andrade, Dennis A.
    555555 Carrera, Isabel
    444444 Creech, Carlton W.
    44444 Hayden, Philip R.
    77777 Luna, Jason R.
    222222 Neri, Manuel
    

    Here is one answer. It solves the two problems as follows (there are other ways to solve them):

    • Extra Middle Initial and Dot: In this case, the RE subpart "([ .a-zA-z]+)" will match both fields (or just one if there is no middle initial).
    • Drop extra initial zeros: The RE subpart "@0+([0-9]+)" doesn't include "@" or leading zeros in the match.

    Transform Student Records
    #!/usr/bin/python
    import re
    import sys
    st = r"([0-9]{2})\s+([ .a-zA-z]+)\s+([a-zA-Z]+)\s+@0+([0-9]+)"
    stest = ["17 Dennis A. Andrade  @00002222",
             "18 Isabel Carrera     @00555555",
             "19 Carlton W. Creech  @00444444",
             "20 Philip R. Hayden   @00044444",
             "21 Jason R. Luna      @00077777",
             "22 Manuel Neri        @00222222"]
    
    def trans(reg, dat):
        sys.stdout.write("Input: \"" + dat + "\" ---> ")
        stc = re.compile(reg)
        sts = stc.search(dat)
        if sts != None:
            sys.stdout.write(sts.group(4) + " ")
            sys.stdout.write(sts.group(3) + ", ")
            sys.stdout.write(sts.group(2) + "\n")
        else:
            sys.stdout.write("None\n")
    
    for stud in stest:
        trans(st, stud)
    
    Output:
    Input: "17 Dennis A. Andrade  @00002222" ---> 2222 Andrade, Dennis A.
    Input: "18 Isabel Carrera     @00555555" ---> 555555 Carrera, Isabel
    Input: "19 Carlton W. Creech  @00444444" ---> 444444 Creech, Carlton W.
    Input: "20 Philip R. Hayden   @00044444" ---> 44444 Hayden, Philip R.
    Input: "21 Jason R. Luna      @00077777" ---> 77777 Luna, Jason R.
    Input: "22 Manuel Neri        @00222222" ---> 222222 Neri, Manuel
    

    Second Way: Using a File for Data
    #!/usr/bin/python
    import re
    import sys
    st = r"([0-9]{2})\s+([ .a-zA-z]+)\s+([a-zA-Z]+)\s+@0+([0-9]+)"
    stc = re.compile(st)
    # read in students' records
    f = open("students.txt",'r')
    for line in f:
        sts = stc.search(line)
        if sts != None:
            sys.stdout.write(sts.group(4) + " ")
            sys.stdout.write(sts.group(3) + ", ")
            sys.stdout.write(sts.group(2) + "\n")
        else:
            sys.stdout.write("None\n")
    
    Input data: % cat students.txt 17 Dennis A. Andrade @00002222 18 Isabel Carrera @00555555 19 Carlton W. Creech @00444444 20 Philip R. Hayden @00044444 21 Jason R. Luna @00077777 22 Manuel Neri @00222222 % python students.py 2222 Andrade, Dennis A. 555555 Carrera, Isabel 444444 Creech, Carlton W. 44444 Hayden, Philip R. 77777 Luna, Jason R. 222222 Neri, Manuel


Two More Python Programs:

Consider the following two representations of date and time:

"American" style International styleComments
10:03 pm, April 20, 20042004-04-20 22:03:00 (random)
  8:04 am, January 4, 19981998-01-04 08:04:00 (random)
11:59 am, July 4, 20122012-07-04 11:59:00 1 min. < next entry
12:00 pm, July 4, 20122012-07-04 12:00:00 This is noon
11:59 pm, December 31, 20032003-12-31 23:59:00 1 min. < next entry
12:00 am, January 1, 20042004-01-01 00:00:00 This is midnight

Here is a single program showing the translation in both directions:

Translate Both Directions
#!/usr/bin/python
import re
import sys
mon = {'01': 'January', '02': 'February', '03': 'March',
       '04': 'April',   '05': 'May',      '06': 'June', 
       '07': 'July',    '08': 'August',   '09': 'September', 
       '10': 'October', '11': 'November', '12': 'December'}
mon_inv = {} # create empty dictionary
for (k, v) in mon.items():
    mon_inv[v] = k # add (v, k) to mon_inv

atest = ["10:03 pm, April 20, 2004",
         " 8:04 am, January 4, 1998",
         "11:59 am, July 4, 2012",
         "12:00 pm, July 4, 2012",
         "11:59 pm, December 31, 2003",
         "12:00 am, January 1, 2004"]
itest = ["2004-04-20 22:03:00",
         "1998-01-04 08:04:00",
         "2012-07-04 11:59:00",
         "2012-07-04 12:00:00",
         "2003-12-31 23:59:00",
         "2004-01-01 00:00:00"]
def amtoin(reg, dat):
    sys.stdout.write("Input American: \"" + dat + "\" ---> ")
    r = re.compile(reg)
    # now try search
    s = r.search( dat )
    if s != None:
        # output international form
        res =  s.group(6)+"-";
        res += mon_inv[s.group(4)]+"-";
        days = s.group(5);
        if len(days) == 1:
            days = "0" + days
        res += days + " "
        if s.group(3) == "am":
            hours = s.group(1);
            if hours == "12":
                hours = "00"
            if len(hours) == 1:
                hours = "0" + hours
            res += hours + ":"
        else: # s.group(3) == "pm"
            ihours = int(s.group(1))
            # sys.stdout.write(str(ihours) + "***")
            if ihours < 12:
                ihours = ihours + 12
            res += str(ihours) + ":"
        res += s.group(2) + ":00"
        res += "\n"
    else:
        res = "None\n"
    return res
    
def intoam(reg, dat):
    sys.stdout.write("Input Internat: \"" + dat + "\" ---> ")
    r = re.compile(reg)
    # now try search
    s = r.search( dat )
    if s != None:
        # output international form
        hours =  s.group(4)
        ihours = int(hours)
        if ihours < 12:
            am = 1
        else:
            am = 0
        if ihours > 12:
            ihours = ihours - 12
        if ihours == 0:
            ihours = 12
        res = str(ihours)+":"
        #if ihours < 10:
        #    res = "0"+res
        res += s.group(5)+" "
        if am == 1:
            res += "am, "
        else:
            res += "pm, "
        res += mon[s.group(2)] + " "
        days = s.group(3)
        if int(days) < 10:
            days = days[1]
        res += days + ", "
        res += s.group(1)
        res += "\n"
    else:
        res = "None\n"
    return res

# data hardwired in as function calls
regexpa = r'\s*([0-9]{1,2}):([0-9]{2})\s(am|pm),\s([A-Z][a-z]+)\s([0-9]{1,2}),\s([0-9]{4})'
regexpi = r'\s*([0-9]{4})-([0-9]{2})-([0-9]{2}) ([0-9]{2}):([0-9]{2}):([0-9]{2})'
for amer in atest:
    sys.stdout.write(amtoin(regexpa, amer))
for inter in itest:
    sys.stdout.write(intoam(regexpi, inter))

Output:
Input American: "10:03 pm, April 20, 2004" ---> 2004-04-20 22:03:00
Input American: " 8:04 am, January 4, 1998" ---> 1998-01-04 08:04:00
Input American: "11:59 am, July 4, 2012" ---> 2012-07-04 11:59:00
Input American: "12:00 pm, July 4, 2012" ---> 2012-07-04 12:00:00
Input American: "11:59 pm, December 31, 2003" ---> 2003-12-31 23:59:00
Input American: "12:00 am, January 1, 2004" ---> 2004-01-01 00:00:00

Input Internat: "2004-04-20 22:03:00" ---> 10:03 pm, April 20, 2004
Input Internat: "1998-01-04 08:04:00" ---> 8:04 am, January 4, 1998
Input Internat: "2012-07-04 11:59:00" ---> 11:59 am, July 4, 2012
Input Internat: "2012-07-04 12:00:00" ---> 12:00 pm, July 4, 2012
Input Internat: "2003-12-31 23:59:00" ---> 11:59 pm, December 31, 2003
Input Internat: "2004-01-01 00:00:00" ---> 12:00 am, January 1, 2004


Revision date: 2014-03-25. (Please use ISO 8601, the International Standard Date and Time Notation.)