|
 |
CS 3723
Programming Languages |
0. Getting Started
(with hidden files) |
|
How to Get Started:
As a start you should type in the first Python program given
below and run it. This work has several objectives:
- Start working on the Python language. In particular, the Python
program you are to copy uses three specific programming
methods:
- Reading from a text file.
- Using regular expressions to extract fields from a string.
- Using Python's "list" data type. (A Python list has all the
features of an array and of a linked list, plus many more.)
- Learn to write and run Python programs.
Note: You can program in Python on almost any
platform: Python is usually already present on
Apple and Linux machines, and it is available for download on
Windows.
Note: A "copy" program may seem weird and useless,
but you really should type it in from scratch.
(There is significant benefit from typing it yourself.
If you make typing errors, so much the better.)
Running a
Python Program: Assume your program is in the
file "h0.py" on a Unix/Linux system. Execute the command:
% python h0.py
where "%" stands for the prompt.
Better is to use the command:
% python h0.py -tt
With the "-t" option, Python issues a warning if
there are any tabs, and "-tt" makes any tabs an error.
Alternatively, you can add the following as the first line
of the file h0.py:
#!/usr/bin/python -tt
And then type:
% chmod +x h0.py # make h0.py executable
% ./h0.py # execute, using first line to find python
I mostly won't use this method (more trouble than it's
worth for writing sample programs), but it is very useful for
systems programming and larger production work.
Larger copy of the above image:
image
n |
Python Copy Program |
Input File: "students.txt" |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 |
# copy.py: H0 program to copy
import sys # for sys.stdout.write
import re # for regular expressions
f = open("students.txt",'r') # open for reading
# reg expr: matches student name plus 3 int scores
r = re.compile(r"(\w+ \w+)\s+(\d+)\s+(\d+)\s+(\d+)")
ave = [] # empty list
sys.stdout.write("Name " +
" E1" + " E2" + " Final" + " Grade\n\n");
for line in f: # iterate through each line in file
m = r.search(line) # m is match data
if m != None:
e1 = float(m.group(2)) # exam 1
e2 = float(m.group(3)) # exam 2
f = float(m.group(4)) # final
sum = e1+e2+f # total raw score
aver = sum/3.5 # percent score
ave.append(aver) # add entry to list
sys.stdout.write(line.strip() + " ")
sys.stdout.write("%6.2f" % aver)
sys.stdout.write("\n")
else:
sys.stdout.write("No match!\n")
break
tot = 0
for av in ave: # iterate through array
tot += av # add student scores
sys.stdout.write("\nCourse Ave: %6.2f\n"
% (tot/len(ave)) ) # system demanded extra ()
| % cat students.txt
Bruce Wayne 85 67 134
Peter Parker 72 71 129
Bruce Banner 55 65 114
Clark Kent 91 88 143
Princess Diana 70 62 131
Barbara Gordon 96 89 147
Selina Kyle 77 74 105
|
Output |
---|
% python copy.py
Name E1 E2 Final Grade
Bruce Wayne 85 67 134 81.71
Peter Parker 72 71 129 77.71
Bruce Banner 55 65 114 66.86
Clark Kent 91 88 143 92.00
Princess Diana 70 62 131 75.14
Barbara Gordon 96 89 147 94.86
Selina Kyle 77 74 105 73.14
Course Ave: 80.20
|
Notes:
- In order to produce just this output, the list "ave"
is not needed, since the program could compute a running sum.
I wanted to illustrate lists.
- Python mostly has no declarations of variables, and a given
variable can mostly be used for any type you like, or even
for several types in the same program.
- When you are inside parens (as in lines 8-9 and 28-29
above), you may indent on the second line in any way you like.
Otherwise, the indenting should be exactly as shown above,
with exactly 4 spaces for each level of indenting, and with
no tab characters. (I was able to set my editor so that the
Tab key always produces 4 spaces and no tabs.)
- My favorite
mistake is to leave off the ":" character just before a new
level of indenting, or to use ";".
- The "strip( )" in line 19 strips off any whitespace characters
from the start or end of the character string "line". In this
case there was only a newline at the end. The "strip( )"
method makes
a new copy of the string for use in line 19, but the variable
"line" retains the newline. You would need to write
"line = line.strip( )" to get rid of the newline
from the variable "line".
- Python regular expressions are mostly the same as in any other
scripting language, but there are significant notational differences.
Python has nothing like the "$" variables that can be used in Perl to
refer to the different matching groups. Instead you must use the
"group( )" method as shown on lines 13-15. Thus
"m.group(3)" takes on the role that
"$3" has in Perl. The result is a string, and
lines 13-15 use the function "float( )" to
convert this string to floating point (a double).
Parts of
Python Illustrated by this Program:
- Reading from a text file:
"students.txt" is a file of records in Linux/Unix where each
record is terminated with a newline ("\n").
(In Windows, records are often terminated with a carriage return
and a linefeed.)
Line 5 is an iterator that provides each record of the file
in sequence. The record retains its newline at the end (in Unix).
Lines 12-17 form a while loop that explicitly reads each record
and explicitly stores the record in the variable "line".
(The name "line" is an arbitrary choice.)
n |
Read and Output a File, Three Programs |
Output |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23 |
# file.py: file: open, read, write
import sys
f = open("students.txt",'r')
for line in f:
sys.stdout.write(line) # "line" has "\n" at end
# file2.py: explicitly read
import sys
f = open("students.txt",'r')
while True:
line = f.readline() # at EOF return empty string
if not line:
break
else:
sys.stdout.write(line)
# file_stdin.py: read from stdin
import sys
for line in sys.stdin:
sys.stdout.write(line) # "line" has "\n" at end
| % python file.py
Bruce Wayne 85 67 134
Peter Parker 72 71 129
Bruce Banner 55 65 114
Clark Kent 91 88 143
Princess Diana 70 62 131
Barbara Gordon 96 89 147
Selina Kyle 77 74 105
% python file.py
Bruce Wayne 85 67 134
Peter Parker 72 71 129
. . .
Selina Kyle 77 74 105
% python file_stdin.py < students.txt
Bruce Wayne 85 67 134
Peter Parker 72 71 129
. . .
Selina Kyle 77 74 105
|
Below is a program that fakes having a file, "opening" it, and
"reading" from it. This can be convenient if you interact with
a terminal window that has Python but doesn't support input.
Notice that the only thing changed below is the new definition
of the variable f (in red).
n |
Fake a file |
Output |
1
2
3
4
5
6
7
8
9
10
11
12 |
# fakefile.py: fake file input
import sys
f = [ "Bruce Wayne 85 67 134\n",
"Peter Parker 72 71 129\n",
"Bruce Banner 55 65 114\n",
"Clark Kent 91 88 143\n",
"Princess Diana 70 62 131\n",
"Barbara Gordon 96 89 147\n",
"Selina Kyle 77 74 105\n" ]
for line in f:
sys.stdout.write(line)
| % python fakefile.py
Bruce Wayne 85 67 134
Peter Parker 72 71 129
Bruce Banner 55 65 114
Clark Kent 91 88 143
Princess Diana 70 62 131
Barbara Gordon 96 89 147
Selina Kyle 77 74 105
|
- Formatted input using a regular
expression: Here we don't show the input, but show
how the fields of one string are extracted.
The regular expression (RE) below is what comes between the quote marks
in r"xxx" . In this case it is "xxx".
Each RE describes a collection of character strings. (The above
describes only a triple of "x"s.) In the example here, we want to
describe the contents of lines in the file. Short-hand notation gives
us "\s" for white space, "\d" for a digit,
and "\w" for a character
in a word (upper-lower case letter). Adding a "+" to the right
means "one or more occurrences of". Anything inside parentheses
are matched and available for use. (A parenthesis itself is
represented by "\(" or "\)".) The "compile" method makes the
RE ready for use, and the "search" method searches
for any matches available.
There's a lot more to this, and we'll go over it later.
n |
Use REs to Extract Fields From a
String |
Common Output |
1
2
3
4
5
6
7
8
9
10
11
12
13
14 |
# regexp.py: use RE to extract fields
import sys
import re
def printstr(s):
sys.stdout.write("\"" + s + "\"\n")
r = re.compile(r"(\w+ \w+)\s+(\d+)\s+(\d+)\s+(\d+)")
m = r.search("Bruce Wayne 85 67 134")
if m != None:
for i in range(0,5):
printstr(m.group(i))
t = r.split("Bruce Wayne 85 67 134")
sys.stdout.write(str(t) + "\n")
| % python regexpr.py
"Bruce Wayne 85 67 134"
"Bruce Wayne"
"85"
"67"
"134"
['', 'Bruce Wayne', '85', '67', '134', '']
|
- Making use of a Python list:
In Python, the common and versatile data structure "list" is a
combination of all the features of a linked list and of an array,
along with much more besides.
List in Python |
n |
List Notation |
Array Notation |
1
2
3
4
5
6
7
8
9
10 |
# ave.py: calculate average of list
import sys
ave = [81.71, 77.71, 66.86, 92.00,
75.14, 94.86, 73.14]
tot = 0
for av in ave:
tot += av
sys.stdout.write("Average: %6.2f\n"
% (tot/len(ave)) ) |
# ave.py: calculate average of list
import sys
ave = [81.71, 77.71, 66.86, 92.00,
75.14, 94.86, 73.14]
tot = 0
for i in range(0,len(ave)):
tot += ave[i]
sys.stdout.write("Average: %6.2f\n"
% (tot/len(ave)) ) |
Common Output |
% python ave.py
Average: 80.20
|
Suppose you
need to use Python from a browser:
The best case is to use the
ideone Python 2 and 3 simulator.
The link is set for Java by default. You have to change a box from
Java to Python or to Python 3.
You can copy the whole students.txt file into their spot for
stdin. You have to read from the file sys.stdin.
One way is to use the line
as shown above. (In place of line 10 in the original program.
You also need to delete or comment out line 4, the line that opens
the file.)
( Revision date: 2015-01-02.
Please use ISO
8601, the International Standard.)
|