CS 3723, Fall 2014: Homework 4: NFAs with ε-moves

CS 3723
Programming Languages
Fall 2014

Homework 4. NFAs with ε-moves
ANSWERS

Week 4: Sep 15 - 19

Submit following directions at: submissions and rules at: rules. Deadlines are:

2014-09-29 23:59:59 (that's Mon, 29 Sep 2014, 11:59:59 pm) for full credit.
2014-10-03 23:59:59 (that's Fri, 3 Oct 2014, 11:59:59 pm) for 75% credit.

Subset Algorithm With ε-moves: See NFAs with ε-moves. Notice that the only difference between this and the former subset algorithm (the one that didn't allow ε-moves) is: Here we compulsively take the ε-closure of each subset generated.

Use the following NFA to represent the RE: /(aa)*(bb)*/. Carry out the extended subset algorithm on this NFA to get a DFA that accepts the same language. This isn't the only NFA that will work, but this is the one you must work with.

State a b

cl({0})={0,2,4} ST cl({1})={1} cl({3})={3}

{1} cl({0})={0,2,4} { }

{3} { } cl({2})={2,4}

{2,4} T { } cl({3})={3}

{ } { } { }
Suppose you want to handle the regular expression: /ab(ab)*/. The figure below gives this converted to an NFA with ε-moves. (Here "@" is what I use for ε in programs (ε = the empty string). This isn't the only NFA that will work, but this is the one you must work with. (Below, the transition from state 7 to state 9 should be labeled with @.)
1. Carry out the extended subset algorithm on this NFA to get a DFA that accepts the same language. [Your answer should have 6 states including the empty set as an error state.]
2. This DFA is not optimal: it doesn't have the minimal number of states. But it can be converted to that optimal DFA by identifying two pairs of its states: the two terminal states (there are only 2), and two of the non-terminals. See if you can figure out what the minimal DFA is. (You haven't been given an algorithm for this.) Table and diagrams are below. The table may be a little confusing because each state is represented in two forms: as a subset of the original states, and as a number gotten from renumbering the states. The diagrams use the new numbering. Note that state 2 in the renumbering is the error state corresponding to the empty set.
State a b

S 0 {0} 1 {1,2} 2 {}

1 {1,2} 2 {} 3 {3,4,8,9}

2 {} 2 {} 2 {}

T 3 {3,4,8,9} 4 {5,6} 2 {}

4 {5,6} 2 {} 5 {4,7,9}

T 5 {4,7,9} 4 {5,6} 2 {}

Continued Fraction Evaluation (Recursive Solution): Please refer to Problem 4 of Homework 2, where you wrote a function to evaluate a weird fraction.

For this problem, you are to write another function p with one parameter n. This function must evaluate the same fraction, but must do it top-down, rather than bottom-up as before. You must use recursion and it must work for general n. The function cannot have a loop in it. Give results for p(10), p(15) and p(20). Below is a Python program that does this. There are two forms, the second more accurate than the first. The results are compared against the exact value of pi.
Below the first program are two programs by students, the first elegantly simple and the second high-tech.

# pi_cf.py: pi from cont. fraction
import sys
import math

if len(sys.argv) != 2:
    sys.stdout.write("Need 2 args\n")
else:
    L = int(sys.argv[1])

def p1(i): # simple form
    if i == L:
        return 1
    return (2.0*i-1)+(i*i)/p1(i+1)

def p2(i): # better approximation
    if i == L:
        return 2*L
    return (2.0*i-1)+(i*i)/p2(i+1)
    
def p(n):
    return 4.0/p2(n)

sys.stdout.write("%3i, Pi:   %19.16f, %19.16f\n" %
  (L, math.pi, math.pi))
sys.stdout.write("%3i, Appr: %19.16f, %19.16f\n" %
  (L, 4/p1(1), p(1)))
sys.stdout.write("%3i, Diff: %19.16f, %19.16f\n" %
  (L, math.pi-4/p1(1), math.pi-p(1)))
Output:
Depth        Return 1             Return 2*L

 10, Pi:    3.1415926535897931,  3.1415926535897931
 10, Appr:  3.1415897324224455,  3.1415925803014688
 10, Diff:  0.0000029211673476,  0.0000000732883243

 15, Pi:    3.1415926535897931,  3.1415926535897931
 15, Appr:  3.1415926540675274,  3.1415926535957652
 15, Diff: -0.0000000004777343, -0.0000000000059721

 20, Pi:    3.1415926535897931,  3.1415926535897931
 20, Appr:  3.1415926535897185,  3.1415926535897913
 20, Diff:  0.0000000000000746,  0.0000000000000018

# Recursive Continued fraction for pi
    
    # Starter function to start evaluation of pi
    def EvalFrac(n):
        if(n == 0):
            return 4
        else:
            return 4 / BottomFrac(1, n)
    
    # Recursive function to calculate bottom part of the fraction
    def BottomFrac(n, end):
        if end == n:
            return 2*n+1
        else:
            return (2*(n-1)+1) + ((n * n) / BottomFrac(n+1,end))
    
    print(EvalFrac(10))
    print(EvalFrac(15))
    print(EvalFrac(20))

#!/usr/bin/env python

from decimal import Decimal, ROUND_FLOOR, getcontext

def _pi(n, divisor):
    '''Helper function. Carries around current n and divisor.'''
    if n == 0:
        return 4 / divisor
    else:
        return _pi(n - 1, (2 * (n - 1) + 1) + (n**2 / divisor))

def pi(n):
    '''Calculates PI using continued fractions with n terms.'''
    # Set the decimal precision and rounding.
    getcontext().prec = int((n * 3) / 4.0)
    getcontext().rounding = ROUND_FLOOR
    return _pi(n, Decimal(2 * n + 1))

def main():
    for x in range(5, 51, 5,):
        print 'pi(%2s): %s' % (x, pi(x))

if __name__ == '__main__':
    main()

Processing REs Using NFAs: Look at the page ε-moves. At the end there is an NFA with ε-moves that supposedly came from the RE: (a|b|c)*(ab|aac). In the diagram I put a|b|c on top of the arrow, but the methods presented don't allow for such a shortcut. Instead, the two "or" operators would have to be handled separately. In this question, I'm just considering how these methods would handle the RE: (a|b|c)*, which is just an initial portion of the whole example. The steps I'm showing here have the same numbers as those in: RE_sim.

Step 1: Parser: Here (a|b|c)* is translated to RPN form: ab|c|*

Step 2. Translate to NFA with ε-moves: Here the following NFA with ε-moves would be constructed. Refer to the page: RE−−>NFA.

Here is the internal representation of the above graph. This information is not needed for this problem.

Graph (internal form)

State Adjacency List 0 --> [a, 1] 1 --> [@, 5] 2 --> [b, 3] 3 --> [@, 5] 4 --> [@, 2] --> [@, 0] 5 --> [@, 9] 6 --> [c, 7] 7 --> [@, 9] 8 --> [@, 6] --> [@, 4] 9 --> [@, 8] --> [@,11] s 10 --> [@,11] --> [@, 8] t 11 -->

Step 4b. NFA Simulator (skip Step 3): The table below shows the output of the simulator. Here there are 12 states, numbered from 0 to 11. The start state is 10 and the only terminal state is 11. The subsets of the set of states are represented by "bit"-strings of length 12, where a 0 means the state is not in the subset and a 1 means the state is in the subset. (I cheated; they are actually char strings.) The position numbers (from 0 to 11) are given first across the top. Next comes a row representing the start state alone. After that is a row giving the ε-closure of the start state. Then comes the state resulting from processing an input "a".

Graph (internal form)

Auto: start: 10, term: 11
Total states: 12
abcab$
execNFA: execStr: abcab$

 |                     1 1 
 | 0 1 2 3 4 5 6 7 8 9 0 1 
 | - - - - - - - - - - - - 
 : 0 0 0 0 0 0 0 0 0 0 1 0  s
 : 1 0 1 0 1 0 1 0 1 0 1 1  st
a: 1 1 1 0 1 1 1 0 1 1 0 1  t

Verify that the Start State of the Simulated NFA is correct. (This is the next-to-the-last line, with "st" at the end.) [As an answer you can just do "mumble-mumble go from here to there", or better (but not required) show roughly how the depth-first search would proceed.]
Verify that the last line is correct. This is the subset resulting from processing an input "a". (You must "show your work", meaning that you must give the set of states you find as being at the head of arrows labeled "a", and then show the result of the ε-closure.)

Calculate the next two lines by hand, assuming the further inputs (after the initial "a") are "b" and then "c". Give the resulting lines. (Again, in these two cases you must "show your work" as above.)

Partial Answer to 4


RE: /(a|b|c)*/
RE (RPN form): ab|c|*$
Graph (internal form):
   0 --> [a, 1]
   1 --> [@, 5]
   2 --> [b, 3]
   3 --> [@, 5]
   4 --> [@, 2] --> [@, 0]
   5 --> [@, 9]
   6 --> [c, 7]
   7 --> [@, 9]
   8 --> [@, 6] --> [@, 4]
   9 --> [@, 8] --> [@,11]
s 10 --> [@,11] --> [@, 8]
t 11 -->


Auto: start: 10, term: 11
Total states: 12
Input string: abcab$
execNFA: execStr: abcab$, states: 100
 |                     1 1 
 | 0 1 2 3 4 5 6 7 8 9 0 1 
 | - - - - - - - - - - - - 
 : 0 0 0 0 0 0 0 0 0 0 1 0  s
 : 1 0 1 0 1 0 1 0 1 0 1 1  st
a: 1 1 1 0 1 1 1 0 1 1 0 1  t
b: 1 0 1 1 1 1 1 0 1 1 0 1  t
c: 1 0 1 0 1 0 1 1 1 1 0 1  t
------------------------------
(below not asked for)
a: 1 1 1 0 1 1 1 0 1 1 0 1  t
b: 1 0 1 1 1 1 1 0 1 1 0 1  t

( Revision date: 2014-10-06. Please use ISO 8601, the International Standard Date and Time Notation.)