Syntax Part II

Last Class

While working on some problems in class several good questions were brought up:

  • Is there only ever one parse tree for a string?
  • How do we make sure our parse tree follows mathmatetical rules

Ambiguity

  • If a grammar contains any sentence for which there are two or more legal parse trees, then that grammar is ambiguous.
  • I saw the man with the binoculars
  • One morning I shot an elephant in my pajamas. How he got in my pajamas, I don't know. (Animal Crackers.1930)
  • In programming languages, ambiguous grammars can usually be rewritten to be unambigous

Ambiguity in Parse Trees

Given the grammar:

  • $< assign > \to < id > = < expr > $
  • $< id > \to A \, | \, B \, | \, C $
  • $< expr > \to < expr > + < expr> $
  • $\qquad \qquad | < expr > * < expr> $
  • $\qquad \qquad | \, ( < expr> ) $
  • $\qquad \qquad | < id > $

A = B + C * A is ambiguous

Example of ambiguous parse trees

Operator Precedence

  • One way to remove ambiguity is to design a language so that certain operators have higher precedence than others
    • This means that when the program is evaluated, this section of the parse tree will be run first
    • The lower in the parse tree an operator is, the higher precedence it has
  • Using a separate non-terminal for each precedence level achieves this

Ambiguous to Unambiguous

Ambiguous Grammar

  • $< assign > \to < id > = < expr > $
  • $< id > \to A \, | \, B \, | \, C $
  • $< expr > \to < expr > + < expr> $
  • $\qquad \qquad | < expr > * < expr> $
  • $\qquad \qquad | \, ( < expr> ) $
  • $\qquad \qquad | < id > $

Unambiguous Grammar

  • $< assign > \to < id > = < expr > $
  • $< id > \to A \, | \, B \, | \, C $
  • $< expr > \to < expr > + < term > $
  • $\qquad \qquad | < term > $
  • $< term > \to < term > * < factor > $
  • $ \qquad \qquad | < factor > $
  • $< factor > \to ( < expr> ) $
  • $\qquad \qquad | < id > $

Compare the derivations

Derivations for A = B + C * A

  • $< assign > \Rightarrow < id > = < expr >$
  • $\qquad \qquad \Rightarrow A = < expr >$
  • $\Rightarrow A = < expr > + < expr >$
  • $\Rightarrow A = < id > + < expr >$
  • $\Rightarrow A = B + < expr >$
  • $\Rightarrow A = B + < expr > * < expr >$
  • $\Rightarrow A = < expr > * < expr >$
  • $\Rightarrow A = < expr > + < expr > * < expr >$
  • $\Rightarrow A = < id > + < expr > * < expr >$
  • $\Rightarrow A = B + < expr > * < expr >$
  • $< assign > \Rightarrow < id > = < expr >$
  • $\qquad \qquad \Rightarrow A = < expr >$
  • $\qquad \qquad \Rightarrow A = < expr > + < term >$
  • $\qquad \qquad \Rightarrow A = < term > + < term >$
  • $\qquad \, \, \quad \Rightarrow A = < factor > + < term >$
  • $\qquad \qquad \Rightarrow A = < id > + < term >$
  • $\qquad \qquad \Rightarrow A = B + < term >$

Precedence Practice

Make this grammar unambigous

  • $C \to C \, \textrm{or} \, C$
  • $C \to C \, \textrm{and} \, C$
  • $C \to ( \, C \, )$
  • $C \to id$
  • $C \to C \, \textrm{or} \, D \, | \, D$
  • $D \to D \, \textrm{and} \, E \, | \, E $
  • $E \to ( \, C \, ) \, | \, id$

Operator Associativity

  • An additional type of precedence used when a string contains multiple operators of the same precedence
    • A + B - C
    • A / B * C
    • A + B + C
  • The side of the operator the recursion occurs on determines associativity
    • $ < expr > \to < expr > + < term > $ is left associative
    • $ < factor > \to < expr > ** < factor> $ is right associative

Extended BNF (EBNF)

  • Does not change the type of languages we can describe
  • Square brackets are used to denote optionality
    • $< if\_stmt > \to < if > ( < expr > ) < stmt > [ else < stmt > ] $
  • Curly braces are used denote repetition
    • $< ident\_list > \to < identifier > \{ , < identifier > \} $
  • Parentheses and the OR operator (|) are used to denote options
    • $< term > \to < term > ( \,* \, \bracevert \, / \,\bracevert \,\% \,) < factor > $

EBNF Practice

Convert to an EBNF

  • $< program > \to \textrm{begin} < stmt\_list > \textrm{end}$
  • $< stmt\_list > \to < stmt > | < stmt > ; < stmt\_list>$
  • $< stmt > \to < var > = < expression> $
  • $< var > \to \textrm{A} \, | \, \textrm{B} \, | \, \textrm{C} $
  • $< expression > \to < var > + < var > $
  • $\qquad \qquad | < var > - < var >$
  • $\qquad \qquad | < var > $
  • $< program > \to \textrm{begin} < stmt\_list > \textrm{end}$
  • $< stmt\_list > \to < stmt > \{ ; < stmt > \} $
  • $< stmt > \to < var > = < expression> $
  • $< var > \to \textrm{A} \, | \, \textrm{B} \, | \, \textrm{C} $
  • $< expression > \to < var > [( + | - ) < var > ]$

EBNF Practice

Convert to an EBNF

  • $< assign > \to < id > = < expr >$
  • $< id > \to \textrm{A} \, | \, \textrm{B} \, | \, \textrm{C} $
  • $< expr > \to < expr > + < expr > $
  • $\qquad \qquad | < expr > * < expr > $
  • $\qquad \qquad | \, ( \, < expr > \, ) \, $
  • $\qquad \qquad | < id > $
  • $< assign > \to < id > = < expr >$
  • $< id > \to \textrm{A} \, | \, \textrm{B} \, | \, \textrm{C} $
  • $< expr > \to < expr > (+|*) < expr > $
  • $\qquad \qquad | \, ( \, < expr > \, ) \, $
  • $\qquad \qquad | < id > $

Static Semantics

  • Covers aspects of meaning that can be checked at compile time
    • Often used for type checking in strongly typed languages
    • Knowing that both operands in < num > / < num > should be float requires knowledge of what / means

Attribute Grammars

  • Used to specificy static semantics
  • Consists of
    • Attributes that act as variables in the grammar
    • Attribute computation functions describe how the values of the attributes are computed
    • Predicate functions give semantic rules that must be followed

Attribute Grammars cont'd

  • The attributes of a grammar symbol X are denoted X.attr
  • Attribute computation functions use the attributes of either the parents or the children as inputs to get the current nodes attributes
  • Predicate functions are boolean expressions that restrict the possible derivations

Attribute Grammar Examples

  • In Ada to define a function we need the name of the procedure at both the beginning and the end of the definition
  • Syntax: $ < proc\_def > \to procedure < proc\_name > [ 1 ] < proc\_body > end < proc\_name > [ 2 ] $
  • Predicate: $ < proc\_name >[1].string == < proc\_name > [2].string $

Attribute Grammar Practice

Modify Grammar below into an Attribute Grammar that obeys the following rules

  • Data types cannot be mixed in expresisons
  • The assingment statement does not need to have the same type on both sides
  • $< assign > \to < var > = < expr >$
  • $< expr > \to < var > + < var > | < var >$
  • $< var > \to \textrm{A} \, | \, \textrm{B} \, | \, \textrm{C} $
  • $< assign > \to < var > = < expr >$
  • $< expr > \to < var >[1] + < var >[2] | < var >$
  • Predicate: $< var >$[1].type == $< var >$[2].type
  • $< var > \to \textrm{A} \, | \, \textrm{B} \, | \, \textrm{C} $