Csr2011 june17 15_15_kaminski

Post on 21-Oct-2014

452 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

Transcript

LR(0) CONJUNCTIVE GRAMMARS AND DETERMINISTIC SYNCHRONIZED

ALTERNATING PUSHDOWN AUTOMATA

Tamar Aizikowitz and Michael KaminskiTechnion – Israel Institute of Technology

June 17 CSR 2011, St. Petersburg, Russia

Context-Free Languages

Combine expressiveness with polynomial parsing, which is appealing for practical applications.

One of the most widely used language class in Computer Science.

At the theoretical basis of Programming Languages, Computational Linguistics, Formal Verification, Computational Biology, and more.

Extended Models

Goal: Models of computation that generate a slightly stronger language class without sacrificing polynomial parsing.

Why? Such models seem to have great potential for practical applications.

In fact, several fields (e.g., Computational Linguistics) have already voiced their need for a stronger language class.

Conjunctive Grammars

Conjunctive Grammars (CG) [Okhotin, 2001] are an extension of context-free grammars.

Add explicit intersection rulesS (A & B) ⋯ (w & w) w

Semantics: L(A & B) = L(A) ⋂ L(B) Since context-free languages are not closed under

intersection, CG is a wider class of languages Retain polynomial parsing and therefore can be

useful for practical applications

Synchronized Alternating PDA Synchronized Alternating Pushdown Automata

(SAPDA) [Aizikowitz & Kaminski, 2008] extend PDA.

Stack modeled as tree in which all braches must accept.

A limited form of synchronization createslocalized parallel computations.

First automaton counterpart shown for Conjunctive Grammars.

Outline

Model Definitions

Equivalence Results

Linear-time LR(0) Parsing

Summary

Conjunctive GrammarsSynchronized Alternating Pushdown Automata

Model Definitions

Conjunctive Grammars

A CG is a quadruple: G=(V , Σ , P , S )

Rules: A → (α1 & ⋯ & αn), where A∊V, αi∊(V ⋃ Σ)*

Examples: A → (aAB & Bc & aD) ; A → abC Derivation steps:

s1As2 s1(α1 & ⋯ & αn)s2, where A→(α1 & ⋯ &

αn)∊P s1(w & ⋯ & w)s2 s1 w s2, where w ∊ Σ*

non-terminals terminals derivation rules start symbol

The case of all n=1 is an ordinary CFG

Grammar Language

Language: L(G) = {w ∊ Σ* | S *w}, i.e., L(G) consists of all terminal words w derivable from the start symbol S.

Note: ( , ) , and & are not terminal symbols. Therefore, all conjunctions must be collapsed in order to derive a terminal word.

Semantics: (A & B) * w if and only if A *w and B *w. Therefore, L(A & B) = L(A) ⋂ L(B).

Example: Multiple Agreement Example: a CG for the multiple-agreement language

{anbncn | n ∊ ℕ}:

C → Cc | D ; D → aDb | ε : L(C) = {anbncm | n,m ∊ ℕ}

A → aA | E ; E → bEc | ε : L(A) = {ambncn | n,m ∊ ℕ}

S → (C & A) : L(S ) = L(C) ⋂ L(A)

S (C & A) (Cc & A) (Dc & A) (aDbc & A) (abc & A) ⋯ (abc & abc) abc

Synchronized Alternating Pushdown Automata (SAPDA)

An extension of classical PDA.

Transitions are made to conjunctions of ( state , stack-word ) pairs, e.g.,δ( q , σ , X ) = { ( p1 , XX ) ∧ ( p2 ,Y ) , ( p3 ,

Z ) }

Note: if all conjunctions are of one pair only, the automaton is an “ordinary” PDA.

non-deterministic model, i.e., many possible transitions

SAPDA Stack Tree

The stack of an SAPDA is a tree. A transition to n pairs splits the current branch into n branches.

Branches are processed independently. Empty sibling branches can be collapsed if they are

synchronized, i.e., are in the same state and have read the same portion of the input.

BA C

Dq

p

BAq

(q,A)(p,DC)δ(q,σ,A)

B qcollapseBε εq q

SAPDA Definition

An SAPDA is a sextuple A = ( Q , Σ , Γ , q0 , δ , ⊥ )

Transition function:δ(q,σ, X ) ⊆ {(q1,α1) ∧ ⋯ ∧ (qn,αn) | qi ∊ Q, αi ∊

Γ*, n ∊ ℕ}

states terminals initial state

transition function

initial stack symbol

stack symbols

SAPDA Computation and Language Computation:

Each step, a transition is applied to one stack-branch If a stack-branch is empty, it cannot be selected Synchronous empty sibling branches are collapsed

Initial Configuration: Accepting configuration: Language: L(A) = {w∊Σ* | ∃ q∊Q,

(q0,w,⊥)⊢*(q,ε,ε)} Note: all branches must empty, i.e., must “agree”.

⊥ q0ε q

have the same state and remaining input

SAPDA and Conjunctive Grammars

Equivalence Results

Equivalence of SAPDA and CG Theorem 1. A language is generated by an CG if

and only if it is accepted by an SAPDA.

The equivalence is analogous to the classical equivalence between CFG and PDA.

The proofs of the equivalence are extended versions of the classical proofs.

Corollary: Single-state SAPDA and multi-state SAPDA are equivalent (like ordinary PDA).

Deterministic SAPDA and LR(0) CGLinear-time Implementation of DSAPDA and LR(0) Parsing

Linear-time LR(0) Parsing

Deterministic Context-free Languages

DCFL are a sub-family of context-free languages, accepted by Deterministic PDA (DPDA).

LR(k) languages were introduced by Knuth, and shown to be equivalent to DPDA, [Knuth, 1965].

Knuth developed a linear-time parsing algorithm for DCFL the basis of Compilation Theory.

Knuth also proved that LR(0) grammars are equivalent to DPDA which accept by empty stack.

Deterministic SAPDA

Deterministic SAPDA are defined according to the classical notion of a deterministic model.

That is, an SAPDA is deterministic if it has only one possible step from any given configuration.

Remark: A deterministic SAPDA has exactly one computation on any given input word w.

LR(0) Conjunctive Grammars To define LR(0) conjunctive grammars, we extend

the notions of viable prefixes, valid items ([A → ]), etc…

A conjunctive grammar is LR(0) if a conflict-free item automaton can be constructed for it.

Main issue: How to build the item automaton so that it supports conjunctive rules...

Solution: Read the paper

Equivalence Results

Theorem 2. A language is generated by an LR(0) CG if and only if it is accepted by a DSAPDA.

The equivalence is analogous to the classical equivalence between LR(0) CFG and DPDA.

The proofs of the equivalence are extended versions of the classical proofs.

The DSAPDA Membership Problem By using a naive implementation of a DSAPDA,

we have an exponential time solution for the membership problem.

Each branch is linear, but there can be an exponential number of branches.

Do we need so many branches?

DSAPDA Efficient Implementation There are at most M = |Q| |Γ| different

combinations for branch head configurations. Thus, at any given time, we need at most M

heads.

Linear-time Parsing

Based on the efficient implementation of a DSAPDA, we can build an LR(0) parser.

The parser runs in linear time for the boolean closure of LR(0) Conjunctive Languages

This class strictly Includes the boolean closure of classical LR(0) languages. { ai1 bai2 b2 ⋯ ain bn $ bai1 bai2 ⋯ bain $ | n ≥ 1 ij ≥

1 }

That is, we have linear-time parsing for a wider class of languages.

Summary

Conjunctive Languages are interesting because: They are a strong, rich class of languages. They are polynomially parsable. Their models of computation are intuitive and easy to

understand; highly resemble classical CFG and PDA. SAPDA are the first automaton model presented for

Conjunctive Languages. They are a natural extension of PDA. They lend new intuition on Conjunctive Languages. They are the basis of a linear parsing algorithm for

a wide class of languages.

Thank you.

top related