Context-free Grammars - Natural & Programming Languages
-
Upload
khangminh22 -
Category
Documents
-
view
1 -
download
0
Transcript of Context-free Grammars - Natural & Programming Languages
Linguistic Roots ALGOL Parsing Other Syntax Models
Context-free GrammarsNatural & Programming Languages
Laureats’ Visit
July 19, 2013
1/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Example of a Programming
Language: Go
I designed by Google (2012)
I documentation : specifiesthe syntax
I uses a context-freegrammar
2/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Example of a Programming
Language: Go
I tool shipped with Go:YACC
I generates a parser from agrammar
I allows for creating,editing, adapting thesyntax of programminglanguages
2/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Pan. ini (∼350 BC) : As.t. adhyayiI Sanskrit grammar
I about 4000 rules
I formal rules:
A→ B/C D
“rewrite A to B in thecontext C D”
I auxiliary symbols
3/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Chomsky (1956) : Three Models for theDescription of Language
1. finite-state automata
2. phrase-structure grammars
3. transformational grammars
N. Chomsky
4/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Modeling
I a language � set of sentences
I syntax vs. semantics:The child eats a tomato.A tomato eats the child.
*A tomato the child eats.
I competence vs. performance:The child eats a nice tomato.The child eats a nice round tomato.The child eats a nice red round tomato....
5/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Constituents Analysis
[[The child] [eats [a tomato]]].[[The child] [eats [a [nice tomato]]]].
6/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Constituents Analysis (ctd.)
P
NP
det
The
AP
n
child
VP
v
eats
NP
det
a
AP
adj
nice
AP
n
tomato7/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Context-free Grammars
Special case of phrase-structured grammars:empty contexts
P→NP VPNP→ det APVP→ v NPAP→ adj AP | ndet→ The | a
n→ child | tomatov→ eats
adj→ nice | red | round
8/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Backus (1959); Naur (1960): Algol
60
I ALGOrithmic Language
I standard syntax
〈statement〉→ 〈unconditional statement〉| 〈conditional statement〉
〈unconditional statement〉→ 〈for statement〉
〈conditional statement〉→ 〈if statement〉| 〈if statement〉 else 〈statement〉
〈if statement〉→ 〈if clause〉 〈unconditional statement〉〈if clause〉→ if 〈boolean expression〉 then
J. Backus
9/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Ginsburg and Rice (1962) : Twofamilies of languages related to ALGOLI connection between Algol and
Chomsky’s work
I pluri-disciplinary research:
I linguistics
I programming languages
I theoretical computer science(Chomsky, 1959; Bar-Hillel et al.,1961; Chomsky and Schutzenberger,1963, ...)
Y. Bar-Hillel
M.P. Schutzenberger
10/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Pushdown Automata
Yngve (1960); Oettinger (1961); Chomsky (1962)
I operational model, easy implementation
I expressivity equivalent to that of context-freegrammars
I idea of parsing: generate a pushdownautomaton from a grammar
11/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Pushdown Automata (ctd.)
(q,ε,⊥,ε,qf)(q,ε,P,NP VP,q)
(q,ε,NP,det AP,q)...
(q,ε,det,The,q)(q,ε,det,a,q)
(q,The,The,ε,q)(q,a,a,ε,q)
...
12/22
Linguistic Roots ALGOL Parsing Other Syntax Models
IssuesI Floyd (1962b): Algol 60 is not
context-free:begin
real x;
y := 3
end
is only correct if the twoidentifiers x and y are thesame.
I separation into lexical analysis,parsing, and semanticsanalysis
R.W. Floyd
13/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Issues
I Cantor (1962); Floyd (1962a):Algol 60 is ambiguous:several possible analyses forsome programs
I inherently ambiguouslanguages (Parikh, 1966;Ginsburg and Ullian, 1966)
I undecidable propertiesR. Parikh
13/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Issues
I the first parsers impose verystringent restrictions ongrammars (Irons, 1961)
I ideally: deterministicpushdown automata (Ginsburgand Greibach, 1966)—notderivable from any grammar
I undecidable properties
S. Greibach
13/22
Linguistic Roots ALGOL Parsing Other Syntax Models
... and Answers
I parser generators for larger andlarger classes of grammars
I Knuth (1965): LR parsing for allthe deterministic languages
I DeRemer (1969) : simplifications(SLR & LALR)
I YACC (Johnson, 1975) : LALR(1)parser generator
D.E. Knuth
14/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Today
All the mainstream programming languages areshipped with
I a context-free grammar that specifies theirsyntax
I a parser generator (most likely a YACC variant)allowing to write parsers for new languages
15/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Syntax Models
I context-free grammars (rewriting systems)
I pushdown automata (transition systems)
I algebraic equations (equations systems)
I categorial grammars (proof systems)
I dynamic logic on trees (model theory)
16/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Syntax Models
I context-free grammars (rewriting systems)
I pushdown automata (transition systems)
I algebraic equations (equations systems)
I categorial grammars (proof systems)
I dynamic logic on trees (model theory)
16/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Algebraic Equations
(Ginsburg and Rice, 1962; Chomsky and Schutzenberger, 1963)
Minimal solutions of a systemP = NP ·VP
NP = det ·APVP = v ·NPAP = adj ·AP∪ndet = {The}∪ {a}
n = {child}∪ {tomato}v = {eats}
adj = {nice}∪ {round}∪ {red}
17/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Categorial Grammars
(Bar-Hillel, 1953; Lambek, 1958)
Categories built using left and right quotientsover a finite set of symbols A:
γ ::=A | γ1\γ2 | γ1/γ2 (categories)
Deduction rules:
Lexiconw ` γ
w1 ` γ1 w2 ` γ1\γ2\
w1 ·w2 ` γ2
w1 ` γ2/γ1 w2 ` γ1/
w1 ·w2 ` γ2 J. Lambek
18/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Proofs
Example
The ` NP/n child ` n/
The child ` NP
eats ` (P\NP)/NP
a ` NP/n tomato ` n/
a tomato ` NP/
eats a tomato ` P\NP\
The child eats a tomato ` P
19/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Logics on Trees
(Blackburn et al., 1993; Afanasiev et al., 2005)
Modal logic on a set of atomic propositions p
ϕ ::=> | p | ¬ϕ |ϕ1 ∧ϕ2 | 〈π〉ϕ (formulæ)π ::=→ |← | ↓ | ↑ | π∗ (relations)
P. Blackburn
20/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Models
An ordered finite labeled tree t in a node n:
t,n |=>t,n |= p if the label of n is pt,n |= ¬ϕ if t,n 6|=ϕt,n |=ϕ1 ∧ϕ2 if t,n |=ϕ1 and t,n |=ϕ2
t,n |= 〈π〉ϕ if ∃n ′,n π n ′ and t,n ′ |=ϕ
21/22
Linguistic Roots ALGOL Parsing Other Syntax Models
Formulæ
ExampleP∧[↓∗][→∗](
∨X∈Σ]N
(X∧∧Y,X
¬Y)
∧ (¬〈↓〉>)≡ (∨a∈Σ
a) ∧ (〈↓〉>)≡ (∨A∈N
A)
∧P ⊃ 〈↓〉(NP∧ 〈→〉VP∧¬〈←〉>∧ 〈→〉¬〈→〉>)∧AP⊃ 〈↓〉(adj∧ 〈→〉AP∧¬〈←〉>∧ 〈→〉¬〈→〉>)
∨ 〈↓〉(n∧¬〈←〉>∧¬〈→〉>)∧det⊃ 〈↓〉(The∧¬〈←〉>∧¬〈→〉>)
∨ 〈↓〉(a∧¬〈←〉>∧¬〈→〉>)∧ ...)
22/22
References References
Afanasiev, L., Blackburn, P., Dimitriou, I., Gaiffe, B., Goris, E., Marx, M., and de Rijke, M., 2005. PDL for orderedtrees. Journal of Applied Non-Classical Logic, 15(2):115–135. doi:10.3166/jancl.15.115-135.
Aho, A.V., Johnson, S.C., and Ullman, J.D., 1975. Deterministic parsing of ambiguous grammars. Communications ofthe ACM, 18(8):441–452. doi:10.1145/360933.360969.
Backus, J.W., 1959. The syntax and semantics of the proposed international algebraic language of the ZurichACM-GAMM Conference. In IFIP Congress, pages 125–131.
Bar-Hillel, Y., Perles, M., and Shamir, E., 1961. On formal properties of simple phrase-structure grammars.Zeitschrift fur Phonetik, Sprachwissenschaft, und Kommunikations-forschung, 14:143–172.
Bar-Hillel, Y., 1953. A quasi-arithmetical notation for syntactic description. Language, 29(1):47–58.doi:10.2307/410452.
Blackburn, P., Gardent, C., and Meyer-Viol, W., 1993. Talking about trees. In EACL ’93, pages 21–29. ACL Press.doi:10.3115/976744.976748.
Cantor, D.G., 1962. On the ambiguity problem of Backus systems. Journal of the ACM, 9(4):477–479.doi:10.1145/321138.321145.
Chomsky, N., 1956. Three models for the description of language. IEEE Transactions on Information Theory, 2(3):113–124. doi:10.1109/TIT.1956.1056813.
Chomsky, N., 1959. On certain formal properties of grammars. Information and Control, 2(2):137–167.doi:10.1016/S0019-9958(59)90362-6.
Chomsky, N., 1962. Context-free grammars and pushdown storage. Quarterly Progress Report 65, ResearchLaboratory of Electronics, M.I.T.
Chomsky, N. and Schutzenberger, M.P., 1963. The algebraic theory of context-free languages. In Braffort, P. andHirshberg, D., editors, Computer Programming and Formal Systems, volume 35 of Studies in Logic, pages118–161. North-Holland Publishing. doi:10.1016/S0049-237X(08)72023-8.
DeRemer, F.L., 1969. Practical Translators for LR(k) Languages. PhD thesis, Massachusetts Institute of Technology,Cambridge, Massachusetts. http://www.lcs.mit.edu/publications/pubs/pdf/MIT-LCS-TR-065.pdf.
Earley, J., 1975. Ambiguity and precedence in syntax description. Acta Informatica, 4(2):183–192.doi:10.1007/BF00288747.
Floyd, R.W., 1962a. On ambiguity in phrase structure languages. Communications of the ACM, 5(10):526.doi:10.1145/368959.368993.
Floyd, R.W., 1962b. On the nonexistence of a phrase structure grammar for ALGOL 60. Communications of the ACM,5(9):483–484. doi:10.1145/368834.368898.
Ginsburg, S. and Rice, H.G., 1962. Two families of languages related to ALGOL. Journal of the ACM, 9(3):350–371.doi:10.1145/321127.321132.
Ginsburg, S. and Greibach, S., 1966. Deterministic context-free languages. Information and Control, 9(6):620–648.doi:10.1016/S0019-9958(66)80019-0.
23/22
References References
Ginsburg, S. and Ullian, J., 1966. Ambiguity in context free languages. Journal of the ACM, 13(1):62–89.doi:10.1145/321312.321318.
Irons, E.T., 1961. A syntax directed compiler for ALGOL 60. Communications of the ACM, 4(1):51–55.doi:10.1145/366062.366083.
Johnson, S.C., 1975. YACC — yet another compiler compiler. Computing science technical report 32, AT&T BellLaboratories, Murray Hill, New Jersey.
Knuth, D.E., 1965. On the translation of languages from left to right. Information and Control, 8(6):607–639.doi:10.1016/S0019-9958(65)90426-2.
Lambek, J., 1958. The mathematics of sentence structure. American Mathematical Monthly, 65(3):154–170.doi:10.2307/2310058.
Naur, P., editor, 1960. Report on the algorithmic language ALGOL 60. Communications of the ACM, 3(5):299–314.doi:10.1145/367236.367262.
Oettinger, A.G., 1961. Automatic syntactic analysis and the pushdown store. In Structure of Language and itsMathematical Aspects, volume 12 of Proc. of Symposia in Applied Math., pages 104–129. AMS.
Parikh, R.J., 1966. On context-free languages. Journal of the ACM, 13(4):570–581. doi:10.1145/321356.321364.
Yngve, V.H., 1960. A model and an hypothesis for language structure. Proceedings of the American PhilosophicalSociety, 104(5):444–466.
24/22