Descriptive Complexity: An overview of the eld, key results, techniques, and applications

13
Descriptive Complexity: An overview of the field, key results, techniques, and applications. Ryan Flannery http://cs.uc.edu/~flannert 22 May 2009 Abstract In 1974 Ronald Fagin proved that problems in the complexity classes of P and NP can be characterized by the form of logic necessary to state the problem. That is, given a problem stated formally in logic, one can determine if answering the problem is in P or NP based solely on the form of the logic used to state the problem. Since this result, many other key complexity classes have been shown to be ”equivalent” to various forms of logic: a problem X is solvable in time O(f(n)) if and only if X can be stated in a specific logic. This field of logic and computer science is called Descriptive Complexity and has applications in database theory, model checking and logic programming. This presentation will present the field of descriptive complexity, it’s key results, techniques used within the field, and various areas of application. Contents 1 Introduction: What is Descriptive Complexity? 1 1.1 Some Key Results .......................................... 1 1.2 Why These are Important: Easy Complexity Analysis ...................... 2 1.3 An Important Note: The Finite Assumption ............................ 2 2 Origins of Descriptive Complexity 2 2.1 Origins of Finite Model Theory ................................... 3 3 Applications of Descriptive Complexity 3 3.1 Complexity Theory .......................................... 3 3.2 Database Theory ........................................... 4 3.3 Model Checking & Formal Verification ............................... 4 3.4 Logic Programming ......................................... 4 4 Current Research into Descriptive Complexity 4 4.1 Relevant Research Groups ...................................... 4 4.2 Relevant Journals & Conferences .................................. 5 5 Descriptive Complexity: In (some) Detail 5 6 Descriptive Complexity: Some Key Notions 5 6.1 First-Order Vocabularies & Models ................................. 5 6.2 Definability .............................................. 6 6.3 Undefinability ............................................. 6 7 Descriptive Complexity: Some Key Techniques 7 7.1 Weird Inductions ........................................... 7 7.2 Ehrenfeucht-Fra¨ ısse (EF) Games .................................. 8 1

Transcript of Descriptive Complexity: An overview of the eld, key results, techniques, and applications

Descriptive Complexity: An overview of the field, key results,

techniques, and applications.

Ryan Flanneryhttp://cs.uc.edu/~flannert

22 May 2009

Abstract

In 1974 Ronald Fagin proved that problems in the complexity classes of P and NP can be characterizedby the form of logic necessary to state the problem. That is, given a problem stated formally in logic, onecan determine if answering the problem is in P or NP based solely on the form of the logic used to statethe problem. Since this result, many other key complexity classes have been shown to be ”equivalent” tovarious forms of logic: a problem X is solvable in time O(f(n)) if and only if X can be stated in a specificlogic. This field of logic and computer science is called Descriptive Complexity and has applicationsin database theory, model checking and logic programming. This presentation will present the field ofdescriptive complexity, it’s key results, techniques used within the field, and various areas of application.

Contents

1 Introduction: What is Descriptive Complexity? 11.1 Some Key Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Why These are Important: Easy Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . 21.3 An Important Note: The Finite Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Origins of Descriptive Complexity 22.1 Origins of Finite Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 Applications of Descriptive Complexity 33.1 Complexity Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Database Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.3 Model Checking & Formal Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.4 Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

4 Current Research into Descriptive Complexity 44.1 Relevant Research Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.2 Relevant Journals & Conferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

5 Descriptive Complexity: In (some) Detail 5

6 Descriptive Complexity: Some Key Notions 56.1 First-Order Vocabularies & Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56.2 Definability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66.3 Undefinability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

7 Descriptive Complexity: Some Key Techniques 77.1 Weird Inductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77.2 Ehrenfeucht-Fraısse (EF) Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1

8 Descriptive Complexity: Key Results 9

References 10

2

1 Introduction: What is Descriptive Complexity?

• Descriptive Complexity (DC) is a field of mathematical logic, finite model theory, and computer science

• DC did not arise in response to a specific problem, but rather in response to a strange theorem byRonald Fagin in 1974 [2].

• DC provides another method of classifying the time/space complexity of problems. . .

• . . . based on the logical “form” of the problem statement itself

• It is an active area of research

1.1 Some Key Results

• Given a problem P . . .

– P can be stated in pure, First-Order Logic iff P can be solved in O(log n) time

– P can be stated in First-Order Logic extended with a Least-Fixed-Point operator iff P can be solvedin Polynomial time

– P can be stated in Existential Second-Order Logic iff P can be solved in NP time

• Symbolically,

FO ⇔ O(log n) FO+LFP ⇔ P ∃-SO ⇔ NP

• The key results of DC are that the time/space requirements to solve a problem can be understoodby the “richness” of the logic required to state the problem

• This makes intuitive sense. . .

– If a problem can be stated using a very simple language, it’s probably a very simple problem tosolve

– If a problem requires a very complex language to even state, then it’s probably a very complexproblem to solve

• The results of DC show that this intuition is correct

• In fact, the relationship between the logic required to state a problem and the complexity of theproblem is tight

• Most major complexity classes can be characterized using the tools of DC

1

1.2 Why These are Important: Easy Complexity Analysis

• Remarkably, they characterize complexity classes with no mention of any model of computation! (Tur-ing Machines, etc.)

• Normal Complexity Analysis: State problem, analyze it to try and deduce the complexity

– If you’re lucky, you’ll find an easy reduction from some known problem to yours

– Otherwise, life may become. . . difficult

• DC Analysis: State the problem formally, et voila

1.3 An Important Note: The Finite Assumption

• Anyone familiar with logic would look at the above results and say. . .

• IF FO+LFP ⇔ P and ∃-SO ⇔ NP, THEN clearly P 6= NP!

• The reason why this is not immediately known is the following. . .

• In DC we always interpret logics over finite structures only

• We’ll cover this more precisely later

• Just know that DC and the study of finite structures, called Finite Model Theory (FMT), are closelyrelated

• It’s important to note that with the finite assumption, we always assume the structures we are talkingabout are (simply) finite

• That is, when we talk about graphs, we only talk about graphs with finitely many nodes

• NOTE: The finite assumptions does not say that we only reason about structures with “exactly nelements” or “at most n elements” for some fixed n! It simply states that the structures we reasonabout are “not infinite”.

2 Origins of Descriptive Complexity

• DC was “created” in 1974 by Ronald Fagin [2], who proved the result

∃-SO⇔ NP (1)

• It was considered remarkable since it characterizes the NP class of problems without using a model ofcomputation (Turing machines, etc.)

• It was until 1980 that further results were made, characterizing other complexity classes as fragmentsof various logics

• Throughout the 1980’s, Neil Immerman, Moshe Vardi, Erich Gradel, and Phokion Kolaitis establishedmost of the major results characterizing complexity classes (for sequential machines) as fragments ofvarious logics (see [3] and [5] for surveys of all the results)

2

2.1 Origins of Finite Model Theory

• As mentioned earlier, FMT is central to DC

• The first results in FMT was a 1950 theorem by Trakhtenbrot, where he proved that validity over finitemodels is not recursively enumerable [9]

• That is, FO is not complete when interpreted over finite structures!

• Until Fagin’s results in 1974, no other work was done with finite structures

• When something as simple as FO is incomplete, why study this further!?

• Since Fagin’s 1974 result, FMT became, and still is, an active area of research

• It was quickly realized that the vast majority of techniques from regular model theory fail in finitemodel theory

– Diagonalization How do you diagonalize over a finite set of unknown size?

– Completeness A consequence of Trakhtenbrot’s result

– Compactness Ouch!

– and many more

• This is why “Finite Model Theory” is not just a single chapter in model theory texts!

3 Applications of Descriptive Complexity

• Since it’s inception, DC has become increasingly used in

– Complexity Theory

– Database Theory

– Model Checking & Formal Verification

– Logic Programming

3.1 Complexity Theory

• The applicability here is obvious

• For sequential machines, all major complexity classes can be easily characterized using DC

• For parallel and distributed settings, some complexity classes have been characterized in DC.

• . . . this is a very active area of research in the field

• Survey of uses in this area in [5]

3

3.2 Database Theory

• DC is used to identify query languages where

– it’s easier to classify/identify the complexity of answering queries

– it’s easier to state queries in an optimal fashion

• Survey of uses in this area in [7] and [5]

3.3 Model Checking & Formal Verification

• DC is not a standard tool used in most active areas of model checking

• DC results can be easily translated to transition logics where, once you state the property you want tocheck/verify, it’s immediately evident the complexity of that operation

• Knowledge of the complexity of the components of a specification aid in model search/construction(now part of standard techniques)

• Survey of uses in this area in [7] and [5]

3.4 Logic Programming

• Well, DC hasn’t been used here much yet. . .

• But I’m hoping to change this

4 Current Research into Descriptive Complexity

• Prominent People include. . .

– Neil Immerman (University of Massachusetts at Amherst)

– Moshe Vardi (Rice University)

– Phokion Kolaitis (IBM Almaden Research Center)

– Erich Gradel (RWTH Aachen University, Germany)

– Leonid Libkin (University of Edinburgh, School of Informatics)

4.1 Relevant Research Groups

• University of Massachusetts, Database and Information Management Lab, and Theory of ComputationLab

• Rice University, Ken Kennedy Institute for Information Technology

• University of Edinburgh, Lab. for Foundations of Computer Science

• Durham University, Department of Computer Science

• IBM Almaden Research Center, Computer Science Principles and Methodologies

4

4.2 Relevant Journals & Conferences

• Association for Symbolic Logic (Journal, Bulletin, and a Review)

• Logical Methods in Computer Science (Journal)

• Logic In Computer Science (LICS) annual symposium

• Federated Logic Conference (FLoC)

• Society for Industrial and Applied Mathematics (SIAM) Journal on Computing (SICOMP)

5 Descriptive Complexity: In (some) Detail

The talk will proceed as follows

1. A discussion of the Key Notions in DC

2. An introduction to some of the Key Techniques used in DC

3. An overview of Key Results in DC

6 Descriptive Complexity: Some Key Notions

6.1 First-Order Vocabularies & Models

• A vocabulary σ is simply a set of relations, functions, and constants used in a first-order logic. Asimple first-order number theory would have a vocabulary such as σ = (0,≤, S)

• A model M of a vocabulary σ is an extension of σ which includes a domain (or universe), and eachsymbol in σ is given an explicit representation as a set

• A model of the σ given above for number theory might look like

M = (N; 0M,≤M, SM) (2)

where

0M = 0 (3)≤M = { (0, 0), (0, 1), (0, 2), . . . (4)

(1, 1), (1, 2), (1, 3), . . . (5)... (6)

} (7)SM = { (0, 0, 0), (0, 1, 1), (0, 2, 2), . . . (8)

(1, 0, 1), (1, 1, 2), (1, 2, 3), . . . (9)... (10)

} (11)

• A common structure we’ll work with are graphs

5

• For a graph G = (V,E)

– The vocabulary is simply E, the edge relation

– V is the domain

– A graph representing a simple 3-node triangle would be simply

G = ({1, 2, 3}; {(1, 2), (2, 3), (3, 1)}) (12)

• We say a formula φ is true in a given model M (denoted M � φ) iff when we go “look into M” wefind that φ is, in fact, true. (There’s a standard recursive definition for this, which you’ve probablyseen, and I’m skipping)

• In the above graph exampleG = ({1, 2, 3}; {(1, 2), (2, 3), (3, 1)}) (13)

• The following formula (asserting that every node has a neighbor) is true

φ ≡ (∀x)(∃y)(E(x, y)) (14)

6.2 Definability

• We say a property P is definable in a logic L if there exists a formula φ in L such that for any modelM, P is true in M if and only if M � φ.

• What does this say?

– Take the property of graphs being 3-colorable

– Look at the collection of all finite graphs

– Separate them into two sets, those which are 3-colorable (set A) and those which are not 3-colorable (set B)

– A formula φ in logic L defines the property of being 3-colorable iff for every G ∈ A, G � φ, andfor every G ∈ B, G 6` φ.

• Example: Graph 3-colorability is definable in ∃-SO, by the following formula φ (from [4])

(∃R)(∃Y )(∃B)(∀x) [(R(x) ∨ Y (x) ∨B(x)) ∧ (∀y) (E(x, v)→ ¬(R(x) ∨R(y)) ∧ ¬(Y (x) ∨ Y (y)) ∧ ¬(B(x) ∨B(y)))](15)

• For any graph G = (V,E), G is 3-colorable iff G � φ (requires proof, obviously)

6.3 Undefinability

• Recall that the complexity hierarchy is cumulative

• That is, if a problem P has a linear time solution, then it also has polynomial time solution, an NPsolution, etc.

• What we’re usually interested in is the lower-bound complexity of P

6

• In DC this corresponds directly to the increasing expressibility of logics

• I.e. if a problem P can be stated in FO logic, then it can be stated in FO plus a LFP operator, and itcan be stated in SO logic, etc.

• Thus one of the central notions in DC is figuring out “where” in the hierarchy a property P goes frombeing definable to undefinable

• This is what most of the proof-techniques in DC revolve around

• This is also where the field of Finite Model Theory plays a heavy role

7 Descriptive Complexity: Some Key Techniques

• We will now look at two of the key techniques for proving undefinability results

– Weird Inductions / Pebble Games– Ehrenfeucht-Fraısse (EF) Games (of a limited form)

• . . . as well as some of their extensions

• In general, to prove that a given property P is undefinable in a logic L, you have to show that noformula in L defines that property

• Recall the framework above to talking about definability, and let’s use the example of Graph Connec-tivity. . .

– Look at the set of all finite graphs– Divide them into two sets: set A is all the graphs that are connected, and set B is the set of all

graphs that are not connected– To prove that graph connectivity cannot be solve in logarithmic time, we must show that the

property of graph connectivity is not expressible in FO– That is, there is no formula φ in FO such that for every graph G, G ∈ A iff G � φ

7.1 Weird Inductions

• For the above example (proving that graph connectivity is not FO definable), we can use the followingtechnique

• Prove by induction, on the number of quantifiers in a FO formula, that no formula can distinguishbetween every connected and disconnected graph.

• Outline

– Base Case: Let φ be any FO formula with no quantifiers. Construct two graphs, one connectedand one disconnected, and prove that φ is either true for both of them, or false for both of them(i.e. it can’t distinguish).

– Inductive Step: Suppose no FO formula with n quantifiers can distinguish between all connectedand disconnected graphs. Now try to prove that for any FO formula φ with n+ 1 quantifiers, youcan still construct two graphs (one connected, one not), such that φ cannot distinguish betweenthem.

7

7.2 Ehrenfeucht-Fraısse (EF) Games

• A Game-Theoretic approach to proving undefinability results

• Most significant results in DC/FMT are proved using EF games

• The basic structure:

– There are two player: the spoiler and the duplicator (often called Sampson and Delilah)– The “board” of the game consists of two structures A and B

– The goal of the spoiler is to show these two models are different– The goal of the duplicator is to show these two models are the same

• An EF games proceeds for a specified number of rounds (say n), each consisting of the following steps:

– The spoiler picks a model (A or B)– The spoiler picks a single element of that model’s domain (so, either some a ∈ A or some b ∈ B)– Then the duplicator moves by picking an element from the other model

• After the number of rounds is over, we compare the sequence of (a1, . . . , an) and (b1, . . . , bn) to seewho won

• I know what you’re thinking: Why hasn’t Toys-R-Us picked this up, right????

• We say the duplicator wins if (~a,~b) is a partial isomorphism between A and B

• A partial isomorphism between two models A and B (defined over the same vocabulary σ), is definedas the mapping (~a,~b) (where ~a ⊆ A and ~b ⊆ B), such that

1. |~a| =∣∣∣~b∣∣∣ = n

2. For every i, j ≤ n, ai = aj iff bi = bj

3. For every k-ary relation symbol P in σ and every sequence of (i1, . . . , ik) of numbers from [1, n],

(ai1 , . . . , aik) ∈ PA iff (bi1 , . . . , bik

) ∈ PB (16)

• NOTE: I’m only considering vocabularies without constants

• What does that mean?

• It means that the duplicator wins (remember: their goal is to show that the two models are the same)if after n rounds of an EF game, the “parts” of the two structures that were picked out are isomorphic(a.k.a. indistinguishable!)

• How does one use EF games?

• To prove some property P is undefinable in FO logic, you prove by induction on the number of roundsin an EF game, duplicator always has a winning strategy

• What about for undefinability in other logics, aside from FO?

• There are simple extensions to the EF game above that account for other logics

• Additionally, an immense amount of work has been done identifying winning conditions for theduplicator (e.g. Hanf’s Condition, see [7] for a nice explanation of this and others)

• This work makes the task of playing these EF games much easier

8

8 Descriptive Complexity: Key Results

• The Logarithmic-Time Hierarchy = FO The log-time hierarchy is equivalent to the set of queriesexpressible in FO

• DSPACE[log n] = FO(DTC) Deterministic log-space is equivalent to the set of queries expressiblein FO with a nondeterministic transitive closure operator

• NSPACE[log n] = FO(TC) Nondeterministic log-space is equivalent to the set of queries expressiblein FO with a (deterministic) transitive closure operator

• P = FO(LFP) Polynomial-time is equivalent to the set of queries expressible in FO with a least fixedpoint operator

• NP = ∃-SO Nondeterministic polynomial-time is equivalent to the set of queries expressible in secondorder existential logic

• Co-NP = ∀-SO Nondeterministic polynomial-time is equivalent to the set of queries expressible insecond order existential logic

• PH = SO The polynomial time hierarchy is equivalent to the set of queries expressible in second orderlogic

• PSPACE = FO(PFP) = SO(TC) Polynomial Space is the set of queries expressible by FO with apartial fixed point operator, which is also equivalent to SO formulas with a transitive closure operator.

• EXPTIME = SO(LFP) = SO[2nO(1)] Exponential time is the set of queries expressible in SO

extended with the capacity to define new relations by induction. (This is the same as second-orderquantifier blocks that may be iterated exponentially)

9

References

[1] Herbert B. Enderton. A Mathematical Introduction to Logic. Harcourt Academic Press, San Diego,California, second edition, 2002.

[2] Ronald Fagin. Generalized first-order spectra and polynomial-time recognizable sets. Complexity ofComputation, 7:27–41, 1975.

[3] Erich Gradel, Phokion G. Kolaitis, Leonid Libkin, Maarten Marx, Joel Spencer, Moshe Y. Vardi, YdeVenema, and Scott Weinstein. Finite Model Theory and Its Applications. Texts in Theoretical ComputerScience. Springer-Verlag, New York, 2007.

[4] Neil Immerman. Descriptive complexity: a logician’s approach to computation. Notices of the AmericanMathematical Society, 42(10):1127–1133, 1995.

[5] Neil Immerman. Descriptive Complexity. Graduate Texts in Computer Science. Springer-Verlag, NewYork, 1999.

[6] Neil Immerman. Progress in descriptive complexity. SIGACT News, 36(4):24–35, 2005.

[7] Leonid Libkin. Elements of Finite Model Theory. Texts in Theoretical Computer Science. Springer-Verlag, New York, 2004.

[8] Joseph R. Shoenfield. Mathematical Logic. A K Peters, Ltd., Natick, Massachusetts, 2000.

[9] B. A. Trakhtenbrot. Impossibility of an algorithm for the decision problem for finite classes. D.A.N.,70:569–572, 1950.

[10] Jouko Vaananen. A short course on finite model theory. Based on lectures given from 1993 to 1994.

10

Figure 1: Immerman’s Descriptive World

11