On a conjecture about finite fixed points of morphisms

26
Theoretical Computer Science 339 (2005) 103 – 128 www.elsevier.com/locate/tcs On a conjecture about finite fixed points of morphisms F. Levé, G. Richomme Université de Picardie JulesVerne, LaRIA, 33, Rue Saint Leu, 80039 Amiens cedex 1, France Abstract A conjecture of M. Billaud is: given a word w, if, for each letter x occurring in w, the word obtained by erasing all the occurrences of x in w is a fixed point of a nontrivial morphism f x , then w is also a fixed point of a non-trivial morphism.We prove that this conjecture is equivalent to a similar one on sets of words. Using this equivalence, we solve these conjectures in the particular case where each morphism f x has only one expansive letter. © 2005 Elsevier B.V.All rights reserved. Keywords: Combinatorics on words; Morphisms; Fixed points 1. Introduction One kind of combinatorial problem is to determine some conditions for which an object is characterized by partial information. For instance, in trace theory [8,9], a trace is entirely determined by its projections on sets {a,b} of dependent letters [7,10]. In combinatorics on words, the exact length of factors needed to reconstruct a word has been recently determined [3] (see also [4,5,17]) using the maximal length of special factors and the maximal length of repeated suffixes. The conjecture we examine here belongs to this kind of problems. It concerns fixed points of morphisms. A lot of studies deal with infinite fixed points of morphisms (see, e.g., [1,6,18,19]). There exist also studies of biinfinite words (see, e.g., [20]). Here, we consider only finite words that are fixed points of a morphism. Head [13] (see also [12]) characterized Corresponding author. E-mail addresses: [email protected] (F. Levé), [email protected] (G. Richomme). 0304-3975/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.tcs.2005.01.011

Transcript of On a conjecture about finite fixed points of morphisms

Theoretical Computer Science 339 (2005) 103–128www.elsevier.com/locate/tcs

On a conjecture about finite fixed points ofmorphisms

F. Levé, G. Richomme∗Université de Picardie Jules Verne, LaRIA, 33, Rue Saint Leu, 80039 Amiens cedex 1, France

Abstract

A conjecture of M. Billaud is: given a wordw, if, for each letterxoccurring inw, the word obtainedby erasing all the occurrences ofx in w is a fixed point of a nontrivial morphismfx , thenw is also afixed point of a non-trivial morphism. We prove that this conjecture is equivalent to a similar one onsets of words. Using this equivalence, we solve these conjectures in the particular case where eachmorphismfx has only one expansive letter.© 2005 Elsevier B.V. All rights reserved.

Keywords:Combinatorics on words; Morphisms; Fixed points

1. Introduction

One kind of combinatorial problem is to determine some conditions for which an objectis characterized by partial information. For instance, in trace theory[8,9], a trace is entirelydetermined by its projections on sets{a, b} of dependent letters[7,10]. In combinatorics onwords, the exact length of factors needed to reconstruct aword has been recently determined[3] (see also[4,5,17]) using the maximal length of special factors and the maximal lengthof repeated suffixes.The conjecture we examine here belongs to this kind of problems. It concerns fixed

points of morphisms.A lot of studies deal with infinite fixed points of morphisms (see, e.g.,[1,6,18,19]). There exist also studies of biinfinite words (see, e.g.,[20]). Here, we consideronly finite words that are fixed points of amorphism. Head[13] (see also[12]) characterized

∗ Corresponding author.E-mail addresses:[email protected](F. Levé),[email protected](G. Richomme).

0304-3975/$ - see front matter © 2005 Elsevier B.V. All rights reserved.doi:10.1016/j.tcs.2005.01.011

104 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

the language of finite fixed points of a given morphism. This set is regular. More precisely,it is a finitely generated[15] free [21] monoid. In Head’s result, constant and expansiveletters (for the considered morphism) play a central role.Ourworkdealswith thesetFWofallwords that are fixedpoints of anon-trivialmorphism.

By opposition to the set of finite fixed points of a givenmorphism, the set FW is not context-free. In 1993, Billaud[2] conjectured the following inductive property of the set FW:

Givenawordw, if for each letterx occurring inw, there exists a (non-trivial)morphismfx such that the word obtained by erasing all the occurrences ofx inw is a fixed pointof fx , then there exists a (non-trivial) morphismf such thatw is a fixed point off .

Not a lot is known about the validity of this conjecture: it was only stated on three-letteralphabets by Zimmermann[22]. Here, we prove it in the more general case where eachfxhas one expansive letter.To solve this case, we need to consider fixed sets of words, that is, sets of finite words

which are fixed points of a samemorphism.We show that Billaud’s conjecture is equivalentto the following one:

Givena setS of words, if for each letterx occurring in words ofS, there exists a (non-trivial) morphismfx such that, for each wordw in S, the word obtained by erasingall the occurrences ofx in w is a fixed point offx , then there exists a (non-trivial)morphismf such that each word ofS is a fixed point off .

Now we describe the outline of this paper. After some generalities in Section2, wepresent in Section3, Head’s result and the set FW. In Section4, we explain more preciselyBillaud’s conjecture and prove the equivalence between the two conjectures. In Section5,we generalize P. Zimmermann’s result to the case where each morphismfx has no constantletter and only one expansive letter. Results of Sections4 and5 are used in Section6 toprove the conjectures by induction when each morphismfx has only one expansive letter(the morphismsfx can have constant letters).Let us note that an extended abstract of this paper was presented to the conference

WORDS’2003[16].

2. Generalities

We assume the reader is familiar with combinatorics on words and morphisms (see, e.g.,[6,18,19]). We precise our notations.Given a finite setX, we denote by Card(X) its cardinality.Given an alphabet A (a non-empty set of letters),A∗ is the set of words overA including

the empty wordε. We denote the concatenation on words by juxtaposition:uv is the wordobtained by concatenation ofu andv. If X is a set of words, we denote byX∗ (resp., byX+) the set of all words that are finite (resp., non-empty finite) concatenation of words ofX: ε ∈ X∗, X∗ = X+ ∪ {ε}. For a wordw, we denote byw∗ (resp., byw+) the set{w}∗(resp.,{w}+).Let w be a word. We denote by alph(w) the set of letters occurring inw: alph(ε) = ∅.

We extend this notation to any set of wordsS: alph(S) = ⋃w∈S alph(w). Thenumber of

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 105

occurrencesof a letterx inw is denoted by|w|x and the total length ofw is denoted by|w|.More generally given a setBof letters, let|w|B = ∑

x∈B |w|x .We denote byminLetters(w)the set{a ∈ alph(w)|∀x ∈ alph(w), |w|a� |w|x}, that is, the set of letters with minimalnumber of occurrences inw.A wordu is afactorof w if there exist wordsp andssuch thatw = pus. If p = ε (resp.,

s = ε), u is aprefix(resp.,suffix) ofw.Powersof a worduare the words defined byu0 = ε,un = uun−1 for any integern�1. A word isprimitive if it cannot be written asun withn�2 an integer.We now recall a classic result of combinatorics on words (see, e.g.,[18]).

Lemma 2.1. For two words x and y, xy = yx if and only if there exists a word z such thatx, y ∈ z∗.

It follows from this lemma that for any non-empty wordw there exists a unique primitivewordz, called theprimitive root ofw such thatw ∈ z∗.Given an alphabetA, a(n endo)morphism fonA is an application fromA∗ toA∗ such that

f (uv) = f (u)f (v) for any wordsu, v overA. A morphism onA is entirely defined by theimages of elements ofA. A particular morphism onA is theidentitymorphism (also calledtrivial morphism) denoted by IdA or simply by Id.ForX ⊆ A∗ andf a morphism onA, f (X) is the set{f (x) | x ∈ X}.A morphismf iserasingif there exists a lettera such thatf (a) = ε. Examples of erasing

morphisms are the projections. LetBbe a subset of an alphabetA. TheprojectiononB is themorphism�B fromA∗ to B∗ defined by�B(a) = a for a ∈ B and�B(a) = ε for a /∈ B.WhenB = A\{x} for a letterx, we denote by�x the projection�B . WhenB = {x, y} fortwo lettersx, y, we denote by�xy the morphism�B .Given a morphismf, powersof f are defined inductively byf 0 = Id, f i = ff i−1 for

integersi�1 (composition of applications is denoted just by juxtaposition).

3. Finite fixed points of morphisms

A wordw is afixed pointof a morphismf onA if f (w) = w. Given a morphismf, Head[13] (see also[12]) has characterized the language of finite fixed points off. In order torecall this result (Theorem3.1), we introduce notions from[14,20]and some notations.Let f be a morphism on the alphabetA. A lettera ismortal for the morphismf if there

exists an integeri�1 such thatf i(a) = ε. We will denote byMf the set of mortalletters of f. Themortality exponentof f, denoted by exp(f ), is the least integeri suchthat f i(a) = ε for all a ∈ Mf . The existence of this exponent can be proved showingthat exp(f )�Card(Mf ).A lettera is saidmonorecursive(for the morphismf) if there exist two wordsx, y inM∗

f

such thatf (a) = xay. For any monorecursive lettera, we have|f (a)|�1. In order to easefuture developments, we split the set of monorecursive letters into two subsets:

Cf = {a ∈ A | f (a) = a},Ef = {a ∈ A | a monorecursive, |f (a)|�2}.

106 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Letters in Cf are constantby f. Letters in Ef are calledexpansiveletters. WhenCard(Ef ) = 1, we denote byef the unique element ofEf .By definition,Cf ∩ Ef = ∅. Moreover for any lettera in Cf ∪ Ef , and for any integer

n�1, |f n(a)|a = 1. Consequently(Cf ∪ Ef ) ∩ Mf = ∅. We can also observe thatf (Cf ) = Cf andf (Ef ) ⊆ (Ef ∪Mf )∗.

Theorem 3.1(Hamm and Shallit[12] and Head[13] ). Let f be a morphism on A. The setof finite fixed points of f is the set(f exp(f )(Cf ∪ Ef ))∗.

Consider, for instance, the morphismf defined on{1,2,3,4,5,6,7,8,9} by f (1) = 2,f (2) = 1, f (3) = 34,f (4) = 435,f (5) = 6, f (6) = 7, f (7) = ε, f (8) = 6787,f (9) =966. For this morphism, we haveMf = {5,6,7}, Cf = ∅, Ef = {8,9}, exp(f ) = 3. Thefixed points off are the words in{76787,96677}∗.Most of time (as in Theorem3.1), given a morphismf, one studies words that are fixed

points of f. Here, we study intrinsic properties of (finite) words that are fixed points ofmorphisms (independently of these morphisms), that is, words in the set

FW = {w | ∃f : alph(w)∗ → alph(w)∗morphism, f �= Id, f (w) = w}

.

Forw ∈ FW, as suggested by Geser[11], we callwitnessof w any morphismf on alph(w)such thatf (w) = w andf �= Idalph(w). A witness is necessarily an erasing morphism. Anelement of FW can have several witnesses (see for instance the wordab).Any word in FW contains necessarily at least two different letters in order to be a fixed

point of a non-trivial morphism on alph(w). For different lettersa, b, observe that any wordw with |w|a = 1 and|w|b�1 belongs to FW: it has the witnessf defined byf (a) = w

andf (c) = ε for eachc in alph(w)\{a}. So FW is an infinite set (even when restrictedto two-letter alphabets). Examples of words that do not belong to FW are wordsabbaandaabbwherea, b are two different letters.Theorem3.1implies that the set of fixed points off is regular. On the contrary, FW is not

even context-free since its intersection with the regular set(abb+)3 is the non-context-freeset{abnabnabn | n�2}.Another consequence of Theorem3.1 is

Corollary 3.2. For w ∈ FW , and f a witness ofw,

Cf ∪ Ef ∪Mf = alph(w).

Proof. Sincef is a witness ofw, f (w) = w. By Theorem3.1, w ∈ (f exp(f )(Cf ∪ Ef ))∗.We have already noticed thatf (Cf ) = Cf andf (Ef ) ⊆ (Ef ∪Mf )∗. Sow ∈ (Cf ∪Ef ∪ Mf )∗. But f is a morphism on alph(w). SoCf ∪ Ef ∪ Mf ⊆ alph(w). FinallyCf ∪ Ef ∪Mf = alph(w). �

The first part of the next proposition (which is a consequence ofCorollary3.2)was provedby Geser[11]. It shows that it is possible to focus only on idempotent witnesses of wordsin FW. (A morphismf is idempotentif f 2 = f .)

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 107

Proposition 3.3. Any wordw in FW has a witness which is idempotent. More precisely,given a witness f ofw, there exists an idempotent witness g ofw withCf = Cg, Ef = EgandMf = Mg.Proof. Let w ∈ FW. By definitionw has at least one witnessf. Let us prove that themorphismg = f exp(f ) fulfills the conditions of the proposition. First note (by induction)that for all integern�0, f n(w) = w. Sog(w) = w.By definitions of constant, mortal and expansive letters, it is easily seen thatCf ⊆ Cg,

Ef ⊆ Eg andMf ⊆ Mg. Sincef is a witness,Mf �= ∅. This impliesMg �= ∅. Sog �= Idalph(w) whenceg is a witness ofw. By Corollary3.2, alph(w) = Cf ∪ Ef ∪Mf =Cg ∪ Eg ∪ Mg. SinceCf , Ef , Mf (andCg, Eg, Mg) are pairwise disjoint,Cf = Cg,Ef = Eg andMf = Mg.Let a ∈ Mg. Sincea ∈ Mf , by definition of exp(f ), g(a) = f exp(f )(a) = ε. So fora

in Mg, g2(a) = g(a). This is also true fora ∈ Cg. Let a ∈ Eg. There exist wordsx, y inM∗g such thatg(a) = xay. From what precedes,g(x) = g(y) = ε. Sog2(a) = g(a). The

morphismg is idempotent. �

Since for an idempotent morphismf, exp(f ) = 0, and sincef (Cf ) = Cf , Theorem3.1can be simplified into:

Corollary 3.4. Let f be an idempotent morphism on A. The set of finite fixed points of f isthe set(f (Cf ∪ Ef ))∗ = (Cf ∪ f (Ef ))∗.We end this section with two useful lemmas.

Lemma 3.5. Letw ∈ FW, f be awitness ofw anda ∈ minLetters(w). If a ∈ Ef ∪Mf , thenw has an idempotent witness g such thatCg = Cf , Card(Ef ) = Card(Eg) anda ∈ Eg.Proof. By Proposition3.3, we can assume thatf is an idempotent witness ofw.Let F = {x ∈ Ef ||f (x)|a �= 0}. We have

|w|a = ∑x∈F

|w|x × |f (x)|a.

But a ∈ minLetters(w). This impliesF = {e} for a lettere. Moreover|f (e)|a = 1 and|w|a = |w|e.If a = e, the lemma is verified withg = f . Assumea �= e. Sincee ∈ Ef and

a ∈ alph(f (e)), a ∈ Mf , that is,f (a) = ε. Let g be the morphism defined byg(a) =f (e), g(e) = ε (so g �= Id), andg(x) = f (x) for x /∈ {a, e}. We haveCf = Cg,Mg = {e} ∪ Mf \{a} andEg = {a} ∪ Ef \{e}. So forx �= a, g2(x) = g(x). Moreoverg2(a) = g(f (e)) = g(a). Thus the morphismg is idempotent.By Corollary3.4,

w = w0p∏i=1f (e)wi = w0

p∏i=1g(a)wi

for wordswi (1� i�p) in (f (Ef \{e})∪Cf )∗ = (g(Eg\{a})∪Cg)∗. It follows g(w) = w.The morphismg is a witness ofw. �

108 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Lemma 3.6. A wordw belongs toFW with a witness f such thatCard(Ef ) = 1 if andonly if there exist a set C of letters, a letter b, and a word u, such that|u|�2, |u|b = 1,alph(u) ∩ C = ∅ andw ∈ (C ∪ {u})∗.

Proof. We get the “only if” part takingC = Cf , b such thatEf = {b} andu = f (b).Conversely, letf be the morphism defined byf (b) = u, f (x) = ε for x in alph(u)\{b}

andf (x) = x for x ∈ C. The morphismf is not the identity since|u|�2. It verifiesCard(Ef ) = 1, and sincew ∈ (C ∪ {u})∗, f (w) = w. �

4. The conjecture

In 1993, in Newsgroup Comp.theory, Billaud[2] proposed

Conjecture 4.1. Let A be an alphabet with at least three letters. Let w be a word withalph(w) = A. If for all a ∈ A, �a(w) ∈ FW, thenw ∈ FW.

Conjecture4.1 assumes Card(A)�3. This is due to the fact that if Card(A)�2, forw ∈ A∗ anda ∈ A, �a(w) belongs tob∗ for a letterb and thus does not belong to FW.The converse of the previous conjecture does not hold. There exist wordsw in FW such

that�a(w) /∈ FW for at least one lettera in alph(w). For instance, one can consider thewordw = acbba. This word belongs to FW since it has a witnessf defined on{a, b, c} byf (a) = ε, f (b) = ε, f (c) = w. Nevertheless the word�c(w) /∈ FW.Note also that given an alphabet with at least three letters, we can find wordsw ∈ FW

such that for each letterx in alph(w), �x(w) /∈ FW. For instance, ifA = {a1, . . . , an} withn�3, the word

w =n−1∏i=1(aian)

n−1∏i=1(an−ian) = a1ana2an . . . an−1anan−1an . . . a2ana1an

belongs to FW. It has as witness the morphismf defined byf (ai) = aian for 1� i < n andf (an) = ε. But (we let the reader verify that) for allx in alph(w), �x(w) does not belongto FW.Maybe one reason of the difficulty to solveConjecture4.1is that it is equivalent to another

one that seems more general. We state this new conjecture and show the equivalence. Forthis let us define a family of sets

FS=S

∣∣∣∣∣∣∃ f : alph(S)∗ → alph(S)∗ morphismf �= Idfor all w in S, f (w) = w

.

For instance,

{ab, ba} /∈ FS,{ε, abcb, aaa, bcb, bcbabcb} ∈ FS.

Of course for each wordw in FW, the singleton{w} belongs to FS.

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 109

Conjecture 4.2. Let A be an alphabet with at least three letters. Let S ⊆ A∗ withalph(S) = A. If for all a ∈ A, �a(S) ∈ FS,thenS ∈ FS.

Theorem 4.3. Conjecture4.1 is true if and only if Conjecture4.2 is.More precisely, given an alphabet A,Conjecture4.1 is true for wordsw with alph(w) = Aif and only if Conjecture4.2 is true for sets S of words withalph(S) = A.

The “if part” of this theorem is quite immediate since Conjecture4.1 is a restriction ofConjecture4.2to singletons. Conversely, Proposition4.5states that we can reduce Conjec-ture4.2to Conjecture4.1using inductively the next lemma. Note that Proposition4.5willbe of main importance to prove Conjecture4.1when for each letterx, �x(w) has a witnesswith only one expansive letter (Theorem6.1).

Lemma 4.4. Let f be a morphism on an alphabet A. Given two words u andv over A, thethree following assertions are equivalent:(1) f (u) = u andf (v) = v,(2) f (uv) = uv andf (vu) = vu,(3) f (uvvu) = uvvu.

Proof. 1⇒ 3 is immediate.3 ⇒ 2. If f (uvvu) = uvvu, then since|f (uv)| = |f (vu)| and|uv| = |vu|, we have

|f (uv)| = |uv|. Thusf (uv) = uv andf (vu) = vu.2⇒ 1. Assumef (uv) = uv andf (vu) = vu. If |f (u)| = |u| then we directly have the

result. So we can assume without loss of generality that|f (u)| > |u| (the case|f (u)| < |u|is similar). Then there exists a wordx �= ε such thatf (u) = ux andv = xf (v). Sincef (vu) = vu, we havef (v)ux = xf (v)u. From a classical result about the equationxy = yx (see[18] for instance), there exist wordsr, sand integersi, j, k such that

x = (rs)k,f (v) = (rs)ir,u = s(rs)j

with k �= 0 andrs �= ε.Sincev = xf (v), v = ∏

i�0 fi(x). Butv is a finite word. So there exists a smallest inte-

gernsuch thatf n(x) = ε (observe thatn�1 becausex �= ε). Consequentlyf n((rs)k) = εand sof n(r) = ε andf n(s) = ε. It follows thatε = f n(u) = f n−1(u)f n−1(x), hencef n−1(x) = ε. We have a contradiction with the minimality ofn. �

Proposition 4.5. Let A be an alphabet with at least three letters. Let S be a set of wordssuch thatalph(S) = A and for alla ∈ A, �a(S) ∈ FS.For a ∈ A, let fa be a morphism onA\{a} such thatfa �= Id and for eachx ∈ �a(S), fa(x) = x.There exists a wordw such thatalph(w) = alph(S) and

(1) for all a ∈ A, �a(w) belongs toFW and hasfa as witness(fa(�a(w)) = �a(w)).(2) w ∈ FW if and onlyS ∈ FS.

More precisely for any morphismf �= Id on A, f (w) = w if and only if for all x in S,f (x) = x.

110 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Proof. The claim holds by induction on Card(S).The result is immediate if Card(S) = 1.AssumeCard(S)�2.Letu,v be twodifferentelementsofS. LetS′ = (S\{u, v})∪{uvvu}.

For a ∈ A andx ∈ S′, fa(�a(x)) = �a(x), and so�a(S′) ∈ FS. Moreover Card(S′) <Card(S). By induction hypothesis, there exists a wordw such that alph(w) = alph(S′) and(1) for all a ∈ A, fa(�a(w)) = �a(w).(2) w ∈ FW if and onlyS′ ∈ FS.

More precisely for any morphismf �= Id onA, f (w) = w if and only if for all x in S′,f (x) = x.By construction, alph(S′) = alph(S) = alph(w). To end the proof, we have to show, for

any morphismf �= Id, the equivalence between• for all x in S, f (x) = x and• for all x in S′, f (x) = x.This is a direct consequence of Lemma4.4. �

Proof of Theorem 4.3. The “if part” is immediate. Assume now Conjecture4.1is true forany wordsw with alph(w) = A. Let S ⊆ A∗ with alph(S) = A such that for alla in A,�a(S) ∈ FS. By Proposition4.5, there exists a wordw such that(1) for all a ∈ A, �a(w) ∈ FW.(2) w ∈ FW if and onlyS ∈ FS.FromAssertion 1 and the fact that Conjecture4.1 is true, we knoww ∈ FW. SoS ∈ FS.

5. A particular case

As already said, Conjecture4.1was solved by Zimmermann[22] in case of words over3-letter alphabets. In this section, we extend this result.

Proposition 5.1. Letw be a word such thatCard(alph(w))�3.Assume there exist threepairwise distinct lettersa, b, c in alph(w) such that, for each letterx ∈ {a, b, c}, thereexists a witnessfx of �x(w) such thatCard(Efx ) = 1 andCard(Cfx ) = 0.Thenw ∈ FW .Moreoverw has a witness f withCard(Ef ) = 1 andCard(Cf ) = 0.

Before proving this proposition, let us mention two consequences (one can note thesimilarity between Corollary5.3and Conjecture4.1).

Corollary 5.2 (Zimmermann[22] ). Conjecture 4.1 is true for words over a3-letteralphabet.

Proof. Letw be a word over a 3-letter alphabet such that for eachx ∈ alph(w), �x(w) ∈FW. Let x ∈ alph(w) and let fx be a witness of�x(w). Since fx �= Id,fx(�x(w)) = �x(w)andCard(alph(�x(w))) = 2,necessarilyCard(Efx ) = 1= Card(Mfx )and Card(Cfx ) = 0. So by Proposition5.1, w ∈ FW. �

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 111

Corollary 5.3. Letw be a word such thatCard(alph(w))�3. If for all a, b∈A, �ab(w) ∈FW, thenw ∈ FW. More precisely, w has a witness f such thatCard(Ef ) = 1 andCard(Cf ) = 0.

Proof. When Card(alph(w)) = 3, the result holds similarly than Corollary5.2.Assume Card(alph(w))�4. Forx ∈ A and fora, b ∈ A\{x}, �ab(�x(w)) = �ab(w) ∈

FW. Thus by induction, for eachx ∈ A, there exists a witnessfx of �x(w) such thatCard(Efx ) = 1 and Card(Cfx ) = 0. Result holds by Proposition5.1. �

Up to the end of the section, our aim is to state

Proof of Proposition 5.1. Weassume the hypotheses of the proposition. Let� be a letter inminLetters(w).Weprove successively three facts before ending the proof of the proposition.

Fact 5.4. |w|x is a multiple of|w|� for each letter x inalph(w).

Proof. Without loss of generality, assume� /∈ {b, c}, that is,� = a or � /∈ {a, b, c}. SinceCard(Cfb) = 0,� ∈ Mfb∪Efb . ByLemma3.5,wecanassume� ∈ Efb . SinceCard(Efb) =1, efb = �. Since Card(Cfb) = 0, for x in alph(w)\{b}, |w|x = |fb(�)|x × |w|�. Similarlyefc = � and|w|b = |fc(�)|b × |w|�. �

Letua = fa(ea)|w|ea|w|� , ub = fb(eb)

|w|eb|w|� anduc = fc(ec)|w|ec|w|� . SinceCfa = Cfb = Cfc =

∅, we have�x(w) = u|w|�x for eachx in {a, b, c}.

Now letw1, . . . , w|w|� be the words of length|w||w|� such thatw = ∏|w|�

i=1 wi . We have

Fact 5.5. |wi |x = |u�|x , for all � in {a, b, c}, x in alph(w)\{�}, and i in {1, . . . , |w|�}.

Proof. We prove this fact when� = a. Cases� = b andc are similar and they are left tothe reader.Let x ∈ alph(w)\{a}. We havex �= b or x �= c. Once again, we treat only casex �= b.We have�b(u

|w|�a ) = �b(�a(w)) = �a(�b(w)) = �a(u

|w|�b ). So�b(ua) = �a(ub) and

|ua|x = |ub|x .Let j be an integer such that 0�j < |ua|x . Let v1, v2, v3, v4 be the words such that

ua = v1xv2, ub = v3xv4 and |v1|x = |v3|x = j − 1. From�b(ua) = �a(ub), we get�b(v1) = �a(v3) and so for ally ∈ alph(w)\{a, b}, |v1|y = |v3|y .Let p = |v1| + |v3|a = |v1|b + |v3|a + ∑

y∈alph(w)\{a,b} |v1|y = |v1|b + |v3|. Let i with1� i� |w|�. We observe that the(j + (i − 1)|ua|x)th occurrence of the letterx in w (andso the same occurrence in�a(w) and�b(w)) is preceded by(|v1| + (i − 1)|ua|) letters in�a(w) = (ua)i−1v1xv2(ua)|w|�−i and is preceded by(|v3|+ (i−1)|ub|) letters in�b(w) =(ub)

i−1v3xv4(ub)|w|�−i . So this occurrence of the letterx is preceded byq letters inwwhereq = (|v1|b+ (i−1)|ua|b)+ (|v3|a+ (i−1)|ub|a)+∑

y∈alph(w)\{a,b}(|v1|y+ (i−1)|ua|y).But, since�a(w) = u

|w|�a and�b(w) = u

|w|�b , we have|ua|b = |w|b|w|� , |ub|a = |w|a|w|� and for

y ∈ alph(w)\{a, b}, |ua|y = |ub|y = |w|y|w|� .

112 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

So|ua|b + |ub|a + ∑y∈alph(w)\{a,b} |ub|y = |w|

|w|� andq = p + (i − 1) |w||w|� .

Since 0�p < |ua| + |ub|a = |w||w|� and sincew = ∏|w|�

j=1wj , this occurrence of the letterx is a factor ofwi . Since this is true for any value ofj, 0�j < |ua|x and any value ofi,1� i� |w|�, we get|wi |x = |ua|x . �

Fact 5.6. �xy(wi) = �xy(w1), for all i in {1, . . . , |w|�} and x, y in alph(w).

Proof. Let x, y ∈ alph(w). There exists a letter� ∈ {a, b, c} such that� �= x and� �= y.We have��(w) = u

|w|�� = ∏|w|�

i=1 ��(wi). So�xy(u�)|w|� = ∏|w|�

i=1 �xy(wi). By Fact5.5,|u�|{x,y} = |wi |{x,y}. Consequently, for eachi ∈ {1, . . . , |w|�}, �xy(u�) = �xy(wi).In particular�xy(wi) = �xy(w1). �

Now let us recall a particular case of a more general result in trace theory[7] (see also[8,9] for more information about trace theory).

Lemma 5.7(Cori and Perrin[7] ). Let u andv be two words.The following conditions areequivalent:(1) u = v.(2) for all x, y in A, �xy(u) = �xy(v).

We can now conclude the proof of Proposition5.1. Indeed by Fact5.6and Lemma5.7,for all i = 1, . . . , |w|�, wi = w1. Sow = w

|w|�1 . Moreover,|w1|� = 1. By Lemma3.6,

w ∈ FW. �

6. Main result

Actually, we solve Conjecture4.1 in a more general case than Proposition5.1. We havea result that does not depend on the inexistence of constant letters.

Theorem 6.1. Letw be awordwithCard(alph(w))�3.Assume that, for each x inalph(w),�x(w) belongs toFW and has a witnessfx with Card(Efx ) = 1. Thenw belongs toFWand has a witness f withCard(Ef ) = 1.

The proof of this theorem is based on the following result.

Proposition 6.2. Let w be a word such thatCard(alph(w))�4. If for each letter x, theword�x(w) has a witnessfx withCard(Efx ) = 1, then(at least) one of the two followingassertions is verified:(1) w ∈ FW and it has a witness f withCard(Ef ) = 1.(2) there exist two non-empty sets B and C withalph(w) = B ∪ C, B ∩ C = ∅ such that

for all x in B and y in C, y belongs toCfx .

Proof of Theorem 6.1. From Proposition3.3, we can assume that the morphismsfx inthe hypotheses are idempotent. We proceed by induction on Card(alph(w)).

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 113

When Card(alph(w)) = 3, Corollary5.2states thatw ∈ FW. Moreover, the proof of thiscorollary can be continued to show thatw has a witnessf with Card(Ef ) = 1.From now on, assume that Card(alph(w))�4, and that we are in the second case of

Proposition6.2(the first one is the expected conclusion of the proof): there exist two non-empty setsB andC such that alph(w) = B ∪ C, B ∩ C = ∅, and for all� in B, C ⊆ Cf� .There exists an integern�1 andwordsxi ∈ C∗ (i = 0, . . . , n), andyi ∈ B+ (i = 1, . . . , n)such thatxi �= ε if i ∈ {1, . . . , n− 1} and

w = x0n∏i=1yixi .

LetS = {y1, . . . , yn}.We have alph(S) = B. Letx ∈ B. Let us denote bygx the restrictionof fx to the setB. SinceC ⊆ Cfx , for y ∈ S, we havefx(�x(y)) = �x(y), that is,gx(�x(y)) = �x(y). FromB ∪ C = alph(w) = {x} ∪ Cfx ∪Mfx ∪ Efx , B ∩ C = ∅ andC ⊆ Cfx , we deduce{x} ∪Mfx ∪Efx ⊆ B. SoMgx = Mfx ,Egx = Efx , Card(B)�3 andgx �= IdB .SoSverifies thehypothesesofProposition4.5.Thereexistsawordw′ such thatalph(w′) =

alph(S) and(1) for all a ∈ alph(w′), fa(�a(w′)) = �a(w′).(2) w′ ∈ FW if and onlyS ∈ FS.

More precisely for any morphismf �= Id on alph(w′), f (w′) = w′ if and only if forall y in S, f (y) = y.

Since alph(w′) = alph(S) = alph(B) < alph(w), by induction hypothesis, there existsa morphismf �= Id on B such that Card(Ef ) = 1 andf (w′) = w′. So for ally in S,f (y) = y.For c ∈ C, let definef (c) = c. We getf (w) = w. �

The rest of this section is devoted to the proof of Proposition6.2. We are more specificabout the hypotheses and we make some choices (without loss of generality).

Hypotheses onw. We assumew is a word such that Card(alph(w))�4. We are going todo a proof by contraposition. So we assume the following two hypotheses.

Hypothesis 1.Eitherw /∈ FW, or, ifw ∈ FW, it has no witnessf with Card(Ef ) = 1.

Hypothesis 2. For allB,C non-empty sets with alph(w) = B ∪C,B ∩C = ∅, there existx ∈ B, y ∈ C such thaty /∈ Cfx .

This hypothesis can also be written:for all non-empty subsets B ofalph(w) with B �=alph(w), there existx ∈ B, y /∈ B such thaty /∈ Cfx .

Hypotheses on the�x(w)’s. For each letterx, the word�x(w) has a witnessfx withCard(Efx ) = 1. By Proposition3.3, we can assumefx is idempotent. We simplify thenotation denoting byEx the setEfx , by Cx the setCfx , and byMx the setMfx . We alsodenote byex the letter such thatEx = {ex}.

114 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

By Corollary3.4,

�x(w) ∈ (Cx ∪ {fx(ex)})∗. (1)

Moreover Lemma3.6and Hypothesis1 imply

w /∈ (Cx ∪ {x, fx(ex)})∗. (2)

The words�x(w) can have several (but a finite number of) idempotent witnesses. Weconsider a particular idempotent witnessfx such that

Hypothesis 3. For all idempotent witnessesf of �x(w), if Ef = {y} then|f (y)|� |fx(ex)|.

We work in the rest of the proof with a letter having a minimal number of occurrences inw. Let a ∈ minLetters(w), that is,a is a letter such that

Hypothesis 4. For allx in alph(w), |w|a� |w|x .

By Lemma3.5, we assume

Hypothesis 5. ec = a for each letterc �= a such thata /∈ Cc.

Fact 6.3. There exists(at least) one letter c such thatec = a.

Proof. By Hypothesis2 (takingB = alph(w)\{a} andC = {a}), there existsc �= a suchthata /∈ Cc. By Hypothesis5, ec = a. �

Let b = ea . Eq. (1) can be rewritten (whenx = a) as follows,�a(w) ∈ (Ca ∪ {fa(b)})∗.From Eq. (2), we deduce there existn�1 wordsZ1, . . . , Zn such that

w ∈ (Ca ∪ {a, fa(b), Z1, . . . , Zn})∗ (3)

and for alli with 1� i�n,• Zi is a factor ofw,• Zi does not start witha and does not end witha,• |Zi |a�1,• �a(Zi) = fa(b).The rest of the proof is divided into several steps making gradually the structure of the

wordw more precise by deriving contradictions with the hypotheses.• Step1: EachZi (1� i�n) contains exactly one occurrence of the lettera.• Step2: n = 1, that is, there exist (unique) non-empty words�, �, Z such that�a(Z) =fa(b) = ��, Z = �a� is a factor ofw and

w ∈ (Ca ∪ {a, fa(b), Z})∗.• Step3: (�,�) = (b, cm) or (�,�) = (cm, b) for a letterc /∈ {a, b} and an integerm�1.Moreoverfa(b) is a factor ofw.

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 115

Until the end of the proof, we assume� = b and� = cm (the symmetric case is left tothe reader). In other terms, we assumew ∈ (Ca ∪ {a, bcm, bacm})∗ with bcm andbacmfactors ofw.• FromStep 4 to the end of the proof, the aim is to prove (by induction) that for any integerl�1,w ∈ (K2l ∪ {�2l−1 . . . �3�1a, �2l�2l−2 . . . �2bcm, �2l�2l−1 . . . �2�1bacm})∗ wherethe non-empty words�i have pairwise disjoint alphabets. This contradicts the fact thatw contains only a finite number of letters. In particular, Step 4 states the existence of�1, and Step 8 the existence of�2. The technical Steps 5–7, 9 and 10 are needed to stateproperties on the words�i and the letters ofw.

Step1: |Zi |a = 1, for all i , 1� i�n.

Proof. From Fact6.3, there exists a letterc such thatec = a. Letuandv be the words suchthatfc(a) = uav. By definition offc, uv �= ε and Card(uav)∩Cc = ∅. Moreover Eq. (1)states that�c(w) ∈ (Cc ∪ {uav})∗.Assume there exists an integeri such that 1� i�n and |Zi |a�2. Let P,X, S be the

words such thatZi = PaXaS, |P |a = 0, |S|a = 0. SinceZi does not start (resp., end)with a, P �= ε (resp.,S �= ε). SinceZi is a factor ofw, the word�c(Zi) is a factor of�c(w). Consequently�c(X) = vYu for a wordY. Sincea /∈ alph(uv), we deduce thatMc = alph(uv) ⊆ alph(X)\{a} ⊆ alph(Zi)\{a} ⊆ alph(fa(b)) = Ma ∪ {b}. By Corollary3.2, alph(w) = Ma ∪ {a, b} ∪ Ca = Mc ∪ {a} ∪ Cc ∪ {c}. ThusCa ⊆ Cc ∪ {c}.Let � = alph(uv). From�c(w) ∈ (Cc ∪ {uav})∗, we get|w|� = |w|a × |uv|. From

�a(w) ∈ (Ca ∪ {fa(b)})∗ and� ⊆ Ma ∪ {b} = alph(fa(b)), we get|w|� = |w|b ×|fa(b)|� = |w|b × (|uv| + |PYS|)� |w|b × |uv|. By Hypothesis4, |w|a� |w|b. Conse-quently |w|a = |w|b and |fa(b)|� = |uv|. This means in particular that alph(PYS) ∩alph(uv) = ∅. As �c(Zi) = �c(P )avYua�c(S), we can see that|Zi |a = 2 and|X|a = 0.Moreover�c(P ) = ε or u = ε. Similarly �c(S) = ε or v = ε. We consider the followingthree cases :�c(P ) �= ε, �c(S) �= ε, and�c(P ) = �c(S) = ε.Case�c(P ) �= ε: Let � be the first letter of�c(P ). From what precedes, we know

u = ε. But |fc(a)|�2. Sov �= ε and�c(S) = ε. From Eq. (3), we get�c(w) ∈ (�c(Ca) ∪{a, �c(fa(b)), �c(Z1), . . . , �c(Zn)})∗. The first letter of�c(fa(b)) is�. For eachj, 1�j�n,the first letter of�c(Zj ) is � or a. So�c(w) has�c(Zi) = �c(P )avYa as suffix, or�c(w)has a factor�c(Zi)d = �c(P )avYad whered is a letter in{a, �} ∪ Ca . Sinced �= c

andCa ⊆ Cc ∪ {c}, d ∈ {a, �} ∪ Cc and sod /∈ alph(v). We have a contradiction with�c(w) ∈ (Cc ∪ {av})∗.Case�c(S) �= ε: Is symmetric to the previous one.Case�c(P ) = �c(S) = ε: Let p, s be the integers such thatP = cp andS = cs . Since

P �= ε andS �= ε, we havep �= 0 ands �= 0. Letx be a letter different froma andc.Let us prove thata ∈ Cx . If a ∈ Ex∪Mx , byHypothesis5, ex = a. FromZi = cpaXacs ,

we deduce that bothacandcaare factors of�x(w). Soc ∈ Mx . This implies that, for twointegersk andl different from zero,�ac(w) ∈ (ckacl)∗. Since|X|a = 0 anda�ac(X)a is afactor of�ac(w), we getk+ l = |X|c. Let recall that for 1�j�n, |Zj |a�1 and|Zj |b = 1.Moreover|Zi |a = 2 and|w|a = |w|b. So by Eq. (3), the wordfa(b) = cpXcs is a factor ofw. Consequently�ac(fa(b)) = cp+s+|X|c is a factor of�ac(w). Sincep �= 0, �ac(fa(b))cannot be a factor of�ac(w) ∈ (ckacl)∗. We have a contradiction, and soa ∈ Cx .

116 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

We now prove thatc ∈ Cx . Assume by contradiction thatc /∈ Cx , that is,c ∈ Ex ∪Mx .Since bothacs andcpa are factors of�x(w), cp is a suffix offx(ex) andcs is a prefix offx(ex). So|fx(ex)|c�p + s andex �= c.Assume first|X|c �= 0. By Eq. (3), a�x(X)a is a factor of�x(w). Thusfx(ex) is a factor

of �x(X). In particularex ∈ alph(X) ⊆ alph(fa(b)). More precisely|X|c = |X|ex ×|fx(ex)|c = |fa(b)|ex × |fx(ex)|c. Consequently,|w|c = |w|ex × |fx(ex)|c = |w|b ×|fa(b)|ex × |fx(ex)|c = |w|b × |X|c < |w|b × |fa(b)|c = |w|c, which is impossible, andso|X|c = 0.Since�a(w) ∈ (Ca∪{fa(b)})∗, andc ∈ alph(fa(b)), |w|c = |w|b×|fa(b)|c = |w|b(p+

s). Eq. (1) states�x(w) ∈ (Cx ∪ {fx(ex)})∗. So|w|c = |w|ex × |fx(ex)|c� |w|ex (p + s).At the beginning of the proof, we have seen|w|a = |w|b. By Hypothesis4 |w|b� |w|ex .Previous inequalities show that|w|b = |w|ex and|fx(ex)|c = p + s. Now letPi andSibe words such thatw = PiZiSi . By Eq. (3), we can assume that�alph(fa(b))(Pi) ∈ fa(b)∗.It follows |Pi |c = 0 modulo(p + s). Since�x(w) starts withPicpa and sincea ∈ Cx ,we have|Picp| = 0 modulo(p + s). Thusp = 0 or s = 0. We have a contradiction. Soc ∈ Cx .In case�c(P ) = �c(P ) = ε, we have shown that forx /∈ {a, c}, {a, c} ⊆ Cx . This

contradicts Hypothesis2. �

Now we prove

Step2: n = 1.We need some notations. Let�i ,�i be the words such thatZi = �ia�i (1� i�n). We

assume|�1| < |�2| < · · · < |�n|. By definition of theZi ’s, �i �= ε and�i �= ε. We alsodenote by�i , �

′i (1� i�n) the words such that�i = �1�i and�i = �′

i�n: �1 = ε = �′n.

Sincefa(b) = �i�i (1� i�n), we have�i�′i = �n = �′

1 (1� i�n).Before proving Step 2, we state seven intermediate steps. They gradually explain the

structure ofw.Step2.1.If n�2, for each letter c such thatec = a, c ∈ alph(fa(b)).

Proof. Assume by contradiction thatn�2 and that there exists a letterc /∈ alph(fa(b))such thatec = a. Letuandv be the words such thatfc(a) = uav. From Eq. (3), we deduce�c(w) ∈ (Ca\{c} ∪ {a, fa(b), Z1, . . . , Zn})∗ (moreover eachZi is a factor of�c(w)).We haveZ1 = �1a�2�2 andZ2 = �1�2a�2. Moreover|�1�2�2|b = |fa(b)|b = 1. Wedistinguish the three following cases:|�1|b = 1, |�2|b = 1 and|�2|b = 1.Case|�1|b = 1: Let�1 and�2 be the words such that�1 = �1b�2. SoZ1 = �1b�2a�2�2

(since�1 = �′1�n = �2�

′2�n = �2�2), Z2 = �1b�2�2a�2 and|�2|b = |�2�2|b = 0. Since

Z1 andZ2 are factors of�c(w), and since|�2| �= 0,u is a suffix of both�2 and�2�2.If b /∈ Cc, sinceb �= ec, b ∈ Mc = alph(uv). Sinceb /∈ alph(u), b ∈ alph(v). But �1b

is a prefix of eachZj (1�j�n) and, by Eq. (1), �c(w) ∈ (Cc ∪ {uav})∗. There exist twowordsv1 andv2 in C∗

a such thatv starts with�2v1�1b and�2�2v2�1b. So�2v1 = �2�2v2.This is not possible since|�2| �= 0 and alph(�2�2) ∩ Ca = ∅.So b ∈ Cc. Since|�2�2|a = 0, |�2�2|alph(u) = |�2|alph(u) implies |�2|alph(u) = 0. But

|�2| �= 0 andu is a suffix of�2�2. Sou = ε. As we have seenu is a suffix of�2 and�2�2,we can see thatv is a prefix of�2 or �2 is a prefix ofv.

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 117

If v is a prefix of�2, then it is also a prefix of�2�2. Since�2 �= ε, it follows that�2�2contains at least(|v| + 1) occurrences of letters of alph(v). This contradicts the fact that�c(w) ∈ (Cc ∪ {av})∗ since|�2�2|a = 0.So�2 is a prefix ofv with �2 �= v. The(|�2| + 1)th letter ofv is the(|�2| + 1)th letter

of �2�2. It is a letter of alph(fa(b)). Sinceb ∈ Cb, v is a prefix of both�2�1 and�2�2�1.We can prove thatv = ε similarly as foru. We have a contradiction withuv �= ε.Case|�2|b = 1: Is similar to the previous one.Case|�2|b = 1: Let �1 and�2 be the words such that�2 = �1b�2. We haveZ1 =

�1a�1b�2�2, Z2 = �1�1b�2a�2.Assumefirst|u|b�1.Sinceb�2a is a factor of�c(w),b�2 is a suffix ofu. Letters occurring

in �2 belong to alph(fa(b)). Let us consider an occurrence ofZ1 in �c(w). Letw1, w2 bewords in (Ca ∪ {fa(b), Z1, . . . , Zn})∗ such that�c(w) = w1Z1w2 = w1�1a�1b�2�2.Since�c(w) ∈ (Ca ∪ {uav})∗, the wordu is a suffix ofw1�1. But b occurs only infa(b),Z1, . . . , Zn−1 andZn. Moreoveraoccurs afterb inZ2, . . . , Zn. It follows thatw1 endswithfa(b) or withZ1, and sow1 ends withb�2�2. But then|b�2�2�1| = |b�2|. This contradicts�1 �= ε. So|u|b = 0.Similarly |v|b = 0 and sob ∈ Cc. Sinceb�2a is a factor of�c(w),u is a suffix of�2. There

exists a word�3 such that�2 = �3u. But nowb�3u�2 is a factor of�c(w) ∈ (Cc ∪{uav})∗.Since|�3�2|a = 0, if u �= ε we should have�2 = ε. This is not the case. Sou = ε andsimilarly v = ε. We have a contradiction.�

Step2.2: If n�2, for anyd ∈ Ca , there exists a letterc ∈ alph(fa(b)) such thatc ∈ Cd .

Proof. Let d ∈ Ca . Assume by contradiction that for all lettersc in alph(fa(b)), c /∈ Cd .By Corollary3.2, this means alph(fa(b)) ⊆ Md ∪Ed . By Eq. (2),w /∈ (Cd ∪{fd(ed), d})∗.Sinced /∈ alph(fa(b)), Eq. (3) implies alph(fa(b)) �= Md ∪ Ed .The wordsZ1 andZ2 are factors ofw and d /∈ alph(Z1Z2) = alph(fa(b)) ∪ {a}.

ConsequentlyZ1 andZ2 are factors of�d(w). Let recallZ1 = �1a�1 with �1 �= ε. By Step2.1 and Hypothesis5, a ∈ Cd . Since alph(�1) ⊆ Md ∪ Ed , fd(ed) has�1 as suffix. Sincealph(fa(b)) �= Md ∪ Ed , fd(ed) has a suffixxfa(b)i�1 for an integeri and a letterx in(Ed∪Md)\alph(fa(b)). In the sameway there exists a lettery in (Ed∪Md)\alph(fa(b))andan integerj such thatfd(ed) has a suffixyfa(b)j�2. This is not possible since|�1| �= |�2|.

Step2.3:If n�2andCa �=∅, thenw ∈ (C+a {fa(b), Z1, . . . , Zn})∗C∗

a ,orw ∈ C∗a ({fa(b),

Z1, . . . , Zn}C+a )

∗.

Proof. Since alph(fa(b)) = Ma ∪Ea , we haveCa ∩ alph(fa(b)) = ∅. So for anyd ∈ Ca ,by Step 2.1 and Hypothesis5, a ∈ Cd . By Hypothesis2, there exist a letterd in Ca and alettery in alph(fa(b))∪{a} such thaty /∈ Cd . Sincea ∈ Cd ,y ∈ alph(fa(b)). In other terms,alph(fa(b)) is not a subset ofCd . In particular alph(fa(b)) ∩ (Md ∪ Ed) �= ∅. By Eq. (2),w /∈ (Cd ∪ {fd(ed), d})∗. Sinced /∈ alph(fa(b)), we deduce that alph(fd(ed)) = Md ∪Edis not a subset of alph(fa(b)), and so thatfd(ed) is not a factor offa(b). Sinced ∈ Ca , byStep 2.2, alph(fa(b)) ∩Cd �= ∅. Looking at (for instance)Z1, we see that the first letter orthe last letter offa(b) does not belong toCd .

118 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

If the first letter offa(b) does not belong toCd , letw1, w2 andw3 be the words suchthatw = w1w2w3, |w1|alph(fa(b)) = 0 andw2 ∈ {fa(b), Z1, . . . , Zn}. There exists a prefixy of w2 (and so a prefix offa(b) sincea ∈ Cd ) which is a suffix offd(ed). More preciselyfd(ed) = zy for a non-empty wordz such that alph(z) ∩ alph(fa(b)) = ∅ (z is a suffix ofw1): alph(z) ⊆ C+

a . We deduce that the occurrences offa(b) and of theZi ’s cannot bea prefix of�d(w) (and so ofw). By Eq. (1), �d(w) ∈ (Cd ∪ {fd(ed)})∗. So each of theoccurrences offa(b) and of theZi ’s must be preceded inw by d or by the last letter ofz.From Eq. (3), this proves the first possibility of Step 2.3.When the last letter offa(b) does not belong toCd , takingw1,w2 andw3 the words such

thatw = w1w2w3, w2 ∈ {fa(b), Z1, . . . , Zn} and |w3|alph(fa(b)) = 0, we can similarlyprove the second possibility of Step 2.3.�

Step2.4: If n�2 andCa �= ∅, for any letter c such thatec = a, �i , �′i ∈ c∗ for all i ,

1� i�n.

Proof. Letcbe a letter such thatec = a and letu, v be thewords such thatfc(a) = uav.Wehaveu �= ε or v �= ε. We study Caseu �= ε (Casev �= ε is symmetric). By Eqs. (1) and (3),for i = 1, . . . , n, theword�c(Zi) = �c(�1�i )a�c(�

′i�n) is a factor of�c(w) ∈ (Cc∪{uav})∗.

So�c(�1�i ) is a suffix ofu, or,u is a suffix of�c(�1�i ).If �c(�1�n) is a suffix ofu different fromu, then for length reason, for eachi, 1� i�n,

�c(�1�i ) is also a suffix ofu different from u. From Step 2.3, we deduce for eachi,1� i�n, u ends withxi�c(�1�i ) wherexi is a letter inCa . Sincex1 andxn do not belongto alph(�1�n) ⊆ Ma ∪ Ea , �c(�1) = �c(�1�1) = �c(�1�n) which implies�c(�n) = ε, thatis, �n ∈ c+.Now let us consider the caseu is a suffix of�c(�1�n) (possiblyu = �c(�1�n)). We have

alph(�1�n) ⊆ Ma ∪Ea . So alph(u)∩Ca = ∅. So from Step 2.3,u is a suffix of�c(�1�i ) foreachi = 1, . . . , n. Letxbe theword such that�c(�1) = �c(�1�1) = xu. There exists awordy such that�c(�1�n) = xyu. The(|x| + 1)th letter of�c(�1) is the first letter ofu and thefirst letter ofyu. This letter belongs toMc. If y �= ε, since|�1�n|a = 0, by Step 2.3,v mustend withlxzwherel is a letter inCa andz is a prefix ofy. In particular|v|Ca �= 0. The word�c(Z1) (resp.,�c(Zn)) ends witha�c(�n�n) (resp.,a�c(�n)). Since�c(�n) and�c(�n�n)are prefixes ofv, Step 2.3 and|v|Ca �= 0 imply �c(�n) = �c(�n�n), that is,�c(�n) = ε. Acontradiction withy �= ε. Soy = ε and once again�n ∈ c+.For eachi, 1� i�n, the equality�n = �i�

′i implies�i , �

′i ∈ c∗. �

Step2.5: If n�2,Ca = ∅.

Proof. Assume by contradictionn�2 andCa �= ∅.First we can notice that, by Fact6.3, there exists a letterc such thatec = a. As a

consequence of Step 2.4, this letterc is unique. ByHypothesis2 (takeB = alph(w)\{a, c}),there exists a letterd /∈ {a, c} such thata /∈ Cd or c /∈ Cd . From uniqueness ofc andHypothesis5, a ∈ Cd and soc /∈ Cd .Since�na = c|�n|a is a factor of�d(w) (it is a factor of�d(Zn)), �n is a suffix of

fd(ed). Since Card(alph(fd(ed)))�2, there exists a letterx �= c and an integerk� |�n|such thatfd(ed) ends withxck. By Step 2.3,w has a prefixZn or a factor inCaZn. So

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 119

w has a prefix�1�na or a factor inCa�1�na. But still by Step 2.3,w has a prefix�1aor a factor inCa�1a. Consequentlyfd(ed) should end byxck−|�n|: this is not possiblesince|�n| �= 0. �

Step2.6: If n�2, for any letter c such thatec = a, we haveCc = ∅.

Proof. Assume by contradiction thatc is a letter such thatec = a andCc �= ∅. FromStep 2.5, we know that�a(w) = fa(b)

|w|b . Consequently�a(�c(w)) = �c(fa(b))|w|b .SinceCc �= ∅ and alph(w) = alph(fa(b))∪ {a}, we have alph(�c(fa(b)))∩Cc �= ∅. Moreprecisely, sinceec = a, alph(�c(fa(b))) = Mc ∪ Cc.If �c(fa(b)) contains a factoru1xu2 with u1, u2 ∈ M+

c andx ∈ C+c , then�c(w) must

contain at least|w|b+1 occurrences offc(a). This implies|w|a� |w|b+1, a contradictionwith a ∈ minLetters(w). So�c(fa(b)) = xu or�c(fa(b)) = ux with x ∈ C+

c andu ∈ M+c .

Since the two cases are symmetric, we only consider the case�c(fa(b)) = xu.We have�a(�c(w)) = (xu)|w|b and, by Eq. (1), �c(w) ∈ (Cc ∪ {fc(a)})∗. Since

|w|b� |w|a and sincex ∈ C+c (in particularx �= ε) andu ∈ M+

c , wemust have|w|b = |w|aand�c(w) = (xfc(a))|w|a . Since|xfc(a)|a = 1, defining amorphismf ′

c byf′c(a) = xfc(a)

and for alld ∈ Cc ∪Mc, f ′c(d) = ε, we get a morphism that contradicts Hypothesis3. So

Cc = ∅. �

Step2.7: If n�2,b is the only letter such thateb = a.

Proof. By Fact6.3, there exists at least one letterc such thatec = a. Assume there existtwo different lettersc anddwith ec = a anded = a. By Step 2.6,Cc = Cd = ∅. Moreoverby Step 2.5,Ca = ∅. By Proposition5.1,w belongs to FW. This contradicts Hypothesis1.So there exists exactly one lettercsuch thatec = a. Assumec �= b. By Step 2.5,Ca = ∅.

By Eq. (1), �a(w) = fa(b)|w|b . Moreover Eq. (3) can be written

w ∈ (a∗{fa(b), Z1, . . . , Zn})∗a∗. (4)

By Step 2.6,Cc = ∅. By Eq. (1), �c(w) = fc(a)|w|a . This implies�ab(w) = �ab(�c(w)) =�ab(fc(a))|w|a = (bkabl)|w|a for two integersk andl. Recall|fa(b)|a = 0, |fa(b)|b = 1,|Zi |a = |Zi |b = 1 (i = 1, . . . , n). It follows

w ∈ {fa(b)kafa(b)l, T1, . . . , Tn}|w|a , (5)

where, fori = 1, . . . , n,Ti = fa(b)k−1Zifa(b)l if |�1�i |b = 1 andTi = fa(b)kZifa(b)l−1if |�′

i�n|b = 1 (recall|fa(b)|b = 1 = |�1�i�′i�n|b). Moreover fori = 1, . . . , n, �c(Ti) =

fc(a) (note that�c(fa(b)kafa(b)l) = fc(a) if fa(b)kafa(b)l s a factor ofw which is notnecessarily the case).ByHypothesis2, thereexistsa letterd /∈ {a, c}such thata /∈ Cd orc /∈ Cd . Byuniqueness

of the letterc (and by Hypothesis5), a ∈ Cd . Thusc /∈ Cd , that is,c ∈ Ed ∪Md .Now (in this step) we distinguish three complementary cases:|�1|b = 1, |�n|b = 1,

|�n|b = 1. In each case, we get a contradiction. This will end this step provingc = b.Case|�1|b = 1: In this caseT1 = fa(b)

k−1Z1fa(b)l andTn = fa(b)k−1Znfa(b)l are

factors ofw. Since�c(T1) = �c(Tn), we get�c(Z1) = �c(Zn) which implies�c(�n) = ε,that is,�n ∈ c+. Since�na is a factor of�d(w), fd(ed) ends with�n. More precisely there

120 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

exist a lettery �= c and an integerp such thatfd(ed) ends withycp�n. By Eq. (5), ycp is

a suffix of�d(�1) if �d(�1) /∈ c∗ or a suffix of�d(�n�1) otherwise. In both cases since�1ais a factor of�d(w) (and by Eq. (5)), ycp must be a suffix offd(ed), a contradiction with�n �= ε.Case|�n|b is symmetric to the previous case.Case|�n|b = 1: HereT1 = fa(b)

kZ1fa(b)l−1 andTn = fa(b)

k−1Znfa(b)l are factorsof w. From�c(T1) = �c(Tn), we deduce�c(�n�1a) = �c(a�n�1). But |�n�1|a = 0. So�n, �1 ∈ c+. Since�1a is a factor ofw (so of�d(w)), fd(ed) ends with the letterc. Moreprecisely by Eq. (5), there exist a lettery and an integerp such thatfd(ed) ends withycp�n�1. The wordyc

p is a suffix of�d(�n). But�d(�n)a is a factor of�d(w). This impliesfd(ed) ends withycp. We have a contradiction with�1 �= ε. �

Proof of Step 2. FromSteps2.5–2.7,weknow�a(w) = fa(b)|w|b and�b(w) = fb(a)|w|a .So�b(fa(b))|w|b = �b(�a(w)) = �a(fb(a))|w|a . Let r be the primitive root of�b(fa(b)),

and letk, l be the integers such that�b(fa(b)) = rk and�a(fb(a)) = rl . There existtwo words r1, r2 such thatr = r1r2 and fb(a) ∈ r∗r1ar2r∗. From Eq. (3), �b(w) ∈{a, �b(fa(b)), �b(Z1), . . . , �b(Zn)}∗. It follows�b(Zi) ∈ r∗r1ar2r∗ (for eachi=1, . . . , n).In other terms (sinceZi = �1�ia�

′i�n), �b(�1�i ) ∈ r∗r1 and�b(�′

i�n) ∈ r2r∗.Assume first�b(�n) = ε. Since|�n|b� |fa(b)|b = 1, since�n �= ε, and since 0= |�1| <

|�2| < · · · < |�n|, we getn = 2 and�n = b. ThusZ1 = �1ab�n andZ2 = �1ba�n. ByHypothesis2, there exists a letterd /∈ {a, b} such thata /∈ Cd or b /∈ Cd . By unicity of bin Step 2.7 (and by Hypothesis5), a ∈ Cd andb /∈ Cd . Hereab andba are factors ofwand so of�d(w). Sofd(ed) starts and ends withb. The letterb belongs toMd (b �= ed ) and|fd(ed)|b�2. But|w|ed = |fa(b)|ed × |w|b� |w|b and|w|b = |fd(ed)|b × |w|ed �2|w|ed .This is impossible.So �b(�n) �= ε. From �b(�1) ∈ r∗r1, �b(�1�n) ∈ r∗r1, it follows �b(�n) ∈ (r2r1)

+.Moreover�b(�1) ∈ r∗r1 and�b(�n) ∈ r2r∗, or, (if r1 = ε or r2 = ε) �b(�1) ∈ r∗ and�b(�n) ∈ r∗. In all cases since|�1�n�n|b = 1, �1 �= ε and�n �= ε, we have�b(�1�n) �= ε.Consequently alph(�1�n)\{b} = alph(�n)\{b} = alph(r).As in case�b(�n) = ε, there existsd /∈ {a, b} such thata ∈ Cd andb /∈ Cd . Since

alph(w) = alph(r) ∪ {a, b}, d ∈ alph(r). Recall�a(w) = fa(b)|w|b . We get|w|ed =

|fa(b)|ed × |w|b� |w|b. From |w|b = |fd(ed)|b × |w|ed , |fd(ed)|b = 1. Let u andv bethe words such thatfd(ed) = ubv. We consider separately the three cases:|�1|b = 1(|�n�n|b = 0), |�n|b = 1 (|�1�n|b = 0), |�n|b = 1 (|�1�n|b = 0).Case|�1|b = 1: Let �1, �2 be the words such that�1 = �1b�2. By Eq. (1), �d(w) ∈

(Cd ∪ {fd(ed)})∗. Sincea ∈ Cd , andb�d(�2)a is a factor of�d(w), the wordv is a prefixof �d(�2). By Eq. (3), sinceCa = ∅, w starts with a factor ina∗�1b. Consequently, sincea ∈ Cd ,u is a suffix of�d(�1). Since alph(�b(�1�n)) = alph(�b(�n)), alph(uv) ⊆ alph(�n).The word�1�na is a factor ofw so of�d(w). But a ∈ Cd and�d(w) ∈ (Cd ∪ {ubv})∗, acontradiction with�1�na ends withubvtafor a non-empty wordt containing nobbut lettersin alph(uv).Case|�n|b = 1: Leads similarly to a contradiction.Case|�n|b = 1: Let �1, �2 be the words such that�n = �1b�2. Sincea�1b anb�2a are

factors ofw, u is a suffix of�d(�1) andv is a prefix of�d(�2). This leads once again to acontradiction with alph(�b(�1�n)) = alph(�b(�n)). �

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 121

Notation. From now on, we denote byZ, �, �, respectively, the wordsZ1, �1 and�1. SoZ = �a�, fa(b) = �� and Eq. (3) isw ∈ (Ca ∪ {a, fa(b), Z})∗. By definition ofZ, Z is afactor ofw.

Step3: There exists a letterc /∈ {a, b} such that� = b and� ∈ c+, or, � ∈ c+ and� = b.As for Step 2, the proof needs intermediate steps.

Step3.1:The wordfa(b) is a factor ofw.

Proof. If fa(b) is not a factor ofw, thenw ∈ (Ca ∪ {a, Z})∗. The number of occur-rences ofa in w is greater than or equal to the number of factors Z inw, which is exactly|w|b. From Hypothesis4, |w|a = |w|b andw ∈ (Ca ∪ {Z})∗. Since alph(Z) ∩ Ca = ∅,Lemma3.6contradicts Hypothesis1. �

Step3.2:For all c in alph(fa(b)), a ∈ Cc or �c(�) = ε or �c(�) = ε.

Proof. Assume by contradiction there exists a letterc ∈ alph(fa(b)) such thata /∈ Cc,�c(�) �= ε and �c(�) �= ε. By Hypothesis5, ec = a. By Eq. (3), �c(w) ∈ (Ca ∪{a, �c(��), �c(�)a�c(�)})∗. Let u andv be the words such thatfc(a) = uav. We have�c(w) ∈ (Cc ∪ {uav})∗. Since�c(�)a�c(�) is a factor of�c(w), we can consider the fourfollowing complementary cases:

(1) uav is a factor of�c(�)a�c(�).(2) p�c(�)a�c(�) = uavq for wordsp �= ε, q �= ε.(3) �c(�)a�c(�)p = quav for wordsp �= ε, q �= ε.(4) �c(�)a�c(�) is a factor ofuav.

Case1: Let p andq be the words such that�c(�) = pu and�c(�) = vq. By Step 3.1,�c(w) = w1�c(��)w2 = w1puvqw2 for some wordsw1, w2 in (Ca ∪ {a, �c(��), �c(�)a�c(�)})∗. Sincepu �= ε and vq �= ε, this contradicts�c(w) ∈ (Cc ∪ {uav})∗ withCc ∩ alph(uav) = ∅.Case2: We have here�c(w) ∈ (Cc ∪ {uav})∗ = (Cc ∪ {p�c(�)av})∗ and�c(w) ∈

(Ca ∪ {a, �c(��), �c(�)a�c(�)})∗ = (Ca ∪ {a, �c(�)vq, �c(�)avq})∗. Since�c(�) �= ε and�c(�) ⊆ alph(u), we observe that�c(w) = w1p�c(�)avqw2 for awordw2 inC∗

c . Moreoversinceq �= ε, alph(q) ∩ alph(uv) = ∅. In particular alph(q) ∩ alph(p) = ∅. Consequently,w1 ∈ (Cc∪{p�c(�)av})∗ ∩ (Ca ∪{a, �c(�)vq, �c(�)avq})∗. By induction, we get�c(w) ∈(Cc ∪ {uavq})∗. But since alph(q) ∩ alph(uv) = ∅ and alph(�c(�)) ⊆ alph(u) (and soalph(�c(�)) ∩ Cc = ∅), this implies�c(��) is not a factor of�c(w), a contradiction withStep 3.1.Case3: Is symmetric to Case 2.Case4: We have here alph(�)\{c} ⊆ alph(u) and alph(�)\{c} ⊆ alph(v). In partic-

ular u �= ε andv �= ε. Since�c(w) ∈ (Cc ∪ {uav})∗ and�c(w) ∈ (Ca ∪ {a, �c(��),�c(�)a�c(�)})∗, there exist integersk, l and wordsu0, u1, . . . , uk, v0, . . . , vl such that

u = u0[k∏i=1

�c(��)ui

]�c(�) and v = �c(�)v0

l∏i=1

�c(��)vi .

122 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Since�c(�) �= ε and�c(�) �= ε, this implies in particular,w ∈ (Ca ∪ {��, �a�})∗. Moreprecisely, with

U = u0[k∏i=1

��ui

]� and V = �v0

l∏i=1

��vi

we havew ∈ (Cc ∪ {UaV })∗. SinceCc ∩ alph(UaV ) = ∅ and|UaV |a = 1, Lemma3.6contradicts Hypothesis1. �

Step3.3:There existsc ∈ alph(fa(b)) such that�c(�) = ε or �c(�) = ε.Moreoverfa(b)is a factor ofw.

Proof. Assume by contradiction that for allc ∈ alph(fa(b)), �c(�) �= ε and�c(�) �= ε. ByStep 3.2,a ∈ Cc. By Hypothesis2, since alph(w) = alph(fa(b)) ∪ Ca ∪ {a}, there existc ∈ alph(fa(b)), andd1 ∈ Ca ∪ {a} such thatd1 /∈ Cc. From what precedes,d1 �= a, thatis, d1 ∈ Ca (and soCa �= ∅).Assumefirst alph(�c(��))∩Cc �= ∅. Sinced1 ∈ Ca∩(Mc∪Ec),wehavefc(ec) = �1�2�3

with �1, �3 ∈ alph(��)∗ and�2 ∈ C+a .

When�1 = ε = �3, sincec ∈ alph(��), w ∈ (Cc ∪ {c, fc(ec)})∗, a contradiction withEq. (2).If �1 �= ε and�3 �= ε, the first letter of�c(�) (which is the first letter of�3) belongs to

Mc∪Ec. Recallw ∈ (Ca ∪{a, ��, �a�})∗ and�c(w) ∈ (Cc∪{fc(ec)})∗. Since alph(��)∩Cc �= ∅, looking at the first occurrence of�c(��) or of �c(�a�) in �c(w), we getfc(ec) ∈C+a alph(��)+, a contradiction withfc(ec) ∈ alph(��)+C+

a alph(��)+.So�1 = ε and�3 �= ε, or, �1 �= ε and�3 = ε. The two cases are symmetric. We treat

only one. Assume�1 = ε and�3 �= ε. Here�3 should be a prefix of�c(�) because offactor�c(�)a�c(�) in �c(w). Fromw ∈ (Cc ∪ {�2�3})∗ ∩ (Ca ∪ {a, ��, �a�})∗, we deducew ∈ (Cc\alph(�a�) ∪ {a, �2�a�, �2��})∗.LetK = Cc\alph(�a�). ObserveK = alph(w)\alph(�2��a). Letf ′

a be the idempotentmorphism defined byf ′

a(b) = �2��, f ′a(x) = ε for x /∈ K ∪ {b}, f ′

a(x) = x for x ∈ K.Since|�2|�1, |f ′

a(b)| > |fa(b)|. We have a contradiction with Hypothesis3.So alph(�c(��)) ∩Cc = ∅ that is, alph(�c(��)) ⊆ (Mc ∪Ec)∗. Recall that�c(�)a�c(�)

is a factor of�c(w). Consider its first occurrence:�c(w) = w1�c(�)a�c(�)w2. Sincea ∈Cc and �c(�) �= ε, fc(ec) is a prefix of�c(�)w2. Sinced1 ∈ alph(fc(ec)) and d1 /∈alph(�c(��)), there exists a letterxsuch thatx /∈ alph(�c(��)) andfc(ec) startswith a factorin�c(�)(�c(��))∗x. Since�c(�) �= ε,fc(ec) is alsoa factorofw1�c(�).Consequentlyfc(ec)starts with a factor in�c(��)+y with y /∈ alph(�c(��)). So�c(�(��)∗) ∩ �c(��)+ �= ∅, acontradiction with�c(�) �= ε and�c(�) �= ε. This ends the proof of Step 3.3.�

Proof of Step 3. From Step 3.3, there exists a letterc such that�c(�) = ε or �c(�) = ε.Assume�c(�) = ε (Case�c(�) = ε is similar), that is, alph(�) = {c}. We have�c(�) �= �since|��|b = 1, � �= ε, � �= ε. By Hypothesis2, there exist a letterx in alph(�)\{c}, and aletterd1 ∈ Ca ∪ {a} ∪ {c} such thatd1 /∈ Cx . Sincex �= c, �x(�) �= ε.We first consider the case�x(�) �= ε. By Step 3.2,a ∈ Cx . Sod1 ∈ Ca ∪{c}. As we have

seen in Step 3.3 that we cannot have�c(�) �= ε, �c(�) �= ε andd1 ∈ Ca , we can prove hered1 /∈ Ca , that is,d1 = c. Soc ∈ Mx ∪ Ex .

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 123

Note that alph(�x(�)) ∩ alph(fx(ex)) �= ∅ otherwise, sincex ∈ alph(�), we getw ∈({x, fx(ex)}∪Cx)∗, a contradictionwith Eq. (2). Theword�a�x(�) is a factor of�x(w). Letus consider its last occurrence. Letw1, w2 be the words such that�x(w) = w1�a�x(�)w2and�a�x(�) is not a factor ofw2.AssumeCa ∩ alph(fx(ex)) �= ∅. Sincea ∈ Cx and c ∈ alph(fx(ex)), fx(ex) is a

suffix ofw1� and sofx(ex) ends with a factor inC+a (��x(�))∗�. Moreover alph(�x(�)) ∩

alph(fx(ex)) �= ∅. Thusfx(ex) is a factor of�x(�)w2. It follows thatfx(ex) ends with afactor inC+

a (��x(�))+. This is impossible since�x(�) �= ε and� �= ε.SoCa ∩ alph(fx(ex)) = ∅, that is, alph(fx(ex)) ⊆ alph(��x(�)). In particular Card

(alph(��x(�)))�2. The word�x(�) starts withcky for an integerk and a lettery �= c.Recallc ∈ Mx ∪ Ex = alph(fx(ex)). The first occurrence ofc in w corresponds to thefirst occurrence of�a�x(�) or of ��x(�). Sincea ∈ Cx , it corresponds to an occurrenceof ��x(�) and consequentlyfx(ex) starts with�cky. Sinceacky is a factor of�x(w) anda ∈ Cx , fx(ex) starts withcky, a contradiction with� �= ε.So�x(�) = ε. It follows alph(�) ∈ c+ and alph(�) ∈ x+. From|��|b = 1, we get� = b

and� ∈ x+, or,� ∈ c+ and� = b. �

Notation. From now on, we assume� = b and� = cm for an integerm�1 and a letterc /∈ {a, b} (Case� ∈ c+ and� = b is symmetric. It is left to the reader). Eq. (3) can bereformulatedw ∈ (Ca ∪ {a, bcm, bacm})∗.

Step4:There exist a word�1 and a setK1 ⊆ alph(w) such that

w ∈ (K1 ∪ {�1a, bcm, �1bacm})∗

withK1∩ alph(�1abcm) = ∅ andalph(�1)∩ {a, b, c} = ∅.Moreoverbcm and�1bacm are

factors ofw.Once again let us state preliminary steps.

Step4.1:a /∈ Cb anda /∈ Cc.

Proof. We provea /∈ Cb. (In the same way we can statea /∈ Cc.)Assume by contradictiona ∈ Cb. If moreoverc ∈ Cb, then fromw ∈ (Ca ∪ {a, bcm,

bacm})∗, we deduce (recallb /∈ Ca) w ∈ ({b, fb(eb)} ∪Cb)∗, a contradiction with Eq. (2).Soc ∈ Mb ∪ Eb. Sinceacm is a factor of�b(w), fb(eb) starts withcm. Let u be the wordsuch thatfb(eb) = cmu. If |u|c = 0 then we get once againw ∈ ({b, fb(eb)}∪Cb)∗, whichleads to a contradiction. So|u|c �= 0. There exist integersk, l and wordsu0, . . . , uk suchthatk�1, 0� l�m and

u = u0[k∏i=1cmui

]cl.

From �b(w) in (Ca ∪ {a, cm, acm})∗ ∩ (Cb ∪ {cmu})∗ and alph(u) �= {c}, we deducel ∈ {0,m}: so we can assumel = 0 (takingk + 1 instead ofk anduk+1 = ε). Now let

U = cmu0k∏i=1bcmui.

124 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Wegetw ∈ (Cb∪{bU, baU})∗. Consequently�a(w) ∈ (Cb\{a}∪{bU})∗. Since|U |eb = 1andalph(U)∩Cb = ∅, we candefinean idempotentmorphismf ′

a byf′a(eb) = bU ,f ′

a(x) =ε for x ∈ alph(bU)\{eb} andf ′

a(x) = x for x ∈ Cb\{a}. Observe Card(alph(U))�2 andsoU �= cm. The fact that|f ′

a(eb)| > |fa(b)| contradicts Hypothesis1. �

Step4.2:b ∈ Cc andc ∈ Cb.

Proof. Once again we only prove the casec ∈ Cb. The similar caseb ∈ Cc is left to thereader.We assume by contradictionc ∈ Mb ∪ Eb. By Step 4.1 and Hypothesis5, eb = a and

so c ∈ Mb. Let u andv be the words such thatfb(a) = uav. From Eq. (3), �b(w) ∈(Ca ∪ {a, cm, acm})∗. Moreover by definition ofZ, acm is a factor of�b(w).If v �= ε, fc(ec) starts withcm, and, more precisely there exist integersk, l and words

u0, u1, . . ., uk, v0, . . ., vl such that

u = u0k∏i=1cmui and v = cmv0

l∏i=1cmvi.

Let

U = u0k∏i=1bcmui and V = cmv0

l∏i=1bcmvi.

We havew ∈ (Cb ∪ {UabV,UbaV })∗.By constructionCb ∩ alph(UabV ) = ∅. If w ∈ (Cb ∪ {UabV })∗ or if w ∈ (Cb ∪

{UbaV })∗, by Lemma3.6, we get a contradiction with Hypothesis1. SoUabVandUbaVare factors ofw.By Step 4.1,a /∈ Cc. By Hypothesis5, a = ec. Sinceab andba are factors of�c(w),

b ∈ Mc. Letpandsbe thewords such thatfc(a) = pas.Wehave�ab(w) = (b|p|bab|s|b )|w|aand�ab(w) ∈ {b|U |bab|V |b+1, b|U |b+1ab|V |b }∗. This is not possible since bothUabVandUbaV occur inw.Sov = ε. Sinceacm is a factor of�b(w), there exist an integerk and wordsu0, . . . , uk

such that

u = cmu0k∏i=1cmui.

Let

U = cmu0k∏i=1bcmui.

We get

w ∈ (Cb ∪ b(Uba)∗Ua)∗.MoreoverbaUa is a factor ofw. By Step 4.1,a /∈ Cc, ec = a. Letu′, v′ be the words suchthatfc(a) = u′av′. If v′ �= ε, looking at the last occurrence ofa in w, we see thatv′ ∈ C+

b .ButaUa is a factor ofw. Sov′ is a prefix of�c(U): a contradiction with alph(U)∩Cb = ∅.

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 125

So v′ = ε. SincebaUa is a factor ofw, UbaUa is a factor ofw. This leads toU ∈ b+,u′ = U andUb is a suffix ofu′: this is impossible.Soc ∈ Cb. �

Proof of Step 4. BySteps 4.1 (andHypothesis5) and 4.2,eb = a andc ∈ Cb. Sinceacm isa factor ofw (and so of�b(w)) fb(a) endswitha. Let�1 be theword such thatfb(a) = �1a:�1 �= ε and alph(�1) ∩ {a, b, c} = ∅. From Eq. (3) and�b(w) ∈ (Cb ∪ {�1a})∗, we getStep 4 withK1 = Cb\{c} = Ca\alph(�1).By definitionZ = bacm is a factor ofw. Consequently�1bac

m is a factor ofw. ByStep 3.1,fa(b) = bcm is a factor ofw. �

Step5: For x ∈ alph(�1), a ∈ Cx .

Proof. Assume by contradictiona /∈ Cx . By Hypothesis5, ex = a. Sincebacm is a factorof �x(w) (notex /∈ {a, b, c}), b ∈ Mx or c ∈ Mx . Sincebcm is also a factor of�x(w),b ∈ Mx andc ∈ Mx . Letu andv be the words such thatfx(a) = uav. If u = ε thenfx(a)starts withac. But �bc(w) starts withb by Step 4. We have a contradiction with Eq. (1)which implies�bc(w) ∈ {�bc(fx(a))}∗.Sou �= ε and similarlyv �= ε. Sincebacm is a factor of�x(w), v starts withcm and

u ends withb (this implies that�1a is not a factor ofw). More precisely two cases arepossible.Case1: There exist integersk, l and wordsu0, . . . , uk, v0, . . . , vl such that

u = u0[k∏i=1bcmui

]�x(�1)b and v = cmv0

l∏i=1bcmvi.

Case2: There exist an integerl and words�′1, v0, . . . , vl such that�

′1 is a suffix of�x(�1),

u = �′1b and v = cmv0

l∏i=1bcmvi.

In the first case, taking

U = u0[k∏i=1bcmui

]�1b,

we getw ∈ (Cx ∪ {Uav})∗. In the second case,w ∈ (Cx\alph(�1) ∪ {�1bav})∗. In bothcases, we can construct a morphism that contradicts Hypothesis1. Soa ∈ Cx . �

Step6: For x /∈ {a, b, c} with a ∈ Cx , we haveb ∈ Cx or c ∈ Cx .

Proof. Assume by contradictionb ∈ Mx ∪ Ex andc ∈ Mx ∪ Ex . By Step 4, there existwordsw1, w2 such thatw = w1�1bacmw2 andba is not a factor ofw1�1 (we consider thefirst occurrence ofbacm inw). Sincec ∈ Mx ∪Ex anda ∈ Cx (by Step 5),cm�x(w2)muststart withfx(ex). Sofx(ex) starts withc. But b ∈ Mx ∪ Ex and by Step 4,�bc(w) startswith c. We get a contradiction with Eq. (1) which implies�bc(w) ∈ {�bc(fx(ex))}∗. �

126 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Step7: Card(alph(�1)) = 1.

Proof. Assume by contradiction Card(alph(�1))�2. By Hypothesis2 there existx ∈alph(�1) and y /∈ alph(�1) such thaty /∈ Cx . If for eachz ∈ alph(�1)\{x}, z ∈ Cx ,then by Step 4w ∈ ({x, fx(ex)} ∪ Cx)∗, a contradiction with Eq. (2). Thus there existsz ∈ alph(�1)\{x} such thatz ∈ Mx ∪ Ex .By Step 5,a ∈ Cx .Assumec ∈ Cx . Let us look at�{a,b,c,z}(w). By Step 4,�1bacm is a factor ofw. Sozbais

a factor of�{a,b,c,z}(w). Butbcm is also a factor ofw. Still by Step 4, such a factor must bepreceded inw by a letter inK1∪{a, c}. This impliesb ∈ Cx . It follows from what precedesand Step 4 that there exist a non-empty prefix�3 of �x(�1) (z ∈ alph(�3)) and a non-emptyword �2 (y ∈ alph(�2)) such thatfx(ex) = �2�3. Consequently withK2 = K1\alph(�2),from Step 4, we getw ∈ (K2∪ {�2�1a, bcm, �2�1bacm})∗. Let us consider the idempotentmorphismf ′

b defined on alph(w)\{b} by f ′b(a) = �2�1a, f

′b(t) = t for t ∈ K2 ∪ {c} and

f ′b(t) = ε for t ∈ alph(�2�1) = alph(w)\(K2 ∪ {a, b, c}). We have|f ′

b(a)| > |fb(a)|, acontradiction with Hypothesis3 sinceeb = a.Soc /∈ Cx . From Step 6,b ∈ Cx . Sinceacm is a factor of�x(w) anda ∈ Cx , the word

cm is a prefix offx(ex). From existence ofy andz, we deduce thatfx(ex) = cm�4�3 with�3 a non-empty prefix of�x(�1) and�4 a word. By Step 4,

w ∈ (bcm�4�1(bacm�4�1)

∗a ∪K1\alph(�4))∗.Let f ′

a be the morphism defined byf′a(b) = bcm�4�1, f

′a(t) = ε for t ∈ alph(c�4�1) and

f ′a(t) = t for t ∈ alph(w)\alph(abc�4�1). This idempotent morphism verifies|f ′

a(b)| >|fa(b)| = |bcm|, a contradiction with Hypothesis3. �

Step8:There exists a word�2 and a setK2 ⊆ alph(w) such that

w ∈ (K2 ∪ {�1a, �2bcm, �2�1bacm})∗withK2 ∩ alph(�2�1abc) = ∅ andalph(�2) ∩ alph(�1abc) = ∅.Proof. Recall that Step 4 statedw ∈ (K1 ∪ {�1a, bcm, �1bacm})∗. By Step 7, there existsa letterd such that alph(�1) = {d}. By Step 5,a ∈ Cd . If b ∈ Cd thenw ∈ ({�1, fd(ed)} ∪Cd)

∗, a contradiction with Eq. (2). Sob /∈ Cd and, sinceba is a factor of�d(w) by Step4, fd(ed) ends withb. Let �2 be the word such thatfd(ed) = �2b. By Step 6,c ∈ Cd . ByStep 4, since�1 ∈ d∗, we get

�d(w) ∈ (K1\alph(�2) ∪ {a, �2bcm, �2bacm})∗and so Step 8 is proved withK2 = K1\alph(�2). �

Step9: For x ∈ alph(�2), a ∈ Cx .The proof of this step, similar to that of Step 5, is left to the reader.

Step10: Card(alph(�2)) = 1.

Proof. Similarly to Step 7, assuming Card(alph(�2))�2, we can prove• there existx ∈ alph(�2) andy /∈ alph(�2) such thaty /∈ Cx .• there existsz ∈ (alph(�1)\{x}) ∩ (Ex ∪Mx).

F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128 127

• a ∈ Cc (by Step 9).• If c ∈ Cx thenb ∈ Cx . Moreover,fx(ex) = �3�4 with �4 a non-empty suffix of�x(�2)and�3 a non-empty word. This leads, withK3 = K2\alph(�2), to

w ∈ (K3 ∪ {�1a, �3�2bcm, �3�2�1bacm})∗.If d is the letter such that alph(�1) = {d}, we can observe that�2 was construct inStep 8 as the word such thatfd(ed) = �2b.We can construct a witnessf

′d of �d(w) such

thatf ′d(ed) = �3�2b, a contradiction with Hypothesis3.

• So c /∈ Cx . Because of factoracm in �x(w), fx(ex) starts withcm. But �cz(w) startswith z, a contradiction withz ∈ alph(fx(ex)) andw ∈ {fx(ex)}∗. �

Final contradiction. We have obtained forl = 1 the existence of a setK2l ⊆ alph(w)and the existence of 2l words�1, . . . , �2l such that• w ∈ (K2l ∪ {�2l−1 . . . �3�1a, �2l�2l−2 . . . �2bcm, �2l�2l−1 . . . �2�1bacm})∗,• K2l ∩ alph(�2l�2l−1 . . . �2�1bac) = ∅,• Card(alph(�i )) = 1 (1� i�2l),• alph(�i ) ∩ {a, b, c} = ∅ (1� i�2l),• alph(�i ) ∩ alph(�j ) = ∅ (1� i < j�2l),• �2l�2l−2 . . . �2bcm and�2l�2l−1 . . . �2�1bacm are factors ofw,• for x ∈ alph(�i ) (1� i�2l), a ∈ Cx .Assume this situation is true for an integerl�1. As done during Steps 4–9, we can findtwo words�2l+1, �2l+2, such that the situation above is true forl + 1 (with K2l+2 =K2l\alph(�2l+1�2l+2)).This implies thatw has an infinite number of letters: a final contradiction. This ends the

proof of Proposition6.2.

7. Conclusion

We have partially solved Conjecture4.1 (A question is: how to simplify the proof?).Actually we proved more. In Proposition5.1, Corollary 5.3 and Theorem6.1, we al-ways prove thatw has a witness with only one expansive letter. We wonder whether thisis more generally true. More precisely forw a word, let minCardExp(w) be the valuemin{Card(Ef ) | f witness ofw}.WhenConjecture4.1holds, dowehave :minCardExp(w)�max{minCardExp(�x(w))|x ∈ alph(w)}?There exist examples where the previous inequality becomes an equality. This is the case

in our results.This canalso happens in other cases. For instance, forw = dababecbcbecbcbdabab, we have Card(Ef ) = 2 for each witnessf of w and for each witnessf of �x(w)whatever isx in alph(w). We do not know any example of equality for a wordw over a4-letter alphabet with minCardExp(w) = 2.Inmost cases, theprevious inequality is certainly strict. For instance,withw = bdabdcbd

cbda, minCardExp(w) = 1 but max{minCardExp(�x(w))|x ∈ alph(w)} = 2.To find other examples, the reader can focus on primitive words. Indeed a consequence of

Theorem3.1 is: for w a word andn�1 an integer,f (w) = w if and only iff (wn) = wn.

128 F. Levé, G. Richomme / Theoretical Computer Science 339 (2005) 103–128

Previous examples are not concerned by our results. Indeed there existsx in alph(w)such that minCardExp(Efx ) = 2. This shows that Conjecture4.1 is still open on 4-letteralphabets.Another question is: how is the conjecture stated for (bi)infinite words?

Acknowledgements

The authors would like to thank P. Séébold for useful discussions and for having pointedout Refs.[13,14]. Thanks also to M. Billaud for communications of answers[11,22]to hisproblem.

References

[1] J. Berstel, J. Karhumäki, Combinatorics on words—a tutorial, Bull. Europ. Assoc. Theoret. Comput. Sci.EATCS 78 (2003) 178–228.

[2] M. Billaud, A problem with words, Letter in Newsgroup Comp.theory, 1993.[3] A. Carpi, A. De Luca, Words and special factors, Theoret. Comput. Sci. 259 (2001) 145–182.[4] A. Carpi, A. De Luca, On the distribution of characteristic parameters of words I, RAIRO Theoret. Inform.Appl. 36 (2002) 67–96.

[5] A. Carpi, A. De Luca, On the distribution of characteristic parameters of words II, RAIRO Theoret. Inform.Appl. 36 (2002) 97–127.

[6] C.Choffrut, J. Karhumäki, Handbookof Formal Languages,Vol. 1:Combinatorics ofWords, Springer, Berlin,1997, pp. 329–438 (Chapter 6).

[7] R. Cori, D. Perrin, Automates et commutations partielles, RAIRO Theoret. Inform. Appl. 19 (1) (1985)21–32.

[8] V. Diekert, Y. Métivier, Handbook of Formal Languages, Vol. 3, Partial Commutation and Traces, Springer,Berlin, 1997, pp. 457–533 (Chapter 8).

[9] V. Diekert, G. Rozenberg (Eds.), The Book of Traces, World Scientific, Singapore, 1995.[10] C. Duboc, On some equations in free partially commutative monoids, Theoret. Comput. Sci. 46 (1986)

159–174.[11] A. Geser, Your “problem with words”, Private communication to M. Billaud, 1993.[12] D. Hamm, J. Shallit, Characterization of finite and one-sided infinite fixed points of morphisms on free

monoids. Technical Report CS-99-17, July 1999, seehttp://www.math.uwaterloo.ca/∼shallit/papers.html.[13] T. Head, Fixed languages and the adult languages of 0L schemes, Internat. J. Comput. Math. 10 (1981)

103–107.[14] T. Head, B. Lando, Fixed and stationary-words and-languages, in: G. Rozenberg, A. Salomaa (Eds.),

The Book of L, Springer, Berlin, 1986, pp. 147–156.[15] G.T. Herman, A.Walker, Context free languages in biological systems, Internat. J. Comput. Math. SectionA

4 (1975) 369–391.[16] F. Levé, G. Richomme, On a conjecture about finite fixed points of morphisms (extended abstract), actes

deWORDS’03 (4th Internat. Conf. on Combinatorics onWords), Turku (Finland), TUCS (Turku Centre forComputer Science) General Publication, Vol. 27, 2003, pp. 198–206.

[17] F. Levé, P. Séébold, Proof of a conjecture on word complexity, Bull. Belgian Math. Soc. 8 (2001) 277–291.[18] M. Lothaire, Combinatorics on words, Encyclopedia of Mathematics, Vol. 17, Addison-Wesley, Reading,

MA, 1983. Reprinted from Cambridge University Press, Cambridge Mathematical Library, Cambridge, UK,1997.

[19] M. Lothaire, Algebraic Combinatorics on Words, Encyclopedia of Mathematics, Vol. 90, CambridgeUniversity Press, Cambridge, UK, 2002.

[20] J. Shallit, L.-W. Wang, On two-sided infinite fixed points of morphisms, Theoret. Comput. Sci. 270 (2002)659–675.

[21] J. van Leeuwen, On the fixpoints of monogenic functions in free monoids, Semigroup Forum 10 (1975)315–328.

[22] P. Zimmermann, Re: a problem with words, Private communication to M. Billaud, 1993.