Abelian-primitive partial words

30
Abelian-Primitive Partial Words * F. Blanchet-Sadri 1 Nathan Fox 2 October 31, 2012 Abstract In this paper we count the number of abelian-primitive partial words of a given length over a given alphabet size, which are partial words that are not abelian powers. Partial words are sequences that may have undefined positions called holes. This combinatorial problem was considered recently for full words (those without holes). It turns out that, even for the full word case, it is a nontrivial problem as opposed to the counting of the number of primitive full words, well-known to be easily derived using the M¨ obius function. Keywords: Algorithms on words; Combinatorics on words; Partial words; Abelian-primitive partial words. 1 Introduction A nonprimitive word is a power of a shorter word, that is, it has the form u m where m> 1. A primitive word is any word that is not nonprimitive. For example, abbabb is nonprimitive with primitive root abb, while abbbab is primitive. Primitive words have been much investigated due to their fundamental importance in algorithms and combinatorics on words: deciding if a word is primitive can be done in linear time in the length of the word [10]; a nonprimitive word can have only one primitive root; the number of primitive words of a fixed length over an alphabet of a fixed size is related to the M¨ obius function [16]; a long-standing open problem is whether or not the set of primitive words is a context-free language [13, 14]. Now, an abelian-nonprimitive word is an abelian power of a shorter word. Equivalently, an abelian-nonprimitive word has the form u 1 u 2 ··· u m , where m> 1 and each u i is a permutation of u 1 (here, p = |u 1 | is an exact abelian period). An abelian-primitive word is any word that is not abelian-nonprimitive. Returning to the example above, abbbab is abelian-nonprimitive since bab is a permutation of abb. In [12], Domaratzki and Rampersad showed that the set of abelian-primitive words is not a context-free language, showed that whether a word is in this set can be decided in linear time, found that an abelian-nonprimitive word can have more than one abelian root, and finally considered the enumeration of abelian primitive words. They left as an open problem the counting of the number of abelian-primitive words of a given length over a given alphabet size. Related works include [9, 11, 15, 17]; in particular, Richmond and Shallit counted the number of abelian squares and gave an asymptotic estimate [17]. * This material is based upon work supported by the National Science Foundation under Grant No. DMS–1060775. We thank the referees of a preliminary version of this paper for their very valuable comments and suggestions. 1 Department of Computer Science, University of North Carolina, P.O. Box 26170, Greensboro, NC 27402–6170, USA, [email protected] 2 Department of Mathematics, University of Minnesota-Twin Cities, 127 Vincent Hall, 206 Church St. SE, Min- neapolis, MN 55455, USA 1

Transcript of Abelian-primitive partial words

Abelian-Primitive Partial Words∗

F. Blanchet-Sadri1 Nathan Fox2

October 31, 2012

Abstract

In this paper we count the number of abelian-primitive partial words of a given length overa given alphabet size, which are partial words that are not abelian powers. Partial words aresequences that may have undefined positions called holes. This combinatorial problem wasconsidered recently for full words (those without holes). It turns out that, even for the full wordcase, it is a nontrivial problem as opposed to the counting of the number of primitive full words,well-known to be easily derived using the Mobius function.

Keywords: Algorithms on words; Combinatorics on words; Partial words; Abelian-primitivepartial words.

1 Introduction

A nonprimitive word is a power of a shorter word, that is, it has the form um where m > 1. Aprimitive word is any word that is not nonprimitive. For example, abbabb is nonprimitive withprimitive root abb, while abbbab is primitive. Primitive words have been much investigated dueto their fundamental importance in algorithms and combinatorics on words: deciding if a word isprimitive can be done in linear time in the length of the word [10]; a nonprimitive word can haveonly one primitive root; the number of primitive words of a fixed length over an alphabet of a fixedsize is related to the Mobius function [16]; a long-standing open problem is whether or not the setof primitive words is a context-free language [13, 14].

Now, an abelian-nonprimitive word is an abelian power of a shorter word. Equivalently, anabelian-nonprimitive word has the form u1u2 · · ·um, where m > 1 and each ui is a permutation ofu1 (here, p = |u1| is an exact abelian period). An abelian-primitive word is any word that is notabelian-nonprimitive. Returning to the example above, abbbab is abelian-nonprimitive since bab isa permutation of abb. In [12], Domaratzki and Rampersad showed that the set of abelian-primitivewords is not a context-free language, showed that whether a word is in this set can be decided inlinear time, found that an abelian-nonprimitive word can have more than one abelian root, andfinally considered the enumeration of abelian primitive words. They left as an open problem thecounting of the number of abelian-primitive words of a given length over a given alphabet size.Related works include [9, 11, 15, 17]; in particular, Richmond and Shallit counted the number ofabelian squares and gave an asymptotic estimate [17].

∗This material is based upon work supported by the National Science Foundation under Grant No. DMS–1060775.We thank the referees of a preliminary version of this paper for their very valuable comments and suggestions.

1Department of Computer Science, University of North Carolina, P.O. Box 26170, Greensboro, NC 27402–6170,USA, [email protected]

2Department of Mathematics, University of Minnesota-Twin Cities, 127 Vincent Hall, 206 Church St. SE, Min-neapolis, MN 55455, USA

1

On the other hand, primitive partial words were introduced in [2] (see also [4]). Partial wordsare sequences that may have undefined positions called holes (a (full) word is simply a partial wordwithout holes). Combinatorics on partial words was initiated in [1] and has been developing since(see, for instance, [3]). The counting of primitive partial words seems to be a very challenging com-binatorial problem [5]. In the context of partial words, recent works related to abelian repetitionsinclude [6, 7, 8].

In this paper, we settle Domaratzki and Rampersad’s open problem by giving a closed formulafor the number of abelian-nonprimitive words of length n over a k-letter alphabet (Theorem 2).We then extend our results to abelian-primitive partial words, which are partial words whosecompletions are abelian-primitive.

The contents of our paper is as follows: In Section 2, we count, given a set of positive integers{pi} and a positive integer n, the number of full words of length n lcm {pi} over an alphabet ofsize k with exact abelian periods {pi} (see the formula in Theorem 1; here, lcm {pi} is the leastcommon multiple of the pi’s). Applying the principle of Inclusion-Exclusion on the prime factors ofn to this result gives a formula for the number of abelian-nonprimitive full words of length n overan alphabet of size k. In Section 3, we prove that if p and q are relatively prime integers greaterthan one, then there exists a partial word of length pq with one hole having exact abelian periodsp and q but not period 1. Moreover, such a word exists with the hole in any desired position,and all such words have the same Parikh vector, up to exchanging letters. The proof is based onan algorithm to construct such a partial word that allows the hole to take any position, and thecounts of the letters do not depend on the hole’s placement. Moreover, we show how our algorithmcan be modified to count the number of abelian-nonprimitive words with relatively prime abelianperiods p and q and one hole in O

((p + q)2 f (p)

)time, where f (p) is the maximum time it takes

to compute(pr

)for some r. In Section 4, we discuss the case of abelian-nonprimitive partial words

with arbitrarily many holes. We provide a lower bound for the minimum hole count for partialwords with given exact abelian periods and we give an algorithm to compute the lexicographicallyleast such partial word (where the hole symbol is lexicographically after every letter). Finally inSection 5, we conclude with some remarks and future research directions.

We end this section by discussing some preliminary concepts used throughout the paper.A partial word of length n over an alphabet A is a partial function w : {0, . . . , n− 1} → A. We

denote by |w| the length of partial word w. A value for which w is undefined is called a hole ofw, denoted by �. As an example, ab�abba�b is a partial word of length 9 over the alphabet {a, b}where w(2) and w(7) are holes of w. If w has no holes, we call it a full word. A completion of apartial word w is a full word built by filling in the holes of w with letters from A.

We denote by ε the empty word of length zero. A nonempty block or factor of a partial wordw consists of consecutive values w(i)w(i + 1) · · ·w(j) for some i, j such that 0 ≤ i ≤ j < |w| and isdenoted by w[i..j] or w[i..j + 1).

Given a partial word w over an alphabet of k letters, denoted by 1, . . . , k, its Parikh vector isthe vector with k entries where the `th entry is the number of occurrences of letter ` in w. If ~v isa Parikh vector, we denote the sum of its entries by |~v|, which we call ~v’s length, and we denoteits `th entry by ‖~v‖`. Partial words w1 and w2 are abelian-equivalent if there exist completions w′

1

of w1 and w′2 of w2 such that w′

1 and w′2 have the same Parikh vector. A Parikh vector ~v is an

exact abelian period of a partial word w (we also say that p = |~v| is an exact abelian period of w

and that w is exactly abelian p-periodic) if |~v| divides |w| and for each integer i with 0 ≤ i < |w||~v|

we have that w [i |~v| .. (i + 1) |~v|) has a completion with Parikh vector ~v (note that the definitionof an abelian period given by Constantinescu and Ilie in [9] is different from the definition in ourpaper). A partial word w is abelian-nonprimitive if it has an exact abelian period ~v with |~v| < |w|.

2

A partial word w is abelian-primitive otherwise. For example, consider

w = abbbbabb�babbb�abbb�babbbabb�babb�bbabbbabbb�abbbbabb�babbb�. (1)

It is easy to check that the Parikh vector ~v = (1, 3) (or abbb) is an exact abelian period of w. Here,|~v| = 4, ‖~v‖1 = 1, and ‖~v‖2 = 3.

2 Abelian-Nonprimitive Full Words

In this section, we expand Domaratzki and Rampersad’s method of counting abelian-primitive andabelian-nonprimitive words to a method of counting all such words of a given length.

First, we construct a matrix. Later, we show that this matrix is a component of a linear systemthat describes precisely the words which we will be counting.

Definition 1. Given a sequence of positive integers {pi} = {pi}0≤i<m and a positive integer k, letM ({pi} , k) be the square block matrix defined as follows:

M ({pi} , k) =

`0Ik 0 0 · · · 0 −`m−1Ik

0 `1Ik 0 · · · 0 −`m−1Ik

0 0 `2Ik · · · 0 −`m−1Ik...

......

......

0 0 0 · · · `m−2Ik −`m−1Ik

0 0 0 · · · 0 Uk

,

where Ik is the k × k identity matrix, `j = lcm{pi}pj

for 0 ≤ j < m, and Uk is the k × k matrix withones on the diagonal and in the first row.

For example, the matrix M ({4, 5, 6, 7} , 3) is equal to

105 0 0 0 0 0 0 0 0 −60 0 00 105 0 0 0 0 0 0 0 0 −60 00 0 105 0 0 0 0 0 0 0 0 −600 0 0 84 0 0 0 0 0 −60 0 00 0 0 0 84 0 0 0 0 0 −60 00 0 0 0 0 84 0 0 0 0 0 −600 0 0 0 0 0 70 0 0 −60 0 00 0 0 0 0 0 0 70 0 0 −60 00 0 0 0 0 0 0 0 70 0 0 −600 0 0 0 0 0 0 0 0 1 1 10 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 1

.

Now, we prove a lemma that allows us to invert the matrices from Definition 1.

Lemma 1. Let {Ei}1≤i≤m be a set of invertible square matrices and let {Qj}1≤j<m be a set ofmatrices, where Qj has the same number of rows as Em and the same number of columns as Ej.Given a block matrix

3

M =

E1 0 0 · · · 0 00 E2 0 · · · 0 00 0 E3 · · · 0 0...

......

......

0 0 0 · · · Em−1 0Q1 Q2 Q3 · · · Qm−1 Em

,

the inverse is

M−1 =

E−11 0 0 · · · 0 00 E−1

2 0 · · · 0 00 0 E−1

3 · · · 0 0...

......

......

0 0 0 · · · E−1m−1 0

−E−1m Q1E

−11 −E−1

m Q2E−12 −E−1

m Q3E−13 · · · −E−1

m Qm−1E−1m−1 E−1

m

.

Similarly, given a block matrix

M =

E1 0 0 · · · 0 Q1

0 E2 0 · · · 0 Q2

0 0 E3 · · · 0 Q3...

......

......

0 0 0 · · · Em−1 Qm−1

0 0 0 · · · 0 Em

,

the inverse is

M−1 =

E−11 0 0 · · · 0 −E−1

1 Q1E−1m

0 E−12 0 · · · 0 −E−1

2 Q2E−1m

0 0 E−13 · · · 0 −E−1

3 Q3E−1m

......

......

...0 0 0 · · · E−1

m−1 −E−1m−1Qm−1E

−1m

0 0 0 · · · 0 E−1m

.

Now, we actually invert the matrices from Definition 1.

Lemma 2. The marix (M ({pi} , k))−1 is equal to

1`0

Ik 0 0 · · · p0

pm−1(Uk)

−1

0 1`1

Ik 0 · · · p1

pm−1(Uk)

−1

0 0 1`2

Ik · · · p2

pm−1(Uk)

−1

......

.... . .

...0 0 0 · · · (Uk)

−1

,

where (Uk)−1 is the k × k matrix with ones on the diagonal and negative ones on the off-diagonal

first-row entries.

4

Proof. First, notice that the claimed inverse of Uk is, in fact, the inverse of Uk. Lemma 1 appliesto M ({pi} , k). The inverses of the diagonal entries (other than Uk) are precisely the reciprocalmultiples of the identity, as claimed. Finally, the last column of the inverse has entries of the form

− (`j−1Ik)−1 (−`m−1Ik) (Uk)

−1 =`m−1

`j−1(Uk)

−1 =lcm{pi}pm−1

lcm{pi}pj−1

(Uk)−1 =

pj−1

pm−1(Uk)

−1 ,

as required.

For example, M ({4, 5, 6, 7} , 3)−1 is equal to

1105 0 0 0 0 0 0 0 0 4

7 −47 −4

70 1

105 0 0 0 0 0 0 0 0 47 0

0 0 1105 0 0 0 0 0 0 0 0 4

7

0 0 0 184 0 0 0 0 0 5

7 −57 −5

70 0 0 0 1

84 0 0 0 0 0 57 0

0 0 0 0 0 184 0 0 0 0 0 5

7

0 0 0 0 0 0 170 0 0 6

7 −67 −6

70 0 0 0 0 0 0 1

70 0 0 67 0

0 0 0 0 0 0 0 0 170 0 0 6

7

0 0 0 0 0 0 0 0 0 1 −1 −10 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 1

.

We now introduce some definitions of a number-theoretic flavor.

Definition 2. Given a set of abelian periods {pi}, we define the set

PER {pi} = {n ∈ Z | 0 ≤ n ≤ lcm {pi} and there exists j such that pj divides n}.

We treat PER {pi} as a set with a total ordering given by the values of its elements and weindex the elements from zero. The notation PER {pi}j denotes the (j + 1)st smallest element inPER {pi}. For example, if p0 = 4 and p1 = 6, then PER {pi} = {0, 4, 6, 8, 12} and PER {pi}0 = 0,PER {pi}1 = 4, PER {pi}2 = 6, PER {pi}3 = 8, and PER {pi}4 = 12.

Definition 3. Given a set of abelian periods {pi}, we define the multiset

GAP {pi} =‖PER{pi}‖−1⊎

j=1

{PER {pi}j − PER {pi}j−1

}.

Returning to the above example with p0 = 4 and p1 = 6, GAP {pi} = {4, 2, 2, 4}.

2.1 Number of full words with given abelian periods

We now have the necessary framework in place, so we can state and prove the following main resultabout counting abelian-nonprimitive words.

Theorem 1. Given a positive integer n and a set of positive integers {pi}0≤i<m, the number offull words of length n lcm {pi} over an alphabet of size k with abelian periods {pi} is given by

∑x1+x2+···+xk=gcd{pi}

∏g∈GAP{pi}

(g

gx1

gcd{pi}gx2

gcd{pi} · · · gxkgcd{pi}

)n .

The sum∑

runs over all partitions x1 + x2 + · · ·+ xk of gcd {pi}.

5

Proof. We give a construction that yields this many words and then prove that the constructionyields all such words.

Let {pi}0≤i<m be in increasing order. For the `th letter of the alphabet, let y`,i be the numberof times that character should appear in each pi-block that starts at a position congruent to zeromodulo pi. Over the course of the entire word, there is a total of ny`,i

lcm{pj}pi

of letter `. Hence,

for every pair i1 and i2, we have ny`,i1lcm{pj}

pi1= ny`,i2

lcm{pj}pi2

, or simply y`,i1lcm{pj}

pi1= y`,i2

lcm{pj}pi2

.

This gives a total of(m2

)= m(m−1)

2 equations, from which we can select m − 1 that are linearlyindependent. We choose these m− 1 to be the equations of the form y`,i

lcm{pj}pi

= y`,im−1

lcm{pj}pim−1

for0 ≤ i < m− 1.

Another relation that we know holds is that

k∑`=1

y`,i = pi.

Two or more of the equations of this form are linearly dependent with the equations from aboveand each other, so we only include one of them:

k∑`=1

y`,m−1 = pm−1.

Finally, we introduce k−1 dummy equations of the form y`,m−1 = y`,m−1 for 2 ≤ ` ≤ k. The valuesof the y`,i’s can yield words with all of the desired abelian periods only if they are integers. Hence,we must discover in what situation the above-described linear system admits integer solutions.

Consider the ordering on the y`,i’s where y`1,i1 precedes y`2,i2 if i1 < i2 or i1 = i2 and `1 < `2.In this case, the matrix of coefficients of the system we have described is precisely M ({pi} , k). Fornotational simplicity, let M = M ({pi} , k). Solving the system corresponds to finding the vectorof y`,i values (which we denote by ~y) such that M~y = ~v, where ~v is all zeroes except for its lastk entries, the first of which is pm−1 and the rest of which are the y`,m−1’s ordered by increasing `beginning with ` = 2. The solution to this system is given by ~y = M−1~v. Let ~u be the k-dimensionalvector that consists of the last k entries of ~v. By Lemma 2,

(y`,i) = ~y = M−1~v

=

1`0

Ik 0 0 · · · p0

pm−1(Uk)

−1

0 1`1

Ik 0 · · · p1

pm−1(Uk)

−1

0 0 1`2

Ik · · · p2

pm−1(Uk)

−1

......

.... . .

...0 0 0 · · · (Uk)

−1

000...~u

=

0 0 0 · · · p0

pm−1(Uk)

−1 ~u

0 0 0 · · · p1

pm−1(Uk)

−1 ~u

0 0 0 · · · p2

pm−1(Uk)

−1 ~u...

......

. . ....

0 0 0 · · · (Uk)−1 ~u

6

=

p0

pm−1~w

p1

pm−1~w

p2

pm−1~w

...~w

,

where

~w = (Uk)−1 ~u =

pm−1 −

k∑=2

y`,m−1

y2,m−1

y3,m−1...

yk,m−1

=

y1,m−1

y2,m−1

y3,m−1...

yk,m−1

(the first entry of ~w is a number; the rest are variables).

Hence, in order for the entries of M−1~v to be integers, it is required that pm−1 divides each ofthe pi times each of the y`,m−1’s for ` > 1 (and, it can be deduced that it must divide for ` = 1too). Hence, for each 0 ≤ i < m− 1 and each `, we can write pm−1 | piy`,m−1. Dividing both sides

by gcd (pi, pm−1) yields pm−1

gcd(pi,pm−1) |pi

gcd(pi,pm−1)y`,m−1. Since, gcd(

pm−1

gcd(pi,pm−1) ,pi

gcd(pi,pm−1)

)= 1,

it must be that pm−1

gcd(pi,pm−1) | y`,m−1. Taking all possible i for each `, we get that pm−1

gcd{pi} | y`,m−1.We also know that 0 ≤ y`,i ≤ pi. For the rest of the proof, denote y`,m−1

pm−1gcd {pi} by x`, which can

take any integer value in [0.. pi

gcd{pi} ].We now show that any choice of the x` values whose sum is gcd {pi} yields a nonzero number

of words of length n lcm {pi} with the desired abelian periods. Furthermore, we count these wordsfor each choice of the x` values.

For 1 ≤ ` ≤ k, let x` be a nonnegative integer, and let

k∑`=1

x` = gcd {pi} .

We begin by constructing one full word with abelian periods {pi}. By Definition 2, we can spliteach block of lcm {pi} positions into factors of length PER {pi}j − PER {pi}j−1. We know thatgcd {pi} must divide the lengths of all of these factors. Hence, we can fill each such factor of lengthg by gx1

gcd{pi} of the first letter followed by gx2

gcd{pi} of the second letter, and so on, up to gxkgcd{pi} of

the kth letter. This process results in each pi-block having exactly y`,i of letter `, so this word hasall the desired abelian periods.

Note, now, that if we permute the letters in any of the factors of variable length g, we preserveall of the abelian periods. Hence, we know that this construction generates a lower bound for thenumber of words with the given abelian periods. Also, since the linear system we constructed hasspecific solutions, all of which lead to this construction, this also gives an upper bound. Hence, thenumber of words with the given x` values with all of the desired abelian periods is given by thetotal number of permutations within each variable length factor independently. In each of the nblocks of length lcm {pi} in the whole word, there is one factor of length g for each g ∈ GAP {pi}.Hence, the total number of words with all of the desired abelian periods for the values of x` is∏

g∈GAP{pi}

(g

gx1

gcd{pi}gx2

gcd{pi} · · · gxkgcd{pi}

)n

.

7

Summing this over all possible choices of the x` values yields the desired formula.

Example 1. To illustrate the construction in the proof of Theorem 1, given alphabet size k = 2and abelian periods p0 = 4 and p1 = 6, we construct words with abelian periods p0 and p1. Initialcomputations show that gcd {pi} = 2, lcm {pi} = 12, and PER {pi} = {0, 4, 6, 8, 12}. We spliteach 12-block into factors of length PER {pi}j − PER {pi}j−1, that is, factors of length 4, 2, 2, 4.Choosing x1 = x2 = 1, we fill each such factor of length 2 by 2x1

gcd{pi} a’s followed by 2x2gcd{pi} b’s

and we fill each such factor of length 4 by 4x1gcd{pi} a’s followed by 4x2

gcd{pi} b’s. This process resultsin (aabb)(ab)(ab)(aabb). The word (abab)(ab)(ba)(bbaa) with the abelian periods p0 and p1 can beobtained by permuting letters within the factors. The word generated if x1 = 2 and x2 = 0 isaaaaaaaaaaaa.

The formula in Theorem 1 contains GAP {pi}, though all we really care about are the valuesand their frequencies in that multiset, as multiplication is commutative. Hence, if we can figureout that information more quickly than computing the entire multiset, we can have a more rapidly(though not asymptotically rapidly) computable formula. In the case of only one or two abelianperiods, we can glean more information about GAP {pi}. This leads to the following two corollaries.

Corollary 1. Given positive integers n and p, the number of full words of length np over analphabet of size k with abelian period p is∑

x1+x2+···+xk=p

(p

x1 x2 · · · xk

)n

.

Proof. In this case, PER {p} = {0, p} and GAP {p} = {p}.

Corollary 2. Let n be a positive integer, let p and q be positive integers such that p < q, and setp = p′d and q = q′d where gcd(p, q) = d. The number of full words of length n lcm (p, q) over analphabet of size k that are exactly abelian p-periodic and exactly abelian q-periodic is

∑x1+x2+···+xk=d

((p

p′x1 p′x2 · · · p′xk

)q′−p′+1

p′−1∏i=1

((id

ix1 ix2 · · · ixk

)(p− id

(p′ − i) x1 (p′ − i) x2 · · · (p′ − i) xk

))n

.

In the case of k = 2, this becomes

d∑x=0

( p

p′x

)q′−p′+1 p′−1∏i=1

((id

ix

)(p− id

(p′ − i) x

))n

.

Proof. Given only two abelian periods p < q, we can compute the frequencies of the elements ofGAP {p, q}. Without loss of generality, assume that p and q are relatively prime, as GAP {p, q} =d GAP {p′, q′}. Each element of GAP {p, q} can be no more than p. More specifically, instancesof elements less than p occur in pairs, and they occur whenever a q-abelian period ends inside ap-abelian period. Since each of the congruences aq ≡ b (mod p) for 1 ≤ b < p has exactly onesolution for 0 ≤ a < pq, there is exactly one split of a p-abelian period at each of those positions.

8

Each of these splits contributes a b-term and a p − b term to GAP {p, q}. The rest of the terms(q′ − p′ + 1 of them) are p. We can write Theorem 1 as

∑x1+x2+···+xk=gcd{pi}

∏g∈GAP{pi}

(g

gx1

gcd{pi}gx2

gcd{pi} · · · gxkgcd{pi}

)n

=∑

x1+x2+···+xk=d

((p

p′x1 p′x2 · · · p′xk

)q′−p′+1

p′−1∏i=1

((id

ix1 ix2 · · · ixk

)2)n

=∑

x1+x2+···+xk=d

((p

p′x1 p′x2 · · · p′xk

)q′−p′+1

p′−1∏i=1

((id

ix1 ix2 · · · ixk

)(p− id

(p′ − i) x1 (p′ − i) x2 · · · (p′ − i) xk

))n

,

as required.

2.2 Number of abelian-nonprimitive full words with given length

The next theorem is the other main result about counting abelian-nonprimitive words. Givena length and using what we have shown already, we can determine exactly how many abelian-nonprimitive words of that length there are.

Theorem 2. Let n ≥ 2 be a positive integer whose set of prime factors is P = {pi}. Let Q ={npi

}= {qi}. Give Q the standard total ordering, and let the indices range from 0 to ‖Q‖− 1. The

number of abelian-nonprimitive full words of length n over an alphabet of size k is given by

∑R ⊆ QR 6= ∅

(−1)‖R‖+1∑

x1+x2+···+xk=gcd(R)

∏g∈GAP(R)

(g

gx1

gcd(R)gx2

gcd(R) · · · gxkgcd(R)

) nlcm(R)

.

Proof. First, we show that any factor of n besides n itself must divide at least one of the qi’s. Writen’s prime factorization as n = pα0

0 pα11 · · ·

(p‖P‖−1

)α‖P‖−1 , and let m | n and m < n. The prime

factorization of m must be of the form m = pβ00 pβ1

1 · · ·(p‖P‖−1

)β‖P‖−1 , where for all i, βi ≤ αi andthere exists j such that βj < αj . Hence, by the definition of Q, we know that m | qj , as required.

Next, note that since every factor of n other than n itself divides one of the qi’s, if a word oflength n has as abelian period any of n’s nonself factors, then it also has as abelian period at leastone of the qi’s. Thus, only counting abelian-nonprimitive words of length n with abelian periodsin Q actually amounts to counting all abelian-nonprimitive words of length n.

Finally, for each of the qi’s we count the number of words that have abelian period qi, whichis given by Corollary 1 (or by Theorem 1). By the Principle of Inclusion-Exclusion, we must thensubtract all words that have pairs of those abelian periods, add back those with three of them, andso on, up to either adding or subtracting (depending on ‖Q‖) the number of words with all of theabelian periods. These formulas are all obtained from Theorem 1, and adding them together inalternating fashion yields the desired formula.

9

3 Abelian-Nonprimitive Partial Words with One Hole

In this section, we discuss results about abelian primitivity of partial words with exactly one hole.We show that having even only one hole greatly increases the flexibility of what abelian periods aword can have. We begin with the following lemma.

Lemma 3. Let p, q, and n, p < q, be positive integers. We have n⌊

qp

⌋+⌊

n(q mod p)p

⌋=⌊

nqp

⌋.

Proof. Note that⌊

qp

⌋= q

p −q mod p

p ,⌊

n(q mod p)p

⌋= n(q mod p)

p − (n(q mod p)) mod pp , and

⌊nqp

⌋=

nqp −

(nq) mod pp . Applying these facts yields

n

⌊q

p

⌋+⌊

n (q mod p)p

⌋= n

(q

p− q mod p

p

)+

n (q mod p)p

− (n (q mod p)) mod p

p

=nq

p− (n (q mod p)) mod p

p

=nq

p− (nq) mod p

p

=⌊

nq

p

⌋,

as required.

Now, we define another number-theoretic construct. For positive integers p and q, we set p = p′dand q = q′d, where gcd(p, q) = d.

Definition 4. For positive integers p and q, let

sqp,q (x) =⌊

(x + 1) (q′ mod p′)p′

⌋−⌊

x (q′ mod p′)p′

⌋− c,

where c = 0 if x ≡ 0 mod p′ and c = 1 otherwise.

We now prove a few simple properties about sq, as this function is important in our furtherdiscussion.

Lemma 4. For integers 0 ≤ x < p′, there are exactly p′ − (q′ mod p′) − 1 values of sqp,q (x) thatequal −1; the rest all equal 0.

Proof. The value of sqp,q (x) for x in the desired domain is −1 precisely when x 6= 0 and(gcd (p, q) (x + 1) q′

)mod p′ >

(gcd (p, q) xq′

)mod p′.

This occurs for all values of x > 0 such that(xq′)

mod p′ < p′ −(q′ mod p′

).

By the Chinese Remainder Theorem, each such value from 0 to p′ − (q′ mod p′) (not inclusive ateither end) occurs once for 0 ≤ x < p′. There are exactly p′ − (q′ mod p′)− 1 such values, so thatis the number of −1 values.

Lemma 5. The function sqp,q (x) is p′-periodic.

10

Proof. Let y = x + mp′ for some integer m. Note that the c values for x and y are the same. Wehave ⌊

(y + 1) (q′ mod p′)p′

⌋−⌊

y (q′ mod p′)p′

⌋− c

=⌊

(x + mp′ + 1) (q′ mod p′)p′

⌋−⌊

(x + mp′) (q′ mod p′)p′

⌋− c

=⌊

(x + 1) (q′ mod p′) + mp′ (q′ mod p′)p′

⌋−⌊

x (q′ mod p′) + mp′ (q′ mod p′)p′

⌋− c

=⌊

(x + 1) (q′ mod p′)p′

+ m(q′ mod p′

)⌋−⌊

x (q′ mod p′)p′

+ m(q′ mod p′

)⌋− c

=⌊

(x + 1) (q′ mod p′)p′

⌋+ m

(q′ mod p′

)−⌊

x (q′ mod p′)p′

⌋−m

(q′ mod p′

)− c

=⌊

(x + 1) (q′ mod p′)p′

⌋−⌊

x (q′ mod p′)p′

⌋− c,

as required.

Lemma 6. The equality sqp,q = sqp′,q′ holds.

Proof. First, note that c is the same for a given x in both cases, as gcd (p′, q′) = 1. Dividing p andq by their greatest common divisor causes the numerator to decrease by a factor of their greatestcommon divisor and it causes the denominator to decrease by that same factor. Hence, sqp′,q′ givesthe same result as sqp,q.

3.1 Main result for partial words with one hole

In this section, we show how to construct partial words with one hole with abelian periods notrealized by full words.

Let p and q be relatively prime integers greater than one. We give an algorithm (Algorithm 1)to construct a partial word with exactly one hole of length pq that is abelian p-periodic and abelianq-periodic but not 1-periodic. This algorithm allows the hole to take any position, and the countsof the letters do not depend on the hole’s placement. Also, we exclusively work over the alphabet{a, b}, as for a given alphabet, we can do this construction with any sub-alphabet of size 2.

We begin by finding properties of full words u = wbw′ and v = waw′ of length pq over thebinary alphabet {a, b} such that u is abelian p-periodic and v is abelian q-periodic (and ww′ is afull word of length pq − 1), if such words exist. Let xa be the number of a’s in one abelian periodof u, xb the number of b’s in one abelian period of u, ya the number of a’s in one abelian period ofv, and yb the number of b’s in one abelian period of v. We can see immediately that xaq = yap− 1and xbq = ybp + 1. We also know that ya + yb = q (and that xa + xb = p, though this can bededuced from the other three equations). Hence, these equations along with the dummy equationyb = yb give the linear system

q 0 −p 00 q 0 −p0 0 1 10 0 0 1

xa

xb

ya

yb

=

−11qyb

.

11

We note that the inverse of the 4× 4 matrix (which is M ({p, q} , 2)) is1q 0 p

q −pq

0 1q 0 p

q

0 0 1 −10 0 0 1

,

so the solution is given byxa

xb

ya

yb

=

1q 0 p

q −pq

0 1q 0 p

q

0 0 1 −10 0 0 1

−11qyb

=

p− 1+pyb

q1+pyb

q

q − yb

yb

.

If words u and v exist, all the entries in the solution vector must be integers. This happens if andonly if q divides (1 + pyb). This occurs if and only if yb ≡ −p−1 (mod q). More specifically, wecan say that yb = q −

(p−1 mod q

), as this is the only such value in the interval [0, q]. If we can

construct such words u and v, their greatest lower bound w�w′ is a partial word of length pq withboth abelian periods p and q that is not 1-periodic and has exactly one hole. We do not constructu or v; rather, we construct their greatest lower bound directly.

Our algorithm, based on the above, is described as follows:

1. Let h be the position of the hole and let w be the word built so far (initialized to ε).

2. Keep track of how many of each letter are still needed in the current abelian periods (invariables called nxa , nxb

, nya , and nybthat are updated appropriately throughout).

3. For each pair (r, s) of consecutive elements in PER {p, q}:

(a) In the case where p divides s, if r ≤ h < s then w ← wanxa bnxb−1 and insert a hole in

the proper position; otherwise w ← wanxa bnxb ;

(b) In the case where q divides s, if r ≤ h < s then w ← wanya−1bnyb and insert a hole inthe proper position; otherwise w ← wanya bnyb .

4. Return w.

Essentially, Algorithm 1 goes over every block between two elements of PER {p, q}, and it fillsin the remaining letters of whichever block ends there (as only one does, unless it is the end). Itcan fill these in any order, but it chooses to put all of the a’s before all of the b’s.

Example 2. Given the parameters p = 3, q = 5, and h = 7, Algorithm 1’s initial computationsare xa = 1, xb = 2, ya = 2, yb = 3, and PER {p, q} = {0, 3, 5, 6, 9, 10, 12, 15}. It outputs the partialword abbabba�bbababb which has a hole in position 7 and which is abelian 3- and 5-periodic, but not1-periodic.

Theorem 3. Given as input two relatively prime integers p and q greater than one and a nonneg-ative integer h such that 0 ≤ h < pq, Algorithm 1 outputs a partial word of length pq with exactlyone hole at position h having abelian periods p and q but not period 1. Moreover, all such wordshave the same Parikh vector, up to exchanging letters.

12

Algorithm 1 GENERATE ABPQ ONE HOLE (p, q, h)Require: p and q relatively prime positive integersRequire: h nonnegative integer (hole position), 0 ≤ h < pq

if q < p thenswap(p, q)

yb ← q −(p−1 (mod q)

)and ya ← q − yb

xb ← (1+pyb)q and xa ← p− xb

nxa ← xa //Number of additional a’s needed in current p-blocknxb← xb //Number of additional b’s needed in current p-block

nya ← ya //Number of additional a’s needed in current q-blocknyb← yb //Number of additional b’s needed in current q-block

per ← PER {p, q}holed←Falselast←Nilw ← εfor item in per do

if item = 0 thenDo Nothing

else if item mod p = 0 thenif ¬holed ∧ item > h then

holed←Truev ← anxa bnxb

−1

v ← v [0..h− last) �v [h− last.. |v|)w ← wvnya ← nya − (nxa + 1) and nyb

← nyb− (nxb

− 1)else

w ← wanxa bnxb

nya ← nya − nxa and nyb← nyb

− nxb

nxa ← xa and nxb← xb

else//item mod q = 0if ¬holed ∧ item > h then

holed←Truev ← anya−1bnyb

v ← v [0..h− last) �v [h− last.. |v|)w ← wvnxb← nxb

− (nyb+ 1) and nxa ← nxa − (nya − 1)

elsew ← wanya bnyb

nxa ← nxa − nya and nxb← nxb

− nyb

nya ← ya and nyb← yb

last← itemreturn w

13

Proof. The result follows if we can show that the variables nxa , nxb, nya , and nyb

are all nonnegativethroughout the execution of the for loop. It is clear that all four are nonnegative initially. We nowgive bounds for all of these values over the course of the execution, and then we show that thesebounds are nonnegative. These bounds are derived by considering the case with a hole in the firstposition, and then we extend them to any position.

There are a total of p blocks of length q, called q-blocks, starting at a position congruent to zeromodulo q. Hence, there are p positions where such blocks end (including the end of the word). Theminimum values that nya and nyb

obtain are both just before (or at, in the case of the last one)the end of one of these blocks. The only positions that nxa and nxb

take values less than xa and xb

are just after one of these q-blocks ends. Number the q-blocks from 1 to p. Let mr−1 denote thenumber of p-blocks (beginning at indices congruent to zero modulo p) completely contained withinq-block numbered r. We claim that mr−1 =

⌊qp

⌋+ sqp,q (r − 1).

Notice that mr−1 is the difference between q and the amount subtracted on both ends forincomplete p-blocks all divided by p. If r−1 = 0, this obviously equals

⌊qp

⌋+sqp,q (r − 1). Assume

that r − 1 > 0 (so c = 1 in Definition 4). The number of letters to the left of the first completep-block is given by p− (((r − 1) q) mod p). The number of letters to the right of the last completep-block is given by rq mod p. Hence,

mr−1 =q − (p− (((r − 1) q) mod p))− (rq mod p)

p

=q − p + (((r − 1) q) mod p)− (rq mod p)

p

=q

p− 1 +

((r − 1) q) mod p

p− rq mod p

p

=⌊

q

p

⌋+

q mod p

p+

((r − 1) q) mod p

p− rq mod p

p− 1

=⌊

q

p

⌋+

q mod p

p− (rq) mod p− ((r − 1) q) mod p

p− 1

=⌊

q

p

⌋+

r (q mod p)p

− (r − 1) (q mod p)p

− (r (q mod p)) mod p

p+

((r − 1) (q mod p)) mod p

p− 1

=⌊

q

p

⌋+

r (q mod p)p

− (r (q mod p)) mod p

p

− (r − 1) (q mod p)p

+((r − 1) (q mod p)) mod p

p− 1

=⌊

q

p

⌋+⌊

r (q mod p)p

⌋−⌊

(r − 1) (q mod p)p

⌋− 1

=⌊

q

p

⌋+ sqp,q (r − 1) ,

as required.Now, we show that at the end of block r, the smallest value that nyb

obtains is

ryb −(

r

⌊q

p

⌋+⌊

r (q mod p)p

⌋)xb + 1,

14

the smallest value that nya obtains is rya −(r⌊

qp

⌋+⌊

r(q mod p)p

⌋)xa − 1, the smallest value that

nxbobtains is

(r⌊

qp

⌋+⌊

r(q mod p)p

⌋+ 1)

xb − ryb − 1, and the smallest value that nxa obtains is(r⌊

qp

⌋+⌊

r(q mod p)p

⌋+ 1)

xa − rya + 1.Firstly, we rewrite all four of those expressions. Notice that

r−1∑i=0

mi =r−1∑i=0

(⌊q

p

⌋+ sqp,q (i)

)

= r

⌊q

p

⌋+

r−1∑i=0

sqp,q (i)

= r

⌊q

p

⌋+

r−1∑i=0

(⌊(i + 1) (q mod p)

p

⌋−⌊

i (q mod p)p

⌋− ci

)= r

⌊q

p

⌋+⌊

r (q mod p)p

⌋− (r − 1) .

Applying this to nyb, we begin with ryb−

(r⌊

qp

⌋+⌊

r(q mod p)p

⌋)xb+1. We then replace the contents

of the parentheses with (r − 1) +∑r−1

i=0 mi yielding

ryb −

((r − 1) +

r−1∑i=0

mi

)xb + 1. (2)

Similary, the nya , nxb, and nxa expressions become, respectively,

rya −

((r − 1) +

r−1∑i=0

mi

)xa − 1, (3)

(r +

r−1∑i=0

mi

)xb − ryb − 1, (4)

(r +

r−1∑i=0

mi

)xa − rya + 1. (5)

Secondly, we use induction on r to show that these four expressions are correct.First, let r = 1. There is a total of m0 blocks of length p subtracted from nyb

and nya . The firstblock, though, was the exceptional one with the hole. Hence, nyb

actually has one less subtractedfrom it and nya one more. Thus, nyb

takes its minimum at yb−m0xb +1 and nya takes its minimumat ya −m0xa − 1. Both of these match the required expressions. Next, nxa has this minimal nya

subtracted from it and nxbhas this minimal nyb

subtracted from it. This yields minimal values of(1 + m0) xa−ya+1 for nxa and (1 + m0) xb−yb−1 for nxb

, both matching the required expressions.Now, assume that for r ≤ s, nyb

takes its minimum at (2), nya takes its minimum at (3), nxb

at (4), and nxa at (5). Let r = s + 1. Over the course of this q-block, nybbegins by having the

previous minimum of nxbsubtracted from it, and nya begins by having the previous minimum of

nxa subtracted from it. This yields

yb −

((s +

s−1∑i=0

mi

)xb − syb − 1

)= (s + 1) yb −

(s +

s−1∑i=0

mi

)xb + 1

15

for nyband

ya −

((s +

s−1∑i=0

mi

)xa − sya + 1

)= (s + 1) ya −

(s +

s−1∑i=0

mi

)xa − 1

for nya . Then, msxa is subtracted from nya over the course of the next ms blocks of length p.Similarly, msxb is subtracted from nyb

. This gives

(s + 1) ya −

(s +

s∑i=0

mi

)xa − 1

for nya and

(s + 1) yb −

(s +

s∑i=0

mi

)xb + 1

for nyb. These are both the required expressions.

Next, nxa becomes xa minus this minimum for nya and nxbbecomes xb minus this minimum

for nyb. This gives, for r = s + 1, minimal values of nxa and nxb

of((s + 1) +

s∑i=0

mi

)xa − (s + 1) ya + 1

and ((s + 1) +

s∑i=0

mi

)xb − (s + 1) yb − 1

respectively, as required.If the hole is not in the first position, the values with a +1 can only possibly be one smaller,

except at the end of the word (where the hole must have been encountered). The values with a −1cannot be smaller at all. Hence, we drop the +1 portions for the remainder of the proof.

Now that we have shown that these bounds are correct, all that remains is to show that theyare nonnegative. Keep in mind that all four bounds are integers, so showing that they are strictlygreater than −1 suffices to show that they are greater than or equal to 0. We work through thebound for nyb

. The proof uses Lemma 3 as an early step.For nyb

, we have

nyb≥ ryb −

(r

⌊q

p

⌋+⌊

r (q mod p)p

⌋)xb

= ryb −⌊

rq

p

⌋xb

= ryb −⌊

rq

p

⌋(1 + pyb

q

)≥ ryb −

rq

p

(1 + pyb

q

)= ryb −

r

p− ryb

= −r

p

≥ −1.

16

Equal to −1 would only possibly occur at the end of the word, though, and by then the hole wouldhave been inserted, restoring the +1 and forcing −1 not to occur. Therefore, nyb

≥ 0, as required.Similarly, we have

nya ≥ rya −(r⌊

qp

⌋+⌊

r(q mod p)p

⌋)xa − 1 > −1,

nxb≥

(r⌊

qp

⌋+⌊

r(q mod p)p

⌋+ 1)

xb − ryb − 1 > −1,

nxa ≥(r⌊

qp

⌋+⌊

r(q mod p)p

⌋+ 1)

xa − rya ≥ −1.

In the latter inequality, equal to −1 would only possibly occur at the end of the word, though, andby then the hole would have been inserted, restoring the +1 and forcing −1 not to occur. Therefore,nxa ≥ 0, as required.

Hence, all four of these variables never go negative over the course of the algorithm’s execution.Therefore, the algorithm is correct, thus demonstrating the existence of a partial word with exactlyone hole with abelian periods p and q but without period 1.

The following corollary ties the previous theorem to the previous section, giving a strong resultabout pairs of abelian periods.

Corollary 3. Let p, q, and n be positive integers. There exists a partial word w of length n withat most one hole that is abelian p-periodic and abelian q-periodic but not 1-periodic.

Proof. Without loss of generality, let n = lcm (p, q) (if n < lcm (p, q), truncating the constructionfor that length at n characters gives a word with both abelian periods, and if n > lcm (p, q),repeating that word and then a truncation of it to make a length n word also preserve both abelianperiods). If p and q are relatively prime, we can apply Theorem 3 to yield a word w with thedesired properties. If p and q are not relatively prime, we note that Theorem 1 with k = 2 givesa count that is larger than 2. Hence, there exists at least one full word (which is a partial wordwith no holes) of length n that is abelian p-periodic and abelian q-periodic but not 1-periodic, asrequired.

Now, we use Theorem 1 and Theorem 3 to state and prove a bound about the number ofabelian-nonprimitive partial words with exactly one hole.

Theorem 4. Let n ≥ 2 be a positive integer whose set of prime factors is P = {pi}. Let Q ={npi

}= {qi}. Give Q the standard total ordering, and let the indices range from 0 to ‖Q‖ − 1.

There are less than or equal to

n∑

R ⊆ QR 6= ∅

(−1)‖R‖+1∑

x1+x2+···+xk=gcd(R)

∏g∈GAP(R)

(g

gx1

gcd(R)gx2

gcd(R) · · · gxkgcd(R)

) nlcm(R)

abelian-nonprimitive partial words of length n with exactly one hole over an alphabet of size k, withequality holding if and only if n = pm for some prime number p and positive integer m.

Proof. The number of abelian-nonprimitive full words of length n is given by Theorem 1. Multi-plying that result by n gives the total number of ways to replace exactly one letter in these wordsby a hole. Each word obtained in this way has exactly one hole and it is abelian-nonprimitive oflength n. We show that all words obtained in this way are unique if and only if n is a power of some

17

prime p, thereby showing that this is an upper bound on the total number of abelian-nonprimitivepartial words of length n with exactly one hole.

For the forward direction, assume that n has at least two distinct prime factors. Let pr dividen for some prime p and some positive integer r such that pr+1 does not divide n. Let q = n

pr .Since pr and q are relatively prime, we can apply Theorem 3 to find a partial word with one holeof length n with abelian periods pr and q and without period 1. Algorithm 1 constructs this wordin such a way that it can be obtained by replacing an a with a hole in some abelian-nonprimitivefull word and a b with a hole in some other abelian-nonprimitive full word with a hole. Hence, thebound above is counting this partial word twice, so it is a strict upper bound.

For the backward direction, let n = pm for some prime p and positive integer m. If the wordis abelian-nonprimitive, it has an abelian period pm−1. It cannot have any independent abelianperiod, so filling a hole from any abelian-nonprimitive word of length n with exactly one hole canbe done in only one way to give a full abelian-nonprimitive word of length n. Hence, every wordobtained from the latter by replacing a letter with a hole is unique.

The following corollary simply restates a simple case with a simpler formula.

Corollary 4. Let p be a prime number, and let m ≥ 1 be an integer. There are

pm∑

x1+x2+···+xk=p

(pm−1

x1 x2 · · · xk

)p

abelian-nonprimitive partial words of length pm with exactly one hole over an alphabet of size k.

Proof. By Theorem 4, the bound is exact when n = pm. Here, we have taken that bound and haveapplied Corollary 1.

3.2 Number of abelian-nonprimitive partial words with one hole and givenabelian periods

We give modifications of Algorithm 1 to count the number of abelian-nonprimitive partial wordswith exactly one hole and having relatively prime abelian periods p and q. Algorithm 2 countswords with the hole in a specific position; Algorithm 3 counts all one-hole words by summing overall hole positions.

Theorem 5. • Given as input two relatively prime integers p and q greater than one, an integerk greater than one, and a nonnegative integer h such that 0 ≤ h < pq, Algorithm 2 outputsthe number of partial words of length pq with exactly one hole at position h having abelianperiods p and q but not period 1. It runs in O ((p + q) f (p)) time (much smaller than thenumber of such words), where f (n) is the maximum time it takes to compute

(nr

)for some r.

• Given as input two relatively prime integers p and q greater than one and an integer k greaterthan one, Algorithm 3 outputs the number of partial words of length pq with exactly onehole having abelian periods p and q but not period 1. It runs in O

((p + q)2 f (p)

)time (much

smaller than the number of such words), where f (n) is the maximum time it takes to compute(nr

)for some r.

Proof. First, note that each of the four major code blocks inside the for loop keeps track of thenumber of a’s and b’s that would be added in the corresponding blocks in Algorithm 1. Then,the total is multiplied by all possible ways that those characters can be arranged, as each block

18

Algorithm 2 COUNT ABPQ ONE HOLE (p, q, k, h)Require: p and q relatively prime positive integersRequire: k ≥ 2 integer (alphabet size)Require: h nonnegative integer (hole position), 0 ≤ h < pq

if q < p thenswap(p, q)

yb ← q −(p−1 (mod q)

)and ya ← q − yb

xb ← (1+pyb)q and xa ← p− xb

nxa ← xa //Number of additional a’s needed in current p-blocknxb← xb //Number of additional b’s needed in current p-block

nya ← ya //Number of additional a’s needed in current q-blocknyb← yb //Number of additional b’s needed in current q-block

per ← PER {p, q}total← 1holed←Falselast←Nilfor item in per do

if item = 0 thenDo Nothing

else if item mod p = 0 thenif ¬holed ∧ item > h then

holed←Truetotal← total ×

(item−last−1

nxa

)nya ← nya − (nxa + 1) and nyb

← nyb− (nxb

− 1)else

total← total ×(item−last

nxa

)nya ← nya − nxa and nyb

← nyb− nxb

nxa ← xa and nxb← xb

else//item mod q = 0if ¬holed ∧ item > h then

holed←Truetotal← total ×

(item−last−1

nya−1

)nxb← nxb

− (nyb+ 1) and nxa ← nxa − (nya − 1)

elsetotal← total ×

(item−last

nya

)nxa ← nxa − nya and nxb

← nxb− nyb

nya ← ya and nyb← yb

last← itemreturn total × k × (k − 1)

19

Algorithm 3 COUNT ABPQ ONE HOLE ALL (p, q, k)Require: p and q relatively prime positive integersRequire: k ≥ 2 integer (alphabet size)

total← 0per ← PER {p, q}first←Truelast←Nilfor item in per do

if first thenfirst←False

elsetotal← total + (item− last) COUNT ABPQ ONE HOLE (p, q, k, last)

last = itemreturn total

is independent. No word is excluded from the count, because any word with the desired abelianperiods and hole position can have its letters in each GAP {p, q}-length block permuted to matcha word generated by Algorithm 1. Hence, Algorithm 2 does count the required quantity. In termsof its time bound, notice that item− last is always no greater than p, as PER {p, q} contains everymultiple of p. Also, it contains no more than p+q elements. Hence, the running time of Algorithm 2is O ((p + q) f (p)).

Algorithm 3 attempts to put the hole in each block, of which there are ‖PER {p, q}‖ = O (p + q).Hence, the total running time is O

((p + q)2 f (p)

), as required.

4 Abelian-Nonprimitive Partial Words with an Arbitrary Numberof Holes

Attempting to generalize the results from the previous section to words with two or more holescauses some problems. Many of these issues stem from the fact that once two holes appear in a word,they cannot necessarily occupy any two positions in the word; placing one constrains the positionsthe other can take. In this section, we begin by dropping two constraints from the previous section:that our abelian periods are relatively prime and that we have only two abelian periods. Fromhere, we investigate a few general results, and then we reintroduce the relatively prime conditionand prove a result.

4.1 A lower bound for the minimum hole count

First, we prove a lower bound for the number of holes in a word with specific Parikh vector abelianperiods.

Theorem 6. Let k be the alphabet size, and let ~p0 through ~pm−1 be Parikh vectors. Let w bea partial word of length lcm {|~pi|} with abelian periods given by these Parikh vectors. Then, wcontains at least

µ ({~pi}) =k∑

`=1

max0≤j<m

{lcm {|~pi|}| ~pm−1|

‖ ~pm−1‖` −lcm {|~pi|}|~pj |

‖~pj‖`

}holes.

20

Proof. We begin by constructing a system of linear equations that describes this situation. Considerthe ~pi’s in increasing order (though exclude ~pm−1). For each letter ` in the alphabet, set

‖~pj‖`lcm {|~pi|}|~pj |

− ‖ ~pm−1‖`lcm {|~pi|}| ~pm−1|

= hj,`

for some integer hj,`. This consists of a total of k (m− 1) equations. We know that∑k

`=1 ‖ ~pm−1‖` =| ~pm−1|, and we can also include dummy equations of the form ‖ ~pm−1‖` = ‖ ~pm−1‖` for 2 ≤ ` ≤ k.

Take ~v to be the vector with entries ‖~pj‖` ordered such that ‖ ~pj1‖`1 precedes ‖ ~pj2‖`2 if j1 < j2

or if j1 = j2 and `1 < `2. We can compute the vector of hj,` values as the first k (m− 1) entries ofM ({|~pi|} , k)~v. A positive value of hj,` means that we must have that many holes correspondingto letter ` in abelian period ~pj and to some other letter in abelian period ~pm−1. A negative valueof hj,` means that we must have that many holes corresponding to letter ` in abelian period ~pm−1

and to some other letter in abelian period ~pj . Also, we define hm−1,` = 0 for all values of `. Thecount

max0≤j<m

{lcm {|~pi|}| ~pm−1|

‖ ~pm−1‖` −lcm {|~pi|}|~pj |

‖~pj‖`

}(6)

of holes for a given ` value comes from having all holes coming from a specific ` value for negativevalues of hj,`. We can have no fewer than max

0≤j<m{−hj,`} holes for a given ` value.

We now show that we cannot have fewer holes than would result from summing over the possible` values. Let w be a partial word with abelian periods given by {~pi}, and let wi be an arbitrary(but specific) completion of w preserving the abelian ~pi period. For each letter `, ˆwm−1 contains atleast the number given in (6) of positions that have letter ` but that do not have letter ` in all ofthe other wi’s. We can sum over this bound to obtain a lower bound for the total number of holesin w.

Example 3. To illustrate the proof of Theorem 6, let us return to the example word w from (1).Note that it has abelian periods ~p0 = (1, 2) of length |~p0| = 3, ~p1 = (1, 3) of length |~p1| = 4, and~p2 = (1, 4) of length |~p2| = 5. The completions w0, w1 and w2 of w, having length lcm{|~pi|} = 60,that preserve, respectively, the abelian periods ~p0, ~p1 and ~p2 are (the underlined letters correspondto holes in w):

abbbbabb a babbb a abbb a babbbabb a babb a bbabbbabbb a abbbbabb a babbb a,abbbbabb b babbb b abbb a babbbabb b babb a bbabbbabbb b abbbbabb b babbb a,abbbbabb b babbb b abbb b babbbabb b babb b bbabbbabbb b abbbbabb b babbb b.

Here,

h0,1 = ‖~p0‖1lcm {|~pi|}|~p0|

− ‖~p2‖1lcm {|~pi|}|~p2|

= 8.

Similarly, h1,1 = 3 and h2,1 = 0. So the maximum in (6) is max{−8,−3, 0} = 0. We can alsocompute h0,2 = −8, h1,2 = −3, and h2,2 = 0, giving a maximum of the −hj,2’s as 8. Therefore,µ({~pi}) = 0 + 8 = 8, implying that the minimum number of holes is achieved by w.

4.2 Main result for partial words with arbitrarily many holes

We call a Parikh vector nontrivial if it contains at least two nonzero entries. In the next result, thenotation 〈a, b〉 denotes

{c ∈ Z | c = αa + βb, α, β ≥ 0} \ {0} .

21

Theorem 7. Let {pi} = {p0, . . . , pm−1} be a set of at least two positive integers. Assume forsome pair of integers 0 ≤ q, r < m, an integer ν = g lcm{pi}

lcm(pq ,pr) for some integer g. If g /∈ 〈pq, pr〉,gcd (pq, pr) | g, and for all integers 0 ≤ s < m with s 6= q and s 6= r, gps ∈ 〈pq, pr〉, then there is achoice of nontrivial Parikh vectors {~pi} where |~pi| = pi such that µ ({~pi}) = ν. The converse holdsif the pi’s are pairwise relatively prime and the alphabet is binary.

Proof. We can assume without loss of generality that the alphabet size k is two. If k ≥ 3, we canchoose a subset of two letters and use only those (which cannot increase the minimum number ofholes). Assume there is a choice of Parikh vectors {~pi} where |~pi| = pi such that µ ({~pi}) = ν. Wehave

ν = max0≤j<m

{lcm {pi}pm−1

‖ ~pm−1‖1 −lcm {pi}

pj‖~pj‖1

}+ max

0≤j<m

{lcm {pi}pm−1

‖ ~pm−1‖2 −lcm {pi}

pj‖~pj‖2

}.

Let 0 ≤ q < m be the maximum j for the first part; let 0 ≤ r < m be the maximum j for thesecond part. Also, let ai = ‖~pi‖1. We have that this equals

lcm {pi}pm−1

am−1 −lcm {pi}

pqaq

+lcm {pi}pm−1

(pm−1 − am−1)−lcm {pi}

pr(pr − ar)

=lcm {pi}

lcm (pq, pr)

(am−1

lcm (pq, pr)pm−1

− aqlcm (pq, pr)

pq

+(pm−1 − am−1)lcm (pq, pr)

pm−1− (pr − ar)

lcm (pq, pr)pr

)=

lcm {pi}lcm (pq, pr)

(ar

lcm (pq, pr)pr

− aqlcm (pq, pr)

pq

)=

lcm {pi}lcm (pq, pr)

(ar

pq

gcd (pq, pr)− aq

pr

gcd (pq, pr)

)= ν.

We refer to this as Expression 1. Notice that g = arpq

gcd(pq ,pr) − aqpr

gcd(pq ,pr) .First, we prove the direction that holds in all cases. Assume that there is some pair of integers

0 ≤ q, r < m such that there exists g /∈ 〈pq, pr〉 such that gcd (pq, pr) | g, and for all integers0 ≤ s < m with s 6= q and s 6= r, gps ∈ 〈pq, pr〉. Let ν = g lcm{pi}

lcm(pq ,pr) . Without loss of generality,we assume pq < pr (as we are only interested in the existence of a solution). We start by findingvalues of aq and ar such that g = ar

pq

gcd(pq ,pr) − aqpr

gcd(pq ,pr) . We then show that for all integers0 ≤ s < m with s 6= q and s 6= r, we can find as such that ν = µ ({~pi}). We repeatedly refer toLemma 1 in [18] (modified slightly to fit our definitions): Let x ∈ Z. Then x /∈ 〈n1, n2〉 if and onlyif gcd (n1, n2) - x or x = n1n2

gcd(n1,n2) − αn1 − βn2 for some α, β ∈ N \ {0}.Since g /∈ 〈pq, pr〉 and gcd (pq, pr) | g, we can write g = pqpr

gcd(pq ,pr) − αpq − βpr for some α, β ∈N \ {0}. Let gcd (pq, pr) | aq and gcd (pq, pr) | ar, and let α = pr−ar

gcd(pq ,pr) and β = aq

gcd(pq ,pr) . These

22

are both positive integers. Then, we obtain that

g =pqpr

gcd (pq, pr)− αpq − βpr

=pqpr

gcd (pq, pr)− pr − ar

gcd (pq, pr)pq −

aq

gcd (pq, pr)pr

=pqar

gcd (pq, pr)− praq

gcd (pq, pr),

as needed.We show that for all relevant s we can find as such that ν = µ ({~pi}). More specifically, we

show that if we can find as such that ps

pqaq ≤ as ≤ ps

prar then such as satisfy the conditions where q

and r are maximal. Then we show that we can find such as.We go through the full argument for q; the argument for r is similar, but with the inequalities

reversed. Assume that ps

pqaq ≤ as ≤ ps

prar. We have ps

pqaq ≤ as or psaq ≤ aspq, which implies

aqps

gcd(pq ,ps)≤ as

pq

gcd(pq ,ps). Then aq

lcm(pq ,ps)pq

≤ aslcm(pq ,ps)

ps, so lcm{pi}

pqaq ≤ lcm{pi}

psas or − lcm{pi}

pqaq ≥

− lcm{pi}ps

as. We deduce lcm{pi}pm−1

am−1 − lcm{pi}pq

aq ≥ lcm{pi}pm−1

am−1 − lcm{pi}ps

as since q is maximal.We find such as. The difference between the terms ps

prar and ps

pqaq is

ps

prar −

ps

pqaq =

ps (arpq − aqpr)pqpr

=psg gcd (pq, pr)

pqpr

We show that psg gcd(pq ,pr)pqpr

−(

ps

prar −

⌊ps

prar

⌋)≥ 0, indicating that there is an integer between the

required values.Note that

(ps

prar −

⌊ps

prar

⌋)= psar mod pr

pr. Also, since gps ∈ 〈pq, pr〉, we can write gps =

γpq + δpr for some positive integers γ and δ. Hence, psg gcd(pq ,pr)pqpr

−(

ps

prar −

⌊ps

prar

⌋)≥ 0 if and

only if psg gcd(pq ,pr)pqpr

− psar mod pr

pr≥ 0. This is equivalent to psg gcd (pq, pr) ≥ pq (psar mod pr) or

(γpq + δpr) gcd (pq, pr) ≥ pq (psar mod pr). In addition,

gps = γpq + δpr = ps

(pqar

gcd (pq, pr)− praq

gcd (pq, pr)

).

Hence, gps = γpq + δpr = ps

(pqar

gcd(pq ,pr) −praq

gcd(pq ,pr)

)if and only if gcd (pq, pr) (γpq + δpr) =

pspqar − pspraq if and only if (psar − gcd (pq, pr) γ) pq = (psaq − gcd (pq, pr) δ) pr if and only if(psar − gcd (pq, pr) γ) pq

gcd(pq ,pr) = (psaq − gcd (pq, pr) δ) pr

gcd(pq ,pr) .

Let γ = psar mod pr

gcd(pq ,pr) . We know that this is an integer, and we also see that

pr

gcd (pq, pr)|(

psar

gcd (pq, pr)− γ

),

which is required for a choice of γ since gcd(

pq

gcd(pq ,pr) , pr

)= 1. Substituting this into the inequality

chain yields an equivalent statement that δpr gcd (pq, pr) ≥ 0, which is obviously true. Therefore,for each relevant s, we can find as with the desired properties.

Now, we prove the converse in the case of the pi’s being pairwise relatively prime. FromExpression 1, g = arpq − aqpr since gcd (pq, pr) = 1. Also, gcd (pq, pr) | g. What remains to beshown is that g /∈ 〈pq, pr〉 and that for all integers 0 ≤ s < m with s 6= q and s 6= r, gps ∈ 〈pq, pr〉.

23

Firstly, suppose towards a contradiction that g ∈ 〈pq, pr〉. Then, there exist positive integers αand β such that g = arpq − aqpr = αpq + βpr. We have, from this, (ar − α) pq = (aq + β) pr. Sincepq and pr are relatively prime, pr | (ar − α). We know, though, that ar < pr and that α < ar, andsince ar − α > 0, we have pr dividing a positive integer less than itself, a contradiction.

Secondly, we show that for all integers 0 ≤ s < m with s 6= q and s 6= r, gps can be written asαpq+βpr for positive integers α and β. This is true if we have arpqps−aqprps = αpq+βpr. Rewritingyields (arps − α) pq = (aqps + β) pr. Since pq and pr are relatively prime, it must be the case thatpr | (arps − α) and that pq | (aqps + β). Let γ = arps−α

pr. We claim that arps > pr and that there

is a choice of α such that γ ∈ N and γpq > aqps. We can find α such that pr | (arps − α). Then,the required value of β = γpq − aqps (derived from the expression (arps − α) pq = (aqps + β) pr) ispositive as well, because γpq ≥ aspq > aqps.

We use the maximal nature of our choices of q and r to prove the claim. The inequalities we useare strict because the pairwise relative primeness of the abelian periods implies different numberof forced holes for each one when paired with pm−1. We know that for all s 6= r,

lcm {pi}pm−1

(pm−1 − am−1)−lcm {pi}

pr(pr − ar)

>lcm {pi}pm−1

(pm−1 − am−1)−lcm {pi}

ps(ps − as) .

This can be rewritten as lcm{pi}ps

(ps − as) > lcm{pi}pr

(pr − ar). Dividing both sides by lcm{pi}lcm(pr,ps)

yieldslcm(pr,ps)

ps(ps − as) > lcm(pr,ps)

pr(pr − ar), or arps > aspr, because lcm (pr, ps) = prps. Since as > 0,

this implies that arps > pr, which is the first part of the claim.We also know that for all s 6= q,

lcm {pi}pm−1

am−1 −lcm {pi}

pqaq >

lcm {pi}pm−1

am−1 −lcm {pi}

psas.

This can be rewritten as lcm{pi}ps

as > lcm{pi}pq

aq. Dividing both sides by lcm{pi}lcm(pq ,ps)

yields lcm(pq ,ps)ps

as >lcm(pq ,ps)

pqaq, or aspq > aqps, because lcm (pq, ps) = pqps.

Let α = arps mod pr. This implies that 0 < α < pr. We can then write arps = γpr + α. Inorder for β to be a positive integer, we need γpq > aqps. To accomplish this, we show that γ ≥ as.We begin with the known expression arps − aspr > 0. We can rewrite this as γpr + α − aspr > 0.This yields that (γ − as) pr > −α. We know, though, that −α > −pr, so, since all variables hereare integers, it must be that (γ − as) pr ≥ 0, which implies that γ ≥ as, as required.

Example 4. To illustrate Theorem 7, let {pi} = {2, 3, 5} (these are pairwise relatively prime):

pq prlcm{pi}

lcm(pq ,pr) g ν {~pi}

2 3 5 1 5{(1, 1) , (2, 1) , (3, 2)}{(1, 1) , (1, 2) , (2, 3)}

3 5 2 4 8{(1, 1) , (2, 1) , (2, 3)}{(1, 1) , (1, 2) , (3, 2)}

2 5 3 3 9{(1, 1) , (2, 1) , (4, 1)}{(1, 1) , (1, 2) , (1, 4)}

3 5 2 7 14{(1, 1) , (2, 1) , (1, 4)}{(1, 1) , (1, 2) , (4, 1)}

24

4.3 An algorithm to compute partial words with given abelian periods

We conclude this section by giving an algorithm, Algorithm 4, that, given a set {~pi} of Parikh vectorscomputes the lexicographically-least partial word (with � lexicographically after every letter in thealphabet) of length lcm {|~pi|} with all of those Parikh vectors specifying abelian periods.

Given abelian periods {~pi}, start with w = ε, and return w at the end.

1. Keep track of how many of each letter may still be added in the current abelian periods.

2. Repeat lcm {|~pi|} times:

(a) Let ` be the earliest letter that can still be added to all current abelian periods (or � ifno such letter exists);

(b) Take w = w`;

(c) If ` 6= �, subtract one from the number of letter ` still allowed in each abelian period;

(d) For each abelian period ~pi, if |w| divides |~pi|, reset the letters’ remaining count for allletters in that abelian period.

Example 5. To illustrate Algorithm 4, let {~pi} = {(1, 1) , (1, 2) , (3, 2)}; building the word, we getabba�bab�ab�ab�ab�abbab�abba��:

Iteration ~q0[1] ~q0[2] ~q1[1] ~q1[2] ~q2[1] ~q2[2]0 ε 1 1 1 2 3 21 ab 1 1 0 1 2 12 abb 1 0 1 2 2 03 abba 1 1 0 2 1 04 abba� 1 1 0 2 3 25 abba�b 1 1 1 2 3 16 abba�ba 0 1 0 2 2 17 abba�bab 1 1 0 1 2 08 abba�bab� 1 1 1 2 2 0...

......

......

......

...

This word has all the desired abelian periods and is the lexicographically least such word.

Theorem 8. Given a set {~pi} of Parikh vectors, Algorithm 4 computes the lexicographically-leastpartial word (with � lexicographically after every letter in the alphabet) of length lcm {|~pi|} with allof those Parikh vectors specifying abelian periods.

Proof. We need to prove three facts about this algorithm:

(i) The word w generated has length lcm {|~pi|};

(ii) The word w has all of the required abelian periods;

(iii) The word w is lexicographically least of all words of that length with those abelian periods(with hole after all nonhole characters).

25

Algorithm 4 BUILD ABELIAN NONPRIMITIV E ({~pi})Require: {~pi} is a set of m Parikh vectors, ~p0, . . . , ~pm−1, over an alphabet of size k with letters 1

through k{~qi} ← {~pi}w ← εdone←Falserepeat

`← 1i← 0while ` ≤ k and i < m do

if ~qi [`] = 0 then`← ` + 1i← 0

elsei← i + 1

if ` ≤ k thenw = w`for i← 0 to m− 1 do

~qi [`]← ~qi [`]− 1else

w = w�done←Truefor i← 0 to m− 1 do

if |w| mod |~pi| = 0 then~qi ← ~pi

elsedone←False

until donereturn w

26

For (i), note that |w| increases by one every iteration of the repeat-until loop. This loop terminatesthe first time that the lengths of all the abelian periods divide |w|, which occurs when |w| =lcm {|~pi|}, as required.

For (ii), we show that w [r |~pi| .. (r + 1) |~pi|) has a completion with Parikh vector ~pi for everyinteger r for which this factor is defined. Just before position r |~pi| is computed, |w| = r |~pi|, so|w| mod |~pi| = 0. This would mean that ~qi = ~pi just before this factor is computed. Over thecourse of computing this factor, a letter (nonhole) is only added if ~qi has that entry greater thanzero, and, if such a letter is added, ~qi has that entry decremented. Hence, no more than ~qi [`] ofletter ` are added in this factor, so w [r |~pi| .. (r + 1) |~pi|) has a completion with Parikh vector ~pi,as required.

For (iii), suppose towards a contradiction that a word w′ has the same abelian periods as w andthe same length as w but w′ < w lexicographically. Let i be the least position where w(i) 6= w′(i).Clearly, it must be that w′(i) < w(i). On iteration i + 1 of the repeat-until loop, the algorithmwould have considered the letter w′(i) before rejecting it in favor of w(i). Hence, some ~qj at thatpoint would have had a zero in the position corresponding to w′(i). This means that w had usedup all of the ~pj Parikh vector’s w′(i)’s already in the current abelian period of length |~pj |. Sincew [0..i) = w′ [0..i), w′ must also have used up those letters. The word w′, though, has w′(i) inposition i which would be too many of that letter for that abelian period. This is a contradiction,so w must be lexicographically least.

It is important to note that, while Algorithm 4 gives the lexicographically least word with thedesired properties, it does not, in general, give a partial word with the minimum number of holes.For example, running Algorithm 4 on input {(1, 2) , (1, 3) , (1, 4)} yields

abbbbabb�babbb�abbb�abb�babbb�abbbbabb�babbb�abbbbabb�babbb�,

which contains nine holes, but

abbbbabb�babbb�abbb�babbbabb�babb�bbabbbabbb�abbbbabb�babbb�

has the same abelian periods and only eight holes (which is the minimum number possible by theµ bound of Theorem 6).

Algorithm 4 still works (in a theoretical sense) if given an infinite set of Parikh vectors. Thiseffect can be simulated by passing it increasing subsets of the infinite set and seeing what the outputis. Running Algorithm 4 on the infinite set {(α, α) | α ≥ 1} ∪ {(α, α + 1) | α ≥ 1} (a set of Parikhvectors with every length greater than 1), yields the word (abbab�)ω, meaning that this word hasevery abelian period 2 and higher.

Proposition 1. The word (abbab�)ω is the lexicographically least partial word with abelian periodsgiven by every Parikh vector in

{(α, α) | α ≥ 1} ∪ {(α, α + 1) | α ≥ 1} .

Proof. We must show that this word has the desired abelian periods and that it is the lexicographi-cally least such word. First, note that it has abelian period (1, 1), and, hence, all the abelian periodsin {(α, α) | α ≥ 1}. Also, notice that it has abelian period (1, 2). There are six possibilities for ablock of length 5: abbab, bbab�, bab�a, ab�ab, b�abb, and �abba. All of these are abelian equivalentto aabbb, which has Parikh vector (2, 3).

Now, let α ≥ 3 and let p = 2α + 1. Notice that α = p−12 and α + 1 = p+1

2 . There are sixpossibilities for a block of length p:

27

Table 1: Letter counts in factors of (abbab�)ω of length p

p mod 61 3 5

1. a 2p−16 + 1 = p+2

3 ≤ α 2p−36 + 1 = p

3 < α 2p−56 + 2 = p+1

3 < α

1. b 3p−16 = p−1

2 < α + 1 3p−36 + 2 = p+1

2 = α + 1 3p−56 + 3 = p+1

2 = α + 12. a 2p−1

6 = p−13 < α 2p−3

6 + 1 = p3 < α 2p−5

6 + 1 = p−23 < α

2. b 3p−16 + 1 = p+1

2 = α + 1 3p−36 + 2 = p+1

2 = α + 1 3p−56 + 3 = p+1

2 = α + 13. a 2p−1

6 = p−13 < α 2p−3

6 + 1 = p3 < α 2p−5

6 + 2 = p+13 < α

3. b 3p−16 = p−1

2 < α + 1 3p−36 + 2 = p+1

2 = α + 1 3p−56 + 2 = p−1

2 < α + 14. a 2p−1

6 + 1 = p+23 ≤ α 2p−3

6 + 1 = p3 < α 2p−5

6 + 2 = p+13 < α

4. b 3p−16 = p−1

2 < α + 1 3p−36 + 1 = p−1

2 < α + 1 3p−56 + 2 = p−1

2 < α + 15. a 2p−1

6 = p−13 < α 2p−3

6 + 1 = p3 < α 2p−5

6 + 1 = p−23 < α

5. b 3p−16 + 1 = p+1

2 = α + 1 3p−36 + 1 = p−1

2 < α + 1 3p−56 + 3 = p+1

2 = α + 16. a 2p−1

6 = p−13 < α 2p−3

6 + 1 = p3 < α 2p−5

6 + 2 = p+13 < α

6. b 3p−16 = p−1

2 < α + 1 3p−36 + 1 = p−1

2 < α + 1 3p−56 + 2 = p−1

2 < α + 1

• (abbab�)bp6c (abbab�) [0..p mod 6);

• (bbab�a)bp6c (bbab�a) [0..p mod 6);

• (bab�ab)bp6c (bab�ab) [0..p mod 6);

• (ab�abb)bp6c (ab�abb) [0..p mod 6);

• (b�abba)bp6c (b�abba) [0..p mod 6);

• (�abbab)bp6c (�abbab) [0..p mod 6).

All of these have at least 2⌊p

6

⌋occurrences of a in them and at least 3

⌊p6

⌋occurrences of b. Table 1

shows the number of each letter for each possible value of p mod 6. In all cases, the requiredquantity is less than or equal to either α (for a’s) or α + 1 (for b’s). Thus, this word has all of thedesired abelian periods.

Finally, we show that w = (abbab�)ω is the lexicographically least word with all of the desiredabelian periods. Suppose towards a contradiction that we have a partial word w′ with the desiredabelian periods such that w′ precedes w lexicographically. Let i be the smallest index such thatw′(i) 6= w(i). We must have w′(i) < w(i), so w(i) 6= a. If w(i) = b, then w′(i) = a. If i mod 6 = 1,then w′ does not have abelian period (1, 1). If i mod 6 = 2 or i mod 6 = 4, then w′ does not haveabelian period (1, 2). Hence, w(i) = �. In this case, if w′(i) = a, w′ does not have abelian period(1, 2), and if w′(i) = b, w′ does not have abelian period (1, 1). This contradicts the fact that w′ hasall of the desired abelian periods, so w is the lexicographically least such word.

5 Conclusion

In Theorem 6, we have proven a lower bound for the number of holes in a word with specific Parikhvector abelian periods. We do not know for certain whether all sets of Parikh vectors can achieve

28

precisely this many holes. We strongly believe that this bound is tight, in the sense that given {~pi}there exists a partial word with exactly this many holes and abelian periods {~pi}.

Conjecture 1. Let k be the alphabet size, and let ~p0 through ~pm−1 be Parikh vectors. There existsa partial word of length lcm {|~pi|} with abelian periods given by these Parikh vectors with exactly

k∑`=1

max0≤j<m

{lcm {|~pi|}| ~pm−1|

‖ ~pm−1‖` −lcm {|~pi|}|~pj |

‖~pj‖`

}holes.

We believe that the converse of Theorem 7 holds if and only if the {pi}’s are pairwise relativelyprime, though we have not yet been able to prove this.

In addition, a World Wide Web server interface has been established at

www.uncg.edu/cmp/research/primitivity4

for automated use in particular of a program “Nonprimitive Words” that, given as input a setof Parikh vectors, outputs a shortest partial word with all specified exact abelian periods alongwith the word’s length, determines if this word is abelian minimal in these abelian periods (withrespect to filling holes), that is, whether this word has holes that, if any is filled in, will give apartial word with the same abelian periods, calculates the theoretical minimum number of holes forthese abelian periods, and counts the number of full words with the specified abelian periods overthe given alphabet as well as the number of full abelian-nonprimitive words of the aforementionedlength over the given alphabet.

References

[1] J. Berstel and L. Boasson. Partial words and a theorem of Fine and Wilf. Theoretical ComputerScience, 218:135–141, 1999.

[2] F. Blanchet-Sadri. Primitive partial words. Discrete Applied Mathematics, 148:195–213, 2005.

[3] F. Blanchet-Sadri. Algorithmic Combinatorics on Partial Words. Chapman & Hall/CRCPress, Boca Raton, FL, 2008.

[4] F. Blanchet-Sadri and A. R. Anavekar. Testing primitivity on partial words. Discrete AppliedMathematics, 155:279–287, 2007.

[5] F. Blanchet-Sadri and M. Cucuringu. Counting primitive partial words. Journal of Automata,Languages and Combinatorics.

[6] F. Blanchet-Sadri, J. I. Kim, R. Mercas, W. Severa, S. Simmons, and D. Xu. Avoiding abeliansquares in partial words. Journal of Combinatorial Theory, Series A, 119:257–270, 2012.

[7] F. Blanchet-Sadri, S. Simmons, and D. Xu. Abelian repetitions in partial words. Advances inApplied Mathematics, 48:194–214, 2012.

[8] F. Blanchet-Sadri, A. Tebbe, and A. Veprauskas. Fine and Wilf’s theorem for abelian periodsin partial words. In JM 2010, 13iemes Journees Montoises d’Informatique Theorique, Amiens,France, 2010.

29

[9] S. Constantinescu and L. Ilie. Fine and Wilf’s theorem for abelian periods. Bulletin of theEuropean Association for Theoretical Computer Science, 89:167–170, 2006.

[10] M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, New York, NY,1994.

[11] E. Czeizler, L. Kari, and S. Seki. On a special class of primitive words. Theoretical ComputerScience, 411:617–630, 2010.

[12] M. Domaratzki and N. Rampersad. Abelian primitive words. In G. Mauri and A. Leporati, edi-tors, DLT 2011, 15th International Conference on Developments in Language Theory, Milano,Italy, volume 6795 of Lecture Notes in Computer Science, pages 204–215, Berlin, Heidelberg,2011. Springer-Verlag.

[13] P. Domosi, S. Horvath, and M. Ito. Context-Free Languages and Primitive Words. WorldScientific, 2011. To appear.

[14] P. Domosi, S. Horvath, M. Ito, L. Kaszonyi, and M. Katsura. Formal languages consistingof primitive words. In Z. Esik, editor, FCT 1993, International Conference on Fundamentalsof Computation Theory, volume 710 of Lecture Notes in Computer Science, pages 194–203,Berlin, Heidelberg, 1993. Springer-Verlag.

[15] G. Fici, T. Lecroq, A. Lefebvre, and E. Prieur-Gaston. Computing abelian periods in words.PSC 2011, Prague Stringology Conference, Prague, Czech Republic, pages 184–196, 2011.

[16] M. Lothaire. Combinatorics on Words. Cambridge University Press, Cambridge, 1997.

[17] L. B. Richmond and J. Shallit. Counting abelian squares. The Electronic Journal of Combi-natorics, 16:R72, 2009.

[18] J. C. Rosales. Fundamental gaps of numerical semigroups generated by two elements. LinearAlgebra and its Applications, 405:200–208, 2005.

30