Inverse functions of polynomials and its applications to initialize the search of solutions of...

31
Numer Algor (2011) 58:203–233 DOI 10.1007/s11075-011-9453-x ORIGINAL PAPER Inverse functions of polynomials and its applications to initialize the search of solutions of polynomials and polynomial systems Joaquin Moreno · A. Saiz Received: 5 May 2010 / Accepted: 16 February 2011 / Published online: 9 March 2011 © Springer Science+Business Media, LLC 2011 Abstract In this paper we present a new algorithm for solving polynomial equations based on the Taylor series of the inverse function of a polynomial, f P ( y). The foundations of the computing of such series have been previously developed by the authors in some recent papers, proceeding as follows: given a polynomial function y = P(x) = a 0 + a 1 x +···+ a m x m , with a i R, 0 i m, and a real number u so that P (u) = 0, we have got an analytic function f P ( y) that satisfies x = f P ( P(x)) around x = u. Besides, we also introduce a new proof (completely different) of the theorems involves in the construction of f P ( y), which provide a better radius of convergence of its Taylor series, and a more general perspective that could allow its application to other kinds of equations, not only polynomials. Finally, we illustrate with some examples how f P ( y) could be used for solving polynomial systems. This question has been already treated by the authors in preceding works in a very complex and hard way, that we want to overcome by using the introduced algorithm in this paper. Keywords Newton’s method · Quasi-Newton methods · Inverse function of polynomials · Polynomial zeros · Polynomial systems zeros · Algorithms · Nonlinear equations J. Moreno (B ) · A. Saiz Departamento de Matemática Aplicada, EUAT, U.P. de Valencia, Camino de Vera, 14, 46022 Valencia, Spain e-mail: [email protected] A. Saiz e-mail: [email protected]

Transcript of Inverse functions of polynomials and its applications to initialize the search of solutions of...

Numer Algor (2011) 58:203–233DOI 10.1007/s11075-011-9453-x

ORIGINAL PAPER

Inverse functions of polynomials and its applicationsto initialize the search of solutions of polynomialsand polynomial systems

Joaquin Moreno · A. Saiz

Received: 5 May 2010 / Accepted: 16 February 2011 /Published online: 9 March 2011© Springer Science+Business Media, LLC 2011

Abstract In this paper we present a new algorithm for solving polynomialequations based on the Taylor series of the inverse function of a polynomial,fP(y). The foundations of the computing of such series have been previouslydeveloped by the authors in some recent papers, proceeding as follows: givena polynomial function y = P(x) = a0 + a1x + · · · + amxm, with ai ∈ R, 0 ≤ i ≤m, and a real number u so that P′(u) �= 0, we have got an analytic functionfP(y) that satisfies x = fP(P(x)) around x = u. Besides, we also introduce anew proof (completely different) of the theorems involves in the constructionof fP(y), which provide a better radius of convergence of its Taylor series,and a more general perspective that could allow its application to other kindsof equations, not only polynomials. Finally, we illustrate with some exampleshow fP(y) could be used for solving polynomial systems. This question hasbeen already treated by the authors in preceding works in a very complexand hard way, that we want to overcome by using the introduced algorithm inthis paper.

Keywords Newton’s method · Quasi-Newton methods · Inverse functionof polynomials · Polynomial zeros · Polynomial systems zeros · Algorithms ·Nonlinear equations

J. Moreno (B) · A. SaizDepartamento de Matemática Aplicada, EUAT, U.P. de Valencia,Camino de Vera, 14, 46022 Valencia, Spaine-mail: [email protected]

A. Saize-mail: [email protected]

204 Numer Algor (2011) 58:203–233

1 Introduction

The purpose of this paper is to contribute to the solutions of polynomialequations and polynomial systems, according to the next expressions,

a0 + a1x + · · · + amxm = 0, (1)

F(x1, ..., xp) = ( f1(x1, ..., xp), ..., fq(x1, ..., xp)) = (0, ..., 0). (2)

The search for solutions of (1) has contributed to the development of math-ematics throughout centuries, since Sumerian (third millennium B.C.), Baby-lonian (second millennium B.C.) and Egyptian (second millennium B.C.) timesuntil nowadays.

Particularly, the study of quadratic, cubic and fourth degree equations hasmotivated the introduction of some important concepts of mathematics suchas irrational and complex numbers. The Galois Theory was motivated by thissame problem of solving (1). This theory not only included the proof of thenon-existence of solutions of (1) by radicals, for m ≥ 5, but also introduced theideas of groups and ideals, that motivated the development of modern algebra.

Finally, (1) has influenced the earlier development of numerical computing,but above all in the important case of Computer Algebra, where solving (1) forlarger m is needed. Here is where equation (1) keeps playing its role nowadays,both as a research problem and as a part of computing tasks.

Polynomial equation (1) and polynomial system (2) arise in a lot of im-portant mathematical areas, such as finite element methods, optimization,with or without constraints, or nonlinear least square problems, On the otherhand they also appear in a large number of fields of science such as physics,chemistry, biology, geophysics, engineering and industry, see [1].

In all these contexts most of practical algorithms for solving them areiterative. Given an initial approximation, x0, a sequence of iterates xk, k =1, 2, ... is generated in such a way that, hopefully, the approximation to somesolution is progressively improved. The convergence is not guaranteed in thegeneral case and no global procedures are provided in order to find such aconvenient approximation, x0.

In the case of Polynomials, see [2–6] as samples of such algorithms.For polynomial systems, the most important methods are Newton and quasi-

Newton’s methods. In [7–9] the reader can find the first steps in the develop-ment of its research, which led to a large amount of works in the late sixtiesand seventies, brightly summarized in [10]. Since then until nowadays therehas been a proliferation of researches that can be put into this framework, andnew methods are also attained, from which we cite some of the more recentones [11–22].

In the search for the above mentioned approximations, x0, is where inversefunction of polynomials, fP(x), might contribute to improve such algorithms,by giving a general method that allows us to locate zeros inside regions in Rp,small enough to guarantee the convergence.

Numer Algor (2011) 58:203–233 205

This paper is organized as follows:In Section 2, functional equation (3) is introduced, h(x) being the unknown.

In order to construct the Taylor series of h(x) all its partial derivatives at zeroare calculated, and we show how to do this.

In Section 3, with the aim of studying the convergence of the Taylor seriesof h(x), a lower and upper bound are provided for all the partial derivatives ofthe function h(x) at zero.

In Section 4, from the analytic function h, the inverse function, fP(x), of apolynomial P(x), is given.

To compute approximations of the Taylor series of h(x) it is required a greatnumber of operations. Therefore, in order to avoid this difficulty, in Section 5,a lower bound, Hl(x), and an upper bound, Hu(x), of the function h(x) areconstructed, what allows us to evaluate it, with a much smaller operationalcost. Some examples to illustrate how inverse functions of polynomials can beutilized, with the goal of finding initial approximations for solving polynomialsand polynomials system are provided.

Finally, in Section 6 we explain our conclusions and future lines of research.Throughout this paper, the degree of the polynomial, P, will be denoted

by m; Z is the set of integer numbers; R, the set of real numbers; C, the setof complex numbers; x = (x2, ..., xm), a vector of Rm−1; 0, the zero vector inRm−1; the first partial derivatives of a function, f , by fx; the second ones, fxy,and for upper orders, f (i1,...,im)(x) will denote the partial derivative of f withrespect to the first variable i1 times, · · · , and with respect to the m-th variable,im times, evaluated at the point x. Finally, the expression hp,0,...,0,1(i),0,...,0,q willdenote that the number 1 is placed in the i − th position.

2 Calculation of the derivatives of the function h at the point zero

In this section we provide new proofs of the theorems introduced in [23, 24] inorder to construct the inverse function of a polynomial.

The main result of this section is Theorem 5 that gives, in an explicit way,the partial derivatives of the function h(x) at zero. This theorem is deducedfrom Theorems 2, 3 and 4, that, on the other hand, are based on Lemmas 1, 2and 3.

Definition 1 The functional equation given by

P(x2 · · · , xm, h(x2, · · · , xm)) = xmhm(x) + · · · + x2h(x)2 − h(x) + 1 = 0 (3)

is defined, where m > 1, x = (x2, x3, · · · , xm) ∈ Rm−1 and h : Rm−1 → C is theunknown to solve.

Theorem 1 Functional equation (3) is solvable.

206 Numer Algor (2011) 58:203–233

Proof It is easy to see that � defined in C and given by

ri �r j ⇔

⎧⎪⎨

⎪⎩

ri =r j;Real(ri)< Real(r j); if Real(ri) �= Real(r j)

Imag(ri)< Imag(r j); if Real(ri)= Real(r j) and Imag(ri) �= Imag(r j)

is a total order relation.Given x0 = (x0

2, . . . , x0m) ∈ Rm−1, (3) defines a polynomial equation of de-

gree p, with 1 ≤ p ≤ m. Solving for h(x0), there are p solutions in C, that weorder in the following manner: r(x0)

1 � r(x0)2 � · · · � r(x0)

p . Then the function,

h : Rm−1 → C, h(x0) ={

r(x0)1 , if x0 �= (0, · · · , 0)

1, if x0 = (0, · · · , 0),

is a solution of (3). �

The next Lemma is easy to prove.

Lemma 1 Let h(x) be a solution of (3), then the f irst order partial derivatives ofh(x) are given by,

hx2(x) = h(x)2

−Ph(x, h(x)), hx3(x) = h(x)3

−Ph(x, h(x)), · · · , hxm(x) = h(x)m

−Ph(x, h(x)),

(4)for all x ∈ Rm−1 with Ph(x, h(x)) �= 0.

Lemma 2 If h(x) is a solution of (3), then the equalities

hxix j(x) = hxi−1x j+1(x), 3 ≤ i ≤ m − 2; 4 ≤ j ≤ m − 1; i < j, (5)

hxixi(x) = hxi−1xi+1(x), 3 ≤ i ≤ m − 1 (6)

are satisf ied.

Proof From (4), the second derivative of h2(x) with respect to xi−1 and x j is

(h2)xi−1x j = 2(hhxi−1)x j = 2hxix j. (7)

On the other hand,

(h2)x jxi−1 = 2(hhx j)xi−1 = 2hx j+1xi−1 . (8)

As (7) equals (8), then (5) follows.Now,

2hxixi = 2(hxi)xi = 2(hhxi−1)xi = (h2)xi−1xi = (h2)xixi−1 = 2(hhxi)xi−1 = 2hxi+1xi−1 .

(9)Thus, the result is proven. �

From the previous Lemma, the next one can be easily proven too.

Numer Algor (2011) 58:203–233 207

Lemma 3 With the same hypothesis as that of Lemma 2, then the next equalitieshold.

h(q2,··· ,qi,··· ,q j,··· ,qm)(x) = h(q2,··· ,qi−1+1,qi−1,··· ,q j−1,q j+1+1,··· ,qm)(x), (10)

with 3 ≤ i ≤ m − 2, 4 ≤ j ≤ m − 1, i < j and qi, q j > 0.

h(q2,··· ,qi,··· ,qm)(x) = h(q2,··· ,qi−1+1,qi−2,qi+1+1,··· ,qm)(x), (11)

with 3 ≤ i ≤ m − 1 and qi > 1.

Theorem 2 If h(x) is a solution of (3), then it satisf ies the f irst order partialderivative equation,

1 + (2x2 − 1)h(x) + (3x3 + 4x22 − x2)hx2(x) + (4x4 + 6x2x3 − 2x3)hx3(x) + · · ·

+ (mxm + 2(m − 1)x2xm−1 − (m − 2)xm−1)hxm−1(x)

+ (2mx2xm − (m − 1)xm)hxm(x) = 0, (12)

for all x, such that Ph(x, h(x)) �= 0.

Proof((2x2 − 1)h(x) + 1

)Ph(x, h(x))

= ((2x2 − 1)h(x) + 1

)( − 1 + 2x2h(x) + 3x3h(x)2 + · · ·+ mxmh(x)m−1)

= −1 + 2x2h(x) + 3x3h(x)2 + · · · + mxmh(x)m−1

+ h(x) − 2x2h(x)2 − 3x3h(x)3 − · · · − mxmh(x)m

− 2x2h(x) + 4x22h(x)2 + 6x2x3h(x)3 + · · · + 2mx2xmh(x)m. (13)

In agreement with (3), h(x) − 1 = xmhm(x) + · · · + x2h(x)2. Then (13)becomes,

((2x2 − 1)h(x) + 1

) = (3x3 + 4x22 − x2)

h(x)2

Ph(x, h(x))

+ (4x4 + 6x2x3 − 2x3)h(x)3

Ph(x, h(x))+ · · ·

+ (mxm + 2(m − 1)x2xm−1 − (m − 2)xm−1)h(x)m−1

−Ph(x, h(x))

+ (2mx2xm − (m − 1)xm)h(x)m

Ph(x, h(x)). (14)

And taking (4) into account, the result follows. �

208 Numer Algor (2011) 58:203–233

Theorem 3 Let h satisfy equation (3), with m > 2 (without lose of generality).We consider the n-th derivative of h(x), h(q2,··· ,qm)(x), with q2 + · · · + qm = n andthe integer number, T, def ined as

T = 2q2 + 3q3 + · · · + mqm ≥ 2n. (15)

Then, there are three unique integer numbers p, q ≥ 0 and r, with 3 ≤ r ≤ m − 1,that verify either

T = mq + 2p

n = q + p; or

T = mq + 2p + r

n = q + p + 1.(16)

Besides this,

h(q2,q1,··· ,qm)(x)={

h(p,0,··· ,0,1(r−1),0,··· ,q)(x); if T =mq+2p+r, with p+q+1=n,

h(p,0,··· ,0,q)(x); if T =mq+2p, with p + q = n,

(17)

Proof First of all, let us assume that, (T − 2n)/(m − 2) ∈ Z . Then q = T−2nm−2 ≥

0 and p = n − q satisfy (16). Otherwise let q, r, p ∈ Z be so that

q = (T − 2n) − r1

m − 2, with 1 ≤ r1 ≤ (m − 3) and p = n − q − 1. (18)

Consequently, T = mq + 2p + r1 + 2. Then, taking r = r1 + 2, q, p and r verify(16).

The equation (17) is a consequence of the following statements:

(a) Each time Lemma 3 is applied to h(q2,q1,··· ,qm)(x), the values of T and n arepreserved.

(b) After applying a finite number of times Lemma 3 to h(q2,q1,··· ,qm)(x), itbecomes either

h(q2,q1,··· ,qm)(x) = h(u,0,··· ,0,1,0,··· ,v)(x), u + 1 + v = n, T = 2u + r + mv, or(19)

h(q2,q1,··· ,qm)(x) = h(u,0,··· ,0,v)(x), u + v = n, T = 2u + mv. (20)

Then, the result follows. �

We need the next Definition in order to prove, by induction, Theorem 4. Itis also important due to the reasons given in Corollary 2.

Numer Algor (2011) 58:203–233 209

Definition 2 Let h(x) be a solution of (3), then the set of n − th partialderivatives of h(x), Dn(x), is defined as

Dn(x) = {h(n,0,··· ,0)(x),

h(n−1,1,0,··· ,0)(x), h(n−1,0,1,0,··· ,0)(x), · · · , h(n−1,0,··· ,1)(x),

h(n−2,1,0,··· ,0,1)(x), h(n−2,0,1,0,··· ,0,1)(x), · · · , h(n−2,0,··· ,0,2)(x),

........................................,

h(1,1,0,··· ,0,n−2)(x), h(1,0,1,0,··· ,0,n−2)(x), · · · , h(1,0,··· ,0,n−1)(x),

h(0,1,0,··· ,0,n−1)(x), h(0,0,1,0,··· ,0,n−1)(x), · · · , h(0,0,··· ,0,n)(x)}.

(21)

Besides, in Dn(x) we define the relation given for

h(q2,··· ,qm)(x) ≺ h(p2,··· ,pm)(x)

⇔ ( if i , 2 ≤ i ≤ m, is the first subscript to satisfy qi �= pi and qi > pi).(22)

Corollary 1 Equation (22) is an order relation.

Corollary 2 In agreement with (17), each n-th derivative of h(x) equals, at least,one of the derivatives of the set Dn(x).

Theorem 4 If t ∈ Dn(0), then either

t = h(p,0,...,0,q)(0) = (2p + mq)!(1 + p + (m − 1)q)! , or (23)

t = h(p1,0,...,0,1(i),0,··· ,0,q1)(0) = (2p1 + i + 1 + mq1)!(p1 + i + 1 + (m − 1)q1)! , with 2 ≤ i ≤ m − 2.

(24)

Proof The Proof is done by induction, which is easily used to prove it forD2(0). Now, we are going to assume that the result is true for all the elementsof Dn−1(0). In order to prove the result for each element of Dn(0), we applyagain the induction method, using the order relation (22).

In fact, the result is true for the first element of Dn(0), since differentiatingequation (12) n times with respect to x2, at the point 0, we obtain,

(n + 1)h(n,0,··· ,0)(0) = 2n(2n − 1)h(n−1,0,··· ,0)(0). (25)

As h(n−1,0,··· ,0)(0) ∈ Dn−1(0), then

h(n,0,··· ,0)(0) = 2n(2n − 1)(2n − 2)!(n + 1)n! = (2n)!

(n + 1)! . (26)

Having done this, suppose now that the result is true for the k − th elementof Dn(0). The (k + 1) − th one is either h(p,0,··· ,0,q)(0) with p + q = n; orh(p1,0,··· ,0,1(i),0,··· ,0,q1)(0), with p1 + 1 + q1 = n and 2 ≤ i ≤ m − 2.

210 Numer Algor (2011) 58:203–233

Let us begin by assuming that the (k + 1) − th element of Dn(0) ish(p,0,··· ,0,q)(0).

Differentiating equation (12) p times with respect to x2 and q times withrespect to xm at the point (0),

(q(m − 1) + p + 1

)h(p,0,··· ,0,q)(0) − mqh(p,0,··· ,0,1,q−1)(0)

= 2p(mq + 2p − 1)h(p−1,0,··· ,0,q)(0). (27)

Solving for h(p,0,··· ,0,q)(0),

h(p,0,··· ,0,q)(0 = mqq(m − 1) + p + 1

h(p,0,··· ,0,1,q−1)(0)

+ 2p(mq + 2p − 1)

q(m − 1) + p + 1h(p−1,0,··· ,0,q)(0). (28)

According to (22), h(p,0,··· ,0,1,q−1)(0) is the k − th element of Dn(0). Thus,

h(p,0,··· ,0,1,q−1)(0) = (2p + mq − 1)!(p + (m − 1)q)! . (29)

Since h(p−1,0,··· ,0,q)(0) ∈ Dn−1(0),

h(p−1,0,··· ,0,q)(0) = (2p − 2 + mq)!(p + q(m − 1))! . (30)

Substituting (29) and (30) in (28),

h(p,0,··· ,0,q)(0) = mq(2p + mq − 1)!

(p + (m − 1)q + 1)! + 2p(2p + mq − 1)!

((m − 1)q + p + 1)!= (2p + mq)!

((m − 1)q + p + 1)! . (31)

Now, let the (k + 1) − th element of Dn(0) is h(p1,0,··· ,0,1(i),0,··· ,0,q1)(0).Differentiating again, at the point 0, equation (12) p1 times with respect to

x2, q1 times with respect to xm, one time with respect to xi+1 and taking intoaccount Lemma 3, we arrive at,

(mq1 + i + 1)h(p1,0,··· ,0,1(i−1),0,··· ,q1)(0) − (q1(m − 1)

+ p1 + i + 1)h(p1,0,··· ,0,1(i),··· ,q1)(0)

+ (mq1 + 2p1 + i)2p1h(p1−1,0,··· ,0,1(i),0,··· ,q1)(0) = 0. (32)

Solving for h(p1,0,··· ,0,1(i),··· ,q1)(0),

h(p1,0,··· ,0,1(i),··· ,q1)(0) = mq1 + i + 1q1(m − 1) + p1 + i + 1

h(p1,0,··· ,0,1(i−1),0,··· ,q1)(0)

+ (mq1 + 2p1 + i)2p1

q1(m − 1) + p1 + i + 1h(p1−1,0,··· ,0,1(i),0,··· ,q1)(0). (33)

Numer Algor (2011) 58:203–233 211

According to the order relation (22), h(p1,0,··· ,0,1(i−1),0,··· ,0,q1) is the k − th ele-ment of Dn(0), so

h(p1,0,··· ,0,1(i−1),0,··· ,0,q1)(0) = (2p1 + mq1 + i)!(p1 + (m − 1)q1 + i)! . (34)

As h(p1−1,0,··· ,0,1(i),0,··· ,0,q1)(0) ∈ Dn−1(0),

h(p1−1,0,··· ,0,1(i),0,··· ,0,q1)(0) = (2p1 + mq1 + i − 1)!(p1 + q1(m − 1) + i)! . (35)

Substituting (34) and (35) in (33),

h(p1,0,··· ,0,1(i),··· ,q1)(0) = (mq1 + i + 1)(2p1 + mq1 + i)!

(p1 + (m − 1)q1 + i + 1)!+ 2p1

(2p1 + mq1 + i)!(p1 + q1(m − 1) + i + 1)!

= (2p1 + mq1 + i + 1)!(p1 + q1(m − 1) + i + 1)! , (36)

which finishes the Proof of the Theorem. �

Theorem 5 Let h(x) be a solution of equation (3), then the equality

h(q2,··· ,qm)(0) = (2q2 + 3q3 + · · · + mqm)!(q2 + 2q3 + · · · + (m − 1)qm + 1)! , (37)

is satisf ied, with q2 + · · · + qm = n.

Proof It is a consequence of Theorems 3, 4 and Corollary 2. �

3 An upper and a lower bound for the n-th derivative of h at 0

In order to study the convergence of the Taylor series of h(x), we will need tobound the partial derivatives of the function h(x), computed in the previoussection. Here, we deal with this issue. Lemmas 4 and 5 play a technical rolewith the aim of proving the main results.

Lemma 4 Let n > 0 be an integer number. Then the inequality

nn

en−1 < n! (38)

holds.

Proof

log(n!) =n∑

i=2

log(i) ≥∫ n

1log(x)dx = xlog(x) − x |n1 = log(nn) − n + 1. (39)

212 Numer Algor (2011) 58:203–233

Consequently the result is proved. �

Lemma 5 Let X, n > 1 and K be positive integer numbers, so that X − n + 1 >

0, then the inequality

X X

(X − n + 1)X−n+1 ≤ (X + K)X

(X − n + 1 + K)X−n+1 (40)

is verif ied.

Proof The function

f (t) = (X + t)X

(X − n + 1 + t)X−n+1

with t ∈ [0, K] is an increasing function, since f ′(t) > 0, ∀ t ∈ [0, K]. Thusf (0) < f (K), which proves (40). �

Theorem 6 Let h(x) be a solution of (3), and h(q2,··· ,qm)(0) with q2 + · · · + qm =n, any of its n-th derivatives. Then the following inequality holds,

h(q2,··· ,qm)(0)= (2q2 + 3q3 + · · · + mqm)!(q2 + 2q3 + · · · + (m − 1)qm + 1)!≤n! 1

nm2q2+3q3+···+mqm

(m − 1)q2+2q3+···+(m−1)qm.

(41)

Proof Consider T = 2q2 + 3q3 + · · · + mqm. Then

h(q2···qm)(0) = T!(T − n + 1)! . (42)

Taking logarithms,

log(h(q2···qm)(0)) = log(

T!(T − n + 1)!

)

=T∑

i=T−n+2

log(i)

≤∫ T+1

i=T−n+2log(x)dx = xlog(x) − x |T+1

i=T−n+2

= log((T + 1)T+1) − T − 1

−(log((T − n + 2)T−n+2) − T + n − 2)

= log(

(T + 1)T+1

(T − n + 2)T−n+2

)

− n + 1. (43)

Therefore,

h(q2···qm)(0) ≤ (T + 1)T+1

(T − n + 2)T−n+2 e−n+1. (44)

Numer Algor (2011) 58:203–233 213

Making in (44) T + 1 = X and taking K = (m − 2)q2 + (m − 3)q3 + · · · +qm−1 − 1, in agreement with (40), we get,

h(q2···qm)(0) ≤ (T + 1)T+1

(T − n + 2)T−n+2 e−n+1 = (X)X

(X − n + 1)X−n+1 e−n+1

≤ (X + K)X

(X − n + 1 + K)X−n+1 e−n+1

≤ (mn)T

((m − 1)n)T−n+1 e−n+1 ≤ mT

(m − 1)T−nnn−1e−n+1

= 1n

mT

(m − 1)T−nnne−n+1, (45)

since X + K = T + 1 + K = mn and X − n + 1 + K = T + 2 − n + K =(m − 1)n + 1. And taking into account (38),

h(q2···qm)(0) ≤ 1n

mT

(m − 1)T−nn!, (46)

and the inequality is proven. �

Theorem 7 Given

T1 = mm

(m − 1)m−1 ,

the inequality

h(q2,··· ,qm)(0) ≤ n! 1T1

1n

Tn1 , ∀ n ≥ 1, ∀ m ≥ 2, (47)

holds, with q2 + · · · + qm = n.

Proof First of all, for the sake of clarity, we make the notation

d(q2, · · · , qm) = h(q2,··· ,qm)(0)

q2! · · · qm! . (48)

Let P1(x) and Q1(x) be, the polynomials

P1(x) = m(mx + m − 1) . . . (mx + 1) ∀x ≥ 0 and

Q1(x + 1) = ((m − 1)x + m) . . . ((m − 1)x + 2) ∀x ≥ 0.(49)

It is easy to show that the function

f (x) = P1(x)(x + 1)

Q1(x + 1)x

verifies:

1. if m = 2;

a. is increasing in [4,∞),b. its restriction to integers: { f (n)}n≥4 is an increasing sequence;

214 Numer Algor (2011) 58:203–233

2. if m ≥ 3;

a. is increasing in [3,∞),b. its restriction to integers: { f (n)}n≥3 is an increasing sequence;

3. the sequence f (n) converges to T1 as n → ∞.

Therefore, if m = 2, f (n) ≤ T1 ∀n ≥ 4, and if m ≥ 3, f (n) ≤ T1 ∀n ≥ 3.First, in the case m = 2, for n = 1, 2, 3, (47) holds. For n = k, k > 3, we can

write that

d(k + 1)(k + 1)

d(k)k= P1(k)(k + 1)

Q1(k + 1)k= f (k) ≤ T1.

Hence

d(k + 1) ≤ T1d(k)k

k + 1.

The result is concluded by induction.Now, let us consider the case m ≥ 3. Then our plan is:

1. We are going to prove that

d(0, . . . , n) ≤ 1n

Tn−11 , ∀ n ≥ 1. (50)

Notice that the number of parameters of d is m − 1.2. Using (50), (47) will be shown.

1. For m ≥ 3, the inequalities:

f (1) = 2m ≤ f (2) = 9m − 34

≤ f (3) = 4(4m − 2)(4m − 1)

3(3m − 1)

are satisfied. Then { f (n)}n≥1 is increasing. For n = 1, d(0, . . . , 1) satisfies (50).For n = k, k > 1 we have that

d(0, . . . , k + 1)(k + 1)

d(0, . . . , k)k= P1(k)(k + 1)

Q1(k + 1)k= f (k) ≤ T1.

Hence

d(0, . . . , k + 1) ≤ d(0, . . . , k)k

k + 1T1.

Therefore (50) follows by induction hypothesis.

Numer Algor (2011) 58:203–233 215

2. For n = 0, (47) is obvious. For n > 0

d(0, . . . , 0, n)n! = (mn)!((m − 1)n + 1)!

= (mq2 + · · · + mqq)!((m − 1)q2 + (m − 1)q3 + · · · + (m − 1)qm + 1)!

= mn(m − 1)n + 1

mn − 1(m − 1)n

. . .(2q2 + 3q3 + · · · + mqm)!

(q2 + 2q3 + · · · + (m − 1)qm + 1)!. (51)

Notice that the last term of the product in (51) equals d(q2, . . . , qm)q2! . . . qm−1!Since

2q2 + 3q3 + · · · + mqm + Kq2 + 2q3 + · · · + (m)qm + 1 + K

≥ 1,

with q2 + · · · + qm = n ≥ 1, for all integer number K ≥ 1, then all the terms ofthe product (51) are greater than or equal to 1, so the proof is completed. �

Theorem 8 Let h(x) be a solution of (3), and h(q2,··· ,qm)(0) with q2 + · · · + qm =n, be any of its n-th derivatives. Then the following inequality holds,

h(q2,··· ,qm)(0) = (2q2 + 3q3 + · · · + mqm)!(q2 + 2q3 + · · · + (m − 1)qm + 1)! ≥ n! 1

n1

3.753.75n, ∀n ≥ 1.

(52)

Proof The proof is similar to the previous one. �

4 Construction of the inverse function of a polynomial

In this section, from the analytic function h(x), the inverse function of apolynomial, fP, is constructed.

Theorem 9 Let h : Rm−1 → R be the analytic function,

h(x) =∞∑

n=0

q2+···+qm=n

d(q2, · · · , qm)xq22 · · · xqm

m , (53)

with d(q2, · · · , qm) introduced in (48). Then h converges in the region,

R={(x2, · · · , xm)∈Rm−1,

∣∣∣∣

mm

(m − 1)m−1 xm

∣∣∣∣+· · ·+

∣∣∣∣

m3

(m − 1)2 x3

∣∣∣∣+

∣∣∣∣

m2

m − 1x2

∣∣∣∣<1}.

(54)

Furthermore, h is a solution of (3).

216 Numer Algor (2011) 58:203–233

Proof Given the series,

S(x) =∞∑

n=1

q2+···+qm=n

1n

n!q2! · · · qm!

(m2

m − 1|x2|

)q2

· · ·(

mm

(m − 1)m−1 |xqm |)qm

,

(55)and the function,

C(x) = m2

m − 1|x2| + · · · + mm

(m − 1)m−1 |xm|, (56)

using the Taylor expansion

log (1 − C(x)) = −∞∑

n=1

C(x)n

n, (57)

and in agreement with

C(x)n =∑

q2+···+qm=n

n!q2! · · · qm!

(m2

m − 1|x2|

)q2

· · ·(

mm

(m − 1)m−1 |xqm |)qm

, (58)

we have

log (1 − C(x)) = −S(x). (59)

For these reasons, and taking into account (41) and (48), we can proceed asfollows,

|h(x)| ≤ 1 +∞∑

n=1

q2+···+qm=n

d(q2, · · · qm)|x2|q2 · · · |xm|qm

≤ 1+∞∑

n=1

q2+···+qm=n

1n

n!q2! · · · qm!

(m2

m − 1|x2|

)q2

· · ·(

mm

(m − 1)m−1 |xqm |)qm

= 1 − log(

1 −(

m2

m − 1|x2| + · · · + mm

(m − 1)m−1 |xqm |))

. (60)

Thus, the result is proven. �

Theorem 10 Given

1. The polynomial function

y = P(x) = a0 + a1x + · · · + amxm, (61)

where a0, a1, · · · , am are real numbers with a1, am �= 0.2. The functions Xi, 2 ≤ i ≤ m,

Xi(y) = (a0 − y)i−1ai

(−a1)i. (62)

Numer Algor (2011) 58:203–233 217

3. The function,

CP(y) = |X2(y)m2

m − 1| + ... + |Xm(y)

mm

(m − 1)m−1 |. (63)

Then

1. The series,

fP(y) = a0 − y−a1

h(X2(y), · · · , Xm(y)), (64)

def ined in the region,

RP = {y ∈ R; CP(y) < 1}, (65)

is the inverse function of P(x), h(x) being the function given in (53).2. If fP(0) is well def ined, then it is a root of P(x), which will be called r.3. r is either the smallest positive root, if a0/a1 > 0, or the greatest negative root,

if a0/a1 < 0.

Proof By induction

1. In fact, as h(x) is a solution of (3), then

fP(y) = a0 − y−a1

h(X2(y), . . . , Xm(y))

= a0 − y−a1

(1 + X2(y)h2(X2(y), · · · , Xm(y)) + ...

+ Xm(y)hm(X2(y), · · · + Xm(y))). (66)

Substituting (62) in (66) one gets

y = a0 + a1 fP(y) + a2 fP(y)2 + · · · + am fP(y)m.

2. It is obvious.3. Four cases are distinguished: a0 > 0, a1 < 0; a0 > 0, a1 > 0; a0 < 0, a1 < 0

and a0 < 0, a1 > 0.

Let a0 > 0 and a1 < 0. Suppose that r = fP(0) < 0. Since fP(0) is well defined,then fP(y) exists at least in the interval [0, a0]. P′(0)=a1 < 0, then there existsδ > 0 such that P′(x) < 0, ∀x ∈ (−δ, δ). Let x0 < 0, with x0 ∈ (−δ, δ). Thenthere is y0 = P(x0) > a0. Since P is continuous in [r, x0], then there exists x1 ∈[r, x0], in such a way that P(x1) = a0 < y0. Therefore fP(a0) = 0 and fP(a0) =x1, with x1 �= 0. This is a contradiction that proves that r = fP(0) > 0.

On the other hand r is the smallest positive root, since fP is continuous, withfP(a0) = 0.

The proof of the other cases are very similar. �

218 Numer Algor (2011) 58:203–233

Definition 3 Given the polynomial, P(x) = a0 + a1x + · · · + amxm, and ac-cording to (65), if

CP(0)= m2

m − 1

∣∣∣∣

a0a2

(−a1)2

∣∣∣∣+

m3

(m − 1)2

∣∣∣∣

a20a3

(−a1)3

∣∣∣∣+· · ·+ mm

(m − 1)m−1

∣∣∣∣∣

am−10 am

(−a1)m

∣∣∣∣∣<1,

(67)then the series

fP(0)= a0

−a1

∞∑

n=0

q2+···+qm=n

d(q2, · · · , qm)

(a0a2

(−a1)2

)q2(

a20a3

(−a1)3

)q3

· · ·(

am−10 am

(−a1)m

)qm

(68)is well defined, and it is said to be the associated series of P(x), from now on,AS of P(x).

Example 1 Find the smallest positive root of the polynomial P(x) = x5 +3x4 + x3 − 2x2 − 6x + 3

2 = 0.

In agreement with (67),

CP(0) = 52

4

(362

)

+ 53

42

(9

4 · 63

)

+ 54

43

(81

8 · 64

)

+ 55

44

(81

16 · 65

)

= 89975131072

< 1.

(69)(CP(0) can be computed using the code shown in Table 1, by defining thepolynomial as S := x5 + 3x4 + x3 − 2x2 − 6x + 3

2 and running the commandCP). Thus, the AS of P(x) is well defined, and the required root is exactlyprovided by the series,

fP(0) = 14

∞∑

n=0

q2+···+q5=n

(2q2 + · · · + 5q5)!(1 + q2 + · · · + 4q5)!q2! · · · q5!

(−362

)q2(

94 · 63

)q3

×(

818 · 64

)q4(

8116 · 65

)q5

. (70)

The results obtained for n=0, 1, 2, 3, 4, are: 0.25, 0.2338867188, 0.2355127335,0.2353500022, 0.2353629259, respectively, 0.2353 being the smallest root ofthe polynomial. The calculations have been performed following the codeintroduced in Table 1, by defining the polynomial as S := x5 + 3x4 + x3 −2x2 − 6x + 3

2 and running the commands SumaParcial[0], SumaParcial[1],SumaParcial[2], SumaParcial[3] and SumaParcial[4].

If the series is not convergent, then Theorem 11 and Corollaries 3 and 4provide a solution to this question, by shifting the polynomial, P, throughoutthe X axis.

Theorem 11 Given the polynomial function (61) and t ∈ R with P′(t) �= 0, weconsider Pt(x) = P(t + x). Then the function,

fPt(y) = P(t) − y−P(1)(t)

h (Y2(t, y), . . . , Ym(t, y)) , (71)

Numer Algor (2011) 58:203–233 219

Table 1 Implementation using Mathematica software for computing the AS of any polynomial

S := (Input for setting the polynomial to solve)m := Exponent[S, x]; (Input for setting the polynomial degree)a0 := Coef f icient[S, x, 0];a1 := Coef f icient[S, x, 1];(Input for computing d(q2, ..., qm) introduced in (48), with l = {q2, ..., qm})d[l_] :=Block[{t1, t2, t3, t4, t5, t6},t1 := Table[0, {i, 1, m − 1}];t2 := Table[i, {i, 2, m}];t3 := Table[i, {i, 1, m − 1}];t4 := Apply[Plus, t2.l];t5 := Apply[Plus, t3.l] + 1;t6 := Apply[Plus, l];I f [l == t1, 1, t4!/(t5!Apply[Times, l!])]]k := a0/(−a1);( Input for computing the list {X2(0), ...Xm(0)} according to (62))X :=Block[{t1, t2, t3, t4}t1 := Drop[Coef f icientList[S, x], 2];t2 := a0/(−a1);t3 := Table[t2i, {i, 1, m − 1}];t4 = 1/(−a1)t1t3];(Input for computing the m1 first terms of the AS)SumaParcial[m1_] :=N[Block[{ f 1, f 2, l1, f3, f4, L1, L2},f 1 = List[##]&;f 2 = Flatten[Array[ f 1, Table[#1 + 1, {m − 1}], 0], m − 2]&;l1[q_] := Select[ f 2[#1], Apply[Plus, ##] == q&]&;f 3[l2_] := Apply[Times, l2];f 4[l3_] := d[l3];L1[ j_] := Map[ f 3,

Table[Xl1[ j][ j][[v]], {v, 1, Length[l1[ j][ j]]}]]//.{00− > 1};L2[ j_] := Map[ f 4, l1[ j][ j]];Suma[p1_] := Apply[Plus, L1[p1]L2[p1]];kSum[Suma[i], {i, 0, m1}]],10];(Input for computing CP(0) in concordance with (67) )CP :=Apply[Plus, Abs[mTable[(m/(m − 1))i, {i, 1, m − 1}]X]]

def ined in

RPt = {(t, y) ∈ R2, |Y2(t, y)| m2

m − 1+ · · · + |Ym(t, y)| mm

(m − 1)m−1 < 1},

220 Numer Algor (2011) 58:203–233

is the inverse of Pt(x) around y = P(t), where

Yi(t, y) = (P(t) − y)i−1 P(i)(t)(i)!(−P(1)(t))i

2 ≤ i ≤ m. (72)

Proof Let us write the Taylor’s formula of P(x) around x = t,

P(y) = P(t) + P(1)(t)(y − t) + P(2)(t)2! (y − t)2 + · · · + P(m)(t)

m! (y − t)m. (73)

Changing (y − t) by x in (73), leads to

Pt(x) = P(x + t) = P(t) + P(1)(t)x + P(2)(t)2! x2 + · · · + P(m)(t)

m! xm.

And the result follows by applying Theorem 10 to Pt(x). �

Corollary 3 Under the same assumption of Theorem 11, let fP be the function,

fP(y) = t + fPt(y) (74)

with (t, y) ∈ RPt . Then (74) is the inverse of P(x) around x = t.

Proof Consider y = P(t1) with (t1, y) ∈ RPt , then

fP(y) = t + fPt(y) = t + fPt(P(t1))

= t + fPt(Pt(t1 − t)) = t + t1 − t = t1.

The result follows. �

It is easy to prove the following Corollary.

Corollary 4 With the same hypotheses of Theorem 10, if fP(0) is not convergent,and if r is a root of P(x), so that P′(r) �= 0, then there is a neighborhood of r, Vr

and t ∈ Vr, in such a way that the series fPt(0) is convergent, with r = t + fPt(0).

Example 2 Find the smallest positive root of the polynomial P(x) = 3 − 2x −3x2 + 2x3 − 4x4 + x5 = 0.

As CP(0) > 1, then the AS of P(x) is not convergent. According to Corollary4, let u = 2

3 be a lower bound1 of the positive roots of P(x), and

Pt(x) = P(x + t) = 3 − 2(

23

+ x)

− 3(

23

+ x)2

+ 2(

23

+ x)3

− 4(

23

+ x)4

+(

23

+ x)5

. (75)

1This bound can be computed following any of the well known methods, see [25]

Numer Algor (2011) 58:203–233 221

Then CPt(0) < 1, and therefore the AS of Pt is convergent and the requiredroot is given by fP(0) = u + fPu(0).

Proposition 1 Let

C1P(y) = |X2(y)| + ... + |Xm(y)| ≤ (m − 1)m−1

mm, (76)

where Xi(y), 2 ≤ i ≤ m, are def ined by (62). Then fP, def ined by (64), satisf iesthe inequality,

| fP(y)| ≤∣∣∣∣a0 − y−a1

∣∣∣∣

mm − 1

, ∀ y ∈ RP. (77)

Proof From (51) we arrive at

d(0, ..., 0, n)n! ≥ d(q2, ..., qm)q2! · · · qm!, with q2 + ... + qm = n. (78)

Therefore

| fP(y)| ≤∣∣∣∣a0 − y−a1

∣∣∣∣

∞∑

n=0

q2+···+qm=n

d(q2, · · · , qm)

∣∣∣∣(a0 − y)a2

(−a1)2

∣∣∣∣

q1

. . .

∣∣∣∣(a0 − y)m−1am

(−a1)m

∣∣∣∣

qm−1

≤∣∣∣∣a0 − y−a1

∣∣∣∣

∞∑

n=0

q2+···+qm=n

d(0, · · · , 0, n)n!q2! · · · qm!

∣∣∣∣(a0 − y)a2

(−a1)2

∣∣∣∣

q1

. . .

∣∣∣∣(a0 − y)m−1am

(−a1)m

∣∣∣∣

qm−1

=∣∣∣∣a0 − y−a1

∣∣∣∣

∞∑

n=0

d(0, · · · , n)

(∣∣∣∣(a0 − y)a2

(−a1)2

∣∣∣∣ + . . . +

∣∣∣∣(a0 − y)m−1am

(−a1)m

∣∣∣∣

)n

≤∣∣∣∣a0 − y−a1

∣∣∣∣

∞∑

n=0

d(0, · · · , n)

((m − 1)m−1

mm

)n

. (79)

Taking into account that 1 is the smallest positive root of xm − mx + (m − 1) =0, then

m − 1m

∞∑

n=0

d(0, · · · , n)

((m − 1)m−1

mm

)n

= 1, (80)

and the result follows. �

222 Numer Algor (2011) 58:203–233

5 Bound functions of h(x)

Computing the series, fP(0), requires a great operational cost. So, in thissection an upper and a lower bounds are constructed with the purpose ofapproximating its value. In Example 3 the reader can see its effectiveness.Finally Example 4 illustrates how inverse functions, fP, can be utilized to findan initial approximation, in the case of polynomial systems.

Definition 4 Let us assume, without loss of generality, that the coordinates x2,· · · , xp are negative and xp+1, · · · , xm, positive. Then, we define Pn(x), Nn(x),En(x) and On(x) for all positive integer n, by

Pn(x) =∑

2t+v=n

q2 + · · · + qp = 2tqp+1 + · · · + qm = v

(−1)2t h(q2···qm)(0)

q2! · · · qm! |x2|q2 · · · |xm|qm, (81)

Nn(x) = −∑

2t+1+v=n

q2 + · · · + qp = 2t + 1qp+1 + · · · + qm = v

(−1)2t+1 h(q2···qm)(0)

q2! · · · qm! |x2|q2 · · · |xm|qm,

(82)

En(x) =∑

2t+v=n

q2 + · · · + qp = 2tqp+1 + · · · + qm = v

1n

n!q2! · · · qm! |x2|q2 · · · |xqm |qm , (83)

On(x) =∑

2t+1+v=n

q2 + · · · + qp = 2t + 1qp+1 + · · · + qm = v

1n

n!q2! · · · qm! |x2|q2 · · · |xqm |qm . (84)

Corollary 5 h(x) can be written as

h(x) = h+(x) − h−(x), with h+(x) = 1 +∞∑

n=1

Pn(x) and h−(x) =∞∑

n=1

Nn(x).

(85)

Proof The Proof follows from Definition 4. �

Remark 1 Given the functions,

V(x) = x2 + · · · + xm and U(x) = |x2| + · · · + |xm|, (86)

Numer Algor (2011) 58:203–233 223

using the Taylor expansion of the functions log(1 − V(x)) and log(1 − U(x)),in a similar way as it was done in (59) and in agreement with Definition 4we have

log ((1 − V(x))(1 − U(x))) = −2∞∑

n=1

En(x), (87)

log(

1 − V(x)

1 − U(x)

)

= 2∞∑

n=1

On(x). (88)

Theorem 12 Under the same hypotheses as Theorem 9, the function h(x) isbounded by the functions Hu(x) and Hl(x) as follows.

1. If |T1x2| + · · · + |T1xm| < 1, with

T1 = mm

(m − 1)m−1 ,

and the coordinates x2, x3,...,xm are positive, then

h(x) ≤ Hu(x);

Hu(x) = 1 − 1T1

log (1 − V(T1x2, · · · , T1xm)) .(89)

Also h(x) ≥ Hl(x), with

Hl(x) = 1 − 13.75

log (1 − V(3.75x2, · · · , 3.75xm)) .(90)

2. On the contrary, if there are negative coordinates, say x2,..., xp, then

h(x) ≤ Hu(x);Hu(x) = 1 − 1

2T1log ((1 − V(T1x2, · · · , T1xm))(1 − U(T1x2, · · · , T1xm)))

− 17.5

log(

1 − V(3.75x2, · · · , 3.75xm)

1 − U(3.75x2, · · · , 3.75xm)

)

. (91)

Also h(x) ≥ Hl(x2, · · · , xm), with

Hl(x) = 1 − 17.5

log((

1 − V(3.75x2, · · · , 3.75xm))

×(1 − U(3.75x2, · · · , 3.75xm)

))

− 12T1

log(

1 − V(T1x2, · · · , T1xm)

1 − U(T1x2, · · · , T1xm)

)

. (92)

224 Numer Algor (2011) 58:203–233

Proof

1. Assume that all the coordinates are positive. From (47) we arrive at

h(x) = 1 +∞∑

n=1

q2+···+qm=n

h(q2···qm)(0)

q2! · · · qm! xq22 · · · xqm

m

≤ 1 + 1T1

∞∑

n=1

q2+···+qm=n

Tn1

nn!

q2! · · · qm! xq22 · · · xqm

m

= 1 + 1T1

∞∑

n=1

q2+···+qm=n

1n

n!q2! · · · qm! (T1x2)

q2 · · · (T1xqm

)qm. (93)

Following the same reasoning as that used in formulas (55), (56), (57), (58)and (59), one gets

1 + 1T1

∞∑

n=1

q2+···+qm=n

1n

n!q2! · · · qm! (T1x2)

q2 · · · (T1xqm

)qm

= 1 − 1T1

log(1 − (

T1x2 + · · · + T1xqm

))(94)

and (89) is proven.In a similar way, if (52) is used, then (90) is obtained.

2. Let us assume now that there are negative coordinates. In this case wehave only to prove (91), since the Proof of the remaining inequality is verysimilar. Taking into account inequalities (47) and (52) again, we bound Pn

and Nn as follows,

0 ≤ 13.75

En(3.75x2, · · · , 3.75xm) ≤ Pn(x)

≤ 1T1

En(T1x2, · · · , T1xm), ∀n > 0,

0 ≤ 13.75

On(3.75x2, · · · , 3.75xm) ≤ Nn(x)

≤ 1T1

On(T1x2, · · · , T1xm), ∀n > 0, (95)

therefore, from (87) and (88), we can write,

− 17.5

log ((1 − V(3.75x2, · · · , 3.75xm)) (1 − U(3.75x2, · · · , 3.75xm)))

≤∞∑

n=1

Pn(x)≤− 12T1

log ((1 − V(T1x2,· · · ,T1xm))(1 − U(T1x2,· · ·, T1xm))),

(96)

Numer Algor (2011) 58:203–233 225

17.5

log(

1 − V(3.75x2, · · · , 3.75xm)

1 − U(3.75x2, · · · , 3.75xm)

)

≤∞∑

n=1

Nn(x) ≤ 12T1

log(

1 − V(T1x2, · · · , T1xm)

1 − U(T1x2, · · · , T1xm)

)

. (97)

The result follows from Corollary 5. �

Corollary 6 Given t ∈ R, with P′(t) �= 0, let FuPt

(y) and FlPt

(y) be the functions,

FuPt

(y) = P(t) − y−P′(t)

Hu(Y2(t, y), · · · , Ym(t, y)), (98)

FlPt

(y) = P(t) − y−P′(t)

Hl(Y2(t, y), · · · , Ym(t, y)), (99)

with Yi, 2 ≤ i ≤ m introduced in (72). Then, if

P(t) − y−P′(t)

> 0,

fPt(y) can be bounded as follows

FlPt

(y) ≤ fPt(y) ≤ FuPt

(y). (100)

On the contrary, if

P(t) − y−P′(t)

< 0,

FuPt

(y) ≤ fPt(y) ≤ FlPt

(y). (101)

Proof It is a consequence of Theorem 12 applied to the function h of (71). �

Remark 2 We recall that

1. if P has positive roots and P(x) = (x − A)Q(x) + R, where R ≥ 0 and A >

0 is the first integer so that all the coefficients of Q are no negative, thenA is an upper bound of the positive roots of P (if A = 0 then P(x) has nopositive roots);

2. if P has positive roots, then B = 1/A, A > 0, is a lower bound of thepositive roots of P, A being an upper bound of the positive roots ofP1(x) = xn P(1/x).

Theorem 13 Let GlP and Gu

P be the functions,

GlP(t) = t + Fl

Pt(0) and Gu

P(t) = t + FuPt

(0). (102)

226 Numer Algor (2011) 58:203–233

Assume f irst that P has positive roots. Consider the sequence, tn, ∀ n ≥ 0, witht0 = 0, given by,

tn+1 ={

GlP(tn); if |T1Y2(tn, 0)|+ · · · +|T1Ym(tn, 0)|≤0.99 and Ptn(0)/(−P′

tn(0))>0

tn+Ltn; if |T1Y2(tn, 0)|+ · · · +|T1Ym(tn, 0)|>0.99 or Ptn(0)/(−P′tn(0))<0

(103)where Ltn is a lower bound of the positive roots of Ptn , according to the precedingnote. Then tn converges to the smallest positive root of P.

Suppose now that P has negative roots. Consider the sequence, tn, ∀ n ≥ 0,with t0 = 0, given by,

tn+1 ={

GuP(tn); if |T1Y2(tn, 0)|+ · · · +|T1Ym(tn, 0)|≤0.99and Ptn(0)/(−P′

tn(0))<0

tn+Utn; if |T1Y2(tn, 0)|+ · · · +|T1Ym(tn, 0)|>0.99 or Ptn(0)/(−P′tn(0))>0

(104)where Utn is an upper bound of the negative roots of Ptn , according to theprevious note. Then tn converges to the greatest negative root of P.

Proof We prove the Theorem for sequence (103), since for (104) the Proof ispractically the same. It is obvious that

T1 ≥ 4, ∀ m ≥ 2. (105)

1. tn+1 ≥ tn ≥ 0 ∀ n. Indeed, if tn+1 = tn + GuP(tn), then, under the given con-

ditions, the inequalities:

0.01 ≤ 1 − U(T1 Y2(tn, 0), · · · , T1 Ym(tn, 0)) ≤ 1,

0.01 ≤ 1 − V(T1 Y2(tn, 0), · · · , T1 Ym(tn, 0)) ≤ 2,

0.01 ≤ 1 − U(3.75 Y2(tn, 0), · · · , 3.75 Ym(tn, 0)) ≤ 1,

0.01 ≤ 1 − V(3.75 Y2(tn, 0), · · · , 3.75 Ym(tn, 0)) ≤ 2 (106)

hold, and it is satisfied that

log(0.01)

≤ log(1 − V(T1 Y2(tn, 0), · · · , T1 Ym(tn, 0))

1 − U(T1 Y2(tn, 0), · · · , T1 Ym(tn, 0))

)

≤ log(200)

(107)

and

2 log(0.01)

≤ log((

1 − V(3.75 Y2(tn, 0), · · · , 3.75 Ym(tn, 0)))

×(1 − U(3.75 Y2(tn, 0), · · · , 3.75 Ym(tn, 0))

))

≤ log(2). (108)

Numer Algor (2011) 58:203–233 227

Therefore, using (105), we have that

Hl(Y2(tn, 0), · · · , Ym(tn, 0)) ≥ 1 − log(2)

7.5− log(200)

8> 0.24528 (109)

and

FlPtn

(0) = Ptn(0)

−P′tn(0)

Hl(Y2(tn, 0), · · · , Ym(tn, 0)) ≥ 0. (110)

On the contrary, if tn+1 = tn + Ltn , then Ltn is also greater than zero, sinceit is a lower bound of the positive roots of Ptn according to Remark 2,consequently from (110) it follows that tn is an increasing sequence, withtn ≥ 0, ∀ n.

2. The sequence {tn}∞0 is bounded, as we are going to see next. Let r be thesmallest positive root of P. In concordance with Corollary 3 we can saythat

r = fP(0) = t + fPt(0) ≥ t + FlPt

(0). (111)

From (111), if rtn is the smallest positive root of Ptn , then either

r = tn + rtn = tn + fPtn(0) > tn + Fl

Ptn(0) ≥ tn. (112)

or

r = tn + rtn ≥ tn + Ltn ≥ tn. (113)

Therefore {tn}∞0 is a bounded sequence.3. Finally, we prove that {tn}∞0 converges to r. It is an increasing and bounded

sequence, thus it is convergent. Assume now that

T = limn→∞ tn.

Then

T = T + PT(0)

−P′T(0)

Hl(Y2(T), · · · , Ym(T, 0)),

and from Corollary 4 there exists an integer N so that tn+1 = tn + FlPtn

(0)

for all n ≥ N and by (109) Hl(Y2(T, 0), ..., Ym(T, 0)) > 0. ConsequentlyP(T)/(−P′(T)) = 0 and T = r. The Proof is finished. �

The next algorithm introduced for finding the positive roots of a polynomial,P(x), with precision δ is based on equation (103) of Theorem 13.

In the following examples, the Algorithm 1 can be used either for finding aninitial approximation to initialize the search of solutions, or as an alternativealgorithm for solving the polynomial equation.

228 Numer Algor (2011) 58:203–233

Algorithm 1 Algorithm for finding the real roots of P(x) = a0 + a1x + · · · +amxm with precision δ.Step 0.Take i = 0, t0 = 0 and n = 0.Step 1.Compute ti+1 following equation (103).If ti+1 = Gl

p(ti) Thencompute the error εi+1 as εi+1 = |Gu

p(ti) − Glp(ti)|

Elseεi+1 = 2δ

End IfStep 2.If εi+1 > δ Then

i = i + 1Go to Step 1

Elsen = n + 1rn = ti+1 (* the root rn is achieved with the required precision *)If rn + δ is an upper bound of P(x) (Remark 2) or all the coefficients of

Ptn preserve the sign ThenStop (* there are not more positive roots *)

ElseTake i = 0 and t0 = rn + δ.Goto Step 1 (* it is initialized the search for a new root *)

End ifEnd If

Notice that εi+1 = 2δ is defined by convenience.

Note: The algorithm for computing the negative roots could trivially beadapted considering the positive roots of a new polynomial, P(w), wherew = −x.

Example 3 Compute the positive roots of P(x) = 105 + 21x − 50x2 − 10x3 +5x4 + x5 with precision δ = 1 × 10−11.

We have solved this example by applying Algorithm 1 in agreement withTable 2, where we have defined SY(ti) = T1

∑5n=2 |Yn(ti, 0)|.

Remark 3 In the case of r1, the accuracy goes from three exact digits to fifteen,only in one step. In the case of r2, the accuracy goes from four exact digits toeleven in r2, only in one step.

Numer Algor (2011) 58:203–233 229

Table 2 Detailed results of the computations of the two positive roots of P(x)

Num. SY(ti) Pti (0)/(−P′ti (0)) Lt(i−1)

ti εi

0 – – – 0 –1 1017.25 > 0.99 – Lt0 = 1 t0 + Lt0 = 1 2δ

2 – – – – ε1 > δ

3 8.1 > 0.99 – Lt1 = 2/3 t1 + Lt1 = 5/3 2δ

4 – – – – ε2 > δ

5 0.26 < 0.99 380/593 > 0 – GlP(t2) = 1.73204 <0.0002

6 – – – – ε3 > δ

7 3.8×10−5 <0.99 7.4×10−6 >0 – GlP(t3) = <4 ×10−16

1.732050807568877288 – – – r1 = 1.73205080756887 ε4 < δ

9 – – – r1 + δ is not an upper –bound

10 – −0.002 < 0 Lt0 = 2/3 t0 + Lt0 = 2.40071 2δ

11 – – – – ε1 > δ

12 28.8 > 0.99 – Lt1 = 2/9 t1 + Lt1 = 2.62293 2δ

13 – – – – ε2 > δ

14 0.503 < 0.99 0.023 > 0 – GlP(t1) = 2.64565 <0.0003

15 – – – – ε3 < δ

16 0.0019 < 0.99 9.5 × 10−5 > 0 – GlP(t3) = <9 ×10−12

2.64575131106417 – – – r2 = 2.6457513110 ε4 < δ

18 – – – r2 + δ is an upperbound: stop

Example 4 Use the inverse functions, fP, in order to find an initial approxima-tion for solving the system,

f1(x, y, z) = 6y2 + 20y + 2x + 44z − 170 = 0,

f2(x, y, z) = 3y3 − 43y − 7x − 6z + 100 = 0,

f3(x, y, z) = z3 − 79z + 6x2 − 10y + 4 = 0, (114)

by Newton’s Method.

This problem has been treated by the authors in [26] in a very complex way,that we try to change in our future researches.

We solve for x in the first and second equation,

x = ϕ11(y, z) = 85 − 10y − 3y2 − 22z,

x = ϕ12(y, z) = 100 − 43y + 3y3 − 6z

7. (115)

230 Numer Algor (2011) 58:203–233

We solve for y in the second and third equations. The second one is consideredas a polynomial in the variable y, ϕ2

2 being its AS,

y = ϕ22(x, z) = 100 − 7x − 6z

43

∞∑

n=0

d(0, n)Yn(x, z),

y = ϕ23(x, z) = 4 + 6x2 − 79z + z3

10, (116)

where, according to (37) and (48),

d(0, n) = 12n + 1

(3nn

)

and Y(x, z) = 3(100 − 7x − 6z)2

1953125.

Finally, we solve for z in the third and first equations,

z = ϕ33(x, y) = 4 + 6x2 − 10y

79

∞∑

n=0

d(0, n)Z n(x, y),

z = ϕ31(x, y) = 85 − x − 10y − 3y2

22, (117)

where

Z (x, y) = (4 + 6x2 − 10y)2

493039.

As a generalization of Corollary 4, applied to the second equation of system(114), considered as a polynomial in the y variable and to the third equation ofsystem (114), considered as a polynomial in the z variable. It is easily proventhat if a is a root of (114), then there exist a neighborhood of a where ϕ3

3(x, y)

and ϕ22(x, y) are well defined. In agreement with this result, we are interested

in searching for the regions, where ϕ22 and ϕ3

3 are convergent. It is easy to seethat such regions are,

R1 = {(x, z); −62.66 ≤ 100 − 7x − 6z ≤ 62.66},R2 = {(x, y); −270.2 ≤ 4 + 6x2 − 10y ≤ 270.2}, (118)

respectively. On the other hand, taking into account (77)

|y| = |ϕ22(x, z)| ≤

∣∣∣∣100 − 7x − 6z

43

∣∣∣∣

32

≤ 2.18, in R1.

|z| = |ϕ33(x, y)| ≤

∣∣∣∣4 + 6x2 − 10y

79

∣∣∣∣

32

≤ 5.13, in R2. (119)

In accordance with (119) one gets

|x| = |ϕ12(y, z)| =

∣∣∣∣100 − 43y + 3y2 − 6z

7

∣∣∣∣ ≤ 27.6. (120)

Numer Algor (2011) 58:203–233 231

Table 3 Results taking the vertices and the central point of R4

Initial approximation Root Number of iterations Accuracy: exact digits

(3.93, 0.87, 1.87) (6,1,3) 6 6(6.43, 0.87, 1.87) (6,1,3) 5 6(3.93, 1.87, 1.87) (6,1,3) 6 6(6.43, 1.87, 1.87) (6,1,3) 6 6(3.93, 1.87, 4.87) (4.75543,2.66156,1.47158) 9 6(3.93, 0.87, 4.87) (6,1,3) 6 6(6.43, 0.87, 4.87) (6,1,3) 5 6(6.43, 1.87, 4.87) (6,1,3) 6 6(5.18, 1.32, 3.37) (6,1,3) 5 6

From (119) and (120) we arrive at

R3 = {(x, y, z); −27.6 ≤ x ≤ 27.6; −2.18 ≤ y ≤ 2.18; −5.13 ≤ z ≤ 5.13}.(121)

Finally, we compute a region, R4, as a parallelepiped included in R1 ∩ R2 ∩ R3and given by

R4 = {(x, y, z); 3.93 ≤ x ≤ 6.43; 0.82 ≤ y ≤ 1.82; 1.87 ≤ z ≤ 4.87}. (122)

The reason of choosing R4 in this way is that its simplicity eases its com-putational treatment. Then, given as initial approximation a point inside R4,for instance x = 3.93, y = 0.82 and z = 1.87 the root (6, 1, 3) is reached byNewton’s Method after six iterations with six exact digits of accuracy. InTable 3 are shown the results taking other initial approximations.

6 Conclusion

The main proposal of our research in this field is to provide a general methodfor finding an initial approximation that lets us initialize the search of solutionsof polynomial and polynomial systems.

Following this line of investigation, we have introduced in Sections 2–4how to construct the inverse function of a polynomial in an analytical way bycomputing its Taylor series.

The importance of the inverse functions of polynomials has become capitalfor our aims, since if it were convergent then we would get the solution byevaluating it at zero, and if it were not, then we have shown in Theorem 10and Corollaries 3 and 4 how to proceed in this case, in such a way that thesolution can always be obtained in the real case according to Algorithm 1. Wewill accomplish the complex case in future investigations.

The part of Algorithm 1 related to the computation of the lower bounds Ltnslows the process, thus we are working in several ideas to avoid this hardness.

Other problem is to compute Taylor series what requires a great operationalcost. For overcoming this difficult, at least partly, we have introduced inSection 5 logarithmic functions that bound the Taylor series of the polynomialinverse function, obtaining very good approximations in all the examples we

232 Numer Algor (2011) 58:203–233

have solved so far, with a very fast velocity of convergence, even better thanNewton Method (see as a sample Example 3). We guess, in agreement withour tests, that the velocity of convergence might be of fourth order, what wehope to prove shortly for the general case.

In Example 4 we illustrate how the inverse function of a polynomial can beused in order to solve polynomial systems. We are mainly going to focus onthis case our investigating effort, with the aim of providing, at least, an initialapproximation, where other methods, for example Newton Method, can beconvergent. Besides, as a consequence of this research, we find very probablethat a new and faster algorithm will arise not only for searching an initialapproximation, but also for solving them completely by its own as happened inthe polynomial case.

We are going to give up older results we have previously published, see [26],about polynomial system (they could give the wrong idea of a complexity thatwe have got over) by other ones we find much better and simpler in which weare working now, basing ourselves on the findings of this paper.

Finally, we want to generalize these ideas for solving any equation orsystems of equations, not only the polynomial ones.

References

1. Pérez, R., Rocha, V.L.: Recent applications and numerical implementation of quasi-Newtonmethods for solving nonlinear systems of equations. Numer. Algorithms 35, 261–285 (2004)

2. Pan, V.Y.: Solving a polynomial equation: some history and recent progress. SIAM Rev. 39(2),187–220 (1997)

3. Pan, V.Y.: On approximating polynomial zeros: modified quad tree (Weyl’s) construction andimproved Newton’s iteration. Research Report 2894, INRIA, Sophia-Antipolis, France (1996)

4. Pan, V.Y.: Optimal and nearly optimal algorithms for approximating polynomial zeros. Com-put. Math. Appl. 31, 97–138 (1996)

5. Mcnamee, J.M.: A bibliography on roots of polynomials. J. Comput. Appl. Math. 47, 391–394(1993)

6. Pan, V.Y.: Fast and efficient parallel evaluation of the zeros of a polynomial having only realzeros. Comput. and Math. 17, 1475–1480 (1989)

7. Davidon, W.C.: Variable metric methods for min imitation. Research and Development Re-port ANL-5990 Rev (1959)

8. Broyden, C.G., Luss, D.: A class of methods for solving nonlinear simultaneous equations.Math. Comput. 19, 577–593 (1965)

9. Ortega, J.M., Reinboldt, W.C.: Iterative Solutions of Nonlinear Equations in Several Vari-ables. Academic Press, New York (1970)

10. Dennid, J.E., More, J.J.: Quasi-Newton methods, motivations and theory. SIAM Rev. 19, 46–89 (1977)

11. Martínez, J.M.: Practical quasi-Newton methods for solving nonlinear systems. J. Comput.Appl. Math. 124, 97–121 (2000)

12. Zhang, J.Z., Chen, L.H., Deng, N.Y.: A family scaled factorized Broyden-like methods fornonlinear least squares problems. SIAM J. Optm. 10(4), 1163–1179 (2000)

13. Brezinski, C.: Classification of quasi-Newton methods. Numer. Algorithms 33, 123–135 (2003)14. Eriksson, J., Gulliksson, M.E.: Local results for the Gauss-Nreyon method on constrained

rank-deficient nonlinear least squares. Math. Comput.73(248), 1865–1883 (2003)15. Birgin, E.G., Krejic, N., Martínez, J.M.: Globally convergent inexact quasi-Newton methods

for solving nonlinear systems. Numer. Algorithms 32, 249–260 (2003)

Numer Algor (2011) 58:203–233 233

16. Yabe, H., Martínez, H.J., Tapia, R.A.: On sizing and shifting the BFGS update within thesized-Broyden family of secant updates. SIAM J. Optim. 15(1), 139–160 (2004)

17. Heng-Bin, A.: On convergence of the additive archway preconditioned inexact newtonmethod. SIAM J. Numer. Anal. 43(5), 1850–1871 (2005)

18. Brett, W.B.: Tensor-Krylov methods for solving large-scale systems of nonlinear equations.SIAM J. Numer. Anal. 43(3), 1321–1347 (2005)

19. Cordero, A., Torregrosa, J.R.: Variants of Newton’s method using fifth-order quadratureformulas. Appl. Math. Comput. 190, 686–698 (2007)

20. Marek, J.S.: Convergence of a generalized Newton and inexact generalized Newton algorithmsfor solving nonlinear equations with non differentiable terms. Numer. Algorithms 50(4), 401–415 (2008)

21. Haijun, W.: On new third-order convergent iterative formulas. Numer. Algorithms 48, 317–325(2008)

22. Hueso, J.L., Martínez, E., Torregrosa, J.R.: Modified Newton’s method for systems of nonlin-ear equations with singular jacobian. J. Comput. Appl. Math. 224, 77–83 (2009)

23. Moreno, J.: Explicit construction of inverse function of polynomials. Int. J. Appl. Sci. Comput.11(1), 53–64 (2004)

24. Moreno, J.: Inverse functions of polynomials around all its roots. Int. J. Appl. Sci. Comput.11(2), 72–84 (2004)

25. Demidovich, B.P., Maron, I.A.: Cálculo Numérico Fundamental. Paraninfo, Madrid (1985)26. Moreno, J., Casabán, M.C., Rodríguez-Álvarez, M.J.: An algorithm to initialize the search of

solutions of polynomial systems. Comput. Math. Appl. 50, 919–933 (2005)