Multimedia Indexing with the SMART system

34
Journal of Visual Languages and Computing (2000) 11, 405–438 doi:10.1006 / jvlc.2000.0169, available online at http://www.idealibrary.com on Multimedia Indexing with the SMART system P. MARESCA,* A. GUERCIO,- T. ARNDT? AND G. TORTORA- * Dipartimento di Informatica e Sistemistica, University of Napoli, Italy, - Dipartimento di Matematica ed Informatica , University of Salerno, Italy and ?Department of Computer and Information Science, Cleveland State University, 1860 E. 18th St., Cleveland, OH 44114, U.S.A. Received 30 November 1999; in revised form 8 April 2000; accepted 22 May 2000 The storage and retrieval of multimedia data is a crucial problem in multimedia information systems due to the huge storage requirements. It is necessary to provide an efficient methodology for the indexing of multimedia data for rapid retrieval. The aim of this paper is to introduce a methodology to represent, simplify, store, retrieve and reconstruct an image from a repository. An algebraic representation of the spatio- temporal relations present in a document is constructed from an equivalent graph representation and used to index the document. We use this representation to simplify and later reconstruct the complete index. This methodology has been tested by implementation of a prototype system called Simplified Modeling to Access and ReTrieve multimedia information (SMART). Experimental results show that the com- plexity of an index of a 2D document is O (n*(n!1)/k) with k52 as opposed to the O (n* (n!1)/2) known so far. Since k depends on the number of objects in an image more complex documents have lower overall complexity. ( 2000 Academic Press Keywords: multimedia database, content-based retrieval, multimedia indexing 1. Introduction THE STORAGE AND RETRIEVAL of multimedia data is a crucial problem in multimedia information systems due to the huge storage requirements of multimedia data. There- fore, it is necessary to provide an efficient methodology for the indexing of multimedia data for rapid retrieval of the data. The time complexity for handling such indexes is an important consideration. Multimedia data can be divided into image data, audio data, video data, etc. In this paper we will refer to a multimedia document when we want to specify an object which can be of any multimedia type. In general, our discussion will refer to images and image indexing, since our prototype system handles images. However, the methodology introduced in this work can be easily extended to handle audio, video, and three- dimensional scenes, and in fact some of our examples will refer to multimedia documents of these types. Several methodologies have been introduced for indexing multimedia data. An extended survey on these techniques can be found in [1]. For brevity, in this paper we review only some of the most important results connected to our research. 1045-926X/00/080405 # 34 $35.00/0 ( 2000 Academic Press

Transcript of Multimedia Indexing with the SMART system

Journal of Visual Languages and Computing (2000) 11, 405–438doi:10.1006/ jvlc.2000.0169, available online at http://www.idealibrary.com on

1

Multimedia Indexing with the SMART system

P. MARESCA,* A. GUERCIO,- T. ARNDT? AND G. TORTORA-

*Dipartimento di Informatica e Sistemistica, University of Napoli, Italy, -Dipartimento di Matematicaed Informatica , University of Salerno, Italy and ?Department of Computer and Information Science,

Cleveland State University, 1860 E. 18th St., Cleveland, OH 44114, U.S.A.

Received 30 November 1999; in revised form 8 April 2000; accepted 22 May 2000

The storage and retrieval of multimedia data is a crucial problem in multimediainformation systems due to the huge storage requirements. It is necessary to provide anefficient methodology for the indexing of multimedia data for rapid retrieval. The aimof this paper is to introduce a methodology to represent, simplify, store, retrieve andreconstruct an image from a repository. An algebraic representation of the spatio-temporal relations present in a document is constructed from an equivalent graphrepresentation and used to index the document. We use this representation to simplifyand later reconstruct the complete index. This methodology has been tested byimplementation of a prototype system called Simplified Modeling to Access andReTrieve multimedia information (SMART). Experimental results show that the com-plexity of an index of a 2D document is O(n*(n!1)/k) with k52 as opposed to theO(n*(n!1)/2) known so far. Since k depends on the number of objects in an imagemore complex documents have lower overall complexity.( 2000 Academic Press

Keywords: multimedia database, content-based retrieval, multimedia indexing

1. Introduction

THE STORAGE AND RETRIEVAL of multimedia data is a crucial problem in multimediainformation systems due to the huge storage requirements of multimedia data. There-fore, it is necessary to provide an efficient methodology for the indexing of multimediadata for rapid retrieval of the data. The time complexity for handling such indexes is animportant consideration.

Multimedia data can be divided into image data, audio data, video data, etc. In thispaper we will refer to a multimedia document when we want to specify an object which canbe of any multimedia type. In general, our discussion will refer to images and imageindexing, since our prototype system handles images. However, the methodologyintroduced in this work can be easily extended to handle audio, video, and three-dimensional scenes, and in fact some of our examples will refer to multimediadocuments of these types.

Several methodologies have been introduced for indexing multimedia data. Anextended survey on these techniques can be found in [1]. For brevity, in this paper wereview only some of the most important results connected to our research.

045-926X/00/080405#34 $35.00/0 ( 2000 Academic Press

Figure 1. Taxonomy of image indexing techniques

406 P. MARESCA ET AL.

Image indexes can be divided into two categories: attached index and index bycontent [2]. While indexes in the first category describe the image itself without anyreference to the specific content, those in the second category provide informationabout the entities in the images and they can be classified as either syntactic or semantic.In [3], the indexes by content are divided into pictorial, textual and hot spot categories.The schema in Figure 1 summarizes the taxonomy.

In surveying previous work, we will focus our attention on pictorial indexes whosesemantics are obtained by looking at the

z geometric and morphological features;z spatial relationships among the elements.

The geometrical and morphological indexes describe features of the image by usingthe color [4–6] the form [7–10] or the texture [11] of the objects of the image, or byusing structures like trees or contractive functions [11, 12]. The spatial relationshipsamong the elements of the image describe the position of the objects in the image.Indexing techniques based on spatial relationships are very useful even though some-times a high time complexity for the storage and the retrieval of the indexes is required.While the 2D-string [13] and its extensions [14–17] provide a useful technique forindexing complex images, the Virtual Image (VI) technique [18–20] introduces an

MULTIMEDIA INDEXING 407

efficient approach for indexing images with a small number of objects. In [21, 22] thetemporal relations of Allen [23], the 2D-strings of Chang [13] and the topologicalrelations of Egenhofer [24] are used for the construction of a logical model of the image.Allen’s interval relations and Egenhofer’s topological relations are combined to createthe 2D Projection Interval Relationships (2D-PIR). The image is then represented via a graphwhose nodes are labeled with the names of the objects while the arcs are labeled with the2D-PIR.

The aim of this paper is to introduce a methodology to represent, simplify, store,retrieve and reconstruct an image from a repository, using several layers of abstractionsof the image itself. If we want to build an indexing methodology which is not domainspecific and whose complexity is not tied to the number of objects in the image, it isnecessary to find a mechanism which reduces the known time complexity of O(n2),particularly if the image contains a large number of objects n. We believe that it isnecessary to define abstract representations which can be more efficiently handledwithout losing information necessary for the reconstruction of the image.

The major results of the present research are the following:

z An algebraic representation corresponding to well-known image indexing techniques.The algebraic representation has the advantage of being easier to manipulate than theequivalent graph representation.

z Development of simplification and reconstruction algorithms for the algebraic formof the index.

z Implementation of a prototype system based on the methodologies developed.z Experimental results show that the methodology results in an image index having

a lower complexity than previously known indexes which represent the sameinformation.

The organization of the rest of this paper is as follows. We introduce an indexingtechnique which uses the spatio-temporal algebraic expression (STAE) abstract repres-entation as an intermediate simplification step which, starting from the abstract descrip-tion via 2D-strings and the graph representation, provides a useful step for the finalsimplification of the index. The spatio-temporal expressions are described in Section 2while in Section 3 we discuss the simplification of the index of the image. Thearchitecture of the SMART prototype system is defined in Section 4. The experimentalresults with the SMART prototype are presented and discussed in Section 5. The resultsprove that increasing the number of objects in an image increases the opportunity toreduce redundant arcs in their graph representation, therefore reducing, in most cases,the O(n2) complexity of the corresponding STAE. This means less storage is requiredfor the image and faster indexing with little overload for the STAE reconstruction ispossible. Conclusions are given in Section 6. Appendix A contains all the acronyms usedin this paper, while Appendix B gives the Prolog implementation of the simplificationphase of the SMART system.

2. The Spatio-Temporal Algebraic Expression

In this section, we briefly introduce a general methodology for the representation ofmultimedia data via a directed and labeled graph called the spatio-temporal relation

Figure 2. Steps in the index construction

408 P. MARESCA ET AL.

graph (STRG). The idea is to model multimedia data with a graph whose nodes arelabeled with symbols representing the objects contained in the document (audio orvisual according to the data type) and whose directed arcs are labeled with the spatialand/or temporal relation between the two objects. We will then show how thisrepresentation can be transformed into a more useful algebraic representation.

It is easy to prove that seven of Allen’s relations are the opposite of the other seven.We will use these relations in building the STRG graph.

The initial steps in the index construction process are shown in Figure 2 for a simpleimage with few objects. We start with the original image and do edge detection on theimage (this step may be skipped if we are doing manual indexing). The next step is to

MULTIMEDIA INDEXING 409

individuate the objects of interest in the image and represent each of these objects witha symbol. This representation is known as a symbolic image. Given the symbolic image, wenow find the minimum bounding rectangles (MBRs). Once we have the MBRs, weproject the MBR for each object on both the x- and y-axis. We now have intervals foreach object on both the x- and y-axis, so we find Allen’s relation between each pair ofobject on both the x- and y-axis.

A common technique in image indexing is to represent the spatial relations as a graphwhere each node represents an object and each arc between nodes represents the spatialrelation between the objects (see, for example, [25] ). We will follow this well-knowntechnique. Let i and j be two objects of a document, the arc (i, j ) will be labeled with oneof the Allen relations according to the relation existing between i and j in each dimension(e.g. two for images and one for audio). In this way, we are able to model any multimediadocument. The graph is without doubt a useful instrument for several applications, butin some cases it is not an easy one to manipulate. This is a problem that needs to beovercome because we will want to manipulate the index in order to reduce theredundant information in the index, thus saving storage space. The graph structurerepresents the spatial relation between each pair of objects, but this amount ofinformation is not really necessary since some relations imply others. Consider, forexample, the relations between three integers, a, b, and c. These might be a 'b, b'c ,and a'c. Obviously, the third relation is redundant since it is implied by the first two.Similarly, with spatial relations, some relations imply others. We want to exploit thischaracteristic in order to reduce the complexity of the index.

Since algebra is a very powerful tool, we transform an STRG into an equivalentexpression, called STAE which is easier to manipulate. We now give some definitions.

Definition 2.1. Let G be an STRG graph which represents a generic multimediadocument. An algebraic representation of G is a 5-tuple (V, Rel, Op, E, E(G )) where:

z V is a vocabulary of symbols.z Rel is the set of 13 Allen relations.z Op is the set of the operators M#, ' , ?N where

# is the or operator,' is the connecting or link operator,? is the sequence operator.

z E3V and is the empty node.z E(G ) is the algebraic expression associated with G.

Table 1. Allen’s relations and their inverse

R R!1

p equals q " p equals!1"

p before q ( p before!1'

p meets q p meets!1 q 2p overlaps q C p overlaps!1 q /p starts q ( p starts q!1 [p finishes q ) p finishes q!1 ]p during q % p during q!1 $

410 P. MARESCA ET AL.

The algebraic expression should completely characterize the graph G in order tomaintain the correspondence between the two representations. In other words,G should be equivalent to E(G ). The equivalence guarantees that, given a document, itwill always be possible to go from its graph representation to its algebraic representationand vice versa allowing us the use of the most useful representation at the moment.

Definition 2.2. Let G (N, A) be an STRG with n nodes and m labeled arcs. An STAEassociated with G, E(G) is defined as a sequence of n subsequences:

E(G )"subseq1? subseq2?2?subseqn

A subsequence is of the form

Subseqi"headi ' (alternativei) for i"1,2, n

A headi is a node of the graph and each subsequence has a different head.The alternatives have the following form:

alternativei"(bodyis1, t1#bodyi

s2, t2#2#body isk, tl)

where body isbody i

sr, th"Rsrnth

with Rsr32Rel and ntk3N XMEN!MheadiN.

To obtain the E(G ) for a graph G we take the following steps: each labeled arc of thegraph G, ((a, b ), R ), (with a, b3V and R32Rel) is represented by the link relation a 'Rb.It is our convention to attach the label R to the tail symbol. This notation can bespecified for the different types of graphs by using the sequence operator ‘?’, and the oroperator ‘#’. In Figure 3(a)}(c), we show three graphs and their correspondingalgebraic representation. In case (b) it is possible to obtain the description of each singlearc by using the distribution property of ' with respect to #, i.e.

a ' (R1b#R2c#R3d )"a 'R1b#a 'R2c#a 'R3d.

The empty node is a node that does not contain any value. Each node of the graph thatdoes not have arcs leaving the node can be considered attached to such a node. In thiscase, we write a 'RE. The value of R is irrelevant since we suppose that a 'RE"a 'E∀R32Rel . For each R there is an RE which represents the empty body. The empty body isnecessary for the representation of the nodes with no arcs leaving the node itself. If weconsider the graph STRG of Figure 4, the STAE associated with this STRG is

E(G )"a 'RE ? b 'R1a ? c ' (R7a#R8b#R3d) ? d ' (R5a#R4b).

This expression contains four subsequences: a 'RE, b 'R1a, c ' (R7a#R8b#R3d ) andd ' (R5a#R4b) and the subsequence c ' (R7a#R8b#RR3d ) has c as head and

Figure 3. Three graphs and their corresponding algebraic representation

Figure 4. An example of an STRG

MULTIMEDIA INDEXING 411

R7a#R8b#R3d as alternative. In turn this alternative has three bodies R7a, R8band R3d. Also RE is a body, but it is an empty body. If we analyze this expression we candeduce several things regarding the graph G(N, A). First of all, the expression has foursubsequences, i.e. DN D"4, and the number of bodies is 6, i.e. DA D"6 (the empty bodyis not counted). Moreover, we can write:

N"Mset of headsN"Ma, b, c, dN and A"M((b, a), R1), ((c, a), R7), ((c, b), R8), ((c, d ), R3),

((d, a), R4), ((d, b), R5)N

Let us observe that ((a, E ), R ) for each R, represents the dummy arc which links theempty node and therefore it is not considered in A.

The operators satisfy several properties. The link operator ' satisfies the invertibilityproperty a 'Rb"b 'R!1a. This assertion is banally true since a Rb b bR!1a.This alternative operator#satisfies the commutative and the associative property

(commutative property) a 'R1b#a 'R2c"a 'R2c#a 'R1b(associative property) a 'R1b#(a 'R2b#a 'R3c )"(a 'R1b#a 'R2b)#a 'R3c

Therefore, we can write with no ambiguity a 'R1b#a 'R2b#a 'R3c.The distributive property of 'with respect to#is also valid: a ' (R1b#R2b#R3c)"a 'R1b#a 'R2b#a 'R3c.

412 P. MARESCA ET AL.

The sequence operator ? satisfies the commutative property: subseq1 ? subseq2"subseq2 ? subseq1 this is due to the fact that it is possible to construct E(G ) starting from anynode of G.The associative property is also valid: subseq1 ? (subseq2 ? subseq3)"(subseq1 ? subseq2)? subseq 3 therefore, we can write subseq1 ? subseq2 ?2 subseqn without ambiguity.

It is clear that both the graph and the expressions are built taking into account therelations between the projections on the axes. In other words if ((a, b), R) is a labeled arcof the graph G, then a, b3V and, in particular, R32Rel, that is R is a k-tuple of Allen’srelations with k equal to the number of the axes where the objects are projected. As anexample, k"2 (axes x and y) if G represents a two-dimensional image.

3. Simplification of an STAE

We can operate with the graph STRG or with the algebraic expression indifferently. Theimportance of the use of the STAE is due to the fact that they are easier to manipulatethan the graphs. The attempt to simplify an STAE is done using logic deduction rules,eliminating bodies in a subsequence corresponding to the elimination of a redundant arcin the graph. In a multimedia document with n objects, we need to use n nodes andn*(n!1)/2 arcs. Since the number of arcs in an STRG is O(n2), the increase of n maygenerate some problems in terms of time and space complexity.

Let us introduce a set of rules, called Spatio-Temporal Deduction Rules (STRD) which willbe used to build a set of Spatio-Temporal Simplification Rules (STSR). Thanks to those rules,it will be possible to eliminate from the abstract representation of the document, (STAEor STRG) redundant information.

The Spatio-Temporal Deduction Table (STDT) will allow us to produce the simplificationrules from the deduction rules, where possible. This table can be seen as an inferentialengine and, of course will require a cost in term of space and time for its storage and itsmanagement. Moreover, an efficient algorithm for the simplification is necessary inorder to reduce the algebraic expressions.

Table 2. The spatio-temporal deduction table ( STDT)

R1/R2 " % ( ) 3 ( C $ [ ] 2 ' /

" " % ( ) 3 ( C $ [ ] 2 ' /% % % % % ? ? ? ? ? ? ? ? ?( ( % ( % ? ? ? ? ? / 2 ' ?) ) % % ) 3 ( C ? C ? ? ? ?3 3 ( 3 ( ( ( ( ? 3 ? ? ? ?( ( ( ( ( ( ( ( ? ( ? ? ? ?C C ? ? ? ( ( ? ? C ? ? ? ?$ $ ? ? ? ( ( ? $ $ $ ' ' ?[ [ ? ? ? ( ( ? ? [ $ 2 ' ?] ] ? ? ? 3 ( ? $ $ ] ' ' ?2 2 ? ' 2 ? ? ? ? ? 2 ' ' '

' ' ? ' ' ? ? ? ? ? ' ' ' '

/ / ? ? ? ? ? ? ? ? / ' ' ?

MULTIMEDIA INDEXING 413

Fortunately, the cost of this table is not relevant if we consider that in most cases it isable to reduce the O(n2) complexity necessary for the management of the arcs. The O(n2)complexity is undesirable especially with a large n. The advantage of our approach is inthe fact that, using the simplification rules, increasing the number of objects increasesthe possibility to apply deduction rules and therefore to simplify the final algebraicexpression.

The STDT contains 169 possible cases. The table is a matrix STRD(i, j ) of dimension13]13, and provide the value of R3 given R1 and R2, which are, respectively, the rowindex and the column index. The symbol ‘?’ (don’t care) is used in the table when nodeductions can be made. The deduction rules can be two-, three- and n-dimensional.Before describing how the simplification of the algebraic expressions can be performedwe need to introduce some definitions.

Definition 3.1. Two STAEs are equivalent if and only if they represent the samedocument.

This definition reflects the fact that there may be multiple STAEs which represent thesame document depending on how the indexing is performed.

Given an STAE E, we can simplify only the number of bodies inside the subsequen-ces of E. The heads cannot be touched since they represent the nodes of the graph andtherefore that information cannot be lost. Let

R1b, R2c and R3c

be three bodies of the two subsequences of E, respectively, those having a, b and c asa head, and let

a 'R1b ? b 'R2cNa 'R3c

be a STRD written by using the STAE formalism. Then

a 'R1b ? b 'R2c ? a 'R3cb a 'R1b ? b 'R2c

is a simplification rule of the STAE (STAESR).We can simplify E by cutting the body R3 c from the alternative whose head is a. If in

the process of simplification it happens that for a certain head, say a, all the bodies of thealternative have been eliminated, we write a 'RE, and we use the empty body so that wedo not lose the head a.

Definition 3.2. Given a document S with n objects and its graph G, an STAE E(G ) iscomplete if no STAESR has been applied; i.e. when G has n nodes and n*(n!1)/2 arcs.

Definition 3.3. Given a document S with n objects and its graph G, an STAE E(G ) isirreducible if no STAESR can be directly applied.

Definition 3.4. Given a document S with n objects and its graph G, an STAE E(G ) isreducible if at least one STAESR can be directly applied.

414 P. MARESCA ET AL.

Definition 3.5. Given a document S with n objects and its graph G, an STAE E @(G )obtained from the direct application of a STAESR to E(G ) is called a reduction of E.

Definition 3.6. Given two equivalent STAE E1 and E2, E1 is more reducible than E2 ifand only if the reduction of E1 has less bodies than the reduction of E2.

Definition 3.7. An STAE E is simplifiable if STAESR can be directly or indirectlyapplied.

Proposition 3.1. A reducible STAE E is simplifiable. The reverse is not true.

Proof. If the expression E is reducible, then according to Definition 3.4 at least oneSTAESR can be directly applied. Therefore, since an STAE on which STAESR can bedirectly or indirectly applied is simplifiable (Definition 3.7) it follows that E is simplifi-able. On the other hand, if E is simplifiable it may be the case that simplifications mayonly be applied in an indirect manner. In this case, E is irreducible. K

Proposition 3.2. An irreducible STAE can be simplifiable.

Proof. Let E be an irreducible expression. Then STAESR cannot be directly applied toE. On the other hand, these rules could be applied inversely. That is, we substitute forE an equivalent expression E @ (e.g. substituting in E some relations with their inverserelations) this could be simplifiable. K

Definition 3.8. An STAE E @ obtained from an STAE E by directly or indirectlyapplying an STAESR, is called a simplified E or a simplification of E.

Definition 3.9. An STAE E is minimal if the maximum number of possible simplifi-cations have been applied.

Proposition 3.3. Given an STAE E there could be more than one minimal STAE, but all theminimal STAE for E are equivalent and have the same number of bodies.

Proof. Suppose that we find an expression E1min equivalent to an STAE E by applyingto E the maximum possible number of simplifications. Let E2min be another minimalexpression of E, then because E1min and E2min are equivalent to E, they are alsoequivalent to each other. Besides this, the number of bodies in E1min is equivalent to thenumber of bodies in E2min since if we found, for example, that E2min has more bodiesthan E1min then E2min is not a minimization of E.

Proposition 3.4. It is not true that, given two equivalent expressions, one is more simplifiable thanthe other.

Proof. Let us consider two equivalent STAE E1 and E2. Let us suppose that E1 ismore simplifiable than E2. This means that E1min contains fewer bodies than E2min.But according to Proposition 3.3, this is not possible. Therefore, we have a proof bycontradiction. K

MULTIMEDIA INDEXING 415

In order to clarify Definition 3.9, we observe that in order to find the minimal formwe need to locate the equivalent STAE of E which allow the application of themaximum number of simplifications. In the process of the search for the minimalexpression, we can find irreducible forms, which could be transformed to equivalentones on which it is still possible to perform further simplifications.

Theorem 3.1. An irreducible STAE is not necessarily minimal. On the contrary a minimal STAEis always irreducible.

Proof. The second assertion is banally true. We will prove instead the first assertion.Given an STAE E, it is possible to apply STAESR to this to simplify it until anirreducible STAE is obtained. Such an expression is not necessarily minimal, in fact,from Proposition 3.2 we know that an irreducible STAE may be simplifiable. Then if wecan simplify the expression this means that we can find an equivalent expression witha smaller number of bodies. K

Now, we have all the definitions necessary for the introduction of the Minimizationand Simplification Procedure.

Definition 3.10. Given an STAE E. The Simplification of E is a process which inputsE and outputs, if possible, an equivalent expression of E with a lesser number of bodies.

Definition 3.11. Given an STAE E. The Minimization of E is a process which inputsE and outputs, if possible, an equivalent minimal expression E.

Finding a minimal form of an STAE means not only applying some simplificationrules, but being sure that there are no equivalent forms which can be furthersimplified and produce a smaller form. So the procedure of minimization requires theuse of the procedure of simplification. In Figure 5 the minimization algorithm isdescribed.

The complex part of the algorithm is described in Figure 5, where the algorithm looksfor the existence of a more reducible form. In reality, the only way to look for a morereducible form is by using the simplification. In other words, the existence of a morereducible expression is reduced to the search for an equivalent expression of E(i#1),which we call E(i#2) and seeing if it is reducible.

Therefore, proving the existence and finding a more reducible expression becomesthe same thing. Moreover, let us suppose we find an E(i#2), the process should berepeated until we locate an expression E(k), for an undefined k such that there are noother equivalent expressions more reducible. It is this difficulty that makes the minimiz-ation problem unacceptable, therefore, we will only try to simplify our expressions to anacceptable form. In Figure 6, the simplification algorithm is described.

This procedure inputs a representation of the document via an STAE E"E(0) andlocates a series of equivalent expressions E(1), E(2),2, E(k) until we find an E(k) which isa good simplification of E. This can be decided by choosing the number of bodies thatwe would like to eliminate from the original expression. Of course, we do not know ifsuch simplification exists, therefore we need to add as a stopping condition to thesearch, the condition k5n for a given n.

Figure 5. Minimization flow-chart

416 P. MARESCA ET AL.

As an example, let us give a one-dimensional document; such a document representsan audio clip. This is shown in Figure 7.

E"E(0): c ' (]a#'b#(d#(f#Ce) ?

a ' ((f#%b#(d#Ce) ?

f ' ('b#]d#/e) ?

b ' ((d#(e) ?

d ' (/e)

This expression contains 6 heads and 15 bodies. Suppose we attempt to simplify at least5 bodies by using no more than 5 attempts. First of all, we observe that E(0) is reducibleand we can apply the following STRD rules:

c ' ]a ? a '(f Nc '(f

c ' ]a ? a '(d Nc '(d

Figure 6. The Simplification algorithm flow-chart

Figure 7. One-dimensional document

MULTIMEDIA INDEXING 417

We obtain the reduced expression of E(0):

E(1): c ' (]a#'b#Ce) ?

a ' ((f#%b#(d#Ce) ?

f ' ('b#]d#/e) ?

b ' ((d#(e) ?

d ' (/e)

418 P. MARESCA ET AL.

With E(1) we have eliminated only two bodies, therefore, we need to look for a possiblefurther reduction. E(1) is equivalent to E(2) obtained by replacing some relationaloperations with their inverse. We replace

c ' ]a with a ')c and f ' ]d with d ') fand we obtain

E(2): c ' ('b#Ce) ?

a ' ( )c#(f#%b#(d#Ce) ?

f ' ('b#/e) ?

b ' ((d#(e) ?

d ' (/e#) f )

E(2) is still reducible. We apply the following STRD rules:

a ' )c ? c ' Ce N a ' Ce

a '(d ? d '(f Na '(f

and we obtain the reduced expression of E(2)

E(3): c ' ('b#Ce) ?

a ' ( ) c#%b#(d ) ?

f ' ('b#/e) ?

b ' ((d#(e) ?

d ' (/e#) f )

With E(3) we have eliminated two more bodies and we can still perform three attempts.Again from E(3) we replace some relational operations with their inverse, such as

b '(d with d ''b and b '(e with e ''b

We obtainE(4): c ' ('b#Ce) ?

a ' ( ) c#%b#(d) ?

f ' ('b#/e) ?

e ' ('b) ?

d ' ( /e#) f#'b)

MULTIMEDIA INDEXING 419

E(4) is still reducible by applying the STRD rule:

d '/e ? e ''bNd ''b

and we obtain the reduced expression of E(4) :

E(5): c ' ('b#Ce) ?

a ' ( ) c#%b#(d) ?

f ' ('b#/e) ?

e ' ('b) ?

d ' ( /e#) f )

E(5) is equivalent to E but has 6 heads and only 10 bodies. We have obtained the desiredreduction in only three attempts. In the following section, we will discuss how werebuild the original information from a simplified version of the expression.

4. Architecture of the SMART System

In this section, we will describe the architecture of the Simplified Modeling to Accessand ReTrieve (SMART) tool. This tool can construct, simplify, represent in a repositoryand reconstruct a multimedia document. SMART has been implemented on theWindows platform using Logic Programming Associates Prolog version 2.6 3d (LPA-Prolog). The tool implements the reconstruction of STAE in a simple and exhaustivemanner. SMART is an intelligent system which is able to examine any multimediadocument and greatly reduce the amount of information in its representation byapplying spatio-temporal deduction rules. The reduction in the amount of informationincreases greatly as the number of objects in the document increases. Furthermore, thereduction in the amount of information stored does not result in the loss of anyinformation. SMART can reconstruct all the information it simplifies, demonstrating itsredundancy.

The prototype inputs the abstract representation of the document using Allen’srelations, creates the equivalent STAE, which is subsequently simplified and stored inthe Prolog repository for a further retrieval of the document. In the retrieval ofthe document, the equivalent simplified STAE is found and the original STAE isreconstructed.

An example of simplification and reconstruction of a document is shown in Figure 8,while in Figure 9 we display the relative portion of the code.

The architecture of SMART is illustrated in Figure 10. As shown in the figure, thereare two phases. The first consists of the construction of an abstraction of the document,its representation using an STAE, its simplification, and finally its storage in therepository. The second phase consists of the retrieval of the simplified expression from

Figure 8. The SMART environment

420 P. MARESCA ET AL.

the repository, and its reconstruction by a process of reverse abstraction by which theoriginal document is recovered.

We will comment on the methodology applied in this work by following these twophases. In the first phase, the SMART tool uses the implementation of the spatio-temporal rules of deduction (STRD) from which the spatio-temporal reduction rules(STRER) are derived in terms of Prolog rules using the powerful list data structure ofProlog.

Rather than examining the technical details of the implementation, we will mentionthe implementation of a logical deduction rule and the corresponding simplification ruleleaving the relevant code in Appendix A. The elementary deduction predicate is of thefollowing type:

deduce (R1, R2, R3)

if it is known that the relations R1, R2, R3 evaluate to true if and only if

A R1 B and B R2 C N A R3 C

is an STRD. Otherwise, it is false.

Figure 9. An example of simplification and reconstruction of a document with 13 objects

MULTIMEDIA INDEXING 421

The job of the predicate deduce/3 (/3 indicates that the relation deduce has arity 3) isto intercept all of the STRDs in the expression and, in the affirmative case, with the helpof the predicate cancel/6, which we will examine shortly, allow its simplification.

In order to implement the two-dimensional STRDs, it is necessary to utilize the ANDof two deduce/3 predicates. In fact, let

A (R1, R2) B, B (R3, R4) C and A (R5, R6) C

be three terms of STAE. In order to verify that

A (R1, R2) B and B (R3, R4) CNA (R5, R6) C

is a two-dimensional STRD, it suffices to verify the AND of the following twopredicates:

(a) deduce (R1, R3, R5),

(b) deduce (R2, R4, R6)

Figure 10. Architecture of SMART

422 P. MARESCA ET AL.

The first deduce/3 verifies the deduction with respect to the x-axis, while the secondone verifies with respect to the y-axis. By definition of two-dimensional STRD as long as(1) is an STRD, the predicates(a) and (b) must both take on a value of true.

In order to simplify other media types we can easily understand how to implement thenecessary STRD. In fact, for audio we need one-dimensional STRDs (and thereforea single deduce/3 predicate), for video we need three-dimensional STRDs (thereforebesides the predicates (a) and (b) we need a third deduce/3 for the time axis).

The STAE may be represented using simple lists. Recall that a and b are two objectsand Rx and Ry represent the spatial relations with respect to the x- and y-axis,respectively. The convention adopted is to represent every term of the STAE of typea ' (Rx, Ry) b as the sublist [a, b, Rx, Ry]. Therefore, every explicit STAE of the type:

E: a1 ' (Rx1, 2 , Ry1, 2)a2#a1 ' (Rx1, 3 , Ry1, 3)a3#2#a1 ' (Rx1, n , Ry1, n)and

a2 ' (Rx2, 3 , Ry2, 3)a3#a2 ' (Rx2, 4 , Ry2, 4)a4#2#a2 ' (Rx2, n , Ry2, n)and

d2d an!1 ' (Rxn!1, Ryn!1, n)an

MULTIMEDIA INDEXING 423

will be represented as the list:

E"[a1, a2, Rx1, 2 , Ry1, 2, a1, a3, Rx1, 3 , Ry1, 3,2, a1, an , Rx1, n , Ry1, n , a2 , a3, Rx2, 3, Ry2, 3,

a2, a4, Rx2, 4 , Ry2, 4 ,2, a2, an , Rx2,n , Ry2, n,2, an!1, an , Rxn!1, n, Ryn!1, n].

The STRDs implemented with the deduce/3 predicates are given in Appendix B.Once the STRDs have been implemented, it is very easy to implement the STAESR(simplification rules) with the help of the exist/5 and cancel/6 predicates describedbelow.

/* exist/5 */exist (A1,C1,Rr1,Rr2,[A1,C1,Rr1,Rr2D

}]).

exist (A1,C1,Rr1,Rr2,[},},},}DD]):- exist (A1,C1,Rr1,Rr2,D).

comment : once the variables A1, C1, Rr1 and Rr2 have been instantiated and input to thepredicate exist/5 along with the expression, the value true will be returned if the list [A1,C1, Rr1, Rr2] is a sublist of the expression.

/* cancel/6*/cancel(

},},},},[ ],[ ]).

cancel(A1,C1,R11,R21,[A1,C1,R11,R21DD],M ],):-cancel(A1,C1,R11,R21,D,M),!.

cancel(A1,C1,R11,R21,[T1,T2,T3,T4DD], [T1,T2,T3,T4DM ]):-

cancel(A1,C1,R11,R21,D,M).

comment: once the variables A1, C1, R11, and R21 have been instantiated and input to thepredicate cancel/6 along with the expression, a new expression equal to the initial oneminus the sublist [A1, C1, R11, R21] will be returned.

As far as the second phase, the one that begins after the retrieval of the expressionfrom the repository, this must be manipulated using the rules of spatio-temporalreconstruction (STRR) which may have three levels. This means that for a simplifiedexpression we must always make all possible attempts to reconstruct the completeexpression. The rules of spatio-temporal reconstruction are essentially the invertedversions of the spatio-temporal deduction rules.

The predicate:deduce (R1, R2, R3)

returns a true value whenever the relations R1, R2, and R3 are known if and only if

A R1 B d B R2 C N A R3 C

is an STRD, where A R1 B, B R2 C, and A R3 C are three terms of an expression.Otherwise, a false value is returned.

In the same way, we can derive all the other reconstruction rules. All the SMARTreconstruction rules are given in Appendix B.

424 P. MARESCA ET AL.

5. Experimental Results

The SMART system has been tested on a set of 12 synthetic images consisting of anincreasing number of objects arranged randomly in two-dimensions. The icons repres-enting the objects of the images, their names ( in Italian) and the symbols used in theexpressions are shown in Figure 11.

The test cases are summarized in Table 3 and in Figures 12 and 13. Two of the testcases (I7 and I11) are shown in the following in order to observe the reconstruction ofthe spatio-temporal algebraic expressions. In order to understand the expressions, referto Figure 11 where the symbols identifying the icons are given and to Appendix B wherethe acronyms used in SMART for Allen’s relations are given. Each expression consistsof a sequence of terms where each term has four components: the two objects and theirrelation in the x and y directions, respectively. Thus, the first term in the completeexpression of Figure 12 represents the fact that objects u (uomo }man) and q (quaderno} picture) have the pad relation (before with adjacency) in both x and y directions. Notethe number of terms that SMART has eliminated in the simplified expressions.

Complete expression:[u,q,pad,pad,u,z,sp,sp,u,l,sp,sp,u,i,sp,sp,u,s,sp,sp,u,r,sp,sp,u,v,sp,sp,u,o,sp,sp,q,z,sp,pr,q,l,sp,inc,q,i,sp,sp,q,s,sp,sp,q,r,sp,pr,q,v,sp,sp,q,o,sp,sp,z,l,sp,dad,z,i,sp,sp,z,s,sp,sp,z,r,sp,incs,z,v,sp,sp,z,o,sp,sp,l,i,incs,sp,l,s,sp,sp,l,v,sp,sp,l,o,sp,sp,i,s,sp,sp,i,r,sp,sd,i,v,sp,pr,i,o,sp,sp,s,r,pr,sd,s,v,sp,dad,s,o,sp,sp,r,v,pr,sp,r,o,sp,sp,v,o,sp,sp].

SMART simplified expression:[u,q,pad,pad,q,z,sp,pr,q,l,sp,inc,q,r,sp,pr,z,l,sp,dad,z,i,sp,sp,z,r,sp,incs,l,i,incs,sp,l,s,sp,sp,l,v,sp,sp,i,s,sp,sp,i,r,sp,sd,i,v,sp,pr,s,r,pr,sd,s,v,sp,dad,s,o,sp,sp,r,v,pr,sp,v,o,sp,sp].

Manually simplified expression:[u,q,pad,pad,q,z,sp,pr,q,l,sp,inc,q,r,sp,pr,z,l,sp,dad,z,i,sp,sp,z,r,sp,incs,l,i,incs,sp,l,s,sp,sp,l,v,sp,sp,i,s,sp,sp,i,r,sp,sd,i,v,sp,pr,s,r,pr,sd,s,v,sp,dad,s,o,sp,sp,r,v,pr,sp,v,o,sp,sp].

Complete expression:[i,n,sp,sd,i,c,sp,sd,i,d,sp,sd,i,m,sp,sd,i,s,sp,sd,i,a,sp,sd,i,r,sp,sd,i,h,sp,sd,i,l,sp,sd,i,u,sp,dad,i,q,sp,sd,i,z,sp,sd,n,c,inc,sp,n,d,inc,sp,n,m,sp,sp,n,s,sp,sp,n,a,sp,do,n,r,sp,pad,n,h,sp,sp,n,l,sp,pad,n,u,sp,sp,n,q,sp,do,n,z,sp,sp,c,d,inc,sd,c,m,sp,inc,c,s,sp,sd,c,a,sp,sd,c,r,sp,sd,c,h,sp,sd,c,l,sp,sd,c,u,sp,pr,c,q,sp,sd,c,z,sp,sd,d,m,sp,sp,d,s,sp,pr,d,a,sp,sd,d,r,sp,do,d,h,sp,pr,d,l,sp,do,d,u,sp,sp,d,q,sp,sd,d,z,sp,pr,m,s,inc,sd,m,a,incd,sd,m,r,incd,sd,m,h,sp,sd,m,l,sp,sd,m,u,sp,ins,m,q,sp,sd,m,z,sp,sd,s,a,ins,sd,s,r,pr,sd,s,h,sp,do,s,l,sp,sd,s,u,sp,sp,s,q,sp,sd,s,z,sp,do,a,r,incd,sp,a,h,sp,sp,a,l,sp,sp,a,u,sp,sp,a,q,sp,ug,a,z,sp,sp,r,h,sp,sp,r,l,sp,ug,r,u,sp,sp,r,q,sp,sd,r,z,sp,sp,h,l,pr,sd,h,u,pr,sp,h,q,pad,sd,h,z,sp,inc,l,u,pr,sp,l,q,pr,sd,l,z,pr,sp,u,q,pr,sd,u,z,pr,sd,q,z,incd,sp].

SMART simplified expression:[i,n,sp,sd,i,c,sp,sd,i,m,sp,sd,i,u,sp,dad,n,c,inc,sp,n,d,inc,sp,n,m,sp,sp,n,s,sp,sp,n,a,sp,do,n,r,sp,pad,c,d,inc,sd,c,m,sp,inc,c,s,sp,sd,c,u,sp,pr,d,s,sp,pr,d,r,sp,do,d,h,sp,pr,d,z,sp,pr,m,s,inc,sd,m,a,incd,sd,m,r,incd,sd,m,h,sp,sd,m,u,sp,ins,s,a,ins,sd,s,r,pr,sd,s,h,sp,do,s,z,sp,do,a,r,incd,sp,a,q,sp,ug,r,h,sp,sp,r,l,sp,ug,h,l,pr,sd,h,u,pr,sp,h,q,pad,sd,h,z,sp,inc,l,u,pr,sp,l,q,pr,sd,l,z,pr,sp,u,q,pr,sd,u,z,pr,sd,q,z,incd,sp].

Figure 11. The icons, names, and symbols (in parentheses)

Table 3. Results of the simplification

Image No. of No. of terms in the No. of terms No. of termsobjects N complete expression in the SMART eliminated by

N*(N!1)/2 simplified expression SMARTN*(N!1)/K

I1

3 3 2 1I

24 6 4 2

I3

5 10 8 2I

46 15 11 4

I5

7 21 15 6I

68 28 18 10

I7

9 36 19 17I

810 45 26 19

I9

11 55 29 26I

1012 66 39 27

I11

13 78 41 37I

1214 91 46 45

MULTIMEDIA INDEXING 425

Figure 12. Image I7

(9 objects)

Figure 13. Image I11

(13 objects)

426 P. MARESCA ET AL.

Manually simplified expression:[i,n,sp,sd,i,c,sp,sd,i,m,sp,sd,i,u,sp,dad,n,c,inc,sp,n,d,inc,sp,n,m,sp,sp,n,s,sp,sp,n,a,sp,do,n,r,sp,pad,c,d,inc,sd,c,m,sp,inc,c,s,sp,sd,c,u,sp,pr,d,s,sp,pr,d,r,sp,do,d,h,sp,pr,d,z,sp,pr,m,s,inc,sd,m,a,incd,sd,m,r,incd,sd,m,h,sp,sd,m,u,sp,ins,s,a,ins,sd,s,r,pr,sd,s,h,sp,do,s,z,sp,do,a,r,incd,sp,a,q,sp,ug,r,h,sp,sp,r,l,sp,ug,h,l,pr,sd,h,u,pr,sp,h,q,pad,sd,h,z,sp,inc,l,u,pr,sp,l,q,pr,sd,l,z,pr,sp,u,q,pr,sd,u,z,pr,sd,q,z,incd,sp].

We can add the banal results shown in Table 4.Analyzing Table 3 and Figures 14 and 15, we can see that for complete expressions

with n objects the number of terms necessary is always equal to n*(n!1)/2. Due to this,as n increases the curve of the terms of the complete expression increases very rapidly.

Table 4. Banal results

No. of objects No. of terms in the No. of terms in thecomplete expression simplified expression

0 0 01 0 02 1 1

Figure 14. The histogram of the complete and simplified expressions: , complete expression; , simpli-fied expressions

Figure 15. Graph of the complete and simplified expressions: , complete expressions; , simplifiedexpressions

MULTIMEDIA INDEXING 427

For the simplified expression, on the other hand, the curve increases much less rapidly.In general, we can say that the number of terms to represent in this case is approximatelyequal to n*(n!1)/k with k'2. We can justify this by noting that the simplificationprocess starts with a complete expression of size n*(n!1)/2 and eliminates some

428 P. MARESCA ET AL.

number of terms. Note that we are not saying that k is a constant, it may depend on n,on the topology of the image, or on some other factors. In the future, we plan toinvestigate further to determine what k depends on.

We will now give a new definition.

Definition 5.1. Let I be an image with n objects (n'3). Let E be the completeexpression which represents I and let E @ be the simplified expression of E. We willdefine the simplification factor found by SMART for an expression E @ as the quantityk (k'2) which divides the quantity n*(n!1) in such a way that the number of terms inthe expression E @ is exactly n*(n!1)/k. When k"2, the expression is not simplified.

The simplification factor found by SMART in the test cases are shown in Table 5.Figure 16 graphs the simplification factor.For the 12 images SMART found a simplification factor k which tends to increase as

the number of objects increases. Thanks to the simplification, the complexity necessaryto represent the terms in the complete expressions can be greatly reduced as n increases.In order to obtain a good estimate of k, it would be necessary to perform a statisticalanalysis with a much larger number of expressions than the 12 we have used.

In any case, the results obtained so far are very interesting. They show that SMARTsimplifies the expressions which represent a multimedia document by increasing thefactor 2 which divides the quantity n*(n!1). The factor introduced by SMART isk (with k'2). As a starting point for future research into the actual value of k we notethat based on the test cases observed, k is strongly influenced by: the number of objectsin the document; the spatial arrangement of the objects in the document; and thetopology used for the construction of the expressions.

We have observed that as the number of objects increases, the number of simplifica-tions inevitably increases. Beyond this, the arrangement of the objects can greatlyinfluence the simplifications since the table of spatial deductions contains some indeter-minate situations corresponding to particular spatiotemporal situations. For example,the image I has an arrangement of objects which allows for a large amount of

Table 5. Simplification factor found by SMART

Image No. of objects Simplificationfactor

I1

3 3I

24 3

I3

5 2)5I

46 2)727

I5

7 2)8I

68 3)111

I7

9 3)789I

810 3)461

I9

11 3)793I

1012 3)384

I11

13 3)804I

1214 3)956

7

Figure 16. Graph of the simplification factor found by SMART: , K with SMART; , K"2 withoutSMART

Figure 17. Raphael’s transfiguration

MULTIMEDIA INDEXING 429

simplification. Finally, the topology chosen for the construction of the expressions (theorder in which the objects are considered) is another factor which can influence k.

Summarizing the results of the experimental session on 20 samples of multimediadocuments we can say the following. The simplification process is higher when the

Figure 18. SMART simplification and reconstruction of a complex image

430 P. MARESCA ET AL.

number of objects is higher. The reduced expression has the advantage of saving storagespace in the repository.

We have also tested SMART on realistic images involving a large number of objects.An example of one of these images (Raphael’s painting Transfiguration) is shown inFigure 17. A total of 29 objects were identified for indexing.

The original index contains 812 terms, while the simplified expression contains 488terms, thus the savings for this index is 39.9%. The simplification factor for this exampleis 3.328. This is a significant reduction given that the most important objects in thepainting (the people) do not exhibit any particularly regular positions which might leadus to think that there is not much simplification to take advantage of. The original,simplified, and reconstructed expressions are shown in Figure 18 which is a screen shotof SMART working on this example.

6. Conclusions and Future Research

The major results of the present research are the development of an image indexingtechnique which uses an algebraic representation which allows for the simplification ofthe index. This leads to a smaller index size as shown by the experimental results,including results on complex images. We foresee that this will be an important result foruse in digital libraries that have a huge number of images, audio, or videos. In such

MULTIMEDIA INDEXING 431

a situation, the multimedia documents may need to be stored on tertiary storage such asrobotic jukeboxes or they may be stored at various places around the network. The mainsecondary storage requirements then are for the indexes so the size of the indexes is animportant factor. We can use our indexing technique in this situation to save greatly onthe storage requirements. Depending on the documents involved, and based on ourexperimentation, we might expect to save 40% or more on storage. When presentedwith a query, we can then scan the simplified indexes in order to decide which ones toreconstruct (we need to only reconstruct the ones which contain the objects we areinterested in).

The reconstruction algorithms used in SMART are described fully in another work[26]. In the future we intend to continue to investigate the variable k in the formula forthe complexity of the indexes to see if we can more precisely determine its value. We willalso extend the SMART indexing system to other types of multimedia data, in particularto video and audio. The SMART produced indexes will be integrated into our Java-based multimedia database system which allows multimedia data to be stored in andretrieved from any relational database management system [27].

The SMART system currently works with multimedia documents which are images.Such documents have relations between any two objects in two dimensions. The systemmay be easily extended to audio and video [28] objects by noting that pairs of objects inaudio clips have relations in one dimension while video clips may be treated asa sequence of image documents (where we have one image per frame or one image perkeyframe).

Acknowledgements

Angela Guercio’s research is supported by C.N.R. through the Program ‘Short TermMobility’. Timothy Arndt’s research is supported by a new faculty research grant fromCleveland State University. This work has been carried out partially under the financialsupport of the Ministero dell’UniversitaH e della Ricerca Scientifica e Tecnologica(MURST) in the framework of the Project ‘Design Methodologies and Tools of HighPerformance Systems for Distributed Applications (MOSAICO)’.

References

1. A. Yoshitaka & T. Ichikawa (1999) A survey on content-based retrieval for multimediadatabases. IEEE Transactions on Knowledge and Data Engineering 11(1), 81–93.

2. P. D. Bruza (1993) Stratified information disclosure. Thesis Publishers, Amsterdam.3. M. De Marsico, L. Cinque & S. Levialdi (1997) Indexing pictorial document by their content:

a survey of current techniques. Image and Vision Computing 15, 119–141.4. E. Binaghi, I. Gagliardi & R. Schettini (1992) Indexing and fuzzy logic-based retrieval of

color images. In: IFIP Transactions A-7, Visual Database System II (Knuth, Wegner, eds).Elsevier Publishers, Amsterdam, pp. 84–95.

5. B. M. Mehtre, M. S. Kankanhalli, A. D. Narasimhalu & G. Chang Man (1995) Colormatching for image retrieval. Pattern Recognition Letters 16(3), 325–331.

6. M. J. Swain & D. H. Ballard (1991) Color indexing. International Journal of Computer Vision 7(1),11–32.

7. A. Califano & R. Mohan (1994) Multimedia indexing for recognizing visual shapes. IEEETransactions on Pattern Analysis and Machine Intelligence PAMI-16(4), 373–392.

432 P. MARESCA ET AL.

8. C. Dyer & R. T. Chin (1986) Model-based recognition in robot vision. ACM Computing Survey8(1), 67–108.

9. R. Mehrota & J. E. Gary (1995) Similar-shape retrieval in shape data management. IEEEComputer 57–62.

10. F. Stein & G. Medioni (1990) Efficient two-dimensional object recognition. In: Proceedings of10th International Conference on Pattern Recognition, Vol. 1. IEEE CS Press, Los Alamitos, CA,Order No. 2062-17 pp. 238–249.

11. A. Pentland, R. Picard & S. Scaroff (1994) PhotoBook: tools for content-based manipulationof image databases. In: Storage and Retrieval for Image and Video Databases II, Vol. 2185, SPIE,Bellingham, WA, pp. 34–46.

12. M. Nappi, G. Polese & G. Tortora (1996) Fractal based indexing in image database. In:Proceedings of the First IAPR International Workshop on Image Databases and Multi Media Search,Amsterdam, the Netherlands, pp. 162–169.

13. S. K. Chang, Q. Y. Shi & C. W. Yan (1989) Iconic indexing by 2D-strings. IEEE Transactionson Pattern Analysis and Machine Intelligence PAMI-9(3), 413–427.

14. S. K. Chang & E. Jungert (1991) Pictorial data management based upon the theory of thesymbolic projections. Journal of Visual Languages and Computing 2, 195–215.

15. C. C. Chang & C. F. Lee (1995) Relative coordinates oriented symbolic string for spatialrelationship retrieval. Pattern Recognition 28(4), 563–570.

16. S. Y. Lee & F. J. Hsu (1990) 2D C-string a new spatial knowledge representation for imagedatabase systems. Pattern Recognition 23(10), 1077}1087.

17. S. Y. Lee, M. C. Yang & J. W. Chen (1992) 2D B-string: a spatial knowledge representationfor image database systems. In: Proceedings of Second International Computer Science ConferenceICSC’92, pp. 609–615.

18. A. F. Abate, M. Nappi, G. Tortora & M. Tucci (1996) Assisted browsing in a diagnosticimage database. In: Proceedings of the Workshop on Advanced Visual Interfaces, AVI’96 (T. Catarci,M. F. Costabile, S. Levialdi, G. Santucci eds). ACM Press, pp. 223–232.

19. G. Petraglia, M. Sebillo, M. Tucci & G. Tortora (1993) Towards normalized iconic indexing.In: Proceedings of IEEE Symposium on Visual Languages, St. Louis, pp. 392–394.

20. G. Petraglia, M. Sebillo, M. Tucci & G. Tortora (1996) Virtual image for spatial reasoning inimage databases. Internal Report, Dipartimento di Informatica ed Applicazioni, Universitadi Salerno.

21. M. Nabil, A. H. H. Ngu & J. Shepherd (1996) Picture similarity retrieval using the 2Dprojection interval representation. IEEE Transactions on Knowledge and Data Engineering 8(4),533–539.

22. M. Nappi, G. Polese & G. Tortora (1996) An image indexing technique based on contractivefunctions. Journal of Computing and Information 2(1) Special Issue: Proceeding of Eighth Interna-tional Conference on Computing and Information ICCI’96, University of Waterloo, On, Canada,pp. 957–978.

23. J. F. Allen (1983) Maintaining knowledge about temporal intervals. Communications of ACM26, 832–843.

24. M. J. Egenhofer (1991) Point-set topological spatial relations. International Journal of Geogra-phical Information Systems 5(2), 161–174.

25. V. N. Gudivada & V. V. Raghavan (1995) Design and evaluation of algorithms for imageretrieval by spatial similarity. ACM Transactions on Information Systems 13(2), 115–144.

26. P. Maresca, T. Arndt & A. Guercio. Deduction Rules for Simplifying Multimedia Repres-entation. Submitted for publication.

27. T. Arndt, A. Guercio & P. Maresca. A layered approach to multimedia database systems.Proceedings of Multimedia Storage and Archiving Systems IV, SPIE Vol. 3846, pp. 441–448.

28. T. Arndt & S. K. Chang (1989) Image sequence compression by iconic indexing. In:Proceedings of IEEE Workshop on Visual Languages, Rome, Italy, pp. 177–182.

MULTIMEDIA INDEXING 433

Appendix A1: Acronyms used in the paper

STAE—spatio-temporal algebraic expressionsSTRG—spatio-temporal relation graphSTRD—spatio-temporal rules of deductionSTSR—spatio-temporal simplification rulesSTRR—spatio-temporal reconstruction rulesSTAESR—STAE simplification ruleSTRER—spatio-temporal reduction rulesSMART—simplified modeling to access and retrieveSTDT—spatio-temporal deduction table

Appendix B: Simplification Phase of SMART

Table B1 gives the relationship between the symbols used in the STDT and the algebraicexpressions and their equivalent acronyms used in the SMART knowledge base.

/* SMART simplification phase */smart :-

take (E ),simplify ( E,Eout),save (Eout ).

/* Retrieve E from Express.txt */take (E ):-

fopen ( ‘Express.txt’,0),see ( ‘Express.txt’ ),inpos (0),read (E),fclose ( ‘Express.txt’ ),seen.

Table B1. Relations, symbols, and acronyms

Name of relation Symbol used Acronym usedin the STAE in SMART

Equal " ugBefore C prAfter / doStrictly before ( spStrictly after ' sdIncludes % incIncluded $ inIncludes with adjac. to the left ( incsIncludes with adjac. to the right ) incdIncluded with adjac. to the left [ insIncluded with adjac. to the right ] indBefore with adjacency 3 padAfter with adjacency 2 dad

434 P. MARESCA ET AL.

/* Save Eout in filesol.Txt */save (Eout ):-

fcreate ( ‘filesol.txt’,0 ),tell ( ‘filesol.txt’ ),outpos (0),write (Eout ),write ( ‘ . ’ ),fclose ( ‘filesol.txt’ ),told.

/*Simplify E and put the result in Eout */simplify ( E,Eout):-

three—reduce (E,Eout1),two—reduce (Eout1,Eout2),one—reduce (Eout2,Eout ).

/* */three—reduce (E,Eout):-

reduce3 (E ),get—solution (Eout ).

reduce3 (Exp):-(exist (A,B,R1,R2,Exp),exist ( B,X,R3,R4,Exp),

(exist—ded3 (A,X,R1,R2,R3,R4,Exp,Exp1);

exist—ded—inv3 (X,A,R1,R2,R3,R4,Exp,Exp1) ),

reduce3 (Exp1), ! );( save—solution (Exp) ),!./* save in a file Exp */

exist—ded3 (A,X,R1,R2,R3,R4,Exp,Exp1):-exist (A,X,R5,R6,Exp),deduce (R1,R3,R5),deduce (R2,R4,R6),cancel (A,X,R5,Exp,Exp1),!.

exist—ded—inv3 (X,A,R1,R2,R3,R4,Exp,Exp1):-exist (X,A,R7,R8,Exp),inverse (R7,R9),inverse (R8,R10),deduce (R1,R3,R9),deduce (R2,R4,R10),cancel (X,A,R7,R8,Exp,Exp1),!.

/* Try to simplify the expression inverting the relations of the second term */two—reduce (E,Eout):-

reduce2 (E ),!,get—solution (Eout ).

reduce2 (Exp):-(exist (A,B,R1,R2,Exp),exist (X,B,R3,R4,Exp),inverse (R3,R7),inverse (R4,R8),

MULTIMEDIA INDEXING 435

(exist—ded2 (A,X,R1,R2,R7,R8,Exp,Exp1);exist—ded—inv2(X,A,R1,R2,R7,R8,Exp,Exp1) ),reduce2 (Exp1),! );( save—solution (Exp) ).

exist—ded2 (A,X,R1,R2,R7,R8,Exp,Exp1):-exist (A,X,R5,R6,Exp),deduce (R1,R7,R5),deduce (R2,R8,R6),cancel (A,X,R5,R6,Exp,Exp1),!.

exist—ded—inv2 (X,A,R1,R2,R7,R8,Exp,Exp1):-exist (X,A,R9,R10,Exp),inverse (R9,R11),inverse (R10,R12),deduce (R1,R7,R11),deduce (R2,R8,R12),cancel (X,A,R9,R10,Exp,Exp1),!.

/* Try to simplify the expression inverting the relations of the first term */one—reduce (E,Eout ):-

reduce1 (E ),!,get—solution (Eout ).

reduce1 (Exp):-(exist (A,B,R1,R2,Exp),inverse (R1,R3),inverse (R2,R4),exist (A,X,R5,R6,Exp),(exist—ded1 (B,X,R3,R4,R5,R6,Exp,Exp1);exist—ded—inv1 (X,B,R3,R4,R5,R6,Exp,Exp1) ),reduce1 (Exp1),! );( save—solution (Exp) ).

/* put in a file Exp */

exist—ded1 (B,X,R3,R4,R5,R6,Exp,Exp1):-exist ( B,X,R7,R8,Exp),deduce (R3,R5,R7),deduce (R4,R6,R8),cancel ( B,X,R7,R8,Exp,Exp1),!.

exist—ded—invl (X,B,R3,R4,R5,R6,Exp,Exp1):-exist (X,B,R9,R10,Exp),inverse (R9,R11),inverse (R10,R12),deduce (R3,R5,R11),deduce (R4,R6,R12),cancel (X,B,R9,R10,Exp,Exp1),!.

/* The solution is saved in sol.txt */save—solution (E):-

fcreate ( ‘sol.txt’,0 ),tell ( ‘sol.txt’ ),outpos (0),write (E ),write ( ‘ . ’ ),fclose ( ‘sol.txt’ ),told.

436 P. MARESCA ET AL.

/* The solution is retrieved from sol.txt */get—solution (E):-

fopen ( ‘sol.txt’,0 ),see ( ‘sol.txt’ ),inpos (0),read (E),fclose ( ‘sol.txt’ ),del ( ‘sol.txt’ ),seen.

/* Does A,C,R1,R2 exist in E? */exist (A1,C1,Rr1,Rr2, [A1,C1,Rr1,Rr2D—] ).exist (A1,C1,Rr1,Rr2, [—,—,—,—DD]):-

exist (A1,C1,Rr1,Rr2,D).

/* SPATIO-TEMPORAL DEDUCTION RULES */deduce (ug,ug,ug ).deduce (ug,inc,inc).deduce (ug,incs,incs).deduce (ug,incd,incd).deduce (ug,pad,pad).deduce (ug,sp,sp ).deduce (ug,pr,pr ).deduce (ug,in,in ).deduce (ug,ins,ins ).deduce (ug,ind,ind ).deduce (ug,dad,dad).deduce (ug,sd,sd ).deduce (ug,do,do).deduce ( inc,ug,inc).deduce ( inc,inc,inc).deduce ( inc,incs,inc).deduce ( inc,incd,inc).deduce ( incs,ug,incs).deduce ( incs,inc,inc).deduce ( incs,incs,incs ).

deduce ( incs,incd,inc).deduce ( incs,ind,do).deduce ( incs,dad,dad ).deduce ( incs,sd,sd ).deduce ( incd,ug,incd).deduce ( incd,inc,inc).deduce ( incd,incs,inc).deduce ( incd,incd,incd).deduce ( incd,pad,pad).deduce ( incd,sp,sp ).deduce ( incd,pr,pr ).deduce ( incd,ins,pr ).deduce (pad,ug,pad).deduce (pad,inc,sp ).deduce (pad,incs,pad ).deduce (pad,incd,sp ).deduce (pad,pad,sp ).deduce (pad,sp,sp ).deduce (pad,pr,sp ).

MULTIMEDIA INDEXING 437

deduce (pad,ins,pad ).deduce ( sp,ug,sp ).deduce ( sp,inc,sp ).deduce ( sp,incs,sp ).deduce ( sp,incd,sp).deduce ( sp,pad,sp ).deduce ( sp,sp,sp ).deduce ( sp,pr,sp ).deduce ( sp,ins,sp ).deduce (pr,ug,pr ).deduce (pr,pad,sp ).deduce (pr,sp,sp ).deduce (pr,ins,pr ).deduce ( in,ug,in ).deduce ( in,pad,sp ).deduce ( in,sp,sp ).deduce ( in,in,in ).deduce ( in,ins,in ).deduce ( in,ind,in ).deduce ( in,dad,sd ).deduce ( in,sd,sd ).deduce ( ins,ug,ins ).deduce ( ins,pad,sp ).deduce ( ins,sp,sp ).deduce ( ins,in,in ).deduce ( ins,ins,ins ).deduce ( ins,ind,in ).deduce ( ins,dad,dad ).deduce ( ins,sd,sd ).deduce ( ind,ug,ind ).deduce ( ind,pad,pad).deduce ( ind,sp,sp ).deduce ( ind,in,in ).deduce ( ind,ins,in ).deduce ( ind,ind,ind ).deduce ( ind,dad,sd ).deduce ( ind,sd,sd ).deduce (dad,ug,dad).deduce (dad,incs,sd ).deduce (dad,incd,dad).deduce (dad,ind,dad).deduce (dad,dad,sd ).deduce (dad,sd,sd ).deduce (dad,do,sd ).deduce ( sd,ug,sd ).deduce ( sd,incs,sd ).deduce ( sd,incd,sd).deduce ( sd,ind,sd ).deduce ( sd,dad,sd ).deduce ( sd,sd,sd ).deduce ( sd,do,sd ).deduce (do,ug,do).deduce (do,ind,do).deduce (do,dad,sd ).deduce (do,sd,sd ).

438 P. MARESCA ET AL.

/* Cancel A,C,R1,R2 from the expression and return the resulting expression */cancel (—,—,—,—, [ ], [ ] ).

cancel (A1,C1,R11,R21, [A1,C1,R11,R21DD],M):-cancel (A1,C1,R11,R21,D,M),!.

cancel (A1,C1,R11,R21, [T1,T2,T3,T4DD], [T1,T2,T3,T4DM] ):-cancel (A1,C1,R11,R21,D,M).

/* The predicate inverse (R,? ) returns the Spatio-Temporal relation which is the inverse of R */inverse (ug,ug).inverse ( inc,in ).inverse ( incs,ins ).inverse ( incd,ind).inverse (pad,dad ).inverse (pr,do).inverse (do,pr ).inverse ( sp,sd ).inverse ( in,inc).inverse ( ins,incs ).inverse ( ind,incd).inverse (dad,pad ).inverse ( sd,sp ).