A Semantic Approach and a Web Tool for Contextual Annotation of Photos Using Camera Phones

12
A Semantic Approach and a Web Tool for Contextual Annotation of Photos Using Camera Phones Windson Viana 1 , José Bringel Filho 2 , Jérôme Gensel, Marlène Villanova-Oliver, Hervé Martin Laboratoire d’Informatique de Grenoble, équipe STEAMER 681, rue de la Passerelle, 38402 Saint Martin d’Hères, France {carvalho,bringel, gensel, villanov, martin}@imag.fr Abstract. The increasing number of personal digital photos on the Web makes their management, retrieval and visualization a difficult task. To annotate these images using Semantic Web technologies is the emerging solution to decrease the lack of the description in photos files. However, existing tools performing manual annotation are time consuming for the users. In this context, this paper proposes a semi-automatic approach for annotating photos and photo collections combining OWL-DL ontologies and contextual metadata acquired by mobile devices. We also describe a mobile and Web-based system, called PhotoMap, that provides automatic annotation about the spatial, temporal and social contexts of a photo (i.e., where, when, and who). PhotoMap uses the Web Services and Semantic Web reasoning methods in order to infer information about the taken photos and to improve both browsing and retrieval. Keywords: Semantic Web, ontologies, context sensing, mobile devices, photo annotation, and spatial Web 2.0. 1 Introduction One of the main motivations of people for taking personal photos is to keep a trace of a situation that they can share afterwards, with their families and friends by using Web photo gallery systems (e.g., Picasa). Shooting new pictures gets easier with the modern digital cameras and the storage costs in these devices has decreased enormously. Hence, the amount of photos in personal digital collections has grown rapidly and looking for a specific photo to be shared has become a frustrating activity [5][7][10]. A similar difficulty is found in off-the-shelf Web image search engines. Generally, a keyword search results in providing the user with images that she/he did not expected [7]. Both problems are related to the lack of description in images files. One solution to facilitate organization and retrieval processes is to have annotations about the photos in a machine-readable form. This image metadata describes both the 1 Supported by CAPES - Brasil 2 Supported by the Programme Alban, the European Union Programme of High Level Scholarships for Latin America, scholarship no. E06D104158BR

Transcript of A Semantic Approach and a Web Tool for Contextual Annotation of Photos Using Camera Phones

A Semantic Approach and a Web Tool for Contextual

Annotation of Photos Using Camera Phones

Windson Viana1, José Bringel Filho2, Jérôme Gensel, Marlène Villanova-Oliver, Hervé Martin

Laboratoire d’Informatique de Grenoble, équipe STEAMER

681, rue de la Passerelle, 38402 Saint Martin d’Hères, France {carvalho,bringel, gensel, villanov, martin}@imag.fr

Abstract. The increasing number of personal digital photos on the Web makes their management, retrieval and visualization a difficult task. To annotate these images using Semantic Web technologies is the emerging solution to decrease the lack of the description in photos files. However, existing tools performing manual annotation are time consuming for the users. In this context, this paper proposes a semi-automatic approach for annotating photos and photo collections combining OWL-DL ontologies and contextual metadata acquired by mobile devices. We also describe a mobile and Web-based system, called PhotoMap, that provides automatic annotation about the spatial, temporal and social contexts of a photo (i.e., where, when, and who). PhotoMap uses the Web Services and Semantic Web reasoning methods in order to infer information about the taken photos and to improve both browsing and retrieval.

Keywords: Semantic Web, ontologies, context sensing, mobile devices, photo annotation, and spatial Web 2.0.

1 Introduction

One of the main motivations of people for taking personal photos is to keep a trace of a situation that they can share afterwards, with their families and friends by using Web photo gallery systems (e.g., Picasa). Shooting new pictures gets easier with the modern digital cameras and the storage costs in these devices has decreased enormously. Hence, the amount of photos in personal digital collections has grown rapidly and looking for a specific photo to be shared has become a frustrating activity [ 5][ 7][ 10]. A similar difficulty is found in off-the-shelf Web image search engines. Generally, a keyword search results in providing the user with images that she/he did not expected [ 7]. Both problems are related to the lack of description in images files. One solution to facilitate organization and retrieval processes is to have annotations about the photos in a machine-readable form. This image metadata describes both the

1 Supported by CAPES - Brasil 2 Supported by the Programme Alban, the European Union Programme of High Level

Scholarships for Latin America, scholarship no. E06D104158BR

photo context (e.g., where and when the picture has been taken) and information about the photo content (e.g., who are the people in the photo, what are they doing) [ 3][ 4][ 6]. The annotation can be exploited by Web image search engines in order to allow users to retrieve their photos by searching on content and context information, instead of only querying filenames. In addition, context annotation of a photo provides a support to conceive better browsing and management image systems [ 6]. Several Web applications and research works provide tools to help the creation of manual annotation. For instance, Flickr allows the association of spatial information (location tags) and free-text keywords with photos. Other applications, such as [ 2] and [ 3], offer a more powerful metadata description by using conceptual graphs and spatial ontologies in RDF. Despite the efforts of innovative annotation tools, such as EspGames3, manual annotation is still a time consuming and a boring activity. The multimedia research community proposes systems that automatically suggest image annotations. These systems employ the low-level visual features of the image (e.g., colors, textures, shapes) for indexing the photos and extracting annotations by using algorithms to identify similarity correspondences between a new image and pre-annotated pictures. However, there is a semantic gap between the recommended annotations of these systems and the user desired annotations [ 6][ 10].

In the near future, the use of new mobile devices can change the photo annotation radically. These devices will have high-resolution built-in cameras and the majority of the users will progressively substitute their traditional digital cameras by this new generation of mobile devices. Besides that, the progressive addition of sensors to mobile devices (e.g., GPS, RFID readers, and temperature sensors) allows the acquisition of a vast number of contextual information about the user’s situation when she uses her cameraphone for taking a photo [ 9]. The interpretation and inference about this context data generate new information that can be employed for annotating photos automatically [ 5]. In addition, using mobile applications the users can assign metadata at the capture point of their photos, avoiding the so-called “time lag problem” of desktop postponed annotations [ 5]. In this context, we propose a semi-automatic approach for annotating photos by using OWL-DL ontologies and mobile devices. In this article, we present an ontology called ContextPhoto that allows manual and automatic context annotations of photos and photo collections. In order to validate the context annotation process we propose, we have also designed and developed a mobile and Web Information System. This novel system, called PhotoMap, is an evolution of the related mobile annotation systems since it provides automatic annotation about the spatial, temporal and social contexts of a photo (i.e., where, when, and who was nearby). PhotoMap also offers a Web interface for spatial and temporal navigation in photo collections. This interface exploits spatial Web 2.0 services for showing where the user takes her photos and the followed itinerary for taking them. In addition, users can look into the inferred and manual annotations of their photos and exploit them for retrieval purposes.

This article is organized as follows: section 2 presents our ontology for photo annotation; section 3 gives an overview of the proposed Web and mobile system; section 4 discusses the related works in image annotation and photos visualisation; and, finally, we conclude in section 5, and outline potential future works.

3 http://www.espgame.org/

2 ContextPhoto Ontology

Annotation is the main tool to associate semantics with an image. Annotation highlights the significant role of photos in restoring forgotten memories of visited places, party events, and people. In addition, the use of annotation allows the development of better organization, retrieval and visualization processes for personal image management. The existing approaches for multimedia annotation employ several representation structures to associate metadata with images. These structures may be attribute-value pairs inserted in the header of image files (e.g., EXIF, and IPTC formats), or more expressive representations such as the MPEG-7 standard and the RDF/OWL ontologies. In the context of the Semantic Web, the use of ontologies for annotation representation is more suitable for making the content machine-understandable. Ontologies can reduce the semantic gap between what image search engines find and what the people expect to be retrieved when they make a search. Moreover, with a description in a formal and explicit way, reasoning methods can be employed in order to infer about the content of a photo and its related context. In the field of multimedia annotation, different ontologies for image metadata representation have been proposed [ 11][ 12][ 2][ 3]. These ontologies are well suited for extensive content-oriented annotation. In our approach, however, we are more interested in representing what was the context of the user (social, spatial, temporal) when she took her photos. We claim, as the authors in [ 6] and [ 7], that contextual metadata are particularly useful to photos organization and retrieval, since, for instance, knowing location and time of a photo often describes the photo itself a lot even before a single pixel is shown. A graphical representation of our ontology, called ContextPhoto, is presented in Fig. 1. ContextPhoto is an OWL-DL ontology to annotate photos and also photo collections with contextual metadata and content textual descriptions.

a) b)

Shot Context

Spatial- Physical location- Latitude, longitude, elevation- City name- Country

Spatiotemporal - Season - Weather conditions - Temperature- Light status

- Nearby Bluetooth devices

Computational- Camera Properties

°C

Social- Around people- Known people

Temporal- Day, month, year - Time of day - Day of week

Event Collection

Photo

ContentAnnotation

Track Point

Shot Context

HasPhoto

HasAnnotation

HasTrackPointHasInterval

Interval

Social Context

Temporal Context

is subclass of

propriety

Spatial Context

Computational Context

SpatiotemporalContext

HasCtxAnnotation

Fig. 1. ContextPhoto Ontology

The main idea of ContextPhoto is to associate a spatial itinerary and a temporal interval with a photo collection. The authors [ 6] and [ 4] have pointed out that grouping photos in events (e.g., a vacation, a tourist visit, and a party) is one of the most common way people use to recall and to organize their photos. The concept EventCollection of ContextPhoto represents the idea of a photo collection associated

with an event. A time interval and an ordered list of track points (timestamp and geographic coordinates) are linked to EventCollection (see Fig. 1a). The EventCollection property hasPhotos describes the photos (Photo) related to the collection. The Photo concept contains the basic image properties (e.g., name, size, width and height). Besides these properties, each photo has two types of annotation: the content annotation (Content Annotation) and the contextual annotations (Shot Context). An annotation system can exploit our ontology for manual content annotation and they can suggest keyword tags derived from the shot context annotation. The integration of ContextPhoto with other image annotation ontologies such as the Visual Descriptor Ontology (VDO) [ 12] can also be envisioned to offer wider description possibilities. ContextPhoto supports five types of contextual metadata: spatial (Spatial Context), temporal (Temporal Context), social (Social Context), computational (Computational Context), and spatiotemporal (SpatioTemporal Context). These concepts correspond to the major elements for describing a photo (i.e., where, when, who, with what) [ 6][ 4][ 7].

Location is the most useful information to recall personal photos. A person can remember a picture using an address (“Champs Élysées avenue”) or a more imprecise description of a place (“Disneyland”). However, systems to acquire location information (GPS, A-GPS) describe location in terms of georeferenced coordinates (latitude, longitude, elevation, coordination system). Hence, the Spatial Context concept incorporates different semantic representation levels of location description. For instance, in order to describe location places in a geometric way (polygons, lines, points), ContextPhoto imports the NeoGeo4 ontology. NeoGeo is an OWL-DL representation of the core concepts of GML (Geographic Markup Language), an open interchange format defined by the Open Geospatial Consortium (OGC) to express geographical features. Using the gml:Point concept, ContextPhoto define a subclass of Spatial Context to represent GPS coordinates. We have also added properties to store the elevation data and the imprecision information of the GPS receiver (number of satellites, quality data). The other levels of location representation are defined using textual description for complete addresses and using hierarchical descriptions for imprecise location (Europe � Paris �Disneyland).

The time is another important aspect to the organization and the retrieval of personal photos. However, a specific date is not the most used temporal attribute when a person tries to find a photo in a chronological way [ 6]. When a photo has not been taken at a date that can be easily remembered (e.g., birthday, Christmas), people use the month, the day of week, the time of day, and/or the year information in order to find the desired photo. Thus, the Temporal Context concept allows the association of an instant (date and time) with a photo, and also of the different time interpretations and attributes listed above. In order to represent time intervals and time instants in ContextPhoto and to allow time reasoning with the photos annotations, we reuse the concepts of OWL-Time ontology5. The Spatial-Temporal Context concept contains attributes of the shot context of a photo that depends of a time and a location data to be calculated. In this first version of ContextPhoto, we have defined only one spatiotemporal class: the physical environment. This concept has properties that

4 http://mapbureau.com/neogeo/neogeo.owl 5 http://www.w3.org/TR/owl-time/

describe how were the season, the temperature, the weather conditions, and the day light status (e.g., day, night, after sunset) when the user took her photo. These spatiotemporal attributes are both employed to search purposes and to increase the described information of a photo. The Computational Context concept describes all the digital devices present at the time of the photo shot (i.e., the camera, and the surrounding Bluetooth devices). This concept groups two other classes: Camera, and BluetoothDevice. The Camera class describes the characteristics of the digital camera and the camera settings used to take the photograph (e.g., aperture, shutter speed, and focal length). This concept integrates the core attributes of the EXIF format. The Bluetooth Device class contains the Bluetooth addresses of the nearby devices. This concept plays a central role in the inference process of the photo social context.

One of the innovating characteristics of ContextPhoto ontology is the ability to describe the social context of a photo. The description of social context is based on the proposal of [ 5] and FoafMobile6 to use the Bluetooth address of personal devices in order to detect the presence of a person. The main idea is to associate a hash value of Bluetooth addresses with a unique Friend-Of-A-Friend7 (FOAF) profile. FOAF is a RDF ontology that allows the description of a person (name, email, personal image) and of her social networks (the person’s acquaintances). Thus, we can use the nearby Bluetooth addresses stored in the Computation Context element in order to calculate if these addresses identify a user’s friend, and afterward, to annotate the photo with the names of the nearby acquaintances. In order to represent acquaintances and the user, we define in ContextPhoto two classes that can be elements of the Social Context concept of a photo: the Person, and the Owner classes. The Person is defined as a subclass of foaf:Person that contains a Bluetooth device (see Formula 1). The Owner describes the owner of the photo (the photographer and user of our system) (see Formula 2). The inference process to calculate the nearby people is presented in details in the section 3.2.

Person ⊆ foaf:Person ≡ ∃hasBTDevice ctxt:BTDevice Owner ⊆ Person ≡ ∃hasEventCollection ctxt:EventCollection

(1) (2)

3 PhotoMap

PhotoMap is a mobile and Web Information System for contextual annotation of photos taken by mobile users. The three main goals of PhotoMap are: (1) to offer a mobile application enabling users to take pictures and to group theirs photos in event collections; (2) to propose a Web system that organizes the user photos using the acquired spatial and temporal data; and (3) to improve the users recall of their photos showing the inferred spatial, temporal, and social information.

6 http://jibbering.com/discussion/Bluetooth-presence.1 7 http:// www.foaf-project.org/

3.1 General Overview

The PhotoMap system is structured according to a client-server model. The PhotoMap client is a mobile application running on J2ME8-enabled devices. Using this mobile application, users can create collections of photos representing events (parties, concerts, tourist visits). The user can give a name and a textual description when she starts a collection. Fig. 2a illustrates a scenario of use of the PhotoMap mobile application. The PhotoMap client runs a background process that monitors the current physical location of the mobile device. The mobile application accesses the device sensors (e.g., built-in GPS, Bluetooth-enabled GPS receiver) via the Location API8

(i.e., JSR 179) or via the Bluetooth API8 (i.e., JSR 82). The gathered coordinates (latitude, longitude, and elevation) are stored in order to build a list of track points. This list represents the itinerary followed by the user to take the pictures. The acquired track list is associated with the metadata of the current photo collection.

Fig. 2. Photomap use and the generated annotation

In addition, the PhotoMap client captures the photo context when a user takes a picture with her camera phone. The mobile client gets the device geographic position, the date and time information, the bluetooth addresses of the nearby devices, and the configuration properties of the digital camera. All this metadata is stored for each photo of the collection. After she has taken a picture, the user can add manual annotations for each photo. This feature reduces the time lag problem that occurs in desktop annotations tools [ 1] since the user is still involved in the shot situation when she writes the photo textual metadata. The user indicates the end of the current photo collection with two clicks in the interface of the PhotoMap client. Thus, the mobile application generates the collection annotation by using the ContextPhoto ontology. The textual description of the collection, its start and end instants, and the track points list are added to the annotation. For each photo, the gathered context (i.e.; computational, spatial, and temporal contexts) and the possible manual annotation are stored in the ontology instantiation. The taken photos and the generated annotation are stocked in the mobile device file system (see Fig. 2b). Afterward, the user uploads her

8 http://java.sun.com/javame/

photos and the annotation metadata of the collection to the PhotoMap Web server. The user can execute this upload process from her mobile phone directly (e.g., via a HTTP connection). PhotoMap offers an alternative in order to avoid payment of data service charges. The user can transfer to a desktop computer (e.g., via a USB connection) her photo collections and their annotations. Thus, the user executes the Web interface of PhotoMap to finish the upload process. The PhotoMap server is a J2EE Web-based application. Besides acting as an upload gateway, the PhotoMap server is in charge of the photos indexing and of the inference process of associated contextual annotations. Fig. 3 shows an overview of the PhotoMap architecture.

Fig. 3. Overview of the PhotoMap architecture

After the transmission of a collection and its semantic metadata, the PhotoMap server reads the annotations associated with each photo. The spatial and temporal information are translated to a more useful representation. For example, geographical coordinates are transformed in a textual representation like city and country names. In addition, PhotoMap accesses to off-the-shelf Web Services in order to get information about the physical environment where each photo was taken (e.g., temperature, weather conditions). Furthermore, PhotoMap uses Semantic Web technologies in order to infer information about the social context of each photo. Automatically, PhotoMap identifies the user’s friends that were nearby during the shot time of each photo. The inference information about each photo is added to the ContextPhoto instantiation. The generated annotations are exploited by the visualization tool of PhotoMap. After an indexing process, the PhotoMap Web site allows the user to view her photo collections and the itinerary followed to take them. The user can see on a map, using spatial Web 2.0 services, the places where she took her photos. She also has access to the rich generated annotation.

3.2 Interpretation, Inference and Indexing Processes

Fig. 4 shows the sequence of three fundamental processes performed by the PhotoMap server in order to increase knowledge about the shot context of a photo and to optimize the mechanisms of spatial and temporal queries. When a user sends a photo collection to PhotoMap, the annotation contains only information about the

computational, spatial and temporal contexts of a photo. Thus, PhotoMap accesses off-the-shelf Web Services in order to augment the description of these context annotations. This approach, proposed by [ 6], reduces the development cost and profits from the advantages of the Web Services technology (i.e., reutilization, standardization). First, PhotoMap executes an interpretation of the gathered spatial metadata. The PhotoMap server uses a Web Service to transform GPS data of each photo into physical addresses. The AddressFinder Web Service9 offers a hierarchic description of an address at different levels of precision (i.e., only country and city name, a complete address). All the different Web Service responses are stored in the Datatype properties of the Spatial Context subclasses. The second interpretation phase is the separation of temporal attributes of the date/time property. The PhotoMap server calculates day of week, month, time of day, and year properties using the instant value. Later, the PhotoMap server gets information about the physical environment as it was at the photo shot time. PhotoMap derives temperature, season, light status, and weather conditions using the GPS data and the date/time annotated by the PhotoMap mobile client. We use the Weather Underground10 Web Service to get weather conditions and the temperature information. In addition, the PhotoMap server uses the Sunrise and Sunset Times11 Web Service to get the light status. The season property is easily calculated using the date and GPS data.

Fig. 4. Interpretation, inference and indexing process

After the interpretation process, the PhotoMap server execute the inference process in order to derivate the social context of a photo. This process gets the FOAF profiles of the user friends and tries to identify who was present at the moment of a photo shot. In our approach, we use the FOAF profile of the photo’s owner is a start point of a search. We navigate through the properties “foaf:Knows” and “rdf:seeAlso” to get the FOAF profiles of people that she knows. After that, we repeat the same process with the friend profiles in order to find the friends of the owner’s friends. All the found profiles are used to instantiate individuals of the Person class. After the instantiation of the Person and Owner individuals, we use a rule-based engine in order to infer which acquaintances were present at the moment of the photo shot. Formula 3 shows the SWRL rule used to infer the presence of an owner’s friend.

9 http://ashburnarcweb.esri.com/ 10 www.weatherunderground.com/ 11 http://www.earthtools.org/

After the end of the inference process, individuals representing the nearby acquaintances are associated with the Social Context of the current photo. Hence, the final OWL annotation of a photo contains spatial, temporal, spatial-temporal, computational and social information. Then, the PhotoMap server executes an indexing process in order to optimize browsing and interaction methods. The number of photo collections annotations should increase in our system quickly. To avoid problems of sequential searches in these OWL annotations, spatial and temporal indexes are generated for each collection using the PostgreSQL database extended with the PostGIS module.

Owner(?owner) ^ Person(?person) ^ SocialContext(?scctxt) ComputationalContext(?compctxt) ^ BTDevice(?btDv)

foaf:knows(?person, ?owner)^ hasBTDevice(?person, ?btDv) hasContextElement(?scctxt, ?owner) ^ hasContextElement(?compctxt, ?btDv) → hasContextElement(?scctxt, ?person)

(3)

3.2 Browsing and Querying Photos

The PhotoMap Web site offers graphical interfaces for navigation and query over the users captured collections of photos. We have decided to design PhotoMap Web site using map-based interfaces for browsing photos purposes. Usability studies [ 8] show that map interfaces present more interactivity advantages than browsing photos only with location hierarchical links (e.g.; Europe -> France -> Grenoble). Furthermore, with a map-based interface, we can easily represent the itineraries followed by the users for taking their photos. Fig. 5 shows a screen shot of the PhotoMap Web site.

Fig. 5. Screen shot of the PhotoMap Web Site

The rectangular region positioned at the left side of the Web Site is called the menu-search view. This window shows the PhotoMap main functionalities, the social network of the user, and a keyword search engine for her photo collections. On the

right side, from top to bottom, are presented the event-collection view, the spatial

view, and the temporal query window. When a user enters in the PhotoMap Web site, the PhotoMap uses the temporal index to determinate the ten latest collections. PhotoMap then shows, in the event-collection view, thumbnails of the first photos of each collection and the collection names annotated by the user. The latest collection is selected and the itinerary followed by the user is displayed in the spatial view. Placemarks are also inserted in the map for each photo, and the user can click to view the photo and the generated annotation. Fig. 5 b shows the contextual information of a selected photo.

The spatial view displays maps using the Google Maps API. The navigation on the map, in a spatial query mode, changes the displayed map and also the visible collections in the event-collection view. In order to perform this operation, PhotoMap queries the spatial database that is constrained to return only the collections intersecting a defined view box. PhotoMap uses the bounding box coordinates of the spatial view and the current zoom value in order to calculate the view box coordinates. The temporal query window can be used to restraint the displayed collections (e.g., to show only the collections intersecting the view box and taken in January). A mouse click in a placemark shows the photo and the context annotation. Photomap reads the collection and photos annotation using the Jena API.

4 Related Works

The possibility to have access to both spatial and temporal data of an image is exploited by the tools PhotoCompas [ 6] and WWMX [ 8]. PhotoCompas use Web Services to derivate information about the location and physical environment (e.g., weather conditions, temperature). PhotoCompas calculates image clusters using the time and the spatial data in order to represent events recall cues. All the generated information is used for browsing the photos of the collection. WWMX is a Microsoft project with the goal of browsing large databases of images using maps. WWMX offers different possibilities of displaying images in a map and uses GIS databases in order to index the location information of the photos.

In these approaches, to get context information associated with the images is difficulty. Although the existence of traditional digital cameras equipped with GPS receivers, these devices are not widely used. In addition, the manual association of location tags with an image is a time consuming task. Our approach addresses this issue by using mobile devices as source of context metadata in order to increase the number and the quality of contextual information. Some research works propose the annotation of photos using mobile devices. For instance, the MMM Image Gallery [ 7] proposes a mobile application for semi-automatic photos annotation using the Nokia 3650 devices. At the photo shot time, the application captures the GSM cell ID, the user identity and date/time of the photo. The image is sent to the server associated with its annotation. The server combines the low level properties of the image with location information in order to derivate information about its content (e.g., is the Eiffel Tower the photo subject?). After that, the user uses a mobile XHTML browser in order to read and validate the generated annotation. The imprecision of the GSM

cell ID and the slow response time of the server during the image upload and validation processes were identified as the major usability problems of MMM [ 7].

Zonetag [ 1] is a Yahoo! research prototype allowing user to upload their photos to Flickr from their mobile phones directly. ZoneTag leverages location information from camera phones (GSM cell ID and GPS coordinates) and suggests tags to be added to the photo. ZoneTag derives its suggestions from the tags used by the community for the same location, and from the tags assigned to the user photos.

In [ 5], the design of a mobile and Web system combines Bluetooth addresses acquisition, FOAF profiling and face detection tools in order to identify people in a photo. The authors suggest that the generated annotation is represented in a RDF file and embedded in the header of the image. They propose to use the Bluetooth addresses of the nearby devices as key inputs to a FOAF Query Service that is able to deliver the requested FOAF profiles. This approach seems to us not easily feasible. First, this proposition does not take into account the distributed characteristics of FOAF profiles. Moreover, to conceive a service that indexes all the FOAF profiles of the Web is not realistic. In our approach, we use the FOAF profile of the photo owner as the starting point of a search from which we access to profiles that are useful to annotate the photo (i.e., the profiles of friends, of friends of friends).

The major difference between the other works and our approach is that PhotoMap is both a mobile application for semi-automatic image annotation and a Web system for organization and retrieval of personal image collections. Besides the ability of acquire context automatically, the PhotoMap mobile client allows user to create their event collections and to insert textual annotation to them. The PhotoMap Web site allows the user to view her photos collections and also the itinerary she has followed to take them.

5 Conclusion and Future Works

Context information can be useful for organization and retrieval of images on the Web. We have described, in this article, a novel mobile and Web System that captures and infers context metadata of photos. We have proposed an OWL-DL ontology to annotate event collections of photos and to be employed for time, spatial, and social reasoning. PhotoMap Web site offers interfaces for spatial and temporal queries over the user photo collections. In future works, we will define other query modes for the PhotoMap Web site. Instead of only query over their collections, users will be able to search public photos using all the contextual attributes (e.g., show photos near to the Eiffel Tower in winter before the sunset). Usability tests will be performed in order to evaluate the PhotoMap mobile application and the visualization Web site. Finally, PhotoMap will be extended towards a Web recommendation system of tourist itineraries using a community collaboration model.

References

1. Ames, M., Naaman, M., Why We Tag: Motivations for Annotation in Mobile and Online Media. Proc. of Conference on Human Factors in computing systems (CHI 2007), 2007.

2. Hollink, L., Nguyen, G., Schreiber, G., Wielemaker, J., Wielinga, B., Worring, M., Adding Spatial Semantics to Image Annotations. Proc. of 4th International Workshop on Knowledge Markup and Semantic Annotation, 2004.

3. Lux, M., Klieber, W., Granitzer, M., Caliph & Emir: Semantics in Multimedia Retrieval and Annotation, Proc. of 19th International CODATA Conference 2004: The Information Society: New Horizons for Science, Berlin, Allemane, 2004.

4. Matellanes, A., Evans, A., Erdal, B., Creating an application for automatic annotations of images and video, Proc. of 1st International Workshop on Semantic Web Annotations for Multimedia (SWAMM), Edinburgh, Scotland, 2006.

5. Monaghan, F., O'Sullivan, D., Automating Photo Annotation using Services and Ontologies, Proc. of 7th International Conference on Mobile Data Management (Mdm'06), Washington, DC, USA, Mai 2006. IEEE Computer Society, p. 79-82.

6. Naaman, M., Harada, S., Wang, Q., Garcia-Molina, H., Paepcke, A., Context data in geo-referenced digital photo collections. Proc. of 12th ACM international Conference on Multimedia (MULTIMEDIA '04), New York, NY, USA, 2004, ACM, p.196-203.

7. Sarvas, R., Herrarte, E., Wilhelm, A., Davis, M., Metadata creation system for mobile images, Proc. of 2th International Conference on Mobile Systems, Applications, and Services (MobiSys '04), Boston, MA, USA, 2004, ACM, p 36-48.

8. Toyama, K., Logan, R., Roseway, A, Geographic location tags on digital images. Proc. of 11th ACM international Conference on Multimedia (MULTIMEDIA '03), Berkeley, CA, USA, Novembre 2003, ACM Press, p. 156-166.

9. Yamaba, T., Takagi, A., Nakajima, T., Citron: A context information acquisition framework for personal devices, Proc. of 11th International Conference on Embedded and real-Time Computing Systems and Applications. 2005.

10. Wang, L., Khan, L., Automatic image annotation and retrieval using weighted feature selection, Journal of Multimedia Tools and Applications, 2006, ACM, p.55-71.

11. Schreiber, A. Th., Dubbeldam, B., Wielemaker, J., Wielinga, B. J. Ontology-based photo annotation. IEEE Intelligent Systems, 2001.

12. Athanasiadis, Th., Tzouvaras, V., Petridis, K., Precioso, F., Avrithis, Y., Kompatsiaris, Y., Using a Multimedia Ontology Infrastructure for Semantic Annotation of Multimedia Content, Proc. of 5th International Workshop on Knowledge Markup and Semantic Annotation (SemAnnot '05), Galway, Ireland, November 2005