A framework for analysis of large database of old art paintings

8
A framework for analysis of large database of old art paintings Da Rugna J´ erome a and Chareyron Ga¨ el a and Ruven Pillay b and Morwena Joly b a ole universitaire Lonard de Vinci, ESILV, 12 avenue L. de Vinci, Paris - La D´ efense, France b C2RMF - UMR 171 - Palais du Louvre - 14 quai F. Mitterand - 75001 Paris, France ABSTRACT For many years, a lot of museums and countries organize the high definition digitalization of their own collec- tions. In consequence, they generate massive data for each object. In this paper, we only focus on art painting collections. Nevertheless, we faced a very large database with heterogeneous data. Indeed, image collection includes very old and recent scans of negative photos, digital photos, multi and hyper spectral acquisitions, X-ray acquisition, and also front, back and lateral photos. Moreover, we have noted that art paintings suffer from much degradation: crack, softening, artifact, human damages and, overtime corruption. Considering that, it appears necessary to develop specific approaches and methods dedicated to digital art painting analy- sis. Consequently, this paper presents a complete framework to evaluate, compare and benchmark devoted to image processing algorithms. Keywords: Color, framework, painting, boosting 1. INTRODUCTION For many years, a lot of museums and countries organize the high definition digitalization of their own collec- tions. 1, 2 Their main objectives are to use all acquired images to provide efficient art painting analysis to help the restorer in his task. Light, heat, moisture, air pollutants, dust, dirt, insects, physical vibration and impact can lead to slow deterioration, or sudden damage to a painting. Paintings in art galleries and museums are kept in conditions which protect them from those things which cause material deterioration. Digital images are, first of all, a complete history of a painting but, in second, are a new powerful tool for the restorer. High-definition photography (like 100 pixels by centimeter) may be combined with x-ray, UV acquisition or any other digital representation. The restorer uses this bundle of images as an element of his cogitation that helps him in some restoration choices. But, ideally, digital images may also be used to analyze paintings. 3–10 To understanding art across artists, art schools, history and painting techniques is one of the most fundamental goal of art study. As example, the underlying questions can be: will numerical imaging allows to extract unknown information among hundred of expert studies ? Are we able to quantify and characterize the link between human per- ception and art painting ? A new and large domain is open for imaging science: it is necessary to develop specific approaches and algorithms dedicated to digital art painting analysis. These massive data generated by museum laboratory is a complex information and a mix of multi modal images - photography, x-ray, etc - and symbolic information - author, school, painting techniques, etc-. To exploit all the intrinsic information contained in such a data, devoted systems must be proposed and used. Consequently, this paper presents a complete framework to evaluate, compare and benchmark devoted to image processing algorithms. The first part introduces images, diversity of acquisition methods and all underlying difficulties linked to our problematic. A single painting may be described by a very large variety of images: scans from 1950 to 2010, from X-ray to infrared, from front view to back-view and lateral view, etc. The second part presents the overall schema of our framework. In this section, we present a general framework that permits to integrate different kinds of procedures and image descriptors. It also provides tools to compare and benchmark several algorithms face to specific tasks. Regarding the technique, our distributed solution is based on database servers, multiple image processing servers and an automatic statistical reporting server. Clients are web-based and provide (Send correspondence to Da Rugna J´ erome.) Da Rugna J´ erome.: E-mail: jerome.da [email protected]

Transcript of A framework for analysis of large database of old art paintings

A framework for analysis of large database of old artpaintings

Da Rugna Jeromea and Chareyron Gaela and Ruven Pillayb and Morwena Jolyb

aPole universitaire Lonard de Vinci, ESILV, 12 avenue L. de Vinci, Paris - La Defense, FrancebC2RMF - UMR 171 - Palais du Louvre - 14 quai F. Mitterand - 75001 Paris, France

ABSTRACT

For many years, a lot of museums and countries organize the high definition digitalization of their own collec-tions. In consequence, they generate massive data for each object. In this paper, we only focus on art paintingcollections. Nevertheless, we faced a very large database with heterogeneous data. Indeed, image collectionincludes very old and recent scans of negative photos, digital photos, multi and hyper spectral acquisitions,X-ray acquisition, and also front, back and lateral photos. Moreover, we have noted that art paintings sufferfrom much degradation: crack, softening, artifact, human damages and, overtime corruption. Consideringthat, it appears necessary to develop specific approaches and methods dedicated to digital art painting analy-sis. Consequently, this paper presents a complete framework to evaluate, compare and benchmark devoted toimage processing algorithms.

Keywords: Color, framework, painting, boosting

1. INTRODUCTION

For many years, a lot of museums and countries organize the high definition digitalization of their own collec-tions.1,2 Their main objectives are to use all acquired images to provide efficient art painting analysis to helpthe restorer in his task. Light, heat, moisture, air pollutants, dust, dirt, insects, physical vibration and impactcan lead to slow deterioration, or sudden damage to a painting. Paintings in art galleries and museums are keptin conditions which protect them from those things which cause material deterioration. Digital images are, firstof all, a complete history of a painting but, in second, are a new powerful tool for the restorer. High-definitionphotography (like 100 pixels by centimeter) may be combined with x-ray, UV acquisition or any other digitalrepresentation. The restorer uses this bundle of images as an element of his cogitation that helps him in somerestoration choices. But, ideally, digital images may also be used to analyze paintings.3–10 To understandingart across artists, art schools, history and painting techniques is one of the most fundamental goal of art study.As example, the underlying questions can be: will numerical imaging allows to extract unknown informationamong hundred of expert studies ? Are we able to quantify and characterize the link between human per-ception and art painting ? A new and large domain is open for imaging science: it is necessary to developspecific approaches and algorithms dedicated to digital art painting analysis. These massive data generatedby museum laboratory is a complex information and a mix of multi modal images - photography, x-ray, etc -and symbolic information - author, school, painting techniques, etc-. To exploit all the intrinsic informationcontained in such a data, devoted systems must be proposed and used. Consequently, this paper presents acomplete framework to evaluate, compare and benchmark devoted to image processing algorithms.

The first part introduces images, diversity of acquisition methods and all underlying difficulties linked toour problematic. A single painting may be described by a very large variety of images: scans from 1950 to 2010,from X-ray to infrared, from front view to back-view and lateral view, etc. The second part presents the overallschema of our framework. In this section, we present a general framework that permits to integrate differentkinds of procedures and image descriptors. It also provides tools to compare and benchmark several algorithmsface to specific tasks. Regarding the technique, our distributed solution is based on database servers, multipleimage processing servers and an automatic statistical reporting server. Clients are web-based and provide

(Send correspondence to Da Rugna Jerome.)Da Rugna Jerome.: E-mail: jerome.da [email protected]

intuitive interface to explore data and results. The third part discusses coarse pre-processing algorithms. Toavoid the bias generated by source diversity, it is required to carry out a pre-processing step for all images.To improve the pre-processing step efficiency, we introduce a new approach to crack detection designed to oldpainting: using well-know edge detector and a devoted learning algorithm, we are able to separate edge fromcrack with an effective rate.

2. HETEROGENEOUS DATA

Art painting techniques are various and may be very different. Some works are painted on stone, other arecollage, etc. Moreover, acquisition methods are numerous and really various. Multi modal image problem isone of the biggest stake of this framework: it requires, first of all, to identify and classify painting and imageacquisitions. Let us describe the heterogeneous data we are faced with.

2.1 Paintings

The art of painting have really changed from the 12th or 13th century (our oldest paintings) to today. Ofcourse, pictorial style cannot be compared, Botticelli and Mondrian do not paint using the same technique! Toclassify a paint art work, three different categories have been defined:

• Shape - Table 3 illustrates this category: even if the most of paintings are rectangular or square shapes,a significant part of paintings have a different shape.

• Material - Table 2 proposes some of the defined material categories. Most of paintings use wood or paper.

• Painting techniques - Associated to a material, the painting style is very important: collage on stone,drawing on wood, etc are example of very different kinds of paintings. Table 3 illustrates these categories.

Table 1: Principal shape classification

arch circular hexagonal octagonaloval rectangular square triangulartriptych

Table 2: Principal material support classification

black amber bone canvas cardcement coating earth enamelglass metal aluminum copper alloysilver mineral pigment mortaramber ivory panel amaranthfir ebony plywood poplarpaper plant matter plaster silk textileskin stone alabaster garnetlimestone marble terra cotta

2.2 Painting acquisition

Museum and devoted labs used to acquire image representation of art, and specially painting. We may foundin the database images from 1930 ! Indeed, as each image in the database may be useful to reach efficientpainting analysis, image photography techniques are precisely described and classified. Table 4 illustrates thesecategories. Considering one painting, very different images may be present, like well-know photography, x-ray,multi spectral, scans or raking lights: to be exploited, this rich content requires devoted algorithms.11,12

Also, the quality of images depends on various factors. We may note variations like:

Table 3: Principal painting technique classification

cane work collage drawingdrawing/bistre drawing/ink drawing/graphite pencildrawing/gouache drawing/charcoal drawing/water colorenamel engraving frescofresco glazing guildingmarquetry metal paintingveneering print

Table 4: Principal photographic techniques

black and white digitization ultraviolet fluorescence color reflection light digitizationinfra-Red (IR) laser scans gaz chromatographypanoramic digitization petrographic microscopy photographyphotomacrography x-ray raking light photographymicroscopy multi spectral digitization

• color resolution - gray level, RGB, multi-spectral and hyper-spectral are the main categories. Few RGBimages use sRGB or any other device-independant process13.14

• image resolution - from less than 1 mega pixel to hundred of mega pixels, image resolution variation ishuge. The difficulty could be to mix a 800x800 X-ray images with a 10000x10000 photography.15

• illuminant - Some pictures were made using museum lights, other using laboratory light and other us-ing specific illuminant. Color constancy or any other process directly linked to color requires a color-management pre-process to be efficient.13,16,17

3. A FRAMEWORK FOR ART PAINTING ANALYSIS

Most of image analysis studies are devoted to specific works,12 specially considering image understanding.18–21

Our main goal is to develop a general framework that can be used to include a large variety of paintings.Considering certain kinds of paintings, only strokes provide reliable information to distinguish artists, for other,color palette is the main information. The framework presented here can potentially be applied to digitized

(a) Color-calibratedhigh-definition pho-tography, 237 pix-els/cm

(b) Gray-level infra-red, 73 pixels/cm

(c) Raking-light (left)color photography, 73pixels/cm

(d) Fluorescent illumi-nant, UV photogra-phy, 73 pixels/cm

(e) X-ray radiography,122 pixels/cm

Figure 1: Some examples of the 268 images of the painting ”The virgin with a blue diadem”, Raphael and PenniGiovanni Francesco - Copyright C2RMF, photographs: Jean-Louis Bellec and Ruven Pillay (1(a)), LambertElsa (1(b),1(c),1(d),1(e))

(a) (b) (c) (d)

Figure 2: 100% zoom of high-definition images, extracted zone size is 2cm by 2cm. - Copyright C2RMF,photographs: Jean-Louis Bellec and Ruven Pillay

paintings of all cultures and provides to art theorists some specific quantifiable features in the paintings thatcould be useful for their classification or analysis. Features include brush strokes, a set of global or localfeatures of paintings, or the color palette for patches in painting.

The main challenge to create the framework is to define structure that permit to cover vast range of artworksand can be adapted to classical techniques but also to future algorithm in image processing. In the previoussection, we shown that data are very heterogeneous and require to develop a framework to be efficient. Indeed,our goal is to benchmark several algorithms on the database.

The framework required features are:

• interact with the C2RMF database or any other digital painting databases

• describe each image (acquisition technique, support, historical information ...) and painting

• save and execute algorithms and programs without any bias regarding the heterogeneous data

• interact with statistical tools (like R) to exploit saved results.

Figure 3: Overall framework: it describes how the image database and algorithm database are designed.

As described in figure 3, in the first step, we design a SQL database to save photographies of painting. Wechoose a restricted number of paintings to test the process: 176 paintings from 16 authors. This represents 7different painting techniques, 12 support materials, 17 acquisition types, photographies from 1939 to 2010 andquite one thousand images. This sample provides enough data to test our framework. In this part we store:

• image description

• author and short biography

• original size of the painting

• painting symbolic information (date, support, shape, etc)

• acquisition date

• acquisition technique.

With this data we can categorize painting by date, painting technique, size of the data ... Therefore we canchoose to work only with a sub-sample of the database. Indeed, algorithms can be specific for one acquisitiontechnique or for a specific image resolution.

In the second step we define a process to benchmark algorithms. We develop specific database to store pro-grams and their results. The database contains source code and the computation processes can be distributedto different computers and operating systems. All the results are saved in the database. It is important tostore a maximum of information for each algorithm and type of algorithm. Huge variety in algorithm to studypainting may be addressed :

• point based procedure

• area-based procedure

• perspective analysis

• brush strokes analysis

• analysis of crack

• analysis of composition

Each algorithm can use the complete image or a small part of the image, likewise scale is important. Fora same painting, we have high resolution image (600 pixels by centimeter) and low resolution image (10 pixelsby centimeter), we need to take into account the spatial resolution. Moreover, for the same painting, there arephotographies taken in mid 50’s and photographies taken last year: this information needs to be introduced inthe process.

Eventually, we use all the data in statistical environment to produce reports. R provides powerful statisticalfunctions and intuitive interface to sql databases.

In this part we have presented a complete framework for examination of painting visual properties. Thisframework can help developer to benchmark several algorithms on a large variety of artworks. Easily wecan apply an algorithm to specific set of painting categorized by period or for a special acquisition method.Moreover, results are automatically generated to simplify the comparison between several processes on thesame dataset. In the next section, we will focus on required pre-processing step, specially concerning crackremoval.

4. COARSE PRE-PROCESSING

Faced to these heterogeneity and multi modal image database, a pre-processing step is required. Indeed, it isnecessary to include into algorithm development

• damaged zone map - This map is a binary image where damaged zones (not crack) are marked. Thesezones are defined by an expert.

• crack map - This map is a binary image where cracks are marked. Cracks are extracted using a semi-automatic method described in the next paragraph.

• color management data - A bundle of color and physical properties (acquisition or estimated illuminant,multi spectral wave array, etc) constitute the parameters of the color process to embed in color approachalgorithms.16,22–27

• crack-less reconstructed images - The framework is able to also reconstruct the image to avoid crackartifacts. Two methods are proposed, one using an anisotropic diffusion algorithm28 and one using amean filter along the crack map.29,30

Figure 4: Semi-automatized crack identification process

Figure 4 illustrates our semi-automatic method for crack identification. This identification is separated inthree different steps:

1. Four patches are extracted. This step is based on an edge rate estimator and a SIFT31 point identification.First, SIFT interest points are computed with a constraint of a minimum distance between neighborhoodpoints. We may choose random points but the sift points are specially localized on edges and corners,which is more adapted to our goal. Then, a classical edge-rate estimator13 is computed on each windowof size 1cm× 1cm centered on this points. Finally, the windows maximizing the rate are selected.

2. For each patch, an expert weight each edge/cracks detector: the result is obtained as the simple sum ofall of them multiply by their weight. Many crack detectors may be used.29,30,32–36 At present, selecteddetectors cover the diversity of principal approaches: top-hat base,37 canny detector,38 SUSAN detector39

and valley based.40,41

3. The last step consists in identifying all cracks in the image using the previous weights. It allows to adjust,for each patch of 1cm× 1cm of the images, the crack identification. Eventually, the expert validates theextracted map. This map is saved in the database and associated with the image.

5. CONCLUSION AND FUTURES WORKS

In this paper, we have discussed the definition of a complete system to analyze artworks. This frameworkaims to help to implement, benchmark and improve quality of algorithms devoted to artworks analysis. Thiswork is only the start of a development to explore the possibilities of automatic visual examination of digitalpainting. The experimentally part of this paper about cracks detection reveals the potential of the frameworkto benchmark several algorithms. We believe that framework might be very useful to classify cracks and otherfeatures. Increasing the quality of the result provided by existing solutions will be a next step. As example,a more efficient crack identification may be obtained mixing several maps extracted from different paintingrepresentations. This is our future work.

REFERENCES

[1] Gombrich, E. H., [The Story of Art ], Phaidon, London (1995).

[2] Barni, M., Pelagotti, A., and Piva, A., “Image processing for the analysis and conservation of paintings:opportunities and challenges,” Signal Processing Magazine, IEEE 22(5), 141 – 144 (2005).

[3] Stork, D. G. and Coddington, J., eds., [Computer Image Analysis in the Study of Art ], SPIE Press (2008).

[4] Comelli, D., Nevin, A., Valentini, G., Osticioli, I., Castellucci, E. M., Toniolo, L., Gulotta, D., andCubeddu, R., “Insights into masolino’s wall paintings in castiglione olona: Advanced reflectance andfluorescence imaging analysis,” Journal of Cultural Heritage In Press, Corrected Proof, – (2010).

[5] Shen, J., “Stochastic modeling western paintings for effective classification,” Pattern Recognition 42(2),293 – 301 (2009). Learning Semantics from Multimedia Content.

[6] Lau, D., Villis, C., Furman, S., and Livett, M., “Multispectral and hyperspectral image analysis of elemen-tal and micro-raman maps of cross-sections from a 16th century painting,” Analytica Chimica Acta 610(1),15 – 24 (2008).

[7] Carcagn, P., Patria, A. D., Fontana, R., Greco, M., Mastroianni, M., Materazzi, M., Pampaloni, E., andPezzati, L., “Multispectral imaging of paintings by optical scanning,” Optics and Lasers in Engineer-ing 45(3), 360 – 367 (2007). Optical Diagnostics and Monitoring: Optical Characterization methods andTechniques.

[8] Leder, H., Carbon, C.-C., and Ripsas, A.-L., “Entitling art: Influence of title information on understandingand appreciation of paintings,” Acta Psychologica 121(2), 176 – 198 (2006).

[9] Kobayasi, M. and Muroya, T., “A spatial wave-length analysis of coarseness or fineness of color variationin painting arts,” Pattern Recognition Letters 24(11), 1737 – 1749 (2003). Colour Image Processing andAnalysis. First European Conference on Colour in Graphics, Imaging, and Vision (CGIV 2002).

[10] Hradil, D., Grygar, T., Hradilov, J., and Bezdicka, P., “Clay and iron oxide pigments in the history ofpainting,” Applied Clay Science 22(5), 223 – 236 (2003).

[11] Sharma, G., [Digital Color Imaging Handbook ], CRC Press, Inc., CRC press, inc. ed. (2002).

[12] Liu, Y., Zhang, D., Lu, G., and Ma, W., “A survey of content-based image retrieval with high-levelsemantics,” Pattern Recognition 40(1), 262–282 (2007).

[13] Lukac, R. and Plataniotis, K. N., [Color Image Processing: Methods and Applications ], CRC Press (2007).

[14] Schanda, J., [Colorimetry: Understanding the CIE System ], wiley ed. (2007).

[15] Thirion, J.-P., “Image matching as a diffusion process: an analogy with maxwell’s demons,” Medical ImageAnalysis 2(3), 243 – 260 (1998).

[16] Tominaga, S. and Wandell, B. A., “Natural scene-illuminant estimation using the sensor correlation,”Proceedings of the IEEE 90, 42–56 (2002).

[17] Gevers, T. and Smeulders, A., “Color based object recognition,” Pattern Recognition 32, 453–464 (1999).

[18] Ciocca, G. and Schettini, R., “Content-based similarity retrieval of trademarks using relevance feedback,”Pattern Recognition 34, 1639–1655 (2004).

[19] Gevers, T. and Stokman, H., “Robust histogram construction from color invariants for object recognition,”IEEE Trans. on Pattern Analysis and Machine Intelligence 26(1), 113–118 (2004).

[20] Bonaiuto, J. and Itti, L., “The use of attention and spatial information for rapid facial recognition invideo,” Image and Vision Computing 24(6), 557–563 (2006).

[21] Simone, F. D., Ticca, D., Dufaux, F., Ansorge, M., and Ebrahimi, T., “A comparative study of colorimage compression standards using perceptually driven quality metrics,” in [Applications of Digital ImageProcessing XXXI ], 7073, 70730Z–70730Z–11, SPIE (2008).

[22] Drbohlav, O. and Leonardis, A., “Towards correct and informative evaluation methodology for tex-ture classification under varying viewpoint and illumination,” Computer Vision and Image Understand-ing 114(4), 439 – 449 (2010). Special issue on Image and Video Retrieval Evaluation.

[23] Wang, Y. and Samaras, D., “Estimation of multiple directional illuminants from a single image,” Imageand Vision Computing 26(9), 1179 – 1195 (2008).

[24] Zhou, W. and Kambhamettu, C., “A unified framework for scene illuminant estimation,” Image andVision Computing 26(3), 415 – 429 (2008). 15th Annual British Machine Vision Conference.

[25] Finlayson, G. and Hordley, S., “Selection for gamut mapping colour constancy,” Image and Vision Com-puting 17(8), 597 – 604 (1999).

[26] Huang, K.-Q., Wang, Q., and Wu, Z.-Y., “Natural color image enhancement and evaluation algorithmbased on human visual system,” Computer Vision and Image Understanding 103(1), 52 – 63 (2006).

[27] Li, X., Tao, D., Gao, X., and Lu, W., “A natural image quality evaluation metric,” Signal Processing 89,548–555 (Apr. 2009).

[28] Weickert, J., [Anisotropic Diffusion in Image Processing ], ECMI Series (1998).

[29] Giakoumis, I., Nikolaidis, N., and Pitas, I., “Digital image processing techniques for the detection andremoval of cracks in digitized paintings,” IEEE Trans. Image Processing 15(1), 178 (2006).

[30] Bergman R, Maurer R, N. H. R. C. P. G. D., “Comprehensive solutions for removal of dust and scrachesfrom image,” J. Electron. Imaging 17(1), 013010 (2008).

[31] Lowe, D., “Object recognition from local scale-invariant features,” in [Computer Vision, 1999. The Pro-ceedings of the Seventh IEEE International Conference on ], 2, 1150 –1157 vol.2 (1999).

[32] Zana F, K. J. C., “Segmentation of vessel-like patterns using mathematical morphology and curvatureevaluation,” IEEE Trans. Image Processing 10(7), 1010 (2001).

[33] Elberly D, Gardner R, M. B. P. S. S. C., “Ridges for image analysis,” J. Math. Imaging Vis. 4(4), 353(1994).

[34] Iyer, S. and Sinha, S., “A robust approach for automatic detection and segmentation of cracks in under-ground pipeline images,” Image Vision Computing 23(10), 921 (2005).

[35] Papari, G. and Petkov, N., “Edge and line oriented contour detection: State of the art,” Image and VisionComputing 29(2-3), 79 – 103 (2011).

[36] Lin, Z., Jiang, J., and Wang, Z., “Edge detection in the feature space,” Image and Vision Computing 29(2-3), 142 – 154 (2011).

[37] Abas, F. and Martinez, K., “Classification of painting cracks for content-based retrieval,” in [IS&T?SPIE’s15th Annual Symposium Electronic Imaging 2003 : Machine Vision Applications in Industrial InspectionXI ], (2003).

[38] Canny, J., “A computational approach to edge detection,” IEEE Trans. on Pattern Analysis and MachineIntelligence 8(6), 679–698 (1986).

[39] Smith, S. and Brady, J., “Susan - a new approach to low level image processing,” Int Journal of ComputerVision 23(1), 45+ (1997).

[40] AM Lopez A M, L. F., “Evaluation of methods for ridge and valley detection,” IEEE Trans. Pattern Anal.Mach. Intell. 21(4), 327 (1999).

[41] Gaugh J, P. S., “Multiresolution analysis of ridges and valleys in grey-scale images,” IEEE Trans. PatternAnal. Mach. Intell. 15(6), 635 (1993).