Producing stylized videos using the AnimVideo rendering tool
-
Upload
anhangueraeducacional -
Category
Documents
-
view
0 -
download
0
Transcript of Producing stylized videos using the AnimVideo rendering tool
Producing Stylized Videos Using the AnimVideoRendering Tool
Rafael B. Gomes,1 Lucas M. Oliveira,1 Laurindo S. Britto-Neto,1 Tiago S. Santos,1
Gilbran S. Andrade,1 Bruno M. Carvalho,1 Luiz M. G. Goncalves2
1 Departamento de Informatica e Matematica Aplicada, Universidade Federal do Rio Grande doNorte, Campus Universitario, S/N, Lagoa Nova, Natal, RN 59.072-970, Brazil
2 Departamento de Engenharia de Computacao e Automacao, Universidade Federal do RioGrande do Norte, Campus Universitario, S/N, Lagoa Nova, Natal, RN 59.072-970, Brazil
Received 15 August 2008; accepted 2 March 2009
ABSTRACT: Stylized rendering is the process of generating images
or videos that can have the visual appeal of pieces of art, expressing
the visual and emotional characteristics of artistic styles. A major
problem in stylizing videos is the absence of temporal coherence,something that results in flickering of the structural drawing elements
(such as brush strokes or curves), also known as swimming. This arti-
cle describes the AnimVideo rendering tool that was developed forstylizing videos with temporal coherence. The temporal coherence is
achieved by first fully segmenting the input video with a fast fuzzy
segmentation algorithm that uses hybrid color spaces and motion in-
formation. The result of the segmentation algorithm is used to con-strain the result of an optical flow algorithm, given as dense optical
flow maps that are then used to correctly move, remove, or add
structural drawing elements. The combination of these two methods
is referred to as constrained optical flow, and we also provide theoption of initializing the optical flow computation with displacement
maps computed by homographies that map objects in adjacent
frames. Also, we briefly describe some stylized rendering methodsthat were implemented in the tool. Finally, experimental results are
shown, including snapshots of the tool’s interface and illustrative
examples of the produced renderings that validates the proposed
techniques. VVC 2009 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 19,
100–110, 2009; Published online in Wiley InterScience (www.interscience.
wiley.com). DOI 10.1002/ima.20180
Key words: stylized rendering; intraobject temporal coherence;
fuzzy segmentation; constrained optical flow; hybrid color spaces
I. INTRODUCTION
Stylized rendering is the process of generating images or videos
that can have the visual appeal of pieces of art, expressing the visual
and emotional characteristics of artistic styles. This class of techni-
ques was originally named Non-Photorealistic Rendering (NPR).
As this negative definition points out, NPR is a class of techniques
defined by what they do not aim, the realistic rendering of artificial
scenes. Another way of defining stylized rendering or NPR techni-
ques is that they aim to reproduce artistic techniques renderings,
trying to express feelings and moods on the rendered scenes.
There are many stylized rendering techniques published in the
literature, and some are currently being used in application areas as
movies, games, advertisement, and technical illustrations. Some
stylized rendering techniques include pen-and-ink drawings (Win-
kenbach and Salesin, 1994), cartoon shading, mosaics (Di Blasi and
Gallo, 2005), impressionist-style rendering (Litwinowicz, 1997),
and water-coloring (Bousseau et al., 2006). For comprehensive
reviews of several NPR techniques and applications, the reader
should refer to (Gooch and Gooch, 2001; Strothotte and Schlecht-
weg, 2002).
Animation techniques can convey information that cannot be
simply captured by shooting a real scene with a video camera.
However, such kind of animation is labor intensive and requires a
fair amount of artistic skill. On the other hand, one could use styl-
ized rendering techniques and graphical tools to generate highly
abstracted animations with little user intervention, thus, making it
possible for nonartist users to create their own animations with less
effort. However, there is a major problem in video stylization,
which is the absence of temporal coherence. Temporal incoherence
occurs when drawing elements move in undesired directions, pro-
ducing a distracting effect. This effect happens in the form of flick-
ering of the structural drawing elements (such as brush strokes or
curves), and it is also known as swimming. Because of this problem,
many animations have been produced with animators working with
a single or a few frames at a time, thus, increasing the artistic and
computational efforts needed to produce the animation.
The AnimVideo rendering tool (http://www.lablic.dimap.
ufrn.br/animvideo) (initially known as AVP) described in this arti-
cle employs a fast fuzzy segmentation algorithm for segmenting
input videos as 3D volumes, and an optical flow algorithm for
enforcing intraobject temporal coherence in the animations. The
Correspondence to: B. M. Carvalho; e-mail: [email protected] sponsors: This work was partially supported by Universal and PDPG-TI
CNPq grants.
' 2009 Wiley Periodicals, Inc.
structure of the article is described next. Section II talks about video
stylization techniques published in the literature, whereas Section
III describes how the AnimVideo rendering tool works, how the
user interacts with it, and some artistic styles implemented in the
AnimVideo rendering tool. Section IV shows illustrative experi-
ments, presenting some frames of the animations produced. Finally,
Section V presents some conclusions.
II. STYLIZING VIDEOS
The main objective of stylized rendering techniques is the produc-
tion of animations from real videos using automatic or semi-auto-
matic techniques. Thus, stylized rendering techniques allow a user
with little or no artistic training to generate animations with little
effort, when comparing to the task of creating an animation from
scratch. Video stylization also offers the choice of mixing real mov-
ies with stylized objects, by superimposing them on the video or by
segmenting objects and rendering them with an artistic style.
As mentioned in Section I, above, if the input for the stylized
video is a normal video, there is the problem of temporal incoher-
ence, mainly due to brightness variations caused by shadowing,
noise, and changes in illumination. This temporal incoherence
appears in the form of swimming, where features of the animation
move and change their intensities within the rendered animation.
This flickering can happen because borders of objects may be blurry
and thus detected in the wrong place or because static areas are
being rendered differently each time, due to some slight brightness
changes.
The first technique to address this problem is proposed by Litwi-
nowicz (1997) to maintain temporal coherence in video sequences
stylized using an impressionist style. Litwinowicz’s technique
advocates the use of an optical flow method for tracking movement
in the scene and moving, adding or removing brush strokes from
frame to frame. An extension to this approach is proposed by Hertz-
mann and Perlin (2000), by detecting areas of change from frame to
frame and painting over them, i.e., keeping the brush strokes of the
static areas, and enforcing intraobject temporal coherence by warp-
ing the brush stroke’s control points using the output of an optical
flow method.
Wang et al. (2004) proposed a method for creating cartoon ani-
mations from video sequences by dividing the problem in two steps
as follows: a segmentation step, used to isolate the objects which
would be rendered using the style throughout the animation, in this
case performed using a mean shift segmentation algorithm for end-
to-end video shot segmentation; and a coherence step, where the
user selected constraint points on key-frames of the video shot
through a graphical interface. The selected points are then used for
interpolating the region boundaries between key-frames. Since the
key-frames have to be similar for generating nice results for the
interpolation process (a typical interval between key-frames is
between 10 and 15), there is still significant interaction with the
user in the rendering process to ensure temporal coherence of the
animation.
Another step forward to create a system for producing tempo-
rally coherent NPR animations, but with less user interaction is
proposed by Collomosse et al. (2005). Their framework, called
‘‘Stroke Surfaces,’’ works in a similar way to the technique of
Wang et al. (2004), in the sense that it also treats the input video as
a 3D spatio-temporal volume. The idea, again, is to segment the
input video, achieving an abstract description of the video, in this
case called Intermediate Representation (IR). This intermediate rep-
resentation is processed by a video analysis front end that uses heu-
ristics and user intervention to associate regions into semantic
volumes, and feed them into the rendering back end. That has some
interaction with the user, including the option of rendering the
video using a few different NPR techniques. The intermediate rep-
resentation also stores local motion estimates for video objects, and
these can be used to calculate a homography that can be used to
recover internal edge positions. However, since this homography
calculation assumes both planar motions and rigid objects, the intra-
object temporal coherence can be affected in the case of curved
objects or movements and nonrigid transformations. This also limits
the use of this technique for achieving intraobject temporal coher-
ence, since any object with high-curvature parts or with nonrigid
motion would pose a problem that would probably result in wrong
mappings of drawing elements.
So, a technique that enforces temporal coherence is necessary
that may use information from the user to guarantee that nonrigid
objects can also be handled. Ways of interaction between the user
and the tool must also be provided, as necessary. Methods devel-
oped in this direction, described next, takes place in the AnimVideo
rendering tool proposed in this work.
Similarly to other tools developed to address the same problem,
the AnimVideo tool can assume that the video was preacquired,
since the goal is to postprocess the video to produce an animation.
Because of the huge gamut of choices, one artist can make in the
course of creating such an animation, the main goal of the AnimVi-
deo tool is to provide powerful software tools for manipulating the
input videos and creating animations. In the next section, we will
describe how the methods implemented in the AnimVideo render-
ing tool enforce temporal coherence, what information is needed
from the user, and how the interaction between the user and the tool
takes place.
III. THE ANIMVIDEO RENDERING TOOL
Our method for producing intraobject temporally coherent stylized
animations is divided into three parts, the segmentation of the input
video, followed by the calculation of the Constrained Optical Flow
map, and the rendering of objects using some artistic style. Interac-
tions between the parts mentioned above can be seen in Figure 1 for
the mosaic rendering style. For a different style, only the parts
related with the rendering and creation/update/deletion of drawing
structural primitives (the two boxes on the top and bottom right of
Fig. 1) have to be replaced.
The technique implemented in the AnimVideo rendering tool to
enforce temporal coherence also treats input videos as spatio-tem-
poral volumes, and segments the volume into space-time objects.
The segmentation step is performed by using a semi-automatic
region-growing segmentation technique (Carvalho et al., 2005,
2006), where the user interaction is performed by selecting seeds
for the objects being segmented. Since the video is treated as a 3D
volume, the user can easily select seeds on several different frames,
solving the problem of segmenting objects that appear later in the
video. The segmentation algorithm (Carvalho et al., 2005, 2006) is
designed to be very fast, so the user can execute it, refine the seg-
mentation by adding/removing seeds, and run the program again,
and still produce a good segmentation in a reasonable time (less
than 5 s).
The temporal coherence of object boundaries can be easily
obtained by following the boundaries of the segmented objects,
whereas intraobject temporal coherence is enforced by computing a
Vol. 19, 100–110 (2009) 101
constrained optical-flow, limited to the area of the segmented object
itself. The optical flow is computed individually for each object,
using Proesmans’ et al. method (1994) that generates a dense map
with good estimates of the optical flow. The results of the optical
flow can also be used to generate a motion emphasis effect in a
painterly technique or any other stroke-based stylized rendering
technique. We now proceed to describe in more detail the techni-
ques used in the AnimVideo rendering tool.
A. Fuzzy Segmentation Using Hybrid Color Spaces andMotion Information. To segment the color video shots, we use a
multiobject fuzzy segmentation algorithm (Herman and Carvalho,
2001; Carvalho et al., 2005) based on hybrid color spaces and
motion information. This approach extends a previous one (Udupa
and Samarasekera, 1996) working with arbitrary digital spaces
(Herman, 1998). By definition, a digital space is a pair (V, p), whereV is a set and p is a symmetric binary relation on V such that V is
connected under p. In the theory presented below, we refer to the
elements of the set V as spels, which is short for spatial elements,even though here we deal only with videos that are segmented as
3D volumes. This method simultaneously computes the grade of
membership of the spels of a video to a number of objects. This
number is between 0 and 1, where 0 indicates that the spel definitely
does not belong to the object and 1 indicates that it definitely does.
To compute the grade of membership, we assign, to every or-
dered pair (c,d) of spels, a real number (in the range [0,1]), which is
referred to as the fuzzy connectedness of c to d (a concept intro-
duced by Rosenfeld (1979)). In the approach used in our tool, fuzzy
connectedness is defined in the following general manner. We call a
sequence of spels a chain, and its links are the ordered pairs of con-
secutive spels in the sequence. The strength of a link is also a fuzzy
concept, i.e., for every ordered pair (c,d) of spels, we assign a real
number (between 0 and 1), which we define as the strength of thelink from c to d. We say that the w-strength of a link is the appropri-
ate value of a fuzzy spel affinity function w: V2 ? [0,1], i.e., a func-
tion that assigns a value (between 0 and 1) to every pair of spels in
V. A set U (( V) is said to be w-connected if, for every pair of spelsin U, there is a chain in U of positive w-strength from the first to the
second spel of the pair. A chain is formed by one or more links and
the w-strength of a chain is the w-strength of its weakest link; the
w-strength of a chain with only one spel in it is 1 by definition.
Since we are dealing with the simultaneous segmentation of
multiple objects, we define an M-semisegmentation of V as a func-
tion r that maps each c 2 V into an (M 1 1)-dimensional vector rc
5 (r0c, r1
c, . . ., rMc), where rm
c represents the grade of membership
of the spel c in the mth object, and r0c is always equal to max1 � m �
Mrmc. An M-segmentation is defined as an M-semisegmentation r
where r0c is positive, for every spel c. An M-fuzzy graph is then
defined as a pair (V,C), where V is a nonempty finite set and C 5
(w1,. . ., wM) with wm (for 1� m�M) being a fuzzy spel affinity.
Here, we use a property that states that a spel d is associated
with an object n, if, and only if, there is a chain of maximal strength
(located entirely inside the nth object) connecting a seed spel c [ Vn
to d. We present the proofs associated with this property in a previ-
ous published theorem (Carvalho et al., 2005). In that work, we
show that there is one, and only one, M-semisegmentation that
satisfies the properties stated in it.
The descriptions of the original and the fast multiobject fuzzy
Segmentation (MOFS and Fast-MOFS, respectively) can be found
in our previous work (Carvalho et al., 2005). The Fast-MOFS
assumes that the affinity functions can assume a small number of
values, without affecting, significantly, the quality of the segmenta-
tions. In the experiments shown there, the affinity functions are
rounded to three decimal places, allowing the segmentations to be
computed much faster (with speedup factors around seven) without
any visible degradation on the results.
To apply the algorithms mentioned above to image segmenta-
tion, we still have to define the fuzzy spel affinities wm (for 1 � m
Figure 1. Diagram showing the
interactions between the parts of
our method for generating intra-object, temporally coherent, styl-
ized animations. [Color figure can
be viewed in the online issue,
which is available at www.interscience.wiley.com.]
102 Vol. 19, 100–110 (2009)
� M). Usually this is done by a computer program, based on some
minimal information supplied by a user (Udupa and Samarasekera,
1996; Herman, 1998; Carvalho et al., 1999). The idea is that, even
though the user probably would not be able to define, mathemati-
cally, the characteristics of the objects he/she wants to segment, it
is easy for him/her to select points that belong to them. A computer
program can then compute some statistics based on the neighbor-
hoods of the selected spels and use these statistics to compute the
fuzzy spel affinities. Since we are dealing with color videos, previ-
ous approach (Udupa and Samarasekera, 1996; Herman, 1998; Car-
valho et al., 1999) used to create the fuzzy spel affinities wm (for
1 � m � M) is adapted here to incorporate color information.
The idea behind the usage of color and motion information to
segment color videos is that, in general, segmentation algorithms
have problems to segment objects when they have colors that are
similar to the background. In this case, motion information may
help to distinguish objects. On the other hand, segmenting videos
using motion information alone may lead to problems, since motion
information can be unreliable on the boundaries of objects.
Khan and Shah (2001) proposes a maximum a posteriori proba-
bility (MAP) method to segment objects in a video by using several
features, such as spatial coordinates, color, and motion. Weights are
assigned to each feature according to a group of heuristics, and a
clustering algorithm is used to segment the first frame of the video.
Then, the segmentation is propagated to the other frames of the
video.
Since our previous algorithm (Herman and Carvalho, 2001;
Carvalho et al., 2005) is a semiautomatic one, requiring user
interaction, the applicability of this methodology is restricted to
the segmentation of preacquired video shots. The adaptation of
the method consists in the construction of fuzzy affinities that
incorporate the hybrid color spaces and the motion information of
each particular object. Besides that, the video is treated as a 3D
volume, with the frames being z slices, allowing the method to
easily handle temporally nonconvex objects or segment several
objects with similar characteristics as a single one. This is done
by using the high-level knowledge about the objects introduced
in the method through the selection of seed spels by the user.
The algorithm chosen for this application was the Fast-MOFS
previously described (Carvalho et al., 2005), because it allows
the user to segment the video shot, to evaluate the quality of the
segmentation, add and/or to remove seed spels, and recalculate
the segmentation in a few time.
B. Motion Estimation. There are many published methods for
estimating motion from videos. The methods based on the optical
flow equation (Horn and Schunck, 1981; Lucas and Kanade, 1981;
Singh, 1991) estimate the motion of intensities in a pair of succes-
sive frames of video, allowing the recovery of approximate image
velocities by measuring spatio-temporal derivatives. According to
Horn and Schunck (1981), ‘‘optical flow is the distribution of appa-rent velocities of movement of brightness patterns in an image.’’ Areview of earlier methods for computing the optical flow field can
be seen in Beauchemin et al. work (1995), as well as a taxonomy
for the methods, that are classified into differential, frequency-
based, correlation-based, multiple motion, and temporal refinement
methods. Other surveys about earlier methods can be found in the
works of Aggarwal and Nandhakumar (1988), Otte and Nagel
(1994), and Stiller and Konrad (1999).
There are several problems that can increase the complexity of
the computation of the optical flow, such as occlusion, motions of
semitransparent objects, nonrigid objects, nonuniform illumination,
and noise. Besides these problems, because optical flow was usually
computed in the scale of resolution defined by the visual sensor
(Horn and Schunck, 1981), it is not appropriate to estimate large
image motions (Beauchemin and Barron, 1995). To circumvent
that, several methods were either created (Anandan, 1989) or
adapted (McCane et al., 2001) to perform hierarchical computation
of optical flow fields.
The idea here is to use motion information to aid the segmenta-
tion of objects in videos, specially when one object occludes
another object of a similar color. The motion information is incor-
porated in the fuzzy segmentation method as part of the fuzzy spel
affinities. This is done by computing a dense optical flow map
between all frames of the video shot. McCane et al. (2001) analyze
the behavior of seven optical flow methods when applied to several
complex synthetic scenes and controlled real scenes with ground
truth, and comes to the conclusion that the most accurate method is
a multiresolution implementation of the method proposed by Proes-
mans’ et al. (1994).
The method of Proesmans’ et al. (1994) computes the optical
flow of a pair of images by employing a set of nonlinear diffusion
equations that integrate the traditional differential approach with a
correlation method. The optical flow is computed in an iterative
dual scheme, with both the forward and backward flow being com-
puted. This is done by computing the optical flow from frame n to
frame n 1 1 and from frame n 1 1 to frame n. During these proc-
esses, consistency maps are computed and possibly are fed back
into the optical flow computation for another iteration. This aniso-
tropic diffusion allows the smoothing of the flow fields, encourag-
ing smoothing within regions but attenuating smoothing across
boundaries by using the consistency maps, and thus, increasing
flow stability within objects while maintaining flow discontinuities
between objects (Novins et al., 1998). The result of the optical flow
computed by Proesmans’ et al. (1994) method is controlled by three
parameters, the number of iterations the optical flow is computed,
the number of hierarchical levels used to compute the optical flow,
and the smoothness parameter k that controls the amount of aniso-
tropic diffusion.
C. Selecting the Color Channels. There are many color
spaces used for color image segmentation, but several different
studies (Gauch and Hsia, 1992; Liu and Yang, 1994; Cheng
et al., 2001) show that there is not a single color model that is
more appropriate for segmenting all kinds of color images, thus,
making the selection of the color space a very important step in
color image segmentation, because, if we can select the color
channels with the aim of maximizing this color separation, we
can improve the accuracy of the color video segmentation. The
heuristics used here to achieve this is based on the selection of
the color channels with the lowest correlation values between
them. The reasoning is that the channels selected in this way
increase the variety of information (diversification of information)
used in the fuzzy affinity functions employed in the fuzzy seg-
mentation algorithm.
According to Hair et al. (2005), the Pearson’s correlation coeffi-
cient measures the intensity or the grade of association between two
variables, i.e., the linear dependency between two variables. Given
two variables, the Pearson correlation coefficient assumes a value
between 21 and 1, where the value 1 indicates that the two varia-
bles have a perfect positive correlation, i.e., the variables present
the same linear distribution. On the other hand, the value 21
Vol. 19, 100–110 (2009) 103
indicates a perfect negative correlation, i.e., when the value of one
variable increases, the other one decreases. A value 0 indicates that
there is no correlation between the two variables. The Pearson
correlation can be defined as follows:
Xi;j ¼Pn
t¼1 pit � pi� �
pjt � pj� �
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnt¼1 pit � pi
� �2q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPnt¼1 pjt � pj
� �2r ; ð1Þ
where pti and pt
j are the values of the spel t for the channels i and j,and pi and pj are the means of the values of the spels for the chan-
nels i and j, respectively. The value of n is defined by the number of
spels in the neighborhood of the seed spels selected by the user, i.e.,
the correlation is computed only with the spel values in areas
selected by the user as being representatives of the objects to be
segmented. The matrix X contains the correlation values between
all k channels analyzed, with no maximum limit on the number of
color models analyzed. The heuristics used to select the channels is
the following: The first channel selected is the one with the lowest
correlation with all the other channels while the second channel is
the one with the lowest correlation value to the first one selected.
Finally, the third channel selected is the one with the lowest correla-
tion values to the first two selected. To find the channel with the
lowest correlation values to the other channels, we compute, for ev-
ery channel i, the value Yi, which indicates the amount of correla-
tion of this channel to the other k 2 1 channels, and is given by the
following:
Yi ¼ k �Xkj¼1
jXi;jj; ð2Þ
where Xi,j is the correlation value between the channels i and j.High Yi values indicate that the channel i has low correlation with
the other channels. The first selected channel (ch1) is the one with
the highest Yi value, whereas the second channel (ch2) is the chan-
nel with the lowest Xi,ch1 value, for 1 � i � k, i = ch1 and the third
channel is the channel that minimizes the value of Xi,ch1 1 Xi,ch2,
for 1 � i � k, i = ch1 and i = ch2.
D. Incorporating Motion and Color Information. The fuzzy
spel affinities that incorporate motion and color information are
built in a similar manner to what was done in Eq. (4) for gray-level
images. In this case, the three color channels selected using the heu-
ristics above and two motion channels (�u and �v, which give the hori-zontal and vertical components of the optical flow) are used to com-
pose the affinity functions. Thus, the color component of the fuzzy
spel affinities is given by the following:
wmðc; dÞcolor ¼P3
i¼1 qgi;mhi;mðgiÞ þ qai;mbi;mðaiÞ� �
6; ð3Þ
where gi,m is the mean and hi,m is the standard deviation of the
average values of the color channel i, for 1 � i � 3, for all pairs
of neighboring spels belonging to Vm, and ai,m is the mean and
bi,m is the standard deviation of the absolute difference of the
values for all pairs of neighboring spels belonging to Vm for the
color channel i. The motion component of the fuzzy spel affin-
ities is based on precomputed optical flow maps that are com-
bined by as follows:
wmðc; dÞmotion ¼qg�v;mh�v;mðg�vÞ þ qa�v;mb�v;mða�vÞ
� �6
;
þqg�u;mh�u;mðg�uÞ þ qa�u;mb�u;mða�uÞ
� �6
; ð4Þ
where the functions g, h, a, and b have the same definition as above,
but are computed over the values of the motion components �u and �v.Depending on the input video and the objects that one wants to seg-
ment, the motion may be more or less important in discerning the
objects. Then, weights are assigned to both color and motion com-
ponents of the fuzzy spel affinities wm(c,d)color and wm(c,d)motion,
so, the fuzzy spel affinities are given by the following:
wmðc; dÞ ¼w1wmðc; dÞmotion ifðc; dÞ 2 p;þw2wmðc; dÞcolor;
otherwise;0;
8><>: ð5Þ
where w1 and w2 are weights such that w1 1 w2 5 1.0. Comparisons
of the results produced by the MOFS algorithm with the original
fuzzy affinities and the affinities described here can be seen in
(Oliveira, 2007).
E. Optical Flow. Several works, such as (Litwinowicz, 1997;
Hertzmann and Perlin, 2000), have proposed the use of an optical
flow method for enforcing temporal coherence. However, the local
characteristic present in the computation of optical flow techniques
and their sensitivity to noise somehow limit their applicability. To
overcome that, we proposed a method (Gomes et al., 2007a) where
an optical flow algorithm is used for enforcing temporal coherence,
but with the search area for the spel matching restricted by object
boundaries obtained by the segmentation algorithm. Thus, the opti-
cal flow information can be used to enforce intraobject temporal co-
herence on these sequences.
F. Constrained Optical Flow. We decided to use a multiresolu-
tion implementation of the optical flow algorithm proposed by Pro-
esmans et al. (1994) because it produces a very dense optical flow
map (with one motion estimate per spel), and because it has been
evaluated by McCane et al. (McCane et al., 2001) as the best (in ac-
curacy and consistency) between several algorithms when applied
to three complex synthetics scenes and one real scene. The
algorithm uses a system of six nonlinear diffusion equations that
computes forward and backward disparity maps.
The Constrained Optical Flow can then be computed by limiting
the search area for the optical flow algorithm to the area occupied
by the same object in the next frame. This is defined as follows:
Given a 3D image I (the video stored as a sequence of frames) and
an M-segmentation map r, the constrained optical flow of the spels
belonging to object k is computed over the image Ik, that is definedby the following:
Ikðx; y; zÞ ¼ Iðx; y; zÞ ifrðx;y;zÞk 6¼ 0;�1 otherwise;
�ð6Þ
where rk(x,y,z)
= 0 indicates that the spel (x,y,z) belongs, with grade
of membership rk(x,y,z) to object k, and the value 21 is used as a flag
104 Vol. 19, 100–110 (2009)
to signal the optical flow algorithm that its computation should not
include the spel (x,y,z). Thus, the Constrained Optical Flow calcu-
lated from two successive frames for the whole frame is given by
the union of nonnull flow vectors of the calculated Constrained Op-
tical Flow from the individual objects, i.e., the individual flow maps
computed for each object that is going to be stylized are combined
prior the rendering step. It is important to note that the Constrained
Optical Flow computation is needed only for the objects we want to
render with intraobject temporal coherence. The limitation of the
optical flow search area in the constrained optical flow results in
flow maps with much less error than when using the global optical
flow counterpart method (Gomes et al., 2007a), as can be seen in
Figure 2.
G. Homography Seeding. A framework for producing tempo-
rally coherent stylized videos is proposed in a previous work by
Collomosse et al. (2005). The idea is to segment objects treating the
video as a 3D image. The object boundaries stored, frame to frame,
are used to compute local motion estimates for video objects that
are then used to model the interframe motion as a homography.
A homography is defined in 2D space as a mapping between a
point on a ground plane as seen from one camera, to the same point
on the ground plane as seen from a second camera (Hartley and Zis-
serman, 2003) (or in the next frame, in our case). This has many
practical applications; most notably it provides a method for com-
posing a scene by pasting 2D or 3D objects into an image or video
with the correct pose. More formally, if we have two cameras, aand b, looking at points pi in a plane. Passing the projections of pifrom bpi in b to a point api in a as follows:
api ¼ KaHbaK�1b
bpi; ð7Þ
where Hba is given by the following:
Hba ¼ RtnT
d: ð8Þ
In Eq. (8) above, R is the rotation matrix by which frame b is
rotated in relation to frame a; t is the translation vector from a to b;n and d are the normal vector of the plane and the distance to the
plane, respectively; Ka and Kb are the camera’s intrinsic parameter
matrices (Horn and Schunck, 1981).
When the image region in which the homography is computed is
small or the image has been acquired with a large focal length, an
affine homography is a more appropriate model for image displace-
ments. An affine homography is a special type of a general homog-
raphy whose last row of Hba is fixed to h31 5 h32 5 0 and h33 5 1.
In our case, homogeneous coordinates are used in practice to imple-
ment the homography transformation because matrix multiplication
cannot directly perform the division required for perspective projec-
tion, becoming thus an affine transformation plus a division.
By the formalism above, one can note that the homography is
not a linear (even not an affine) transformation, but assumes both
planar motion and rigid objects, thus limiting the accuracy of this
method when applied to a wide range of objects and motions. Non-
rigidity is the case of some objects used in the current work, for
example, when using the object shown in Figure 3, a frog that has
nonrigid motion of its articulations. We introduce an improvement
over the above cited algorithms by treating nonrigid objects as a
connected set of rigid components, as few as necessary, each one
modeled separately. So, a homography (rigid) can be calculated for
each one of these components, allowing the tracking of the object
in every frame.
This simple approach allows the tracking of the object in every
frame, with some help from the user that, presumably, knows the
Figure 2. Application of the multiresolution implementation of the Proesmans’ et al. optical flow algorithm to two subsequent frames of the
Pooh sequence, on the whole image (a) and constrained to the segmented object only (b). By looking at the original sequence, we can see that
the Constrained Optical Flow yields better results, especially close to the borders of Winnie the Pooh. Only part of the frames is shown toemphasize the differences between the results.
Vol. 19, 100–110 (2009) 105
objects to be tracked in the movie. An initialization is necessary, in
which the user breaks each nonrigid object in its components. Then,
the tracking of each component is performed by using an approach
based on correlation measures. This simple approach has demon-
strated to be enough for the image sequences used in this work.
Other approaches for tracking were not tried here since this one sol-
ves our problem, with a good performance, as it will be shown in
the experiments performed. Besides, tracking nonrigid objects in
video sequences is a well-known topic in the field of Computer
Vision and several works are available in the literature (Tissai-
nayagam and Suter, 2005).
The framework developed in our rendering tool uses the homog-
raphy-based interframe motion estimates to seed the flow map used
in the constrained optical flow, observing that the homography is
applied to object components instead of a global one for each object
to be tracked. This approach has two potential advantages over the
constrained optical flow described above. First, a good motion esti-
mate speeds up the computation of the optical flow maps. Second,
the optical flow maps become smoother, especially close to the bor-
ders of the objects, since the homography maps the object shapes in
two adjacent frames. The use of the constrained optical flow to pro-
cess the homography motion estimates allows us to overcome the
limitations of the homography method described by Collomosse
et al. (2005) when dealing with nonrigid objects and nonplanar
motion.
H. FusionFlow. We mentioned before that the main goal of the
AnimVideo tool is to provide powerful software tools for manip-
ulating the input videos and creating animations. Thus, we devel-
oped the AnimVideo tool in a modular way, making it easy to
add other segmentation, tracking, motion estimation, and render-
ing methods.
There are many other optical flow methods, and the Middle-
bury’s optical flow page (http://vision.middlebury.edu/flow) pro-
vides a ranking of several state-of-the-art optical flow methods.
The data sets and evaluation methods used on the Middlebury’s
optical flow ranking (Baker et al., 2007) emphasize problems
associated with nonrigid motion, real sensor noise, complex natu-
ral scenes, and motion discontinuities. They achieve that by
including data sets with realistic synthetic sequences, nonrigid
motion where the ground-truth is known, stereo pairs of static
scenes, and high frame-rate video used to assess interpolation
error. Four quality measures are used for ranking the optical flow
methods, two for flow accuracy, and two for frame interpolation
quality, where the two measures of flow accuracy are the angular
error and the end-point error.
The ability to easily make use of different methods for parts of
the animation’s creation process was exercised here by using
another optical flow method to generate an animation. The method
chosen is the FusionFlow method, proposed by Lempitsky et al.
(2008), that is currently ranked as one of the top four methods in
the Middlebury’s optical flow page in all four measures.
The FusionFlow method (Lempitsky et al., 2008) formulates the
optical flow computation as graph cut problem, as iteratively fuses
a set of candidate solutions (proposals) using minimum cuts, and it
models the optical flow estimation using pairwise Markov Random
Fields, as it was done in (Heitz and Bouthemy, 1993; Black and
Anandan,1996). The energy function used has two terms, a data
term, that measures how well the flow field describes the matching
between pixels in the two images, and the spatial term, that penal-
izes changes in horizontal and vertical flow between adjacent pix-
els. The distance between two pixels in the data term is the Euclid-
ean distance in RGB space that is computed after performing a
high-pass filtering of the input data, to make the data term more
robust to illumination and exposure changes.
The proposals were computed using the Lucas-Kanade (LK)
method (Lucas and Kanade, 1981) that usually produces accurate
estimates for textured regions but not for textureless regions, and
the Horn-Schunck (HS) method (Horn and Schunck, 1981), that
usually produce accurate estimates for regions with smooth motion,
but over-smooths areas with motion discontinuities. Apart from
those computed proposals, some constant flow fields were also
used.
The process starts by randomly choosing one of the LK or HS
proposals as an initial solution, randomly visiting all other pro-
posals, and fusing them with the current solution, one by one. The
constant flow fields are then computed using clusters of flow vectors
of the fused solution produced after this first pass. Then, the fusion
process is repeated twice for all proposals, now including the con-
stant flow proposals. It is important to emphasize that the fusions do
not increase the energy of the solution, and it was observer by Lem-
pitsky et al. (2008) that the final solution ‘‘always has an energythat is much smaller than the energy of the best proposal.’’ Afterperforming this discrete optimization, a standard conjugate gradient
method is used to perform local optimization that produces more
accurate flow estimates for areas where the proposal solutions were
not diverse enough.
Frames of two animations generated using the flow maps
produced by the FusionFlow method are shown in Section IV.
Figure 3. The Frog, a model
with nonrigid motion betweenframes. [Color figure can be
viewed in the online issue,
which is available at www.interscience.wiley.com.]
106 Vol. 19, 100–110 (2009)
I. Stylized Rendering. As mentioned above, the tool can be used
with any stylized rendering technique, as long as it is implemented
in the framework of the AnimVideo rendering tool. So, to validate
the AnimVideo rendering tool, we have implemented five artistic
styles.
The first artistic style implemented is impressionist painting.Similarly to what is done in the original work (Litwinowicz, 1997),
we render the brush strokes according to a predetermined size,
using the average color of the region where the brush stroke is
placed. The difference between our method and the previous
method (Litwinowicz, 1997) is that we use the constrained optical
flow as described in Section III, resulting in much less error in the
rendering of brush strokes in our case. We can also use the optical
flow information to create a velocity, as well as a motion emphasis
effect, by using the orientation and magnitude of the flow vectors to
determine the size and orientation of the brush strokes, or their life
span, respectively.
The second artistic style described here is mosaicing. In a previ-
ous work (Gomes et al., 2007b), we have implemented techniques
for performing mosaics using different initial tile distributions, such
as the Centroidal Voronoi Diagrams (CVDs), similarly to the works
of Hausner (2001) and Faustino and Figueiredo (2005) for static
mosaics, or the Distance Transform Matrices (DTM), as done by Di
Blasi and Gallo (2005) to distribute quadrilateral tiles, all designed
for generating static mosaics. After computing the DTM of an
image, we can compute the gradient and the level line matrices,
which determine the tile orientations and positions, respectively.
However, the method of Di Blasi and Gallo (2005) handles only
tiles of the same size. On the other hand, since we segment the input
video into disjoint objects, our method can associate different char-
acteristics with them, such as the tile size, emphasizing regions
close to borders, as is done by Faustino and Figueiredo (2005).
Tile removal or addition becomes necessary when objects move
closer or further away from the camera, or when new parts of the
scene appear in the video. This is done to maintain a consistent
appearance of the tiles in the animation, i.e., a homogeneously
dense animated mosaic. To control the addition/removal of tiles, we
defined a threshold that specifies the maximal superposition that
two tiles can have, and they appear as if one of the tiles had been
cut to free space for another tile to be placed, something common
Figure 4. Stylized renderings
produced using the AnimVideo
rendering tool showing the ren-dering styles of colored sand
bottle (a and b), watercolor (c
and d), and the combination of
watercolor (background) andimpressionism (frog) (e and f).
[Color figure can be viewed in
the online issue, which is avail-
able at www.interscience.wiley.com.]
Vol. 19, 100–110 (2009) 107
in real life mosaics. By playing with this threshold, we can achieve
more or less tightly packed tiles in areas where the video is
changing.
Another artistic style implemented simulates a typical art craft
from the Northeastern region of Brazil that uses colored sand to
compose landscape images on the inner surface of transparent glass
bottles. Since the visual interaction takes place solely at the inner
glass surface of the bottle, we implemented a method for generating
2D procedural sand textures (Britto-Neto and Carvalho, 2007) that
can be then combined to compose images similar to the ones pro-
duced by the artists. We also implemented two techniques to mimic
effects created by the artists using their tools. The images generated
can then be texture mapped into the inner surface of a 3D glass bot-
tle model. (Artists can also create pictures between two flat pieces
of glass, producing a ‘‘painting’’ that can be laid on a flat surface.)
By implementing these techniques (here called Csand, short for col-ored sand), we allowed users to create not only bottles with similar
images to the ones produced by the artists, but also animations
using these sand bottles, something close to impossible in real life
with the original technique.
A related technique implemented is the one named Sandbox(Britto-Neto, 2007), where movies are shown as they were playing
on a sandbox, and objects inside it push sand around as they move
about it. This method, inspired in the work of Sumner and co-
workers (O’Brien et al., 1999), is used more effectively on movies,
where the background is static and there are few objects moving on
it. After the objects of interest are segmented with the segmentation
module of the AnimVideo tool, we generate depth masks for each
moving object, followed by the computation of the compression of
the sand under the objects and the dislocation of sand on the edges
of the objects. Finally, some erosion is performed to smooth out the
sand ripples generated by the objects.
The fifth style implemented in the AnimVideo tool is the Water-color style proposed by Bousseau et al. (2006) to perform image
stylization. This method consists of creating a simplified, abstracted
version of the input image and applying textures that simulate a
watercolor appearance to this abstracted image. This technique was
later extended to handle videos in (Bousseau et al., 2007), where the
temporal coherence was achieved by using texture advection along
lines of optical flow, whereas the video abstraction was performed
with 3D morphology filters, treating the time (i.e., the frames of the
video) as the third dimension, in the same way it is done here.
IV. RESULTS
To validate our tool, we have made experiments for producing styl-
ized rendering. Figure 4 shows examples of stylized animations
produced using the AnimVideo Rendering tool, with 4a and 4b
Figure 5. Two original frames (a and b) and the correspondent stylized frames (c and) generated busing a directional painting style. [Color fig-
ure can be viewed in the online issue, which is available at www.interscience.wiley.com.]
108 Vol. 19, 100–110 (2009)
showing two frames of an animation using the Csand technique,
whereas 4c and 4d show two frames of a video rendered using the
watercolor style of [8,9], and Figures 4e and 4f show two frames of
an impressionist rendering of the frog, while the background is ren-
dered as a watercolor. Finally, Figures 5a and 5b show two input
frames, whereas Figures 5c, 5d, 6a, and 6b show the directional
painting renderings of two frames from two different input videos
of the Middlebury evaluation data sets, that were generated using
the flow maps produced by the FusionFlow method (Lempitsky
et al., 2008).
On the left of Figure 5d, where the orange ball is located, we
can see the effect of wrong flow vectors being used for mapping the
brush strokes. This is exactly the case where a segmentation step
can help, since a previous segmentation of the ball as a separate
object can be used to restrict the search space of the optical flow
method to the area where the ball is located in the next frame.
The quality of the stylized videos is very dependent on the qual-
ity of the segmentation of the objects. If the segmentation is not
good, the rendering module will render parts of the object that were
mistakenly segmented in an erroneous way. Of course, very noisy
videos will affect the quality of the Constrained Optical Flow result,
even to the point of making it useless. However, the segmentation
method described here has been successfully used to segment very
diverse videos, some of which contained several occluded objects,
and moving shadows (Oliveira, 2007). The speed of the segmenta-
tion process is also important, since the user can interact with the
program, adding and/or removing seed spels, and reprocess the
video. The average time for segmenting the videos mentioned here
was about 4 ls per spel on a Pentium 4 3.0 GHz.
Another potential problem is the aperture problem. The aperture
problem is an underlying limitation of optical flow methods. Possi-
ble approaches to solve this problem are the conversion of the
motion problem to a stereo problem and finding the correspondence
between a number of points in the image at time t to the image at
time t 1 dt, or the computation of the optical flow and the use of its
geometrical properties to deduce 3D information about the scene
and the motion. Here, we adopt an alternative solution to attenuate
the aperture problem by using homographies to estimate motion,
breaking the image objects into rigid components, and then using
these motion estimates as the initial flow field of the Proesmans’
et al. optical flow method (Proesmans et al., 1994). Such estimates
could also be used as one of the proposals in the fusion process of
the FusionFlow method (Lempitsky et al., 2008).
V. CONCLUSIONS
We propose a tool used for generating stylized renderings of videos.
The tool implements a method for enforcing temporal coherence
that is based on the full segmentation of the input video shot using a
fast fuzzy segmentation algorithm, followed by the computation of
optical flow maps, that are produced by restricting the search area
of the optical flow method to the correspondent segmented object.
The segmented objects can then be used as different layers in the
rendering process, thus, providing us with many options in the ren-
dering phase, such as rendering different objects using different
artistic styles, even though this is not done here. The approach
based on homographies can be used to produce good estimations
for motion to serve as an initial solution for the iterative optical
flow computation. We have also shown that flow maps produced by
a different optical flow method can be used to generate animations,
due to the modular structure of the tool.
In the experiments, we produced several frames, for several ani-
mations, using the AnimVideo rendering tool rendering modules,
with the rendering styles of Mosaic, CSand, Sandbox, Impressionist
painting and Watercolor. The AnimVideo tool was designed to eas-
ily allow the addition of plugins containing new rendering styles or
segmentation and point tracking techniques (used to compute the
homographies), so it could be also used as a framework for the
development of well-known and new rendering styles in undergrad-
uate computer graphics classes. Other future works include the
addition of other computer vision techniques to allow more robust
processing of highly complex video shots.
ACKNOWLEDGMENTS
A preliminary version of this article (Oliveira et al., 2008) was pre-
sented at the 12th International Workshop on Combinatorial Image
Analysis, which took place in Buffalo, NY, in April 7–9, 2008. The
authors thank Stefan Roth, V. Lempitsky, and C. Rother for making
available to us the flow maps used to generate Figures 5 and 6.
REFERENCES
J.K. Aggarwal and N. Nandhakumar, On the computation of motion from
sequences of images—A review, Proc IEEE 76 (1988), 917–935.
P. Anandan, A computational framework and an algorithm for the measure-
ment of visual motion, Int J Comput Vis 2 (1989), 283–310.
Figure 6. Two stylized frames generated busing a directional paint-
ing style. [Color figure can be viewed in the online issue, which is
available at www.interscience.wiley.com.]
Vol. 19, 100–110 (2009) 109
S. Baker, S. Roth, D. Scharstein, M.J. Black, J.P. Lewis, and R. Szeliski, A
database and evaluation methodology for optical flow, Int Conf Comput
Vis’07 1 (2007), 1–8.
S.S. Beauchemin and J.L. Barron, The computation of optical flow, ACM
Comput Surv 27 (1995), 433–466.
M.J. Black and P. Anandan, The robust estimation of multiple motions:
Parametric and piecewise-smooth flow fields, Comput Vis Image Under-
stand 63 (1996), 75–104.
A. Bousseau, M. Kaplan, J. Thollot, and F. Sillion, Interactive color render-
ing with temporal coherence and abstraction, Proc Int Symp Non-Photoreal-
istic Anim Render 1 (2006), 141–149.
A. Bousseau, F. Neyret, J. Thollot, and D. Salesin, Video watercolorization
using bidirectional texture advection, ACM Trans Graphics (Proc SIG-
GRAPH) 26 (2007), 104.
L.S. Britto-Neto, Renderizacoes nao fotorealısticas para estilizacao de
imagens e vıdeos usando areia colorida, Master’s thesis, Universidade Fed-
eral do Rio Grande do Norte, 2007.
L.S. Britto-Neto and B.M. Carvalho, Message in a bottle: Stylized rendering
of sand movies, Proc XX Braz Symp Comput Graphics Image Process (SIB-
GRAPI’07), IEEE, Los Alamitos, CA, 2007, pp. 11–18.
B.M. Carvalho, C.J. Gau, G.T. Herman, and T.Y. Kong, Algorithms for
fuzzy segmentation, Pattern Anal Appl 2 (1999), 73–81.
B.M. Carvalho, G.T. Herman, and T.Y. Kong, Simultaneous fuzzy segmen-
tation of multiple objects, Discrete Appl Math 151 (2005), 55–77.
B.M. Carvalho, L.M. Oliveira, and G.S. Silva, Fuzzy segmentation of color
video shots, Proc DGCI, Springer-Verlag, London, 2006, Vol. 4245, pp.
402–407.
H.D. Cheng, H. Jiang, Y. Sun, and J.I. Wang, Color image segmentation:
Advances and prospects, Pattern Recognit 34 (2001), 2259–2281.
J.P. Collomosse, D. Rowntree, and P.M. Hall, Stroke surfaces: Temporally
coherent artistic animations from video, IEEE Trans Visual Comput Graph
11 (2005), 540–549.
G. Di Blasi and G. Gallo, Artificial mosaics, Vis Comput 21 (2005), 373–383.
G. Faustino and L. Figueiredo, Simple adaptive mosaic effects, Proc SIB-
GRAPI 1 (2005), 315–322.
J. Gauch and C. Hsia, A comparison of three color image segmentation algo-
rithm in four color space, SPIE Vis Commun Image Process’92 1818 (1992),
1168–1181.
R.B. Gomes, T.S. Santos, and B.M. Carvalho, Coeroncia temporal intra-
objeto para NPR utilizando fluxo optico restrito, Revista Eletronica de Ini-
ciacao Cientıfica 7 (2007a), 2007205.
R.B. Gomes, T.S. Santos, and B.M. Carvalho,Mosaic animations from video
inputs, Proc IEEE Pacific-Rim Symp Image Video Technol, Springer-Ver-
lag, London, 2007b, Vol. 4872, pp. 87–99.
B. Gooch and A. Gooch, Non-photorealistic rendering, AK Peters, Natick,
MA, 2001.
J. Hair, B. Black, B. Babin, R. Anderson, and R. Tatham, Multivariate data
analysis, 6th edition, Prentice Hall, Upper Saddle River, NJ, 2005.
R. Hartley and A. Zisserman, Multiple view geometry in computer vision,
Cambridge University, Cambridge, UK, 2003.
A. Hausner, Simulating decorative mosaics, Proc ACM SIGGRAPH 1
(2001), 207–214.
F. Heitz and P. Bouthemy, Multimodal estimation of discontinuous optical
flow using markov random fields, Trans Pattern Anal Appl 17 (1993), 185–
203.
G.T. Herman, Geometry of digital spaces, Springer, Danvers, MA, 1998.
G.T. Herman and B.M. Carvalho, Multiseeded segmentation using fuzzy
connectedness, IEEE Trans Pattern Anal Mach Intell 23 (2001), 460–474.
A. Hertzmann and K. Perlin, Painterly rendering for video and interaction,
Proc NPAR 1 (2000), 7–12.
B. Horn and B. Schunck, Determining optical flow, Artif Intell 17 (1981),
185–203.
S. Khan and M. Shah, Object based segmentation of video using color,
motion and spatial information, Proc IEEE CVPR 2 (2001), 746–751.
V. Lempitsky, S. Roth, and C. Rother, Fusionflow: Discrete-continuous opti-
mization for optical flow estimation, CVPR08 1 (2008), 1–8.
P. Litwinowicz, Processing images and video for an impressionist effect,
Proc ACM SIGGRAPH 1 (1997), 407–414.
J. Liu and Y.-H. Yang, Multiresolution color image segmentation, IEEE
Trans Pattern Anal Mach Intell 16 (1994), 689–700.
B. Lucas and T. Kanade, An iterative image registration technique with an
application to stereo vision, 7th Int Joint Conf Artif Intell 1 (1981), 674–679.
B. McCane, K. Novins, D. Crannitch, and B. Galvin, On benchmarking
optical flow, Comput Vis Image Understand 84 (2001), 126–143.
K. Novins, D. Mason, S. Mills, B. Galvin, and B. McCane, Recovering
motion fields: An evaluation of eight optical flow algorithms, In 9th British
Machine Vision Conference, Southampton, UK, 1998, pp. 195–204.
J.F. O’Brien, R. Sumner, and J.K. Hodgins, Animating sand, mud, and
snow, Comput Graph Forum 18 (1999), 17–26.
L.M. Oliveira, Segmentacao fuzzy de imagens e vıdeos, Master’s thesis,
Universidade Federal do Rio Grande do Norte, Natal, Brazil, 2007.
L.M. Oliveira, L.S. Britto-Neto, R.B. Gomes, T.S. Santos, G.S. Andrade,
and B.M. Carvalho, ‘‘Producing stylized renderings using the AVP render-
ing tool,’’ In Image Analysis—From Theory to Applications, R.P. Barneva
and V.E. Brimkov (Editors), Research Publishing, Singapore, 2008, pp. 55–
64.
M. Otte and H.-H. Nagel, Optical flow estimation: Advances and compari-
sons, Eur Conf Comput Vis 1 (1994), 51–60.
M. Proesmans, L.V. Gool, E. Pauwels, and A. Oosterlinck, Determination of
optical flow and its discontinuities using non-linear diffusion, Proc 3rd
ECCV 2 (1994), 295–304.
A. Rosenfeld, Fuzzy digital topology, Inform Contr 40 (1979), 76–87.
A. Singh, Optic flow computation: A unified perspective, IEEE Computer
Society Press, Los. Alamitos, California, 1991.
C. Stiller and J. Konrad, Estimating motion in image sequences: A tutorial
on modeling and computation of 2D motion, IEEE Signal Process Mag 16
(1999), 70–91.
T. Strothotte and S. Schlechtweg, Non-photorealistic computer graphics:
Modeling, rendering and animation, Morgan Kaufman, San Francisco, CA,
2002.
P. Tissainayagam and D. Suter, Object tracking in image sequences using
point features, Pattern Recognit 38 (2005), 105–113.
J.K. Udupa and S. Samarasekera, Fuzzy connectedness and object definition:
Theory, algorithms, and applications in image segmentation, Graph Model
Image Process 58 (1996), 246–261.
J. Wang, Y. Xu, H.-Y. Shum, and M.F. Cohen, Video tooning, ACM Trans
Graphics 23 (2004), 574–583.
G. Winkenbach and D.H. Salesin, Computer-generated pen-and-ink illustra-
tion, In Proc SIGGRAPH 1994, ACM SIGGRAPH, Orlando, FL, 1994, pp.
91–100.
110 Vol. 19, 100–110 (2009)