Post on 31-Jan-2023
© Association of Art Historians 2011 2
Pictorial Grammar: Chomsky, John Willats, and the Rules of RepresentationPaul Smith
Noam Chomsky’s work has had enormous impact, not only within linguistics,
but also upon the humanities, social sciences, and sciences more generally. It has
certainly been of especial importance for philosophy because it has reviviied the
stalled debate between Rationalists and Empiricists as to whether our knowledge is
innate or acquired with its proposal that our capacity to understand and produce our
own language – and to learn others too – is grounded in the fact that we are born in
possession of the principles governing all possible grammatical sentences.1
The discipline of art history has remained almost completely impervious to
Chomsky’s work, however. Instead it has preferred of late to draw its understanding
of what is language-like about pictures from structuralism,2 or from critical revisions
of this body of theory that nevertheless preserve its foundational assumption that
both signiiers and their associated signiieds are conventional and arbitrary. Applied
uncritically to pictures, this would imply that their signs simply cannot be iconic, in
the Peircean sense of resembling what they stand for, which would beg the question
why so many depictions do indeed look convincing. The other main theory of
representation to draw analogies between pictures and language (under the broader
rubric of ‘the language of art’) that still enjoys some currency among art historians
is Gombrich’s. But this provides no more satisfying resolution to the matter of the
arbitrariness or iconicity of depiction than its rival. Instead – as Richard Wollheim
observed – it contradicts its own thesis that pictorial signs can be illusionistic (or
strongly iconic) by maintaining at the same time that they are always conventional,
in the sense of subject to arbitrary rules of interpretation.3 (Gombrich also regards
painting as a ‘code’ of sorts.) Given this impasse, not the least advantage of applying
Chomskian principles to the analysis of pictures is that it generates an account that
makes it possible to accommodate iconicity and conventionality together amenably.
In a career spanning forty years, the psychologist (and trained engineer and artist)
John Willats developed an account of precisely this kind, largely on the basis of the
‘standard theory’ Chomsky elaborated in his irst two books.4 It is therefore relatively
unsophisticated compared with the more recent versions of Chomsky’s theory,5
which can specify the operations of linguistic grammar in minute detail in terms
that embrace syntax and semantics (or structure and sense) simultaneously, and at
the same time explain how we can in practice implement grammar economically in
our speech.6 Willats’s account is none the less serviceable for pictures since it rests on
a conception of their operations that derives from those concepts most fundamental
to Chomsky’s account, and which have endured through the several reforms it has
Detail from Paul Cézanne, Victor Chocquet Seated, c. 1877 (plate 18)
DOI: 10.1111/j.1467-8365.2011.00835.xArt History | ISSN 0141-679034 | 2 | June 2011 | pages XX-XX
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
© Association of Art Historians 2011 3
Pictorial Grammar
undergone. It also turns out that pictures possess only a restricted set of semantic
features organized by a relatively small number of syntactical rules. So, rather as
Chomsky’s work could reconigure our understanding of the role of nature and
nurture in language acquisition, Willats’s account is powerful enough to reconigure
our understanding of what is natural and what is conventional about depiction. It is
capable, in other words, not only of transcending the limitations of semiotic theory,
but also those of philosophical theories (unfamiliar to most art historians) which
regard pictures as seamless and transparent icons that resemble what they represent
naturally (as opposed to conventionally).
In short, what Willats argues on this score is that pictures are neither wholly
conventional nor naturally iconic. Instead he proposes that pictures have a natural
basis, and non-conventional characteristics, because they issue from our innate
capacity to ‘map’ percepts (or mental representations) of real scenes and their
components according to rules that preserve many of their objective properties. At
the same time, however, he also argues that convention plays a substantial role in
depiction, since it is this that decides which of the manifold possible variants of these
rules are employed by particular groups and cultures.
Willats acknowledged his debt to various aspects of Chomsky’s thinking in
several places. In his magnum opus, Art and Representation of 1997,7 he recognized both
the impact that Chomsky’s theory of syntax made on his understanding of pictorial
structure, and his indebtedness to the pioneering research into ‘picture grammars’
undertaken in the 1960s by David Huffman, Max Clowes, and Adolfo Guzman
under the impact of Chomsky’s theories.8 And later, in Making Sense of Children’s Drawings
of 2005, Willats declared that his explanation of how children learn to draw was
premised upon Chomsky’s ideas about our innate capacity for producing speech.9
Willats also recognized Chomsky’s more indirect inluence in both books by stating
that the other main source of his general theory was David Marr’s Vision of 1980,10 a
work that explicitly modelled its use of computational principles to explain how the
brain converted raw data into conscious percepts upon Chomsky’s use of similar
principles to explain how speech transformed its own ‘base’ content.11
Notwithstanding this, leshing out Willats’s account will involve analysing a
good deal more explicitly than he does himself how his theory applies Chomskian
principles since he only rarely cites speciic passages of Chomsky’s writings, and is
reluctant to pursue theoretical generalizations. More speciically, demonstrating
that pictures have a grammar similar in kind to the grammar Chomsky discerned in
language will involve demonstrating a series of more basic facts: irst, that pictures
have parts which can be segmented out of their larger structures; second, that syntax
operates on their component parts; third, that they map shapes and spatial relations
in a grammatical way; fourth, that the grammatical rules operating in any picture
are innately grounded; and, inally, that these rules map, or transform, a ‘deeper’
perceptual content.
It must be acknowledged that the type of account to be developed solely by
extrapolating from Willats must necessarily be restricted to characterizing how
pictures perform their most basic cognitive function, namely that of rendering
objects in forms that allow them to be recognized. I nevertheless hope to show
that developing a theory of this kind (or of pictorial grammar pure and simple) is
worth the effort. One justiication for doing so is that the reality of such a grammar
is vigorously contested.12 Another is that the analytical categories provided by
Willats greatly enhance our understanding of how pictures work. What is more,
even a rudimentary account of pictorial grammar is capable of making sense of the
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
© Association of Art Historians 2011 4
Paul Smith
complex meanings pictures sometimes achieve when they bend, or subvert, its rules.
And although it would be unwise to attach too much weight to any further claims
about the potential of a theory of pictorial grammar, I will none the less tentatively
suggest in my conclusion that it may be capable of explaining aspects of depiction
other than shape or spatial relations, and may also contribute towards a better
understanding of the relationship between pictures and language. A general theory
of pictorial grammar is, in other words, a worthwhile aspiration for art historians as
for others.
Convention, Iconicity, and Mapping
A convenient route into the differences between semiotic and natural resemblance
theories of depiction is provided by the arguments advanced by Yve-Alain Bois in
his essay, ‘The Semiology of Cubism’, of 1992,13 and by Richard Wollheim in his
response to it in his essay, Formalism and its Kinds, of 1995.14 In the ‘principles’ laid out
at the outset of his essay, Bois argues that the pictorial sign in cubist collage from
1912 is used in the same way either as the arbitrary ‘signiier’ (described by Saussure)
or the (Peircean) index, and that its iconic use is all but entirely suspended.15 An
index is simple enough in that it is a sign that refers to its referent by grace of a
causal or existential bond of some kind. Signiiers are more complex. In the irst
place, they do not refer to things and the like, but signify their own signiieds.
More speciically, because signiiers are distinguished from one another within
any ‘system’, or language, by the play of phonic differences alone,16 the signiieds
that system can generate are solely a function of the differences it can generate, and
hence are as arbitrary as their associated signiiers. It is also important to emphasize
that the signiiers actually present in any utterance do not signify by virtue of their
relationship to each another alone, but by virtue of their relationship to their absent
relatives as well.17 Bois extends these arguments by analogy to propose that the
meanings of pictorial signiiers are a function of the whole system they constitute
(although he is unclear about how this category should be applied), and are therefore
arbitrary in a strong sense. Signs thus conceived are evidently incapable of reference,18
at any event, and only mean what they do because convention assigns them their
meaning.
Because Bois sees no role for iconicity in cubist collage, he does not allow that the
games at work in some cubist collages – often lagged by the words ‘jouer’ or ‘jeu’
or suchlike – depend on playing with, or undermining, the function that iconic signs
fulil when used more straightforwardly. In Picasso’s Bowl with Fruit, Violin and Wineglass
of 1912 (plate 1), for instance, it is ambiguous whether the piece of paper representing
a wooden violin is painted to look like wood – and is therefore a straightforwardly
iconic sign – or whether it is painted to look like faux-bois wallpaper – and is therefore
a sign that imitates a sign that routinely employs iconicity for the purposes of posing
as indexical. But even while the ambiguity of such signs calls their own authority
into question, their ability to do so is ultimately parasitic upon the reliability of the
overwhelming majority of iconic signs to refer.19 So even though Picasso cut out the
illustrations of fruit in the top left-hand corner in rough angular shapes that place
them in the pictorial equivalent of quotation marks, the joke only works because they
still manage to resemble their referents.
Although Bois’ essay does not purport to be a general theory of depiction,
Wollheim nevertheless argues in Formalism and its Kinds that no theory of the kind it
implies could ever be tenable.20 Wollheim’s aversion to Bois’ way of thinking arises
from his commitment to the idea that ‘seeing-in’ and iconicity are both fundamental
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 5
Pictorial Grammar
to depiction generally,21 even overtly non-igurative painting; from which it follows
that any theory maintaining the possibility that a painting’s content can be non-
representational falsiies what is most central to it. What underlies Wollheim’s
opposition to Bois, in other words, is what he construes as the ‘latent formalism’
apparent in his concern for structure. Rather than tackling this commitment head
on, however, Wollheim prefers to criticize it indirectly by countering Bois’ more
general (and, to be fair, largely implicit) claim that pictures are language-like
in structure. And he does this by identifying a series of conditions drawn from
Chomsky’s thinking that representational pictures ought to be able to satisfy were
they language-like, but which they cannot because (in his view) they are not.22 These
all require that pictures possess grammatical features of a kind that Bois cannot
discern in them (any more than structuralism can in language).23 But Wollheim’s
critique goes further because it purports to show that there is no such thing as
pictorial grammar tout court, which – if right – must mean that Willats’s account of
depiction collapses along with Bois’.
1 Pablo Ruiz y Picasso, Bowl with Fruit, Violin, and Wineglass, 1912. Charcoal, black chalk, watercolour, oil, coarse charcoal or black pigment in binding medium, on newspaper (Le Journal, 6 and 9 December 1912), blue and white laid charcoal papers, supported by thin cardboard, 64 × 49.5 cm. Philadelphia: Philadelphia Museum of Art (A. E. Gallatin Collection, 1952-61-106).© Succession Picasso/DACS, London. 2010. Photo: Philadelphia Museum of Art.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
© Association of Art Historians 2011 6
Paul Smith
Wollheim’s most fundamental objection to the idea of pictorial grammar is that
pictures cannot involve ‘syntactical rules’ because, unlike sentences, they cannot be
‘segmented into smaller units’ or ‘ultimately basic units’ upon which syntax operates.
Wollheim denies, in short, that pictures consist of lexical (and other kinds of) items
ordered by syntax. This claim can be understood better by grasping how it relates
to Flint Schier’s criticism of the notion of pictorial grammar in Deeper into Pictures of
1986, a work Wollheim regarded highly.24 This contends that ‘pictures are weakly
decomposable or compositional but not strongly so’ like ‘sentences’, and hence
contain nothing ‘which plays the role of a word’ (or ‘morpheme’) in them.25 Schier
consequently claims that in order to understand a picture we have no need to grasp
anything akin to the syntax operating in a sentence, or ‘the relevant grammatical
rules governing [the] composition … of its parts’.26 Rather, he believes that there
are ‘no grammatical rules’ governing the composition of pictures, and he sees ‘no
place for a grammar or syntax of pictures’ at all.27 Indeed, Schier sums up Wollheim’s
position along with his own when he asserts that ‘There is no need for a grammar in
Chomsky’s sense … for iconic systems’,28 since a picture is just ‘built up iconically out
of its parts’. Our manifest ability to recognize a whole icon is, in other words, simply
a function of our ability to recognize ‘the objects and states of affairs represented by
[its] parts’.29 What this seems to suggest is that pictures iconify by grace of natural
resemblance alone, or in virtue of being somehow – or in some unspeciied sense –
isomorphic to their referents.
One of the more important implications of such a view is that pictures resemble
things without making recourse to any independent syntax of their own which
might complicate their relationship to their referents. So, even though Wollheim
eschews the notion of pictorial grammar altogether, his arguments about the
naturalness of iconicity nevertheless accord with the Port-Royal grammarians’ view
that there was nothing in the grammar of language that exceeded the requirement
that it should be able to relect the natural order of things. This view is highly
reductive, however, as Chomsky famously demonstrated when he showed that the
‘surface structure’ of a sentence maps – or transforms – a ‘deep structure’ according
to rules proper to itself.30 It follows that inasmuch as he embraces a closely analogous
conception of pictorial grammar, Willats maintains that pictorial structure is to some
extent sui generis too.
At the same time as challenging Wollheim’s and Schier’s account of
representation for being too strongly naturalistic, Willats implicitly contests Bois’
view for being too strongly conventionalist. This is because he envisages pictorial
mapping as a process that preserves a measure of identity between the structures of
pictures and those of the objects and scenes they represent. Willats suggests, more
particularly, that pictorial syntax requires semantic elements like lines, and higher-
order structures like whole shapes, to fall into arrangements consistent with those
formed by their real counterparts. Lines, in other words, must occlude one another
since this is what edges do, and shapes must have continuous surfaces because this is
how they are in actual objects.
Although it may seem that pictures cannot both transform their referents and
preserve some of their features, Willats argues synthetically that pictorial structures
possess what might be thought of as a qualiied isomorphism with the structure of
things. He seems to regard pictures, in other words, as closely akin to maps, which
transform what they represent regularly, and not just haphazardly. This sense of
pictures as approximating to maps is shared by Andrew Harrison, who argues that
they succeed in plotting many of the relations between things, even though they do
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
© Association of Art Historians 2011 7
Pictorial Grammar
so according to the rules of their own syntax.31 Evidently, the great advantage of a
view such as this is that it makes it possible to maintain perfectly coherently that all
viable pictorial structures can be formalized in sets of normative or conventional
rules (like those of schoolboy grammar), and that they will be transparent or iconic if
they transform percepts in ways that allow these to be recoverable. Put another way,
it is implicit in Willats’s account of representation that pictures succeed in referring to
things because their structures are ultimately constrained by what Richard Gregory
called ‘the grammar of vision’ (a formulation cited by Chomsky).32
For and against Pictorial Grammar – 1. Segmentability
Before it is possible to establish anything of this complexity, any theory of pictorial
grammar must irst of all establish that pictures have component semantic parts
of a kind that syntax can operate on, or assemble into larger units of sense. This
is what Willats argues, and by doing so contests Wollheim’s view of pictorial
compositionality, which denies that they have parts of this kind. It is important to
realize, however, that Wollheim does not suggest that pictures have no parts tout
court. Indeed, he accepted that they do in an article of 1966, when he concurred with
Michael Podro’s view that a ‘higher-level [pictorial] composition’ can have ‘basic’, or
‘sub-components’,33 arguing that many Old Master paintings have structures whose
component elements ‘are put together and contained in’ larger pictorial structures.34
Rather, Wollheim’s argument is that the parts of a picture are not the same in kind as
the discrete, and ultimately irreducible, semantic units (or whole component parts
that make sense on their own) into which sentences can be decomposed according to
Chomsky. This more restricted claim nevertheless runs directly against the grain of
the Chomskian argument Wollheim made in 1973 that ‘lexical items … concatenated
into complexes’, and ‘syntax and semantics’, ‘belong to the essence of art itself’.35
One way of understanding Wollheim’s volte-face is that his views were altered
by reading Schier. But it is also likely that his close dialogue with Nelson Goodman
throughout the 1970s affected his views, and in particular the distinction Goodman
proposed in Languages of Art (1976) between a ‘initely differentiated’ system or
‘notational scheme’, such as the printed notation we use for language, and the
notational scheme of painting.36 In more detail, this
argument holds that whereas printed ‘characters’ are
‘disjoint’, and readily identiied as ‘different’ from one
another by virtue of their membership of one ‘class’
(such as ‘a’) or another (such as ‘d’), nothing of the sort
can be said of the components of pictures.37 Instead,
pictures have ‘syntactically dense’ systems that lack
clear internal differentiation,38 from which it follows
that they cannot be ordered by syntax.
Goodman’s views were contested almost as soon
as they appeared by Curtis Carter, who argued that
‘shapes’ are indeed disjoint components of pictures
that exhibit character class membership,39 and hence
that they are readily seen as ‘constitutive elements’ of
‘larger units’ akin to ‘phonemes’, ‘morphemes’, and
‘spoken words’.40 His argument falls, however, because
it does not clearly establish what the deining limits of
a shape are. In a similar vein, Willats’s predecessors,
Huffman and Clowes, argued that structures such as
2 Drawing of a square produced by a 3½-year-old, igure 3.9 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press. Adapted from igure 10b from Jean Hayes, ‘Children’s visual descriptions’, Cognitive Science, 2: 1, January–March 1978, 14. © Cognitive Science Society.
3 Paul Cézanne, Provencal Landscape with a Red Roof, or The Pine at L’Estaque, 1875–76. Oil on canvas, 72 × 58 cm. Paris: Musée de l’Orangerie (Collection Jean Walther et Paul Guillaume). Photo: © RMN (Musée de l’Orangerie)/Franck Raux.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 8
Paul Smith
line junctions and vertices, being composed of more simple elements in the form
of lines, thereby imply the operation of pictorial syntax on disjoint units. It is a
weakness of Huffman’s theory, however, that it does not closely specify the character
of these units, and of Clowes’s and Guzman’s that they characterized lines somewhat
vaguely as the boundaries of ‘regions’.41
1
2
3
4
5
© Association of Art Historians 2011 9
Pictorial Grammar
In the light of these partial successes, it was an important part of Willats’s
achievement that he clearly identiied pictorial units of the kind Goodman and
Wollheim regarded as indiscernible by developing the analytical category of the
‘picture primitive’.42 These items come in a small number of simple forms: lines
denoting edges, points indicating luminosity, and colours referring to hues and tones
simultaneously. Put this way, the argument seems trite. But it actually represented
a major advance over earlier attempts to identify basic pictorial units because it
involved distinguishing a pictorial semantic unit from the marks carrying it. A line,
in other words, is no more identical to a pencil or pen mark, or a set of brushstrokes,
or the interstices between tesserae, than a word is the same as the sound in which it is
expressed.43 Nor is a point the same as a blob, or a colour the same as a brush mark.
Once this is appreciated, it starts to appear as though Wollheim’s objection to pictorial
segmentability involves conlating pictorial marks, which are not always readily
segmentable, with the primitives they carry, which are. Or to put it the other way
round, a picture decomposes straightforwardly once its units are correctly speciied,
just as a sentence does.
What Willats achieved by identifying the semantic units of the picture in
this manner is closely comparable to what Chomsky achieved by inventing the
‘rewrite rule’,44 that is, the technique of characterizing the components of phrases
‘generatively’ in terms of explicit lexical categories such as noun, verb, adjective and
the like.45 And, indeed, Willats not only regards picture primitives as elements of the
picture’s ‘denotation system’, where they play a role akin to the ‘constituents’ of the
‘lexicon’ in language,46 but casts them as lexical units on account of how they refer
to ‘scene primitives’, or to the basic perceptual units of the scene they depict. Lighter
and darker points, for example, refer to the degrees of luminosity we extract from the
luminous array, and lines refer to the edges we extract from it. This means that lines
are indeed genuine units of sense, or that they are components of larger structures
that stand for things rather like nouns. Furthermore, Willats supports this inference
by showing that children’s drawings sometimes use what he calls ‘shape modiiers’,
or marks appended to a line, to specify particular aspects of its shape – almost as
adjectives – illustrating his claim with a drawing of a cube (plate 2), where these marks
stipulate several such properties.47
Being ‘primitive’ a line can evidently be decomposed out of the larger semantic
unit of a vertex, rather as a word can be isolated from a phrase. A vertex, moreover,
can be decomposed out of the larger unit of a surface, which can itself be extracted
from a whole shape, which can in turn be isolated from the whole scene depicted,
just as phrases can be decomposed out of larger phrases and ultimately whole
sentences. (The caveat is that it is nevertheless misleading to think that the units
of sense to be identiied at different levels of pictorial structure have an equivalent
valence to the units present at different levels of linguistic structure.)
There is an initially plausible objection to the idea of pictorial segmentability,
however, which is that pictures do not have units that we see in isolation in the same
way as we (seem to) hear words individually. It might be claimed, for example,
that when we ixate on a particular line in a drawing we cannot isolate it from
the lines adjacent to, or conjoint with, it, whereas we hear words as having sharp
boundaries. The fact is, however, that our impression that we hear individual words
independently of the larger structures they form is retrospective. That is, we extract
individual words from the larger sequences in which they are rolled out only after
the fact, even if this is not how it seems to us.48 There is therefore a genuine analogy
4 Paul Cézanne, Woman with a Coffee-Pot, c. 1895. Oil on canvas, 130 × 95 cm. Paris: Musée d’Orsay. Photo: © RMN (Musée d’Orsay)/Hervé Lewandowski.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 11
Pictorial Grammar
to be made between the phenomenology of parsing a sentence into its constituents
and what happens when we identify a picture’s units only in the context of the
complete work, or of a large section of it. The area behind the screen of trees in the
centre of Cézanne’s Provencal Landscape with a Red Roof of the mid-1870s (plate 3) is a good
example of this last phenomenon at work, since it is almost impossible to identify its
constituent primitives, or shapes, when seen in isolation (from close-to), but these
are readily identiied when a suficient distance is attained for the whole painting to
be visible.
Cézanne’s mature paintings in general tend to exhibit compositionality overtly,
since they make little attempt to hide the fact that the objects in them are made up
of smaller components. In Woman with a Coffee-Pot of c. 1895 (plate 4), for example, it
is dificult to resist the temptation to see the upper and lower halves of the sitter as
joined together in the same way as the two sections of the coffee-pot to her right,
as both ‘objects’ readily parse into roughly conical or cylindrical components
of the kind Irving Biederman calls ‘geons’, or ‘viewpoint invariant, volumetric
primitive[s]’ analogous to ‘phoneme[s]’.49 It remains a problem, however, that many
paintings appear so luent that it can seem as though Wollheim’s argument against
compositionality must prevail. We may nevertheless be aware of their component
parts as such subliminally, since one part of our perceptual mechanism is devoted
to picking out areas of high contrast like lines, even though what that mechanism
detects is amalgamated with lower frequency information in conscious seeing (rather
as edges are when we see real scenes).50 Compositionality is thus a real feature of our
perception of pictures, even when it is not obviously so.
One symptom of compositionality is that pictures and speech are both rolled out
in units. With speech, it is obvious enough that we fabricate sense in ever-larger units
consisting of words, whole and embedded phrases, and whole sentences. But Willats
shows that we do the same kind of thing in pictures too, when, in an argument about
occlusion in pictures, he contrasts the natural sequence in which we draw an object
with a possible sequence we almost never use.51 The natural drawing (plate 5) shows
that we set down all the lines making up the front face of the object irst, then repeat
the process to form the adjacent face, and inally draw its partially occluded faces
5 Stages of drawing a rectangular object, igure 5 from John Willats, ‘The third domain: The role of pictorial images in picture perception and production’, Axiomathes, 13: 1, March 2002, 1–15. Photo: With kind permission from Springer Science and Business Media.
6 Stages of drawing a rectangular object, igure 6 from John Willats, ‘The third domain: The role of pictorial images in picture perception and production’, Axiomathes, 13: 1, March 2002, 1–15. Photo: With kind permission from Springer Science and Business Media.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
© Association of Art Historians 2011 12
Paul Smith
one after the other. The second (plate 6) shows how it is possible to begin drawing
the same object by drawing some of its various edges without joining them up into
faces. The fact, however, that we almost never use this sequence strongly implies that
we are predisposed to assemble pictures out of whole ‘units’ of the very kind whose
existence Wollheim contests, at the earliest possible stage of the picture-making or
picture-perceiving process.52
2. Syntax
Two of the key sources of Willats’s ideas about pictorial syntax were Huffman’s
article, ‘Impossible Objects as Nonsense Sentences’,53 and Clowes’s ‘On Seeing
Things’, both of 1971. Both of these used line drawings of ‘anomalous’ objects such as
an impossible polyhedron (plate 7) to reveal the rules governing the combination of
edges in properly-formed objects and pictures, in imitation of Chomsky’s technique
of using ambiguous and nonsense sentences to reveal the rules governing well-
formed sentences and their component ‘strings’.54 By this account, the sentence
‘colourless green ideas sleep furiously’ could appear to be a well-formed string
inasmuch as the adjectives, noun, verb and adverb are all in the right place, but it is
anomalous nevertheless because it violates (and hence reveals) the ‘sectional rule’ that
only certain kinds of ‘lexical items’ can be placed together if a sentence is to makes
sense of itself. By contrast, there is no such problem with the sentence ‘revolutionary
new ideas appear infrequently’, for all that it is supericially identical in structure.55
One of the more notable discoveries of Huffman and Clowes was the principle
that the rules governing the lawful combination of edges in a rectilinear object can
be expressed as variations of four basic categories of line junction, the V-, W-, Y-, and
T-junctions (plate 8), to use Huffman’s nomenclature; or the Ell, Arrow, Fork, and Tee
junctions, to use Clowes’s.56 This made it possible to at least envisage the possibility
that pictorial grammar could be described systematically in terms of clear categories
like those Chomsky had isolated. It also provided strong evidence for the existence of
something closely analogous to the kind of ‘well-formed’ structure, or ‘string’, that
Chomksy had discerned in grammatical sentences.
At the same time, Huffman and Clowes developed systems of labelling their
diagrams which showed the convexity or concavity of an edge clearly (a technique
they compared to ‘parsing’ a sentence),57 but which also made it possible readily
to identify ‘forbidden’, or ‘ungrammatical’,58 combinations of edges as well. They
were consequently able to specify a number of rules that they proposed pictures
could not infringe without being ill-formed, which included the rule that the same
surface cannot exist on two sides of the same edge, and the rule that the character of
an edge must be identical on both sides of the line describing it (viz. either convex
or concave).59 They also make it clear that all the junctions in an object must be
susceptible of being resolved coherently together, just as the elements of a sentence
must, if it is to be well- and not ill-formed.
In this way Huffman and Clowes gave grounds for thinking that Wollheim
was wrong to contest the reality of pictorial grammar on the basis that there is
nothing analogous in pictures to the ‘well-formed strings’ of grammatical sentences
because it cannot be said of them, as it can of language, that ‘the meaning [of a
well-formed] string … is determined by its syntactical form plus [its] vocabulary,
or lexicon’. A further distinctive feature of Huffman’s and Clowes’s arguments is
that they maintained that we interpret shapes from the bottom-up, irst of all by
identifying simple structures like line junctions which are governed by only a few
rules, then by identifying the larger, coherent structures these create, and ultimately
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 13
Pictorial Grammar
by identifying the whole scene such structures form
together. Interpreting shapes this way is clearly
akin to the way that Chomsky suggests we interpret
sentences, by irst identifying their smaller units and
then combining these into larger and more complex
structures. Chomsky notoriously developed two
methods of representing the consequences of this
process, bracketing sentences and representing them in
the form of ‘trees’ (plate 9), both of which show clearly
how sentences are formed out of nesting structures,
or several levels of sense. Huffman’s and Clowes’s
diagrams have no structure of this sort because neither
they nor Guzman realized that a description of the
rules governing line junctions and their combinations
are subject to higher-level rules governing the manner
in which the picture projects, or maps, three-
dimensional spatial relations into two dimensions.60
But in characterizing the ‘drawing system’ at work in
the picture as a set of rules of precisely this sort, Willats
made it possible to see how it was also the top level of
pictorial syntax to which all subsidiary structures are
subservient.
The realization that projection was part of pictorial
syntax was only implicit in the research Willats
published in the book, Drawing Systems of 1972, co-
authored with Fred Dubery. Nonetheless, this text
offered a clear description of how the various different
methods of projection each impart a speciic character
to vertices and line junctions because they articulate
them differently. As Willats later made explicit, the
same vertex or junction can look quite different in
7 Impossible polyhedron, igure 2e from David Huffman, ‘Impossible objects as nonsense sentences’, in Bernard Meltzer and Donald Michie, eds, Machine Intelligence, vol. 6, Edinburgh, 1971, 295–323. Photo: Courtesy of Edinburgh University Press.
8 Look up list of labelled line junctions in drawings of rectangular objects, igure 5.4 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press. Originally igure 6 from David Huffman, ‘Impossible objects as nonsense sentences’, in Bernard Meltzer and Donald Michie, eds, Machine Intelligence, vol. 6, Edinburgh, 1971, 295–323. Photo: Courtesy of Edinburgh University Press.
9 Tree diagram of the surface structure of the sentence, ‘A wise man is honest’, from Noam Chomsky, ‘Linguistic contribution: present’ [1967], in Language and Mind, Cambridge, 2006 [1968], 26. Photo: © Cambridge University Press.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
© Association of Art Historians 2011 14
Paul Smith
vertical oblique projection (plate 10), for example, from
how it looks in oblique projection proper (plate 11).
More importantly, his early research insisted that not
all projection systems bestow a particular ‘secondary
geometry’ on shapes as a function of their ‘primary
geometry’, or because they map spatial relationships
in a scene as they appear from a single viewpoint.61 He
argued instead that some drawing systems can only
be understood in terms of their secondary geometry
alone, or through how they map relationships in a
scene without reference to viewpoint.
Willats and Dubery coined the term ‘drawing
system’ to identify the different projection systems
that fell under this more comprehensive deinition.
And, crucially, they were able to show that all forms of
vanishing-point perspective and parallel projection,
including complex systems such as three-point and
inverted perspective, or isometric and axonometric
projection, could be explained by systematic rules.
Later, in Art and Representation, Willats extended the
concept of the drawing system to very simple
projection systems that do not appear to be rule-
governed at all at irst glance. Perhaps most remarkably
he even discerned a kind of regularity in the
‘enclosure’ system used by very young children (plate
12). Even though such ‘enclosures’ do not map shape
systematically but merely describe brute extension in
individual shapes, they none the less give some sense of
the basic topological relations – such as adjacency – that
obtain between the elements of an object.62
10 Drawing of a rectilinear object in vertical oblique projection, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.
11 Drawing of a rectilinear object in oblique projection, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.
12 Drawing of a rectilinear object in enclosure, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
© Association of Art Historians 2011 15
Pictorial Grammar
Although it was shortly after the publication of
Drawing Systems that Willats realized his conception of
projection had analogies in Chomsky’s conception
of syntax,63 only in his last book did he explicitly
compare the drawing system to the high-level syntax
that organizes a sentence’s components into a coherent
whole.64 And although Willats remained reticent
about exactly how the analogy might play out, he
nevertheless implies that the drawing system might
be compared to the ‘parameters’ that specify such
things as the sequence of subject, verb and object in a
particular language,65 or which determine whether
a language is in Chomskian terms ‘left-branching’
or ‘right-branching’ in structure.66 It was clear in
any event that the rules of the drawing system take
precedence over rules of the sort described by Huffman
and Clowes, and that in consequence a picture’s
units can indeed be described in terms of the kind of
branched ‘nesting’ structures into which Chomsky
analysed sentences. It is no surprise therefore that the
type of object recognition known as ‘syntactic pattern
recognition’ has proposed to analyse drawings and
visual scenes into tree diagrams of precisely this kind
(plate 13 and plate 14).67
One of the most signiicant aspects of Willats’s inal
conception of the drawing system is its claim that the
syntax of any such system organizes a picture’s parts
exclusively in its own terms. This means that a picture’s
secondary geometry will inevitably produce a pictorial
space that is neither lat, nor the same as that of its real
counterpart, but which constitutes a ‘third domain’
13 Pictorial patterns for scene A and picture F, igure 1.1 from King Sun Fu, Syntactic Pattern Recognition and Applications, irst edn, Englewood Cliffs, NJ, 1982, 2. © Prentice Hall, 1982. Photo: Reprinted by kind permission of Pearson Education Inc., Upper Saddle River, NJ.
14 Hierarchical structural descriptions of scene A and picture F, igure 1.2 from King Sun Fu, Syntactic Pattern Recognition and Applications, irst edn, Englewood Cliffs, NJ, 1982, 3. © Prentice Hall, 1982. Photo: Reprinted by kind permission of Pearson Education Inc., Upper Saddle River, NJ.
15 Picture-faces from Ludwig Wittgenstein, Lectures and Conversations on Aesthetics, Psychology and Religious Belief, ed. Cyril Barrett, Oxford, 1966, section 10, 4. Photo: By permission of John Wiley & Sons.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
© Association of Art Historians 2011 16
Paul Smith
between the two that we can recover only from its drawing system.68 This radical
argument has several consequences, the most important of which for the purposes
of my argument is that it reopens the gap between shape percepts and pictures, a
gap which – as already mentioned – had been completely overlooked in naturalistic
accounts of iconicity, and all but ignored in the early picture grammars too.
The most obvious conclusion to be drawn from Willats’s arguments about
pictorial syntax is that it is the glue that binds the elements of a picture into a
coherent whole. By corollary, then, Wollheim and Schier are not able to explain
how pictures hold together. This shortcoming is evident in Schier’s discussion of a
diagram composed of a rectangle surrounded by dots, which is taken to represent a
conference table surrounded by delegates. Even though Schier accepts that this image
uses a ‘compositional’ system the ‘grammar and lexicon’ of which are parasitic upon
‘the linguistic environment’, he maintains that ‘the signiicance of its composition’
requires ‘no special or prior stipulation’, and can be explained by its ‘quasi-natural’
‘grammar’ alone,69 or in effect its ability to replicate the structure of the scene it
represents. He therefore argues that while such a drawing may contain ‘sub-iconic
parts’ in the form of lines and dots that do not signify anything outside their broader
pictorial context, their contribution to the whole ‘is not determined conventionally’
or by the operations of syntax.70 Even excusing Schier’s curious insistence on the
sheer conventionality of linguistic grammar, which takes no account of its innate
16 Henri Matisse, Blue Nude (La Grenouille), 1952. Lithograph by Mourlot after the gouache decoupée (published in Verve, 9: 35–6, 1958), 18 × 16 cm. © Succession H. Matisse/DACS 2010.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
© Association of Art Historians 2011 17
Pictorial Grammar
basis, the problem remains that his account does not explain why we come to see
these parts differently in context from how we see them in isolation. That is, he
simply fails to see that this phenomenon can only be explained by the operations of a
syntax that is not ‘bracket-free’,71 but bracketed like Chomsky’s, or which determines
the form and sense of the units upon which it operates by combining them into
larger units themselves subordinated to yet larger units.
The problem comes into plainer view still in Schier’s account of the ability of a
‘sub-iconic’ pair of dots, which iconify nothing on their own, to iconify eyes within
a picture-face. His argument here is that they can do so in this context because
they function as units of a whole icon.72 But again this does not explain what it is
about being placed in such a context that makes the dots look so different. Indeed, it
would seem that the only credible explanation for the fact that they come to life – as
eyes – is that the grammar picture-faces mimics the peculiarly holistic grammar of
face perception.73 The point here is that this form of perceptual grammar is unique
because it binds eyes and the other elements of faces together into a meaningful
whole. Within this gestalt the dot-eyes enjoy a special relationship to each other as a
pair, to the other features of the face, and to the rim of the face; with the consequence
that they look radically different in this setting from how they look outside it. This idea
would certainly seem to explain why the same feature looks different on different
faces, and why people ind it hard to recognize a person from one facial feature alone.
If the grammar of picture-faces does in fact mimic that of face perception it
would help to explain several aspects of their peculiarity. It makes sense, for instance,
of the way that even very small changes to a schematic picture-face will normally
result in signiicant changes to its ‘expression’, as Wittgenstein observed (plate 15).74
It would also explain why the round regions denoting the breasts of the igure in
Matisse’s Blue Nude (plate 16) seem unusually compelling, without quite turning into
eyes, when viewed within the context provided by her upraised arms and torso. The
peculiarity of this kind of grammar would, in addition, account for the fact that there
is what Andrew Harrison has called a ‘minimal syntax’ to a picture-face,75 or that it
can only work when it organizes a suficient number of features within a suficiently
replete structure. As Harrison was fond of observing, a vivid demonstration of
minimal syntax is that it is all but impossible to draw the Cheshire cat disappearing.76
Too much information, as in Tenniel’s illustration, will yield a cat obscured by
foliage; but too little will produce ‘a grin without a cat’ – something Alice declared to
be ‘the most curious thing I ever saw in my life!’
The peculiar, holistic, grammar of the face also poses a problem for artists,
inasmuch as it means that faces will look different in kind from other objects in
the picture. Most artists successfully dissemble any problems of this kind, but
Cézanne sometimes overcame it, as Merleau-Ponty pointed out, by ‘painting [the]
face “as an object”’, or a mere assemblage of shapes.77 His reasons for doing so were
undoubtedly complex, but the fact that he took such drastic steps in portraits like
Woman with a Coffee-Pot demonstrates how pictorial syntax was – and is – an inescapable
reality.
Another way of describing what happens when we see a face as a meaningful
whole as opposed to a mere collection of shapes is that the syntax allows us to
perceive the relation between its parts synchronically rather than piecemeal. Pictorial
syntax also allows us to see a whole picture as a single coherent unit the meaning of
which is present all at once, just as linguistic syntax makes the sense of a complete
sentence present as such. The synchronicity of sense in both kinds of output is
explained better by the idea that syntax subsumes the meanings of the constituent
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
© Association of Art Historians 2011 18
Paul Smith
parts of a picture or a sentence within the meaning of the whole. The effect is, of
course, normally involuntary and recent research shows that unconscious impulse
towards synchronic coherence results in unexpected similarities between the ways
in which we attend to both pictures and language. Robert Solso has demonstrated,
for example, that a spectator familiar with pictorial conventions will move her eyes
around a painting in an attempt to discover ‘thematic patterns’ in it,78 rather as
Alan Kennedy has shown that the eye movements we deploy when reading are not
linear, but involve both retrospective movements designed to check the sense we
have already acquired and proleptic movements used to anticipate the sense we are
about to encounter.79 At all events, such indings strongly suggest that it is simplistic
to assert that a painting is perceived instantaneously while a piece of poetry is heard
sequentially.80 Rather, it would seem that syntax bestows a degree of synchronicity
upon both.
3. Grammaticality
Willats’s conception of pictorial grammar as something that results in structures
analogous to well-formed strings – or words arranged in grammatically correct
structures – implies that pictures will normally display a form of grammatical
correctness.
A simple example of a well-formed pictorial string is a line junction that
obeys the rules of occlusion described by Huffman and others. To be well formed
at a higher level, a picture must use projection systematically. The majority of
pictures observe these rules and look right as a consequence, so much so that their
grammaticality is elusive. This obedience becomes apparent when they are compared
with pictures that do contravene the rules. Huffman used drawings of anomalous
objects in this spirit, demonstrating how ungrammatical combinations of lines
will render a picture spatially incoherent in ways foreign to well-formed drawings.
So too, Willats argues that many Byzantine and ‘Orthodox’ paintings employing
several drawing systems simultaneously look anomalous, nowadays, since they
contravene the rule that projection should be consistent throughout the picture, as
it is in photographs and paintings in perspective.81 It is clear, then, that Wollheim’s
protestations against the possibility of ‘ill-formed
or ungrammatical’ strings in pictures, and Schier’s
denial of the very possibility of an ‘ill-formed [iconic]
symbol’ genuinely analogous to ‘a sentence that is
ungrammatical in Chomsky’s sense’,82 simply run
counter to the phenomenal facts.
If grammar operates throughout every level of a
picture’s structure, it must not only employ a drawing
system and a denotation system that are each internally
consistent to qualify as well formed, but also ensure
that these are compatible with each other. Most of
the time this is the case, and, conversely, when a
picture’s drawing system clashes with its denotation
system its ill-formedness is normally patent. Willats’s
argument on this count is complex, however, since his
ideas about compatibility involve a dense distinction
between drawing and denotation systems that are
‘view-based’ and those that are ‘object-based’.
The basis of this distinction is that either kind
17 Drawing of a rectilinear object in fold-out projection, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 19
Pictorial Grammar
of system maps a corresponding kind of percept. View-based systems are simple
enough in this regard, being projection systems such as perspective which map how
spatial relations look from a particular point of view, or denotation systems which
use primitives like contours to express the views presented by edges and surfaces.
By contrast, object-based systems articulate the more elusive percepts that David
Marr called ‘object-centred descriptions’, or the mnemonic descriptions of shapes in
general, from no particular point of view, which we derive from objects by tracking
their silhouettes back to the three-dimensional forms capable of generating them.83
Although controversial, Marr’s arguments in favour of the existence of object-
centred descriptions are compelling.84 It would be dificult to understand how we
could recognize things unless we could match the views they present to relatively
simple representations of invariant shapes of a kind our limited memory is capable
of storing. Moreover, only recourse to something like object-centred descriptions
can explain how we can be aware of the solidity and depth objects have even though
they only present us with their surfaces. At all events, Willats argues persuasively that
18 Paul Cézanne, Victor Chocquet Seated, c. 1877. Oil on canvas, 46 × 38 cm. Columbus, OH: Columbus Museum of Art (Howald Fund Purchase, 1950.024). Photo: Columbus Museum of Art.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
© Association of Art Historians 2011 20
Paul Smith
many projection systems are predominantly object-based, including several forms
of parallel projection, since these preserve the relative lengths of the sides of objects
as they really are, rather than as they appear. So too are primitives of the kind Willats
calls ‘regions’, which stand for entire volumes rather than for surfaces as they appear.
The ‘stick’ regions children use to represent limbs, for example, do exactly this, and
should not be mistaken for rudimentary contours.
Because the two kinds of system are fundamentally different, trouble inevitably
arises when a picture combines a projection system of one kind with a denotation
system of the other. Children, for example, get into dificulties as they mature when
they try to project views while remaining attached to the regions they had formerly
deployed in drawings projected by enclosure. So, for example, when they experiment
with ‘fold-out’ projection in an attempt to capture how the sides of a rectilinear
object appear to join up (plate 17), they still use regions to denote the objective shape
of these sides, with the result that their drawings look very odd indeed.85 Willats also
identiies several other uncomfortable combinations of projection and denotation
system,86 which further support the claim that a grammatical relation between the
two kinds of systems is normative. Ironically enough, so does Schier’s conference dot
diagram, since it is an ill-formed icon of the very kind whose possibility he denies.
The more speciic problem with it is that both its drawing system and its denotation
19 Paul Cézanne, Pot of Primroses and Fruit, c. 1888–90. Oil on canvas, 46 × 56.25 cm. London: The Courtauld Gallery (Samuel Courtauld Bequest 1948). Photo: © The Samuel Courtauld Trust.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
© Association of Art Historians 2011 21
Pictorial Grammar
system are ambiguous. So it could be that it uses orthogonal projection to render a
bird’s-eye view of the contours of the table and the surrounding delegates, but it can
just as easily be seen as employing enclosure to articulate the same items in the form
of regions. It is unclear in any case whether we should interpret it according to the
rules of view-based or object-based systems.
Schier’s diagram is it for purpose nonetheless. One reason for this is that we
do not require diagrams to be consistent throughout in how they represent things.
But, according to Willats, we also ind images of this kind acceptable because we are
accustomed to the fact that there are ‘degrees of grammaticalness’ in pictures, just
as there are in normal sentences according to Chomsky.87 But Willats also argues
further that we do not simply put up with ill-formed icons, but sometimes relish the
way they bend the rules, just as we enjoy grammatical play in poetry.88
Willats gives numerous examples of such creative play in pictures. He shows, for
example, how Klee creates a range of meanings by louting Huffman’s rule that a line
indicating a leading edge along one section of its course must not represent the leeing
edge of the same surface along another.89 Willats also gives numerous examples of
the playful use of the anomaly known as ‘accidental’ or ‘false attachment’,90 which
describes what happens when lines belonging to objects located at different depths
are allowed to join up, or to run into one another, with the result that they appear
to lie in the same plane. Cézanne exploited false attachment to considerable effect
20 Paul Cézanne, Still Life with Commode, c. 1887–88. Oil on canvas, 62.2 × 78.7 cm. Cambridge, MA: Fogg Art Museum, Harvard University Art Museums (Bequest from the Collection of Maurice Wertheim, Class 1906). Photo: The Bridgeman Art Library.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
© Association of Art Historians 2011 22
Paul Smith
in his Victor Chocquet Seated of 1877 (plate 18), where a number of lines converge upon
the sitter’s wrist, making it look as though the planes to which they correspond are
‘pinned’ together at this point.91 It needs emphasizing, however, that pictures of this
sort are not completely ungrammatical, as they can only perform their tricks if they
are grammatical for the most part.92
A more radical subversion of pictorial grammar employed by Cézanne and
the cubists is effected by the device known as ‘passage’, which describes their use
of marks interposed between or across occluding contours. Passage is particularly
evident in the area immediately to the right of the right edge of the pear in the lower
centre of Pot of Primroses and Fruit of c. 1888–90 (plate 19), where it elides the transition
between the base of the pot and the tabletop. Passage is signiicant because it is
unclear what kind of device it is. It might at a pinch be seen as lexical constituent
of the painting that Cézanne used in order to register the vacillation edges undergo
when ixated;93 but it can just as easily be regarded as a ‘functional’ constituent of
the work that Cézanne used more synthetically to play down the breaks in depth
that contours normally create. This means that, if so, it only has meaning in so far
as it intervenes in the relationships between the picture’s lexical items, just as the
‘functional’ constituents of a sentence (conjunctions, prepositions, ‘determiners’ like
‘the’ and ‘a’, and ‘complementizers’, including auxiliaries and ‘modals’ that modify
verbs) only contribute to its sense by specifying relations between its lexical parts.
But although words that are ambiguous between lexical and functional constituents
are largely absent from language, this does nothing to threaten the idea that passage
is a form of creative ungrammaticality. It merely points to the fact that pictorial
grammar is different from its linguistic counterpart.
Cézanne also exploited higher-level grammatical anomalies by employing
more than one projection system in the same work. Willats argues that Cézanne
did this in his Still Life with Commode of 1887–88 (plate 20) by employing one form of
projection approximating to vertical oblique for the table at the front and another
approximating to horizontal oblique for the commode behind it. The two systems
are not massively anomalous, however, especially since Cézanne used neither strictly,
and may in fact have been using a disguised form of oblique projection for the
commode.94 So the picture is not just an ill-formed icon but a work that ‘warps’ space
for expressive effect.95
Adults’ pictures can couple projection and denotation systems anomalously, just
as children’s do. Willats shows, for example, that van Gogh resorted to combinations
of this kind when he used vanishing-point perspective in conjunction with regions.96
Arguably, however, Cézanne used more complex and ambiguous combinations,
employing forms of parallel projection that render reasonably convincing views
while preserving aspects of the objective character of shapes, and employing
primitives in the form of areas of contrasting tone and colour that serve equally well
as contours denoting edges, or the boundaries of regions denoting three-dimensional
shapes. In Still Life with Commode, at any rate, loosely deined parallel projection systems
appear hand-in-hand with ambiguous contrasts of this sort, creating an expressive
ambiguity about whether the scene is a view or a more objective depiction.97
The expressive potential of parallel drawing systems is not analysed in great
detail by Willats, but it is implied by his equivocation over their nature, which
he characterizes as view-based in some places and as object-based in others. It is
apparent, in any case, that such systems are inherently ambivalent. For one thing, a
drawing in parallel projection produced from memory is, by transforming an object-
centred description, normally plausible as a view as long as it observes the rules of
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
© Association of Art Historians 2011 23
Pictorial Grammar
occlusion. For another, parallel projection preserves the to-and-fro that occurs within
perceptual acts, between the object-centred description and views, or how we both
infer object-centred descriptions from views and use them to ill out our sense of
the objects that we view. It would therefore seem that it is their very ambivalence
that makes the parallel projection expressive. At any event, Cézanne seems to have
exploited their dual nature to such ends, capitalizing on this duality’s capacity to
produce a sense of shape in something approximating to its plenitude.
The coherence and usefulness of the idea of pictorial grammaticality
notwithstanding, it has to be admitted that language possesses a far richer variety
of constituents than pictures. In the irst place, its ‘lexical’ constituents fall into
a number of classes and can perform a wide variety of roles, most of which are
impossible for the narrow range of constituents to be found in pictures.98 Language
also possesses multifarious functional constituents, which can generate complex
syntactic relationships between its lexical parts, whereas nothing of the sort applies to
pictures. Wollheim is therefore on irm ground when he claims that pictures cannot
be segmented ‘into parts that can be categorised … according to the contribution
they make to the meaning of the whole’,99 or that we have no need to classify the
‘basic units’ of pictures into ‘general categories’ equivalent to ‘noun, verb, adverb’
etc. But such objections do nothing to undermine the idea of pictorial grammar
per se since parsing a picture only ever need involve distinguishing a few kinds of
primitives. And even these would seem to have some lexical diversity, and perhaps also
at least a suggestion of functional capacity.
4. Innateness
A key assumption in Willats’s account, which sharply differentiates it from
conventionalist alternatives, is that pictorial competence is innately grounded in
the same way as language according to Chomsky.100 Chomsky argues that children
develop the ability to understand the grammar of their own language because they
are born with ‘universal grammar’, or the basic generative principles underlying the
possibility of all actual grammatical structures. Hence, simply through exposure
to their native language, they can parse and understand what they hear, and
begin to generate grammatical sentences.101 Put another way, Chomsky argues
that children learn their own language spontaneously, by acquiring ‘I-language’,
or the ‘internalized language’ consisting of the rules governing all of its possible
sentences.102 In short, then, it is an innate linguistic competence that grounds
children’s ability to make ‘ininite use of inite means’.103 And, for Willats, the same is
true of their ability to draw.104
By way of evidence, Willats shows that a child who has learned to use a particular
drawing system to represent one kind of object already has a more general command
of the system, and so can use it inventively – even when it is enclosure – to depict
an enormously wide range of shapes.105 He also points out that children come
to command more and more complex rules of depiction as they mature,106 just
as they deploy increasingly complex grammatical rules in their speech. This is
implicitly because an increasingly sophisticated command of grammar manifests
itself spontaneously in their brains as these develop. This conclusion is supported
by the fact that children’s drawing competence develops in a series of stages whose
sequence is normative. It appears, at any event, that they normally begin to draw
using regions within enclosure to represent volumes, then move on to using more
reined regions, typically lines, to represent cylindrical volumes such as limbs, and
afterwards adopt the use of contours to denote edges within an orthogonal projection
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 24
Paul Smith
system. Moreover, it is only once having mastered this system that children learn
to command rudimentary parallel projection systems, and eventually learn to
use oblique projection proper when they reach artistic maturity. By corollary,
Willats seems to imply that children will not learn vanishing-point perspective
spontaneously, but must be taught its rules,107 which would explain why it appears
so late in ontogenetic development, when it does at all. The point to emphasize is
that this sequence is apparent in many cultures.108 Willats shows, for example, that
most Western children’s drawing of the human igure almost always progresses from
‘tadpole’ igures to ‘stick’ igures, and then to igures drawn in contours (plate 21).
And he even demonstrates that many of the drawings that researchers elicited from
children of the Jiri valley in Papua New Guinea, who were wholly unused to the
practice of drawing, exhibit a similar morphology (plate 22).
21 Drawings of a man made by Western children of successively greater ages, Florence L. Goodenough Collection, Penn State University Archives, Pennsylvania State University Libraries. Photo: With permission. Figure 13.7 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997.
22 Drawings by children from the Jiri valley in Papua New Guinea, adapted from igure 13.7 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press. Originally from igures 2 and 3 from Margaret Martlew and Kevin J. Connolly, ‘Human igure drawings by schooled and unschooled children in Papua New Guinea’, Child Development, 67: 6, December 1996, 2743–62). Photo: By permission of John Wiley & Sons.
23 A drawing of a scene in oblique projection that includes the transparency error, from John Willats, ‘How children learn to represent three-dimensional space in drawings’, in G. Butterworth, ed., The Child’s Representation of the World, London, 1977, 189–202. Photo: With kind permission of Springer Science and Business Media.
1
2
3
4
5
6
7
8
9
10
11
12
© Association of Art Historians 2011 25
Pictorial Grammar
The fact that children’s ability to understand speech and drawings greatly
outstrips their ability to produce them may, by contrast, seem to undermine the
argument that innate grammar underlies both kinds of competence. It simply goes to
show, however, that producing output in either form is complicated by the need to
co-ordinate grammatical competence with performative, and predominantly motor,
skills.109 Indeed, Harrison has argued that the capacity of most people to recognize
pictures of far greater sophistication than they can produce can only be explained by
the existence of a ‘generative grammar’ of the pictorial that we all share.110
The fact that children characteristically make errors in both their speech
and their drawings could also seem to contradict the idea of innate grammatical
competence. But as far as speech goes, it is clear that many of their errors are
grammatical in the sense that they occur when a child applies a rule that works
perfectly well for one situation to another for which it is inappropriate, for instance
when they add ‘ed’ to the root of an irregular verb to turn it into the past tense.111
Willats argues in a similar vein that some of the errors in children’s drawings can
be attributed to misapplications of grammatical rules.112 An example might be the
‘transparency error’, which children commit when they draw the line denoting
the edge of a solid object which is occluded – as happens in plate 23, where the line
denoting the top edge of the table top is visible through the box in front.113 Such
errors can at any rate be seen as the result of a failure on the part of the child to grasp
how the particular task in hand demands a modiication to the rules for drawing
shapes she had previously acquired. What is more, the fact that children often learn
to correct such mistakes themselves without having to be taught the error of their
ways suggests that they possess an innate sense of what can count as grammatical
drawing, just as they can work out what constitutes grammatical speech.114 And
indeed, in this drawing, the child spontaneously added shading to obscure the
illegitimate contour.
Even the almost complete immunity children exhibit towards adults’ attempts
to correct their errors can be attributed to their developing command of grammar.
They will, in other words, normally only elect to use the conventional norms when
their competence has reached the point where they are ready to do so.115 The broader
signiicance of the argument that competence is innately grounded is that it makes
sense of facts that naturalists like Schier acknowledge but cannot explain. So, for
example, Schier acknowledges that our ability to recognize a wide range of pictures
is ‘generative’, in the sense of productive, and that our capacity to produce a wide
variety of icons exhibits ‘generativity’ as well,116 rather as Wollheim seems to,117 but
denies that these competences are grounded in any innate grammar. This means that
his account offers no explanation of what makes generativity possible. But Willats’s
argument that we are born with the capacity to map shapes and their relations ills
the explanatory gap perfectly.
5. Transformation
It is apparent from the general drift of Willats’s account that his conception of the
mapping rules implemented in pictures is indebted to Chomksy’s thinking. His use
of the word ‘transformations’118 to describe the forms in which pictures render their
‘deep’ perceptual content also makes a clear allusion to Chomsky’s early idea that
a sentence transforms a deep grammatical structure into a surface structure. To all
intents and purposes, therefore, Willats would seem to regard pictorial grammar as
analogous to the universal grammar that Chomsky envisaged in the form of a set of
algorithms determining the ‘transformations’ speech makes out of its underlying
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 26
Paul Smith
contents. Admittedly, this conception of grammar
begs some intractable questions about what exactly
the substrate to speech consists of.119 But if it is
reasonable to assume that language maps some pre-
articulate linguistic equivalent of thought, or even
thought itself, into a public form,120 then by analogy
it seems reasonable to assume that pictures transform
shape percepts into a form that makes them publicly
available.121 This idea – or its principle – inds support
in the fact that computer programs capable of mapping
shape descriptions into visual form are readily
written.122
A more speciic similarity between the
transformational work done by pictures and language
is that both employ forms of ‘deletion’ when they
convert their deep content into surface output.
Hence, the surface structure of the sentence, ‘A
wise man is honest’, is more concise than the deep
structure it corresponds to (plate 24), since, typically, it
eliminates syntactic constituents that its counterpart
makes explicit.123 Notwithstanding, deletion of this
kind has no serious deleterious effect on the sense
of most sentences. Something similar, although
not identical, is true of deletion in line drawings,
which only inform us about the edges of objects
and their relations at the cost of excluding semantic
information about the shape of objects provided
by shading. They can, as a result, be a little unclear
about the contiguity of, or distance between, objects,
compared with fully elaborated drawings employing
cast shadows.124 But most of the time, line drawings
provide perfectly good working representations of
objects and scenes, perhaps because they map what
Marr calls the ‘2½D sketch’, or the skeletal preview of
a scene that we construct for the purposes of grasping
its main features prior to forming it in detail.125 In
short, deletion allows both forms of output to be more
succinct than the underlying contents they transform,
enabling us to articulate meaning eficiently. And,
although he opposes the idea of pictorial grammar
altogether, Wollheim nevertheless seems to conirm
this conclusion since he subscribes to the idea that
deletion of a kind described by Chomsky operates
in pictures, arguing more speciically that it plays a
role within ‘thematization’, or the process whereby
the artist gives salience to particular features of the
eventual painting at the expense of features that were
signiicant in its earlier stages.126
The kinds of deletion at work in language and
pictures are nevertheless different inasmuch as it
24 Tree diagram of the deep structure of the sentence, ‘A wise man is honest’, from Noam Chomsky, ‘Linguistic contribution: present’ [1967], in Language and Mind, Cambridge, 2006 [1968], 26. Photo: © Cambridge University Press.
25 Example of stimulus objects in an experiment on the perception of degraded objects, igure 16 from Irving Biederman, ‘Recognition-by-components: A theory of human image understanding’, Psychological Review, 94: 2, 1987, 115–47. © 1987, American Psychological Association. Photo: Reprinted with permission.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
© Association of Art Historians 2011 27
Pictorial Grammar
is syntactical elements that are most readily abbreviated in the one and semantic
elements in the other. Hence, while speech from which syntactic information has
been deleted is normally clear, removing the same kind of information from line
drawings can have a catastrophic effect. Biederman has produced drawings, for
example, which show that while it is perfectly possible to remove a segment of a
continuous line without making it particularly dificult to grasp the shape it implies,
eliminating a vertex or cusp with the function of specifying the relationships
between the adjoining edges of the same object can sometimes make it almost
impossible to decipher its shape (plate 25).127 Hence, and although they are meant
principally to illustrate our perception of objects, his drawings demonstrate that
pictures of mundane objects and not just faces must possess a ‘minimal syntax’.
Transformational rules are also among the things that allow language the capacity
for ‘movement’,128 which means that it is possible to change the sense of some
sentences by altering the order of their constituent phrases and words. Movement
thus allows us to turn a statement into a question or a modal expression such as a
surmise or speculation with ease, whereas pictures can do nothing of the kind.129 So
too, language can embed phrases within a sentence ‘recursively’ – most obviously
perhaps in The House that Jack Built, where ‘that’ is used in each iteration to embed
another phrase within the sentence enunciated – while nothing strictly analogous is
possible in pictures, as Wollheim rightly points out (pictures within pictures being
most closely analogous to parentheses). But neither of these limitations implies that
pictures have no grammar at all. Rather, and once more, they merely show that
pictures use a relatively simple grammar appropriate to their particular content.
Nature vs Nurture
Any language must be at least potentially accessible to its users, since a logically ‘private
language’ is not a language at all.130 One way of explaining how all languages are
capable of being understood in principle – provided their vocabulary is learned – is
that they transform the same underlying content according to rules derived from an
innate, universal grammar. Language is thus in a very strong, and signiicant, sense
‘natural’ in the last instance, rather than purely or wholly ‘conventional’. Schier
evidently fails to grasp this argument. But Willats is almost equally obtuse when he
argues that language is ‘conventional’ unlike drawing.131 It is, of course, true that
words are unlike picture primitives inasmuch as the form of any primitive will map
some of the properties of what it stands for. Hence a line which is darker than the
igure it encloses will map a contour more effectively than one that is lighter.132 It
is also the case that while most sentences do not mirror the structure of the events
they refer to, any regular drawing system will preserve at least some of the spatial
relationships obtaining in the scene it represents. However, neither of these differences
means that language is wholly conventional while depiction is wholly natural. Rather,
the difference between language and depiction is less substantive, and narrower.
As regards the role of convention in language, and setting aside the issue of
the lexicon, the most obvious way in which convention determines the surface
structures of any particular language is that it sets its many parameters arbitrarily.
Convention can nevertheless only set parameters that universal grammar makes
available, and it must set them according to the small set of options it allows. This
means that many of the widely divergent forms of different languages can be
attributed to the different ways in which they happen to set the same parameters.
The word orders of English and Japanese, for instance, are grossly different,133 but
the structures of both are consistent with the possibilities sanctioned by universal
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 28
Paul Smith
grammar. And, by corollary, only very few languages have the form OVS, while OSV
is almost non-existent, which suggests that universal grammar makes them very
dificult, or nigh-on impossible, to generate.
By analogy, or just as the widely different ‘particular grammar(s)’ of all
languages are generated by the same innate rules in the last instance,134 so the
diverse surface pictorial ‘grammars’ sanctioned by the various academies and
different cultures all express rules consistent with universal pictorial grammar.
This suggests that convention gets into depiction by a more elliptical route then we
commonly imagine, by deciding which of the rules made available by universal
pictorial grammar will be used in practice. So, for example, while a particular
culture’s preference for certain forms of drawing and denotation systems is a matter
of convention,135 all drawing systems are still ultimately generated by universal
pictorial grammar. By the same token, if it is convention that sanctions the use of
widely different kinds of mark among different cultures and groups for establishing
the same picture primitives, it is universal pictorial grammar that permits a line
to stand for an edge or a luminous spot to refer to a point in space. Moreover,
although it may be convention that decides what counts as an acceptable degree of
grammaticality in pictures, it is universal pictorial grammar that makes it possible to
speak of pictorial grammaticality at all.136
Language and depiction are therefore similar in the respect that convention
shapes the output they generate on the basis of innate universal grammar. But
they are different inasmuch as pictorial form is closely constrained by its innate
grammar to conform to a limited set of norms whereas language can assume a wide
multiplicity of particular, and extremely complex, grammatical forms. Put more
prosaically, the effect of convention on pictures is small beer compared to its effect
upon speech. So it is that the vast majority of pictures are transparent to people of all
cultures, whereas the different languages are relatively opaque to non-native speakers
irrespective of the question of unfamiliar vocabulary.
Conclusions
It is fair to say then, that one major achievement of Willats’s account of representation,
suitably modiied, is that by furnishing a conception of pictorial grammar as
something to which nature and convention both contribute it can relieve the profound
antinomy between pictorial syntax and iconicity that dogged earlier debates about
representation. But perhaps the most general conclusion to be drawn from Willats’s
work is that grammatical structure is central to how, and what, pictures articulate.
This basic idea has many possibilities. One is that a more developed theory of
pictorial grammar might allow us to specify more closely just what pictures can, and
cannot, achieve by pictorial means alone, apart from brute depiction. For example, it
still needs to be decided whether a picture can represent a particular object, or kind
of object, by showing a cluster of visible properties alone, or whether it can only do
so by recruiting concepts from language. By corollary, the same sort of analysis might
make it clearer how pictures are dependent on the services of concepts when they
show particular aspects of things,137 and on more complex alien structures imported
from language when they make statements,138 or tell stories.139
Closer attention to the constraints imposed by grammatical structure might
also produce a different, or at any rate a more complete, understanding of why
some pictorial forms evolved as they did than is presently available. So, for instance,
although the composite forms used to articulate the human igure in antique Greek
vase painting are readily described as conventional schemata, it might enrich our
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
© Association of Art Historians 2011 29
Pictorial Grammar
Notes
This article owes much to the writings of and conversation with
several colleagues and friends, especially John Willats, Jason
Gaiger, Andrew Harrison, Michael Podro, and Richard Wollheim.
I am also indebted to the J. Paul Getty Trust for a scholar grant
which allowed me to pursue the research it presents.
1 On ‘capacities’, see Noam Chomsky, Rules and Representations, Oxford,
1980. 4–5.
2 Throughout, ‘picture’ is used in the strict sense to mean
‘representational picture’.
3 Richard Wollheim, ‘Relections on Art and Illusion’, in On Art and the Mind,
London, 1973, 261–89, esp. 266–7.
4 Noam Chomsky, Syntactic Structures, The Hague, 1957, and Aspects of the
Theory of Syntax, Cambridge, MA, 1965.
5 For an outstandingly lucid account of Chomsky’s thinking organized
according to its various phases, see Vivian Cook and Mark Newson,
Chomsky’s Universal Grammar: An Introduction, third edn, Oxford, 2007. For a
perceptive account of his theories from a broader perspective, see Neil
Smith, Chomsky: Ideas and Ideals, second edn, Cambridge, 2004.
6 We can do this by applying a small number of overarching ‘principles’
governing the broad possibilities of sentence and phrase construction,
and a series of ‘parameters’ governing word and phrase order, which
together constitute a relatively simple hierarchical system capable of
generating all the speciic rules required. See Smith, Chomsky, 60–94.
7 John Willats, Art and Representation: New Principles in the Analysis of Pictures,
Princeton, NJ, 1997, xii.
8 John Willats, Art and Representation, 279. For a review of the literature,
see W. F. Miller and A. C. Shaw, ‘Linguistic methods in picture
processing: A survey’, AFIPS Joint Computer Conferences: Proceedings of the
December 9–11, 1968 (fall joint computer conference), part 1, 279–90.
9 John Willats, Making Sense of Children’s Drawings, Mahwah, NJ, 2005, ix–x.
10 Willats, Art and Representation, xii and 19–20, and Children’s Drawings, x.
11 See David Marr, Vision: A Computational Investigation into Human Representation
and Processing of Visual Information, New York, 1982, 28–9 and 356–7, and
Noam Chomsky, Aspects of the Theory of Syntax, 14.
12 Criticisms of Chomsky’s early work often misconstrued what he
said, particularly about ‘deep structure’. See ‘Deep Structure’ [1975],
reprinted in Noam Chomsky, On Language, New York, 2007, esp. 171–2.
For an over-zealous attack on Chomsky’s method, see Robert D. Levine
and Paul M. Postal, ‘A corrupted linguistics’, in Peter Collier and David
Horowitz, eds, The Anti Chomsky Reader, San Francisco, CA, 2004.
understanding of them to appreciate how they are determined by the orthogonal
projection system demanded by curved, rotatable, surfaces that do not readily tolerate
more sophisticated drawing systems dependent on a ixed viewpoint.
Another possibility of a theory of pictorial grammar is that it might reveal how
mapping rules apply to colour, as well as to drawing, in painting. More speciically,
it might be capable of expressing what kind of regularities must obtain between
colours in a representational painting if they are to stand successfully for colours
in the world, which they cannot do punctually because chromatic effects like
contrast,140 and chromostereopsis, operate differently on a lat surface from how
they operate in depth.
Willats’s radical idea that the grammar of projection makes all pictorial space sui
generis might be extended, as well, into an analysis of the space produced by artists’
individual styles of drawing, particularly if combined with his idea of how artists
disguise their drawing systems and play with the rules of concatenation. An analysis
of this sort might reveal, for example, how Cézanne’s peculiarly elastic space lends
itself to the expression of our tendency inside acts of seeing to probe the world
before us for the meanings it has for our potentially grasping physical hands and
mobile body.
Willats’s more speciic ideas about the more basic semantic constituents of
pictures have a similar potential inasmuch as they might explain how an artist’s
individual style is closely dependent on such things as the particular manner in
which it articulates pictorial features like edges. More particularly, the notion taken
from Frédo Durand that pictorial marks bestow ‘attributes’, including colour, tone,
transparency, texture, thickness, ‘wiggling’ and orientation upon lines and other
pictures primitives, might start to explain why some artists’ way of using marks gives
the objects and spatial relations in their pictures a characteristic look of their own.141
In sum, Willats’s account of pictorial grammar is pregnant with possibilities for
specifying the historical and aesthetic particularity of pictures and for explaining
how their different forms came about, because it offers a way of analysing pictorial
structure more precisely than previously, and at the same time vividly demonstrates
just how, and to what extent, this structure is responsible for the meanings and effects
that pictures produce.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33343536373839
40414243444546474849505152
© Association of Art Historians 2011 30
Paul Smith
13 Yve-Alain Bois, ‘The semiology of cubism’, in Lynn Zelevansky, ed.,
Picasso and Braque: A Symposium, New York, 1992, 169–221. Wollheim also
refers to Rosalind Kraus, ‘The motivation of the sign’, in Zelevansky,
Picasso and Braque, 261–305.
14 Richard Wollheim, Formalism and its Kinds, Fundació Antoni Tàpies,
Barcelona, 1995
15 Bois, ‘The semiology of cubism’, 173. For an argument in favour of the
existence of the ‘conventional sign’ in painting almost contemporary
to cubism, see Teodor de Wyzewa, ‘Wagnerian Painting’ [1897]. trans.
Paul Smith, in Charles Harrison and Paul Wood, eds, Art in Theory: An
Anthology of Changing Ideas, Oxford, 1992, 18.
16 Bois, ‘The semiology of cubism’, 173–4. By contrast Chomsky
argues that language has ‘ininite’ productive capacity, implying that
phonocentrism can only impose practical limits on what it can say.
See Chomsky, Theory of Syntax, 8. Cf. Steven Pinker, The Language Instinct:
The New Science of Language and Mind, London, 1995, 84.
17 Bois, ‘The semiology of cubism’, 174. Cf. Kraus, ‘The motivation of
the sign’, 262–3, and Rosalind Kraus, ‘In the Name of Picasso’ [1981],
in The Originality of the Avant-Garde and Other Modernist Myths, Cambridge,
MA, 1996, 34–7.
18 Cf. Andrew Harrison, ‘A minimal syntax for the pictorial: The
pictorial and the linguistic – analogies and disanalogies’, in Salim
Kemal and Iwan Gaskell, eds, The Language of Art History, Cambridge,
1991, 213, which argues that any account of linguistic or pictorial
meaning that takes no account of reference is ‘absurd’.
19 Ludwig Wittgenstein, On Certainty, Oxford, 1969, §160, and Anthony
Grayling, Wittgenstein, Oxford, 1996, 95–6.
20 Wollheim also attacks Kraus here. See Wollheim, Formalism and its Kinds,
28 and 38, note 12. Cf. the argument that cubism is not about ‘the
nature of the sign’ in Flint Schier, ‘Painting after art? Comments on
Wollheim’, in Norman Bryson, Michael Ann Holly, and Keith Moxey,
eds, Visual Theory: Painting and Interpretation, Cambridge, 1991, 154.
21 Richard Wollheim, Painting as an Art, London, 1987, 43–100. For a
succinct analysis of Wollheim’s position, see Harrison, ‘Minimal
syntax’, 220–4.
22 Wollheim, Formalism and its Kinds, 26–7. All subsequent quotations
from Wollheim are from this section of text. The essay is reprinted,
in an abbreviated and modiied form, as ‘On formalism and pictorial
organisation’, Journal of Aesthetics and Art Criticism, 59: 2, Spring 2001,
127–37.
23 Cf. Flint Schier, Deeper into Pictures: An Essay on Pictorial Representation,
Cambridge, 1986, 150, which paraphrases Roger Scruton’s criticism
of Barthes’ semiotics for its inability to adduce a ‘grammatical
rule’ connecting the meaning of a system’s ‘signiicant parts’ to the
meaning of the ‘ensemble’.
24 Wollheim, Formalism and its Kinds, 1995, 37, note 10.
25 Schier, Deeper into Pictures, 67–8.
26 Schier, Deeper into Pictures, 67.
27 Schier, Deeper into Pictures, 65–6.
28 Schier, Deeper into Pictures, 66.
29 Schier, Deeper into Pictures, 66–7.
30 Chomsky, Theory of Syntax, vi. Cf. Max Clowes, ‘On seeing things’,
Artiicial Intelligence, 2: 1, Spring 1971, 79–116, esp. 80.
31 Harrison, ‘Minimal syntax’, 216–17 and 225–7. For the view that
pictures cannot map appearances constituted by Gestalt and constancy
effects (inter al.), see E. H. Gombrich in ‘Mirror and Map’ [1975], in The
Image and the Eye, Oxford, 1982, 172–214.
32 Cited in Noam Chomsky, ‘On Cognitive Capacity’ [1975], reprinted in
On Language, 3–35, esp. 8.
33 Michael Podro, ‘Formal elements and theories of modern art’, British
Journal of Aesthetics, 6: 4, 1966, 329–38, esp. 335.
34 Richard Wollheim, ‘Form, elements, modernity: Reply to Michael
Podro’, The British Journal of Aesthetics, 6: 4, 1966, 339–45, esp. 344.
35 Richard Wollheim, ‘Giovanni Morelli and the origins of scientiic
connoisseurship’, On Art and the Mind: Essays and Lectures, London, 1973,
200–1.
36 Goodman, Languages of Art: An Approach to a Theory of Symbols, Indianapolis,
IN, 1976, 133.
37 Goodman, Languages of Art, 131–41.
38 Goodman, Languages of Art, 225–32.
39 Curtis Carter, ‘Painting and language: A pictorial syntax of shapes’,
Leonardo, 9: 2, Spring 1976, 111–18, esp. 112–15.
40 Carter, ‘Painting and language’, 114–16.
41 Clowes, ‘On seeing things’, 84–7, and Willats, Art and Representation, 94
(on Guzman).
42 Willats, Art and Representation, 4 and 93–100.
43 Willats, Art and Representation, 8–9, 98–100, and 220, and Children’s
Drawings, 11 and 63–8.
44 The rule was irst applied in Chomsky, Syntactic Structures, 26. Cf. Cook
and Newson, Chomsky’s Universal Grammar, 28–32. See Clowes, ‘On
seeing things’, 80–1 for the analogy with pictures.
45 On this, technical, sense of ‘generative’ (as synonymous with
‘explicit’) see Cook and Newson, Chomsky’s Universal Grammar, 35–6,
which prohibits its use as a synonym for ‘productive’. For a more
liberal view, see Neil Smith, ‘Chomsky’s science of language’, in James
McGilvray, ed., The Cambridge Companion to Chomsky, Cambridge, 2005,
21–41 and 295–6, esp. 296, note 4.
46 Willats, Children’s Drawings, 77 and 88.
47 Willats, Children’s Drawings, 82–97, and Art and Representation, 309–10, 356,
and 370.
48 Pinker, The Language Instinct, 159–61.
49 Irving Biederman, ‘Visual object recognition’, in Stephen Kosslyn
and Daniel Osherson, eds, Visual Cognition, vol. 2, 121–65, esp. 131 (on
‘natural parsing region[s]’ in the human igure) and 139 (on the geon
and parsing). Cf. Donald Hoffman and Manish Singh, ‘Vision: Form
perception’, in Lynn Nadel, ed., Encyclopedia of Cognitive Science, vol. 4,
London, 2003, 486–90.
50 Clara Casco and Daniela Guzzon, ‘The aesthetic experience of
“contour binding”’, Spatial Vision, 21: 3–5, 2008, 291–314.
51 John Willats, ‘The third domain: The role of pictorial images in
picture perception and production’, Axiomathes, 13: 1, March 2002,
1–15, esp. 11–12. Cf. Willats, Children’s Drawings, 178–9.
52 In his early television programmes, Rolf Harris used to draw in a
sequence of this kind with the intention of withholding the meaning
of the picture until it was almost completed.
53 David Huffman, ‘Impossible objects as nonsense sentences’, in
Bernard Meltzer and Donald Michie, eds, Machine Intelligence, vol. 6,
Edinburgh, 1971, 295–323.
54 Huffman, ‘Impossible objects’, 323 (where implicit reference is made
to Chomsky), and Clowes, ‘On seeing things’, 79–80. Cf. Willats,
Art and Representation, 29, 175, 272, 279 (citing Clowes), and 282, and
Children’s Drawings, 199.
55 Chomksy, Aspects of the Theory of Syntax, 148–9.
56 Huffman, ‘Impossible objects’, 301–4, and Clowes, ‘On seeing things’,
86–7. Cf. Willats, Art and Representation, 114–15, and Biederman, ‘Visual
object recognition’, 127–9.
57 Huffman, ‘Impossible objects’, 305, and Clowes, ‘On seeing things’,
79–80 and 87–91. Cf. Willats, Art and Representation, 118.
58 Clowes, ‘On seeing things’, 89, and Huffman, ‘Impossible objects’,
306–13.
59 Huffman, ‘Impossible objects’, 304–11, and Clowes, ‘On seeing
things’, 104–6. Cf. Willats, Art and Representation, 113.
60 On the analogy between drawings and maps, see Fred Dubery and
John Willats, Drawing Systems, London, 1972, 9, and Willats, Art and
Representation, 70–6.
61 Willats, Art and Representation, 10–13 and 369.
62 Willats, Art and Representation, 70–84, and Children’s Drawings, 68–70.
63 Willats, Children’s Drawings, ix–x.
64 Willats, Children’s Drawings, 77 and 88. The analogy is implicit in Art and
Representation, but is suggested in a citation from Clowes, ‘On seeing
things’, 273.
65 Smith, Chomsky, 79. Cf. Andrew Carnie, Syntax: A Generative Introduction,
second edn, Oxford, 2007, 19 and 23. Cf. Pinker, The Language Instinct,
234, which includes OVS among the rare forms.
66 Noam Chomksy, The Minimalist Programme, Cambridge, MA, 1995, 35,
and Theory of Syntax, 12–14, and 83–5.
67 King Sun Fu, Syntactic Pattern Recognition, Englewood Cliffs, NJ, 1982, 1–7.
68 Willats ‘The third domain’, 5. This radical view inds some support
in Michael Podro, Depiction, New Haven and London, 1998, 9, which
argues that our sense of the work a line performs will affect the course,
and the outcome, of our visual experience of what it represents.
69 Schier, Deeper into Pictures, 65.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
© Association of Art Historians 2011 31
Pictorial Grammar
70 Schier, Deeper into Pictures, 67. Cf. 170.
71 Harrison, ‘Minimal syntax’, 229–30.
72 Schier, Deeper into Pictures, 73–4.
73 Doris Tsao and Margaret Livingstone, ‘Mechanisms of face
perception’, Annual Review of Neuroscience, 31, July 2008, 411–37, esp.
418–20.
74 Ludwig Wittgenstein, Lectures and Conversations on Aesthetics,
Psychology and Religious Belief, ed. Cyril Barrett, Oxford, 1966, 10, §4.
75 Harrison, ‘Minimal syntax’. The essay’s title may allude to
Chomsky, Theory of Syntax, 3, which refers to ‘minimal syntactically
functioning units’.
76 Harrison, personal communication.
77 Maurice Merleau-Ponty, ‘Cézanne’s doubt’, in Galen Johnson,
ed., The Merleau-Ponty Aesthetics Reader: Philosophy and Painting, Evanston, IL,
1993, 59–75, this quotation 66. See also The Phenomenology of Perception,
London, 1962, 322
78 Robert Solso, Cognition and the Visual Arts, Cambridge, MA, 1996,
147.
79 Alan Kennedy, The Psychology of Reading, London, 1984, 126–39. I
am grateful to Katherine Shingler for this reference.
80 See Gotthold Lessing, ‘Loacoon’, in Selected Prose Works of G. E.
Lessing, London, 1879, chs IV, XVI, and XVII.
81 Willats, Art and Representation, 341–6, and ‘The rules of
representation’, in Paul Smith and Caroline Wilde, eds, A Companion to
Art Theory, Oxford, 2002, 417–18.
82 Schier, Deeper into Pictures, 66, note 1.
83 Willats, Art and Representation, 20–1, 40–1, and 170–4. Marr,
Vision, 313–17.
84 See Vicki Bruce, Patrick Green and Mark Georgeson, Visual
Perception: Physiology, Psychology & Ecology, fourth edn, Hove, 2003,
276–89, for succinct accounts of Marr’s (and Nishihara’s) theory of
the role played by object-centred co-ordinates in object recognition,
and of Biederman’s alternative.
85 Willats, Children’s Drawings, 16, 104–8, 122–3, 146, Art and
Representation, 93, 177–8, 182–5, and 316–17.
86 Willats, Art and Representation, 165–7.
87 Willats, Art and Representation, 29 (citing Chomsky, Theory of Syntax,
148).
88 Willats, ‘Anomaly in the service of expression’, Art and
Representation, 248–67, and Children’s Drawings, 18.
89 Willats, Art and Representation, 267.
90 Clowes, ‘On seeing things’, 104. Cf. Willats, Art and Representation,
30 (on Guzman) and 357, note 1 (on Huffman).
91 Meyer Schapiro, Paul Cézanne, New York, 2004, 62.
92 Willats, Art and Representation, 364.
93 Paul Smith, Interpreting Cézanne, London, 1996, 46.
94 Willats, Art and Representation, 48 and 51, and Children’s Drawings,
197.
95 Cf. Merleau-Ponty, ‘Cézanne’s doubt’, 64, which offers some
dubious remarks about another ‘warped’ table in Cézanne, and
Huffman, ‘Impossible objects’, 312–13, on how ‘ungrammatical’
combinations of lines can lead to a ‘warped’ appearance.
96 Willats, Art and Representation, 154.
97 Willats, Art and Representation, 225.
98 This means that even grammatically simple sentences can
express abstract concepts like number, while pictures can only show
particulars, although this allows them to exemplify the sensuous
properties and relational properties of objects more fully than even
the most poetic language. See Nelson Goodman, ‘Art and inquiry’,
Proceedings and Addresses of the American Philosophical Association, vol. 41,
1967–1968, 5–19, esp. 12.
99 Richard Wollheim, ‘On pictorial representation’, in Rob van
Gerwen, ed., Richard Wollheim on the Art of Painting, Cambridge, 2001, 14.
Cf. Alex Potts, ‘Sign’, in Robert Nelson and Richard Shiff, eds, Critical
Terms for Art History, Chicago, IL, 1996, 22.
100 Willats, Children’s Drawings, 13, 15, 221.
101 Chomsky, Rules and Representations, 4.
102 See Smith, Chomsky, 28–32, and Cook and Newson, Chomsky’s
Universal Grammar, 13–19.
103 Chomsky, Theory of Syntax, v.
104 Willats, Children’s Drawings, 8, 78, and 170.
105 Willats, Children’s Drawings, 77 –8.
106 Willats, Children’s Drawings, 19, 40, 233.
107 Willats, Children’s Drawings, 147, 171, 177, 216, and 235. Children, such
as Stephen Wiltshire, who draw in perspective spontaneously are
exceptional. See Oliver Sacks, An Anthropologist on Mars, New York, 1995,
185–7.
108 Willats, Children’s Drawings, 159, and Art and Representation, 289 and 311–15.
109 Cf. Willats, Art and Representation, 165, and Children’s Drawings, 37–9, 60–1,
and 76–7.
110 Harrison, ‘Minimal syntax’, 229.
111 Cf. Willats, Art and Representation, 175.
112 Willats, Children’s Drawings, 8–9.
113 Willats, Children’s Drawings, 179–80.
114 Willats, Children’s Drawings, 170, 181, 201, and 229.
115 Willats, Children’s Drawings, 172. Cf. Carnie, Syntax, 21.
116 Schier, Deeper into Pictures, 55, 66, note 1, and 151. Cf. Harrison,
‘Minimal syntax’, 221, which argues that ‘Wollheim clearly does
think that the pictorial is in some central sense “generative” despite
his opposition to “semiotic” theories of it’.
117 Harrison, ‘Minimal syntax’, 221.
118 Willats, Art and Representation, 171–2 and 289.
119 See Chomsky, Theory of Syntax, 17–18 and 63–4, and The Minimalist
Programme, 223. Cf. Smith, Chomsky, 43–4.
120 Cf. Ludwig Wittgenstein, Philosophical Investigations, Oxford, 1953, §§319,
320, 329, and 335, and Pinker, The Language Instinct, 55–82;
121 Cf. Wittgenstein, Philosophical Investigations, 198e.
122 See Willats, Art and Representation, 171–2.
123 On the condition of deletion, see Chomsky, Theory of Syntax, 179–84,
and Language and Mind, 50–2. Cf. Cook and Newson, Chomsky’s Universal
Grammar, 280.
124 Willats, Art and Representation, 138–41.
125 Willats, Art and Representation, 112 and 152–3. Marr, Vision, 268–94.
126 Wollheim, Painting as an Art, 23 and 359, note 16.
127 Irving Biederman, ‘Recognition-by-components: A theory of human
image understanding’, Psychological Review, 94: 2, 1987, 115–47.
128 Smith, Chomsky, 60–8, and Cook and Newson, Chomsky’s Universal
Grammar, 20–3 and 32–5.
129 Harrison, ‘Minimal syntax’, 229.
130 See Marie McGinn, Wittgenstein and the Philosophical Investigations, London,
1997, 113–42.
131 Willats, Art and Representation, 146, 364, note 6, and Children’s Drawings,
13–14.
132 Willats, Art and Representation, 331–3.
133 Pinker, The Language Instinct, 203–4.
134 Chomksy, Theory of Syntax, 6.
135 Cf. Willats, Art and Representation, 34 and 353–4, which emphasizes the
function and purpose to which the different systems are put.
136 Willats, Art and Representation, 341.
137 See Victor Burgin, ‘Seeing sense’, in The End of Art Theory: Criticism and
Postmodernity, Basingstoke and London, 1986, 51–70, and Wittgenstein,
Philosophical Investigations, 193e–214e.
138 Schier, Deeper into Pictures, 120–5. Cf. Harrison, ‘Minimal syntax’, esp.
219. Cf. Wittgenstein, Philosophical Investigations, §244, and Hans Sluga,
‘“Whose house is that?”: Wittgenstein on the self’, in Hans Sluga and
David Stern, eds, The Cambridge Companion to Wittgenstein, Cambridge,
1996, 340.
139 Wollheim, Painting as an Art, 188–9.
140 Cf. Jean-Désiré Régnier, De la lumière et de la couleur chez les grands maîtres
anciens, Paris, 1865, esp. 31–8.
141 John Willats and Frédo Durand, ‘Deining pictorial style: Lessons
from linguistics and computer graphics’, Axiomathes, 15: 3, September
2005, 1–27, esp. 3–7, and 12–21.
123
45678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
Pictorial Grammar: Chomsky, John Willats, and the Rules of RepresentationPaul Smith
This article argues that there is such a thing as pictorial
grammar, despite objections to the very idea by
philosophers. It proposes that John Willats developed
a theory of this grammar on the basis of Chomsky’s
early work, which demonstrates that pictures are
segmentable into semantic units, and that these
are organized by syntax into larger, grammatically
coherent structures. It also argues that pictorial
grammar, thus conceived, is innately grounded
and that it operates to transform an underlying
perceptual content into a surface form. The advantages
of this theory are that it allows us to specify how
pictures produce meanings both by obeying and by
transgressing grammaticality, and to reconigure
our present understanding of what is conventional
or arbitrary, and what is natural and iconic, about
depiction. It concludes with a consideration of the
further possibilities of a theory of pictorial grammar.
Paul Smith is author of Seurat and the Avant-Garde (Yale, 1997), co-
editor of The Blackwell Companion to Art Theory (2002), and editor
of an annotated translation of Marius Roux’s novel, The Substance and the
Shadow (Penn State, 2007), whose central character is modelled on Cézanne.
He is also editor of the recently published volume, Seurat Re-Viewed (Penn
State, 2010), and is presently working on a phenomenological study of Cézanne’s
perspective and colour.