‘Pictorial Grammar: Chomsky, John Willats, and the Rules of Representation’, Art History, vol....

© Association of Art Historians 2011 2

Pictorial Grammar: Chomsky, John Willats, and the Rules of RepresentationPaul Smith

Noam Chomsky’s work has had enormous impact, not only within linguistics,

but also upon the humanities, social sciences, and sciences more generally. It has

certainly been of especial importance for philosophy because it has reviviied the

stalled debate between Rationalists and Empiricists as to whether our knowledge is

innate or acquired with its proposal that our capacity to understand and produce our

own language – and to learn others too – is grounded in the fact that we are born in

possession of the principles governing all possible grammatical sentences.1

The discipline of art history has remained almost completely impervious to

Chomsky’s work, however. Instead it has preferred of late to draw its understanding

of what is language-like about pictures from structuralism,2 or from critical revisions

of this body of theory that nevertheless preserve its foundational assumption that

both signiiers and their associated signiieds are conventional and arbitrary. Applied

uncritically to pictures, this would imply that their signs simply cannot be iconic, in

the Peircean sense of resembling what they stand for, which would beg the question

why so many depictions do indeed look convincing. The other main theory of

representation to draw analogies between pictures and language (under the broader

rubric of ‘the language of art’) that still enjoys some currency among art historians

is Gombrich’s. But this provides no more satisfying resolution to the matter of the

arbitrariness or iconicity of depiction than its rival. Instead – as Richard Wollheim

observed – it contradicts its own thesis that pictorial signs can be illusionistic (or

strongly iconic) by maintaining at the same time that they are always conventional,

in the sense of subject to arbitrary rules of interpretation.3 (Gombrich also regards

painting as a ‘code’ of sorts.) Given this impasse, not the least advantage of applying

Chomskian principles to the analysis of pictures is that it generates an account that

makes it possible to accommodate iconicity and conventionality together amenably.

In a career spanning forty years, the psychologist (and trained engineer and artist)

John Willats developed an account of precisely this kind, largely on the basis of the

‘standard theory’ Chomsky elaborated in his irst two books.4 It is therefore relatively

unsophisticated compared with the more recent versions of Chomsky’s theory,5

which can specify the operations of linguistic grammar in minute detail in terms

that embrace syntax and semantics (or structure and sense) simultaneously, and at

the same time explain how we can in practice implement grammar economically in

our speech.6 Willats’s account is none the less serviceable for pictures since it rests on

a conception of their operations that derives from those concepts most fundamental

to Chomsky’s account, and which have endured through the several reforms it has

Detail from Paul Cézanne, Victor Chocquet Seated, c. 1877 (plate 18)

DOI: 10.1111/j.1467-8365.2011.00835.xArt History | ISSN 0141-679034 | 2 | June 2011 | pages XX-XX

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35


Pictorial Grammar

undergone. It also turns out that pictures possess only a restricted set of semantic

features organized by a relatively small number of syntactical rules. So, rather as

Chomsky’s work could reconigure our understanding of the role of nature and

nurture in language acquisition, Willats’s account is powerful enough to reconigure

our understanding of what is natural and what is conventional about depiction. It is

capable, in other words, not only of transcending the limitations of semiotic theory,

but also those of philosophical theories (unfamiliar to most art historians) which

regard pictures as seamless and transparent icons that resemble what they represent

naturally (as opposed to conventionally).

In short, what Willats argues on this score is that pictures are neither wholly

conventional nor naturally iconic. Instead he proposes that pictures have a natural

basis, and non-conventional characteristics, because they issue from our innate

capacity to ‘map’ percepts (or mental representations) of real scenes and their

components according to rules that preserve many of their objective properties. At

the same time, however, he also argues that convention plays a substantial role in

depiction, since it is this that decides which of the manifold possible variants of these

rules are employed by particular groups and cultures.

Willats acknowledged his debt to various aspects of Chomsky’s thinking in

several places. In his magnum opus, Art and Representation of 1997,7 he recognized both

the impact that Chomsky’s theory of syntax made on his understanding of pictorial

structure, and his indebtedness to the pioneering research into ‘picture grammars’

undertaken in the 1960s by David Huffman, Max Clowes, and Adolfo Guzman

under the impact of Chomsky’s theories.8 And later, in Making Sense of Children’s Drawings

of 2005, Willats declared that his explanation of how children learn to draw was

premised upon Chomsky’s ideas about our innate capacity for producing speech.9

Willats also recognized Chomsky’s more indirect inluence in both books by stating

that the other main source of his general theory was David Marr’s Vision of 1980,10 a

work that explicitly modelled its use of computational principles to explain how the

brain converted raw data into conscious percepts upon Chomsky’s use of similar

principles to explain how speech transformed its own ‘base’ content.11

Notwithstanding this, leshing out Willats’s account will involve analysing a

good deal more explicitly than he does himself how his theory applies Chomskian

principles since he only rarely cites speciic passages of Chomsky’s writings, and is

reluctant to pursue theoretical generalizations. More speciically, demonstrating

that pictures have a grammar similar in kind to the grammar Chomsky discerned in

language will involve demonstrating a series of more basic facts: irst, that pictures

have parts which can be segmented out of their larger structures; second, that syntax

operates on their component parts; third, that they map shapes and spatial relations

in a grammatical way; fourth, that the grammatical rules operating in any picture

are innately grounded; and, inally, that these rules map, or transform, a ‘deeper’

perceptual content.

It must be acknowledged that the type of account to be developed solely by

extrapolating from Willats must necessarily be restricted to characterizing how

pictures perform their most basic cognitive function, namely that of rendering

objects in forms that allow them to be recognized. I nevertheless hope to show

that developing a theory of this kind (or of pictorial grammar pure and simple) is

worth the effort. One justiication for doing so is that the reality of such a grammar

is vigorously contested.12 Another is that the analytical categories provided by

Willats greatly enhance our understanding of how pictures work. What is more,

even a rudimentary account of pictorial grammar is capable of making sense of the

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50


Paul Smith

complex meanings pictures sometimes achieve when they bend, or subvert, its rules.

And although it would be unwise to attach too much weight to any further claims

about the potential of a theory of pictorial grammar, I will none the less tentatively

suggest in my conclusion that it may be capable of explaining aspects of depiction

other than shape or spatial relations, and may also contribute towards a better

understanding of the relationship between pictures and language. A general theory

of pictorial grammar is, in other words, a worthwhile aspiration for art historians as

for others.

Convention, Iconicity, and Mapping

A convenient route into the differences between semiotic and natural resemblance

theories of depiction is provided by the arguments advanced by Yve-Alain Bois in

his essay, ‘The Semiology of Cubism’, of 1992,13 and by Richard Wollheim in his

response to it in his essay, Formalism and its Kinds, of 1995.14 In the ‘principles’ laid out

at the outset of his essay, Bois argues that the pictorial sign in cubist collage from

1912 is used in the same way either as the arbitrary ‘signiier’ (described by Saussure)

or the (Peircean) index, and that its iconic use is all but entirely suspended.15 An

index is simple enough in that it is a sign that refers to its referent by grace of a

causal or existential bond of some kind. Signiiers are more complex. In the irst

place, they do not refer to things and the like, but signify their own signiieds.

More speciically, because signiiers are distinguished from one another within

any ‘system’, or language, by the play of phonic differences alone,16 the signiieds

that system can generate are solely a function of the differences it can generate, and

hence are as arbitrary as their associated signiiers. It is also important to emphasize

that the signiiers actually present in any utterance do not signify by virtue of their

relationship to each another alone, but by virtue of their relationship to their absent

relatives as well.17 Bois extends these arguments by analogy to propose that the

meanings of pictorial signiiers are a function of the whole system they constitute

(although he is unclear about how this category should be applied), and are therefore

arbitrary in a strong sense. Signs thus conceived are evidently incapable of reference,18

at any event, and only mean what they do because convention assigns them their

meaning.

Because Bois sees no role for iconicity in cubist collage, he does not allow that the

games at work in some cubist collages – often lagged by the words ‘jouer’ or ‘jeu’

or suchlike – depend on playing with, or undermining, the function that iconic signs

fulil when used more straightforwardly. In Picasso’s Bowl with Fruit, Violin and Wineglass

of 1912 (plate 1), for instance, it is ambiguous whether the piece of paper representing

a wooden violin is painted to look like wood – and is therefore a straightforwardly

iconic sign – or whether it is painted to look like faux-bois wallpaper – and is therefore

a sign that imitates a sign that routinely employs iconicity for the purposes of posing

as indexical. But even while the ambiguity of such signs calls their own authority

into question, their ability to do so is ultimately parasitic upon the reliability of the

overwhelming majority of iconic signs to refer.19 So even though Picasso cut out the

illustrations of fruit in the top left-hand corner in rough angular shapes that place

them in the pictorial equivalent of quotation marks, the joke only works because they

still manage to resemble their referents.

Although Bois’ essay does not purport to be a general theory of depiction,

Wollheim nevertheless argues in Formalism and its Kinds that no theory of the kind it

implies could ever be tenable.20 Wollheim’s aversion to Bois’ way of thinking arises

from his commitment to the idea that ‘seeing-in’ and iconicity are both fundamental

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Pictorial Grammar

to depiction generally,21 even overtly non-igurative painting; from which it follows

that any theory maintaining the possibility that a painting’s content can be non-

representational falsiies what is most central to it. What underlies Wollheim’s

opposition to Bois, in other words, is what he construes as the ‘latent formalism’

apparent in his concern for structure. Rather than tackling this commitment head

on, however, Wollheim prefers to criticize it indirectly by countering Bois’ more

general (and, to be fair, largely implicit) claim that pictures are language-like

in structure. And he does this by identifying a series of conditions drawn from

Chomsky’s thinking that representational pictures ought to be able to satisfy were

they language-like, but which they cannot because (in his view) they are not.22 These

all require that pictures possess grammatical features of a kind that Bois cannot

discern in them (any more than structuralism can in language).23 But Wollheim’s

critique goes further because it purports to show that there is no such thing as

pictorial grammar tout court, which – if right – must mean that Willats’s account of

depiction collapses along with Bois’.

1 Pablo Ruiz y Picasso, Bowl with Fruit, Violin, and Wineglass, 1912. Charcoal, black chalk, watercolour, oil, coarse charcoal or black pigment in binding medium, on newspaper (Le Journal, 6 and 9 December 1912), blue and white laid charcoal papers, supported by thin cardboard, 64 × 49.5 cm. Philadelphia: Philadelphia Museum of Art (A. E. Gallatin Collection, 1952-61-106).© Succession Picasso/DACS, London. 2010. Photo: Philadelphia Museum of Art.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15


Paul Smith

Wollheim’s most fundamental objection to the idea of pictorial grammar is that

pictures cannot involve ‘syntactical rules’ because, unlike sentences, they cannot be

‘segmented into smaller units’ or ‘ultimately basic units’ upon which syntax operates.

Wollheim denies, in short, that pictures consist of lexical (and other kinds of) items

ordered by syntax. This claim can be understood better by grasping how it relates

to Flint Schier’s criticism of the notion of pictorial grammar in Deeper into Pictures of

1986, a work Wollheim regarded highly.24 This contends that ‘pictures are weakly

decomposable or compositional but not strongly so’ like ‘sentences’, and hence

contain nothing ‘which plays the role of a word’ (or ‘morpheme’) in them.25 Schier

consequently claims that in order to understand a picture we have no need to grasp

anything akin to the syntax operating in a sentence, or ‘the relevant grammatical

rules governing [the] composition … of its parts’.26 Rather, he believes that there

are ‘no grammatical rules’ governing the composition of pictures, and he sees ‘no

place for a grammar or syntax of pictures’ at all.27 Indeed, Schier sums up Wollheim’s

position along with his own when he asserts that ‘There is no need for a grammar in

Chomsky’s sense … for iconic systems’,28 since a picture is just ‘built up iconically out

of its parts’. Our manifest ability to recognize a whole icon is, in other words, simply

a function of our ability to recognize ‘the objects and states of affairs represented by

[its] parts’.29 What this seems to suggest is that pictures iconify by grace of natural

resemblance alone, or in virtue of being somehow – or in some unspeciied sense –

isomorphic to their referents.

One of the more important implications of such a view is that pictures resemble

things without making recourse to any independent syntax of their own which

might complicate their relationship to their referents. So, even though Wollheim

eschews the notion of pictorial grammar altogether, his arguments about the

naturalness of iconicity nevertheless accord with the Port-Royal grammarians’ view

that there was nothing in the grammar of language that exceeded the requirement

that it should be able to relect the natural order of things. This view is highly

reductive, however, as Chomsky famously demonstrated when he showed that the

‘surface structure’ of a sentence maps – or transforms – a ‘deep structure’ according

to rules proper to itself.30 It follows that inasmuch as he embraces a closely analogous

conception of pictorial grammar, Willats maintains that pictorial structure is to some

extent sui generis too.

At the same time as challenging Wollheim’s and Schier’s account of

representation for being too strongly naturalistic, Willats implicitly contests Bois’

view for being too strongly conventionalist. This is because he envisages pictorial

mapping as a process that preserves a measure of identity between the structures of

pictures and those of the objects and scenes they represent. Willats suggests, more

particularly, that pictorial syntax requires semantic elements like lines, and higher-

order structures like whole shapes, to fall into arrangements consistent with those

formed by their real counterparts. Lines, in other words, must occlude one another

since this is what edges do, and shapes must have continuous surfaces because this is

how they are in actual objects.

Although it may seem that pictures cannot both transform their referents and

preserve some of their features, Willats argues synthetically that pictorial structures

possess what might be thought of as a qualiied isomorphism with the structure of

things. He seems to regard pictures, in other words, as closely akin to maps, which

transform what they represent regularly, and not just haphazardly. This sense of

pictures as approximating to maps is shared by Andrew Harrison, who argues that

they succeed in plotting many of the relations between things, even though they do

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50


Pictorial Grammar

so according to the rules of their own syntax.31 Evidently, the great advantage of a

view such as this is that it makes it possible to maintain perfectly coherently that all

viable pictorial structures can be formalized in sets of normative or conventional

rules (like those of schoolboy grammar), and that they will be transparent or iconic if

they transform percepts in ways that allow these to be recoverable. Put another way,

it is implicit in Willats’s account of representation that pictures succeed in referring to

things because their structures are ultimately constrained by what Richard Gregory

called ‘the grammar of vision’ (a formulation cited by Chomsky).32

For and against Pictorial Grammar – 1. Segmentability

Before it is possible to establish anything of this complexity, any theory of pictorial

grammar must irst of all establish that pictures have component semantic parts

of a kind that syntax can operate on, or assemble into larger units of sense. This

is what Willats argues, and by doing so contests Wollheim’s view of pictorial

compositionality, which denies that they have parts of this kind. It is important to

realize, however, that Wollheim does not suggest that pictures have no parts tout

court. Indeed, he accepted that they do in an article of 1966, when he concurred with

Michael Podro’s view that a ‘higher-level [pictorial] composition’ can have ‘basic’, or

‘sub-components’,33 arguing that many Old Master paintings have structures whose

component elements ‘are put together and contained in’ larger pictorial structures.34

Rather, Wollheim’s argument is that the parts of a picture are not the same in kind as

the discrete, and ultimately irreducible, semantic units (or whole component parts

that make sense on their own) into which sentences can be decomposed according to

Chomsky. This more restricted claim nevertheless runs directly against the grain of

the Chomskian argument Wollheim made in 1973 that ‘lexical items … concatenated

into complexes’, and ‘syntax and semantics’, ‘belong to the essence of art itself’.35

One way of understanding Wollheim’s volte-face is that his views were altered

by reading Schier. But it is also likely that his close dialogue with Nelson Goodman

throughout the 1970s affected his views, and in particular the distinction Goodman

proposed in Languages of Art (1976) between a ‘initely differentiated’ system or

‘notational scheme’, such as the printed notation we use for language, and the

notational scheme of painting.36 In more detail, this

argument holds that whereas printed ‘characters’ are

‘disjoint’, and readily identiied as ‘different’ from one

another by virtue of their membership of one ‘class’

(such as ‘a’) or another (such as ‘d’), nothing of the sort

can be said of the components of pictures.37 Instead,

pictures have ‘syntactically dense’ systems that lack

clear internal differentiation,38 from which it follows

that they cannot be ordered by syntax.

Goodman’s views were contested almost as soon

as they appeared by Curtis Carter, who argued that

‘shapes’ are indeed disjoint components of pictures

that exhibit character class membership,39 and hence

that they are readily seen as ‘constitutive elements’ of

‘larger units’ akin to ‘phonemes’, ‘morphemes’, and

‘spoken words’.40 His argument falls, however, because

it does not clearly establish what the deining limits of

a shape are. In a similar vein, Willats’s predecessors,

Huffman and Clowes, argued that structures such as

2 Drawing of a square produced by a 3½-year-old, igure 3.9 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press. Adapted from igure 10b from Jean Hayes, ‘Children’s visual descriptions’, Cognitive Science, 2: 1, January–March 1978, 14. © Cognitive Science Society.

3 Paul Cézanne, Provencal Landscape with a Red Roof, or The Pine at L’Estaque, 1875–76. Oil on canvas, 72 × 58 cm. Paris: Musée de l’Orangerie (Collection Jean Walther et Paul Guillaume). Photo: © RMN (Musée de l’Orangerie)/Franck Raux.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Paul Smith

line junctions and vertices, being composed of more simple elements in the form

of lines, thereby imply the operation of pictorial syntax on disjoint units. It is a

weakness of Huffman’s theory, however, that it does not closely specify the character

of these units, and of Clowes’s and Guzman’s that they characterized lines somewhat

vaguely as the boundaries of ‘regions’.41

1

2

3

4

5


Pictorial Grammar

In the light of these partial successes, it was an important part of Willats’s

achievement that he clearly identiied pictorial units of the kind Goodman and

Wollheim regarded as indiscernible by developing the analytical category of the

‘picture primitive’.42 These items come in a small number of simple forms: lines

denoting edges, points indicating luminosity, and colours referring to hues and tones

simultaneously. Put this way, the argument seems trite. But it actually represented

a major advance over earlier attempts to identify basic pictorial units because it

involved distinguishing a pictorial semantic unit from the marks carrying it. A line,

in other words, is no more identical to a pencil or pen mark, or a set of brushstrokes,

or the interstices between tesserae, than a word is the same as the sound in which it is

expressed.43 Nor is a point the same as a blob, or a colour the same as a brush mark.

Once this is appreciated, it starts to appear as though Wollheim’s objection to pictorial

segmentability involves conlating pictorial marks, which are not always readily

segmentable, with the primitives they carry, which are. Or to put it the other way

round, a picture decomposes straightforwardly once its units are correctly speciied,

just as a sentence does.

What Willats achieved by identifying the semantic units of the picture in

this manner is closely comparable to what Chomsky achieved by inventing the

‘rewrite rule’,44 that is, the technique of characterizing the components of phrases

‘generatively’ in terms of explicit lexical categories such as noun, verb, adjective and

the like.45 And, indeed, Willats not only regards picture primitives as elements of the

picture’s ‘denotation system’, where they play a role akin to the ‘constituents’ of the

‘lexicon’ in language,46 but casts them as lexical units on account of how they refer

to ‘scene primitives’, or to the basic perceptual units of the scene they depict. Lighter

and darker points, for example, refer to the degrees of luminosity we extract from the

luminous array, and lines refer to the edges we extract from it. This means that lines

are indeed genuine units of sense, or that they are components of larger structures

that stand for things rather like nouns. Furthermore, Willats supports this inference

by showing that children’s drawings sometimes use what he calls ‘shape modiiers’,

or marks appended to a line, to specify particular aspects of its shape – almost as

adjectives – illustrating his claim with a drawing of a cube (plate 2), where these marks

stipulate several such properties.47

Being ‘primitive’ a line can evidently be decomposed out of the larger semantic

unit of a vertex, rather as a word can be isolated from a phrase. A vertex, moreover,

can be decomposed out of the larger unit of a surface, which can itself be extracted

from a whole shape, which can in turn be isolated from the whole scene depicted,

just as phrases can be decomposed out of larger phrases and ultimately whole

sentences. (The caveat is that it is nevertheless misleading to think that the units

of sense to be identiied at different levels of pictorial structure have an equivalent

valence to the units present at different levels of linguistic structure.)

There is an initially plausible objection to the idea of pictorial segmentability,

however, which is that pictures do not have units that we see in isolation in the same

way as we (seem to) hear words individually. It might be claimed, for example,

that when we ixate on a particular line in a drawing we cannot isolate it from

the lines adjacent to, or conjoint with, it, whereas we hear words as having sharp

boundaries. The fact is, however, that our impression that we hear individual words

independently of the larger structures they form is retrospective. That is, we extract

individual words from the larger sequences in which they are rolled out only after

the fact, even if this is not how it seems to us.48 There is therefore a genuine analogy

4 Paul Cézanne, Woman with a Coffee-Pot, c. 1895. Oil on canvas, 130 × 95 cm. Paris: Musée d’Orsay. Photo: © RMN (Musée d’Orsay)/Hervé Lewandowski.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Paul Smith


Pictorial Grammar

to be made between the phenomenology of parsing a sentence into its constituents

and what happens when we identify a picture’s units only in the context of the

complete work, or of a large section of it. The area behind the screen of trees in the

centre of Cézanne’s Provencal Landscape with a Red Roof of the mid-1870s (plate 3) is a good

example of this last phenomenon at work, since it is almost impossible to identify its

constituent primitives, or shapes, when seen in isolation (from close-to), but these

are readily identiied when a suficient distance is attained for the whole painting to

be visible.

Cézanne’s mature paintings in general tend to exhibit compositionality overtly,

since they make little attempt to hide the fact that the objects in them are made up

of smaller components. In Woman with a Coffee-Pot of c. 1895 (plate 4), for example, it

is dificult to resist the temptation to see the upper and lower halves of the sitter as

joined together in the same way as the two sections of the coffee-pot to her right,

as both ‘objects’ readily parse into roughly conical or cylindrical components

of the kind Irving Biederman calls ‘geons’, or ‘viewpoint invariant, volumetric

primitive[s]’ analogous to ‘phoneme[s]’.49 It remains a problem, however, that many

paintings appear so luent that it can seem as though Wollheim’s argument against

compositionality must prevail. We may nevertheless be aware of their component

parts as such subliminally, since one part of our perceptual mechanism is devoted

to picking out areas of high contrast like lines, even though what that mechanism

detects is amalgamated with lower frequency information in conscious seeing (rather

as edges are when we see real scenes).50 Compositionality is thus a real feature of our

perception of pictures, even when it is not obviously so.

One symptom of compositionality is that pictures and speech are both rolled out

in units. With speech, it is obvious enough that we fabricate sense in ever-larger units

consisting of words, whole and embedded phrases, and whole sentences. But Willats

shows that we do the same kind of thing in pictures too, when, in an argument about

occlusion in pictures, he contrasts the natural sequence in which we draw an object

with a possible sequence we almost never use.51 The natural drawing (plate 5) shows

that we set down all the lines making up the front face of the object irst, then repeat

the process to form the adjacent face, and inally draw its partially occluded faces

5 Stages of drawing a rectangular object, igure 5 from John Willats, ‘The third domain: The role of pictorial images in picture perception and production’, Axiomathes, 13: 1, March 2002, 1–15. Photo: With kind permission from Springer Science and Business Media.

6 Stages of drawing a rectangular object, igure 6 from John Willats, ‘The third domain: The role of pictorial images in picture perception and production’, Axiomathes, 13: 1, March 2002, 1–15. Photo: With kind permission from Springer Science and Business Media.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31


Paul Smith

one after the other. The second (plate 6) shows how it is possible to begin drawing

the same object by drawing some of its various edges without joining them up into

faces. The fact, however, that we almost never use this sequence strongly implies that

we are predisposed to assemble pictures out of whole ‘units’ of the very kind whose

existence Wollheim contests, at the earliest possible stage of the picture-making or

picture-perceiving process.52

2. Syntax

Two of the key sources of Willats’s ideas about pictorial syntax were Huffman’s

article, ‘Impossible Objects as Nonsense Sentences’,53 and Clowes’s ‘On Seeing

Things’, both of 1971. Both of these used line drawings of ‘anomalous’ objects such as

an impossible polyhedron (plate 7) to reveal the rules governing the combination of

edges in properly-formed objects and pictures, in imitation of Chomsky’s technique

of using ambiguous and nonsense sentences to reveal the rules governing well-

formed sentences and their component ‘strings’.54 By this account, the sentence

‘colourless green ideas sleep furiously’ could appear to be a well-formed string

inasmuch as the adjectives, noun, verb and adverb are all in the right place, but it is

anomalous nevertheless because it violates (and hence reveals) the ‘sectional rule’ that

only certain kinds of ‘lexical items’ can be placed together if a sentence is to makes

sense of itself. By contrast, there is no such problem with the sentence ‘revolutionary

new ideas appear infrequently’, for all that it is supericially identical in structure.55

One of the more notable discoveries of Huffman and Clowes was the principle

that the rules governing the lawful combination of edges in a rectilinear object can

be expressed as variations of four basic categories of line junction, the V-, W-, Y-, and

T-junctions (plate 8), to use Huffman’s nomenclature; or the Ell, Arrow, Fork, and Tee

junctions, to use Clowes’s.56 This made it possible to at least envisage the possibility

that pictorial grammar could be described systematically in terms of clear categories

like those Chomsky had isolated. It also provided strong evidence for the existence of

something closely analogous to the kind of ‘well-formed’ structure, or ‘string’, that

Chomksy had discerned in grammatical sentences.

At the same time, Huffman and Clowes developed systems of labelling their

diagrams which showed the convexity or concavity of an edge clearly (a technique

they compared to ‘parsing’ a sentence),57 but which also made it possible readily

to identify ‘forbidden’, or ‘ungrammatical’,58 combinations of edges as well. They

were consequently able to specify a number of rules that they proposed pictures

could not infringe without being ill-formed, which included the rule that the same

surface cannot exist on two sides of the same edge, and the rule that the character of

an edge must be identical on both sides of the line describing it (viz. either convex

or concave).59 They also make it clear that all the junctions in an object must be

susceptible of being resolved coherently together, just as the elements of a sentence

must, if it is to be well- and not ill-formed.

In this way Huffman and Clowes gave grounds for thinking that Wollheim

was wrong to contest the reality of pictorial grammar on the basis that there is

nothing analogous in pictures to the ‘well-formed strings’ of grammatical sentences

because it cannot be said of them, as it can of language, that ‘the meaning [of a

well-formed] string … is determined by its syntactical form plus [its] vocabulary,

or lexicon’. A further distinctive feature of Huffman’s and Clowes’s arguments is

that they maintained that we interpret shapes from the bottom-up, irst of all by

identifying simple structures like line junctions which are governed by only a few

rules, then by identifying the larger, coherent structures these create, and ultimately

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Pictorial Grammar

by identifying the whole scene such structures form

together. Interpreting shapes this way is clearly

akin to the way that Chomsky suggests we interpret

sentences, by irst identifying their smaller units and

then combining these into larger and more complex

structures. Chomsky notoriously developed two

methods of representing the consequences of this

process, bracketing sentences and representing them in

the form of ‘trees’ (plate 9), both of which show clearly

how sentences are formed out of nesting structures,

or several levels of sense. Huffman’s and Clowes’s

diagrams have no structure of this sort because neither

they nor Guzman realized that a description of the

rules governing line junctions and their combinations

are subject to higher-level rules governing the manner

in which the picture projects, or maps, three-

dimensional spatial relations into two dimensions.60

But in characterizing the ‘drawing system’ at work in

the picture as a set of rules of precisely this sort, Willats

made it possible to see how it was also the top level of

pictorial syntax to which all subsidiary structures are

subservient.

The realization that projection was part of pictorial

syntax was only implicit in the research Willats

published in the book, Drawing Systems of 1972, co-

authored with Fred Dubery. Nonetheless, this text

offered a clear description of how the various different

methods of projection each impart a speciic character

to vertices and line junctions because they articulate

them differently. As Willats later made explicit, the

same vertex or junction can look quite different in

7 Impossible polyhedron, igure 2e from David Huffman, ‘Impossible objects as nonsense sentences’, in Bernard Meltzer and Donald Michie, eds, Machine Intelligence, vol. 6, Edinburgh, 1971, 295–323. Photo: Courtesy of Edinburgh University Press.

8 Look up list of labelled line junctions in drawings of rectangular objects, igure 5.4 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press. Originally igure 6 from David Huffman, ‘Impossible objects as nonsense sentences’, in Bernard Meltzer and Donald Michie, eds, Machine Intelligence, vol. 6, Edinburgh, 1971, 295–323. Photo: Courtesy of Edinburgh University Press.

9 Tree diagram of the surface structure of the sentence, ‘A wise man is honest’, from Noam Chomsky, ‘Linguistic contribution: present’ [1967], in Language and Mind, Cambridge, 2006 [1968], 26. Photo: © Cambridge University Press.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31


Paul Smith

vertical oblique projection (plate 10), for example, from

how it looks in oblique projection proper (plate 11).

More importantly, his early research insisted that not

all projection systems bestow a particular ‘secondary

geometry’ on shapes as a function of their ‘primary

geometry’, or because they map spatial relationships

in a scene as they appear from a single viewpoint.61 He

argued instead that some drawing systems can only

be understood in terms of their secondary geometry

alone, or through how they map relationships in a

scene without reference to viewpoint.

Willats and Dubery coined the term ‘drawing

system’ to identify the different projection systems

that fell under this more comprehensive deinition.

And, crucially, they were able to show that all forms of

vanishing-point perspective and parallel projection,

including complex systems such as three-point and

inverted perspective, or isometric and axonometric

projection, could be explained by systematic rules.

Later, in Art and Representation, Willats extended the

concept of the drawing system to very simple

projection systems that do not appear to be rule-

governed at all at irst glance. Perhaps most remarkably

he even discerned a kind of regularity in the

‘enclosure’ system used by very young children (plate

12). Even though such ‘enclosures’ do not map shape

systematically but merely describe brute extension in

individual shapes, they none the less give some sense of

the basic topological relations – such as adjacency – that

obtain between the elements of an object.62

10 Drawing of a rectilinear object in vertical oblique projection, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.

11 Drawing of a rectilinear object in oblique projection, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.

12 Drawing of a rectilinear object in enclosure, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30


Pictorial Grammar

Although it was shortly after the publication of

Drawing Systems that Willats realized his conception of

projection had analogies in Chomsky’s conception

of syntax,63 only in his last book did he explicitly

compare the drawing system to the high-level syntax

that organizes a sentence’s components into a coherent

whole.64 And although Willats remained reticent

about exactly how the analogy might play out, he

nevertheless implies that the drawing system might

be compared to the ‘parameters’ that specify such

things as the sequence of subject, verb and object in a

particular language,65 or which determine whether

a language is in Chomskian terms ‘left-branching’

or ‘right-branching’ in structure.66 It was clear in

any event that the rules of the drawing system take

precedence over rules of the sort described by Huffman

and Clowes, and that in consequence a picture’s

units can indeed be described in terms of the kind of

branched ‘nesting’ structures into which Chomsky

analysed sentences. It is no surprise therefore that the

type of object recognition known as ‘syntactic pattern

recognition’ has proposed to analyse drawings and

visual scenes into tree diagrams of precisely this kind

(plate 13 and plate 14).67

One of the most signiicant aspects of Willats’s inal

conception of the drawing system is its claim that the

syntax of any such system organizes a picture’s parts

exclusively in its own terms. This means that a picture’s

secondary geometry will inevitably produce a pictorial

space that is neither lat, nor the same as that of its real

counterpart, but which constitutes a ‘third domain’

13 Pictorial patterns for scene A and picture F, igure 1.1 from King Sun Fu, Syntactic Pattern Recognition and Applications, irst edn, Englewood Cliffs, NJ, 1982, 2. © Prentice Hall, 1982. Photo: Reprinted by kind permission of Pearson Education Inc., Upper Saddle River, NJ.

14 Hierarchical structural descriptions of scene A and picture F, igure 1.2 from King Sun Fu, Syntactic Pattern Recognition and Applications, irst edn, Englewood Cliffs, NJ, 1982, 3. © Prentice Hall, 1982. Photo: Reprinted by kind permission of Pearson Education Inc., Upper Saddle River, NJ.

15 Picture-faces from Ludwig Wittgenstein, Lectures and Conversations on Aesthetics, Psychology and Religious Belief, ed. Cyril Barrett, Oxford, 1966, section 10, 4. Photo: By permission of John Wiley & Sons.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31


Paul Smith

between the two that we can recover only from its drawing system.68 This radical

argument has several consequences, the most important of which for the purposes

of my argument is that it reopens the gap between shape percepts and pictures, a

gap which – as already mentioned – had been completely overlooked in naturalistic

accounts of iconicity, and all but ignored in the early picture grammars too.

The most obvious conclusion to be drawn from Willats’s arguments about

pictorial syntax is that it is the glue that binds the elements of a picture into a

coherent whole. By corollary, then, Wollheim and Schier are not able to explain

how pictures hold together. This shortcoming is evident in Schier’s discussion of a

diagram composed of a rectangle surrounded by dots, which is taken to represent a

conference table surrounded by delegates. Even though Schier accepts that this image

uses a ‘compositional’ system the ‘grammar and lexicon’ of which are parasitic upon

‘the linguistic environment’, he maintains that ‘the signiicance of its composition’

requires ‘no special or prior stipulation’, and can be explained by its ‘quasi-natural’

‘grammar’ alone,69 or in effect its ability to replicate the structure of the scene it

represents. He therefore argues that while such a drawing may contain ‘sub-iconic

parts’ in the form of lines and dots that do not signify anything outside their broader

pictorial context, their contribution to the whole ‘is not determined conventionally’

or by the operations of syntax.70 Even excusing Schier’s curious insistence on the

sheer conventionality of linguistic grammar, which takes no account of its innate

16 Henri Matisse, Blue Nude (La Grenouille), 1952. Lithograph by Mourlot after the gouache decoupée (published in Verve, 9: 35–6, 1958), 18 × 16 cm. © Succession H. Matisse/DACS 2010.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20


Pictorial Grammar

basis, the problem remains that his account does not explain why we come to see

these parts differently in context from how we see them in isolation. That is, he

simply fails to see that this phenomenon can only be explained by the operations of a

syntax that is not ‘bracket-free’,71 but bracketed like Chomsky’s, or which determines

the form and sense of the units upon which it operates by combining them into

larger units themselves subordinated to yet larger units.

The problem comes into plainer view still in Schier’s account of the ability of a

‘sub-iconic’ pair of dots, which iconify nothing on their own, to iconify eyes within

a picture-face. His argument here is that they can do so in this context because

they function as units of a whole icon.72 But again this does not explain what it is

about being placed in such a context that makes the dots look so different. Indeed, it

would seem that the only credible explanation for the fact that they come to life – as

eyes – is that the grammar picture-faces mimics the peculiarly holistic grammar of

face perception.73 The point here is that this form of perceptual grammar is unique

because it binds eyes and the other elements of faces together into a meaningful

whole. Within this gestalt the dot-eyes enjoy a special relationship to each other as a

pair, to the other features of the face, and to the rim of the face; with the consequence

that they look radically different in this setting from how they look outside it. This idea

would certainly seem to explain why the same feature looks different on different

faces, and why people ind it hard to recognize a person from one facial feature alone.

If the grammar of picture-faces does in fact mimic that of face perception it

would help to explain several aspects of their peculiarity. It makes sense, for instance,

of the way that even very small changes to a schematic picture-face will normally

result in signiicant changes to its ‘expression’, as Wittgenstein observed (plate 15).74

It would also explain why the round regions denoting the breasts of the igure in

Matisse’s Blue Nude (plate 16) seem unusually compelling, without quite turning into

eyes, when viewed within the context provided by her upraised arms and torso. The

peculiarity of this kind of grammar would, in addition, account for the fact that there

is what Andrew Harrison has called a ‘minimal syntax’ to a picture-face,75 or that it

can only work when it organizes a suficient number of features within a suficiently

replete structure. As Harrison was fond of observing, a vivid demonstration of

minimal syntax is that it is all but impossible to draw the Cheshire cat disappearing.76

Too much information, as in Tenniel’s illustration, will yield a cat obscured by

foliage; but too little will produce ‘a grin without a cat’ – something Alice declared to

be ‘the most curious thing I ever saw in my life!’

The peculiar, holistic, grammar of the face also poses a problem for artists,

inasmuch as it means that faces will look different in kind from other objects in

the picture. Most artists successfully dissemble any problems of this kind, but

Cézanne sometimes overcame it, as Merleau-Ponty pointed out, by ‘painting [the]

face “as an object”’, or a mere assemblage of shapes.77 His reasons for doing so were

undoubtedly complex, but the fact that he took such drastic steps in portraits like

Woman with a Coffee-Pot demonstrates how pictorial syntax was – and is – an inescapable

reality.

Another way of describing what happens when we see a face as a meaningful

whole as opposed to a mere collection of shapes is that the syntax allows us to

perceive the relation between its parts synchronically rather than piecemeal. Pictorial

syntax also allows us to see a whole picture as a single coherent unit the meaning of

which is present all at once, just as linguistic syntax makes the sense of a complete

sentence present as such. The synchronicity of sense in both kinds of output is

explained better by the idea that syntax subsumes the meanings of the constituent

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50


Paul Smith

parts of a picture or a sentence within the meaning of the whole. The effect is, of

course, normally involuntary and recent research shows that unconscious impulse

towards synchronic coherence results in unexpected similarities between the ways

in which we attend to both pictures and language. Robert Solso has demonstrated,

for example, that a spectator familiar with pictorial conventions will move her eyes

around a painting in an attempt to discover ‘thematic patterns’ in it,78 rather as

Alan Kennedy has shown that the eye movements we deploy when reading are not

linear, but involve both retrospective movements designed to check the sense we

have already acquired and proleptic movements used to anticipate the sense we are

about to encounter.79 At all events, such indings strongly suggest that it is simplistic

to assert that a painting is perceived instantaneously while a piece of poetry is heard

sequentially.80 Rather, it would seem that syntax bestows a degree of synchronicity

upon both.

3. Grammaticality

Willats’s conception of pictorial grammar as something that results in structures

analogous to well-formed strings – or words arranged in grammatically correct

structures – implies that pictures will normally display a form of grammatical

correctness.

A simple example of a well-formed pictorial string is a line junction that

obeys the rules of occlusion described by Huffman and others. To be well formed

at a higher level, a picture must use projection systematically. The majority of

pictures observe these rules and look right as a consequence, so much so that their

grammaticality is elusive. This obedience becomes apparent when they are compared

with pictures that do contravene the rules. Huffman used drawings of anomalous

objects in this spirit, demonstrating how ungrammatical combinations of lines

will render a picture spatially incoherent in ways foreign to well-formed drawings.

So too, Willats argues that many Byzantine and ‘Orthodox’ paintings employing

several drawing systems simultaneously look anomalous, nowadays, since they

contravene the rule that projection should be consistent throughout the picture, as

it is in photographs and paintings in perspective.81 It is clear, then, that Wollheim’s

protestations against the possibility of ‘ill-formed

or ungrammatical’ strings in pictures, and Schier’s

denial of the very possibility of an ‘ill-formed [iconic]

symbol’ genuinely analogous to ‘a sentence that is

ungrammatical in Chomsky’s sense’,82 simply run

counter to the phenomenal facts.

If grammar operates throughout every level of a

picture’s structure, it must not only employ a drawing

system and a denotation system that are each internally

consistent to qualify as well formed, but also ensure

that these are compatible with each other. Most of

the time this is the case, and, conversely, when a

picture’s drawing system clashes with its denotation

system its ill-formedness is normally patent. Willats’s

argument on this count is complex, however, since his

ideas about compatibility involve a dense distinction

between drawing and denotation systems that are

‘view-based’ and those that are ‘object-based’.

The basis of this distinction is that either kind

17 Drawing of a rectilinear object in fold-out projection, from igure 13.2 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Pictorial Grammar

of system maps a corresponding kind of percept. View-based systems are simple

enough in this regard, being projection systems such as perspective which map how

spatial relations look from a particular point of view, or denotation systems which

use primitives like contours to express the views presented by edges and surfaces.

By contrast, object-based systems articulate the more elusive percepts that David

Marr called ‘object-centred descriptions’, or the mnemonic descriptions of shapes in

general, from no particular point of view, which we derive from objects by tracking

their silhouettes back to the three-dimensional forms capable of generating them.83

Although controversial, Marr’s arguments in favour of the existence of object-

centred descriptions are compelling.84 It would be dificult to understand how we

could recognize things unless we could match the views they present to relatively

simple representations of invariant shapes of a kind our limited memory is capable

of storing. Moreover, only recourse to something like object-centred descriptions

can explain how we can be aware of the solidity and depth objects have even though

they only present us with their surfaces. At all events, Willats argues persuasively that

18 Paul Cézanne, Victor Chocquet Seated, c. 1877. Oil on canvas, 46 × 38 cm. Columbus, OH: Columbus Museum of Art (Howald Fund Purchase, 1950.024). Photo: Columbus Museum of Art.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15


Paul Smith

many projection systems are predominantly object-based, including several forms

of parallel projection, since these preserve the relative lengths of the sides of objects

as they really are, rather than as they appear. So too are primitives of the kind Willats

calls ‘regions’, which stand for entire volumes rather than for surfaces as they appear.

The ‘stick’ regions children use to represent limbs, for example, do exactly this, and

should not be mistaken for rudimentary contours.

Because the two kinds of system are fundamentally different, trouble inevitably

arises when a picture combines a projection system of one kind with a denotation

system of the other. Children, for example, get into dificulties as they mature when

they try to project views while remaining attached to the regions they had formerly

deployed in drawings projected by enclosure. So, for example, when they experiment

with ‘fold-out’ projection in an attempt to capture how the sides of a rectilinear

object appear to join up (plate 17), they still use regions to denote the objective shape

of these sides, with the result that their drawings look very odd indeed.85 Willats also

identiies several other uncomfortable combinations of projection and denotation

system,86 which further support the claim that a grammatical relation between the

two kinds of systems is normative. Ironically enough, so does Schier’s conference dot

diagram, since it is an ill-formed icon of the very kind whose possibility he denies.

The more speciic problem with it is that both its drawing system and its denotation

19 Paul Cézanne, Pot of Primroses and Fruit, c. 1888–90. Oil on canvas, 46 × 56.25 cm. London: The Courtauld Gallery (Samuel Courtauld Bequest 1948). Photo: © The Samuel Courtauld Trust.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19


Pictorial Grammar

system are ambiguous. So it could be that it uses orthogonal projection to render a

bird’s-eye view of the contours of the table and the surrounding delegates, but it can

just as easily be seen as employing enclosure to articulate the same items in the form

of regions. It is unclear in any case whether we should interpret it according to the

rules of view-based or object-based systems.

Schier’s diagram is it for purpose nonetheless. One reason for this is that we

do not require diagrams to be consistent throughout in how they represent things.

But, according to Willats, we also ind images of this kind acceptable because we are

accustomed to the fact that there are ‘degrees of grammaticalness’ in pictures, just

as there are in normal sentences according to Chomsky.87 But Willats also argues

further that we do not simply put up with ill-formed icons, but sometimes relish the

way they bend the rules, just as we enjoy grammatical play in poetry.88

Willats gives numerous examples of such creative play in pictures. He shows, for

example, how Klee creates a range of meanings by louting Huffman’s rule that a line

indicating a leading edge along one section of its course must not represent the leeing

edge of the same surface along another.89 Willats also gives numerous examples of

the playful use of the anomaly known as ‘accidental’ or ‘false attachment’,90 which

describes what happens when lines belonging to objects located at different depths

are allowed to join up, or to run into one another, with the result that they appear

to lie in the same plane. Cézanne exploited false attachment to considerable effect

20 Paul Cézanne, Still Life with Commode, c. 1887–88. Oil on canvas, 62.2 × 78.7 cm. Cambridge, MA: Fogg Art Museum, Harvard University Art Museums (Bequest from the Collection of Maurice Wertheim, Class 1906). Photo: The Bridgeman Art Library.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20


Paul Smith

in his Victor Chocquet Seated of 1877 (plate 18), where a number of lines converge upon

the sitter’s wrist, making it look as though the planes to which they correspond are

‘pinned’ together at this point.91 It needs emphasizing, however, that pictures of this

sort are not completely ungrammatical, as they can only perform their tricks if they

are grammatical for the most part.92

A more radical subversion of pictorial grammar employed by Cézanne and

the cubists is effected by the device known as ‘passage’, which describes their use

of marks interposed between or across occluding contours. Passage is particularly

evident in the area immediately to the right of the right edge of the pear in the lower

centre of Pot of Primroses and Fruit of c. 1888–90 (plate 19), where it elides the transition

between the base of the pot and the tabletop. Passage is signiicant because it is

unclear what kind of device it is. It might at a pinch be seen as lexical constituent

of the painting that Cézanne used in order to register the vacillation edges undergo

when ixated;93 but it can just as easily be regarded as a ‘functional’ constituent of

the work that Cézanne used more synthetically to play down the breaks in depth

that contours normally create. This means that, if so, it only has meaning in so far

as it intervenes in the relationships between the picture’s lexical items, just as the

‘functional’ constituents of a sentence (conjunctions, prepositions, ‘determiners’ like

‘the’ and ‘a’, and ‘complementizers’, including auxiliaries and ‘modals’ that modify

verbs) only contribute to its sense by specifying relations between its lexical parts.

But although words that are ambiguous between lexical and functional constituents

are largely absent from language, this does nothing to threaten the idea that passage

is a form of creative ungrammaticality. It merely points to the fact that pictorial

grammar is different from its linguistic counterpart.

Cézanne also exploited higher-level grammatical anomalies by employing

more than one projection system in the same work. Willats argues that Cézanne

did this in his Still Life with Commode of 1887–88 (plate 20) by employing one form of

projection approximating to vertical oblique for the table at the front and another

approximating to horizontal oblique for the commode behind it. The two systems

are not massively anomalous, however, especially since Cézanne used neither strictly,

and may in fact have been using a disguised form of oblique projection for the

commode.94 So the picture is not just an ill-formed icon but a work that ‘warps’ space

for expressive effect.95

Adults’ pictures can couple projection and denotation systems anomalously, just

as children’s do. Willats shows, for example, that van Gogh resorted to combinations

of this kind when he used vanishing-point perspective in conjunction with regions.96

Arguably, however, Cézanne used more complex and ambiguous combinations,

employing forms of parallel projection that render reasonably convincing views

while preserving aspects of the objective character of shapes, and employing

primitives in the form of areas of contrasting tone and colour that serve equally well

as contours denoting edges, or the boundaries of regions denoting three-dimensional

shapes. In Still Life with Commode, at any rate, loosely deined parallel projection systems

appear hand-in-hand with ambiguous contrasts of this sort, creating an expressive

ambiguity about whether the scene is a view or a more objective depiction.97

The expressive potential of parallel drawing systems is not analysed in great

detail by Willats, but it is implied by his equivocation over their nature, which

he characterizes as view-based in some places and as object-based in others. It is

apparent, in any case, that such systems are inherently ambivalent. For one thing, a

drawing in parallel projection produced from memory is, by transforming an object-

centred description, normally plausible as a view as long as it observes the rules of

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50


Pictorial Grammar

occlusion. For another, parallel projection preserves the to-and-fro that occurs within

perceptual acts, between the object-centred description and views, or how we both

infer object-centred descriptions from views and use them to ill out our sense of

the objects that we view. It would therefore seem that it is their very ambivalence

that makes the parallel projection expressive. At any event, Cézanne seems to have

exploited their dual nature to such ends, capitalizing on this duality’s capacity to

produce a sense of shape in something approximating to its plenitude.

The coherence and usefulness of the idea of pictorial grammaticality

notwithstanding, it has to be admitted that language possesses a far richer variety

of constituents than pictures. In the irst place, its ‘lexical’ constituents fall into

a number of classes and can perform a wide variety of roles, most of which are

impossible for the narrow range of constituents to be found in pictures.98 Language

also possesses multifarious functional constituents, which can generate complex

syntactic relationships between its lexical parts, whereas nothing of the sort applies to

pictures. Wollheim is therefore on irm ground when he claims that pictures cannot

be segmented ‘into parts that can be categorised … according to the contribution

they make to the meaning of the whole’,99 or that we have no need to classify the

‘basic units’ of pictures into ‘general categories’ equivalent to ‘noun, verb, adverb’

etc. But such objections do nothing to undermine the idea of pictorial grammar

per se since parsing a picture only ever need involve distinguishing a few kinds of

primitives. And even these would seem to have some lexical diversity, and perhaps also

at least a suggestion of functional capacity.

4. Innateness

A key assumption in Willats’s account, which sharply differentiates it from

conventionalist alternatives, is that pictorial competence is innately grounded in

the same way as language according to Chomsky.100 Chomsky argues that children

develop the ability to understand the grammar of their own language because they

are born with ‘universal grammar’, or the basic generative principles underlying the

possibility of all actual grammatical structures. Hence, simply through exposure

to their native language, they can parse and understand what they hear, and

begin to generate grammatical sentences.101 Put another way, Chomsky argues

that children learn their own language spontaneously, by acquiring ‘I-language’,

or the ‘internalized language’ consisting of the rules governing all of its possible

sentences.102 In short, then, it is an innate linguistic competence that grounds

children’s ability to make ‘ininite use of inite means’.103 And, for Willats, the same is

true of their ability to draw.104

By way of evidence, Willats shows that a child who has learned to use a particular

drawing system to represent one kind of object already has a more general command

of the system, and so can use it inventively – even when it is enclosure – to depict

an enormously wide range of shapes.105 He also points out that children come

to command more and more complex rules of depiction as they mature,106 just

as they deploy increasingly complex grammatical rules in their speech. This is

implicitly because an increasingly sophisticated command of grammar manifests

itself spontaneously in their brains as these develop. This conclusion is supported

by the fact that children’s drawing competence develops in a series of stages whose

sequence is normative. It appears, at any event, that they normally begin to draw

using regions within enclosure to represent volumes, then move on to using more

reined regions, typically lines, to represent cylindrical volumes such as limbs, and

afterwards adopt the use of contours to denote edges within an orthogonal projection

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Paul Smith

system. Moreover, it is only once having mastered this system that children learn

to command rudimentary parallel projection systems, and eventually learn to

use oblique projection proper when they reach artistic maturity. By corollary,

Willats seems to imply that children will not learn vanishing-point perspective

spontaneously, but must be taught its rules,107 which would explain why it appears

so late in ontogenetic development, when it does at all. The point to emphasize is

that this sequence is apparent in many cultures.108 Willats shows, for example, that

most Western children’s drawing of the human igure almost always progresses from

‘tadpole’ igures to ‘stick’ igures, and then to igures drawn in contours (plate 21).

And he even demonstrates that many of the drawings that researchers elicited from

children of the Jiri valley in Papua New Guinea, who were wholly unused to the

practice of drawing, exhibit a similar morphology (plate 22).

21 Drawings of a man made by Western children of successively greater ages, Florence L. Goodenough Collection, Penn State University Archives, Pennsylvania State University Libraries. Photo: With permission. Figure 13.7 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997.

22 Drawings by children from the Jiri valley in Papua New Guinea, adapted from igure 13.7 from John Willats, Art and Representation: New Principles in the Analysis of Pictures, Princeton, NJ, 1997. © Princeton University Press, 1997. Photo: Reprinted by Permission of Princeton University Press. Originally from igures 2 and 3 from Margaret Martlew and Kevin J. Connolly, ‘Human igure drawings by schooled and unschooled children in Papua New Guinea’, Child Development, 67: 6, December 1996, 2743–62). Photo: By permission of John Wiley & Sons.

23 A drawing of a scene in oblique projection that includes the transparency error, from John Willats, ‘How children learn to represent three-dimensional space in drawings’, in G. Butterworth, ed., The Child’s Representation of the World, London, 1977, 189–202. Photo: With kind permission of Springer Science and Business Media.

1

2

3

4

5

6

7

8

9

10

11

12


Pictorial Grammar

The fact that children’s ability to understand speech and drawings greatly

outstrips their ability to produce them may, by contrast, seem to undermine the

argument that innate grammar underlies both kinds of competence. It simply goes to

show, however, that producing output in either form is complicated by the need to

co-ordinate grammatical competence with performative, and predominantly motor,

skills.109 Indeed, Harrison has argued that the capacity of most people to recognize

pictures of far greater sophistication than they can produce can only be explained by

the existence of a ‘generative grammar’ of the pictorial that we all share.110

The fact that children characteristically make errors in both their speech

and their drawings could also seem to contradict the idea of innate grammatical

competence. But as far as speech goes, it is clear that many of their errors are

grammatical in the sense that they occur when a child applies a rule that works

perfectly well for one situation to another for which it is inappropriate, for instance

when they add ‘ed’ to the root of an irregular verb to turn it into the past tense.111

Willats argues in a similar vein that some of the errors in children’s drawings can

be attributed to misapplications of grammatical rules.112 An example might be the

‘transparency error’, which children commit when they draw the line denoting

the edge of a solid object which is occluded – as happens in plate 23, where the line

denoting the top edge of the table top is visible through the box in front.113 Such

errors can at any rate be seen as the result of a failure on the part of the child to grasp

how the particular task in hand demands a modiication to the rules for drawing

shapes she had previously acquired. What is more, the fact that children often learn

to correct such mistakes themselves without having to be taught the error of their

ways suggests that they possess an innate sense of what can count as grammatical

drawing, just as they can work out what constitutes grammatical speech.114 And

indeed, in this drawing, the child spontaneously added shading to obscure the

illegitimate contour.

Even the almost complete immunity children exhibit towards adults’ attempts

to correct their errors can be attributed to their developing command of grammar.

They will, in other words, normally only elect to use the conventional norms when

their competence has reached the point where they are ready to do so.115 The broader

signiicance of the argument that competence is innately grounded is that it makes

sense of facts that naturalists like Schier acknowledge but cannot explain. So, for

example, Schier acknowledges that our ability to recognize a wide range of pictures

is ‘generative’, in the sense of productive, and that our capacity to produce a wide

variety of icons exhibits ‘generativity’ as well,116 rather as Wollheim seems to,117 but

denies that these competences are grounded in any innate grammar. This means that

his account offers no explanation of what makes generativity possible. But Willats’s

argument that we are born with the capacity to map shapes and their relations ills

the explanatory gap perfectly.

5. Transformation

It is apparent from the general drift of Willats’s account that his conception of the

mapping rules implemented in pictures is indebted to Chomksy’s thinking. His use

of the word ‘transformations’118 to describe the forms in which pictures render their

‘deep’ perceptual content also makes a clear allusion to Chomsky’s early idea that

a sentence transforms a deep grammatical structure into a surface structure. To all

intents and purposes, therefore, Willats would seem to regard pictorial grammar as

analogous to the universal grammar that Chomsky envisaged in the form of a set of

algorithms determining the ‘transformations’ speech makes out of its underlying

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Paul Smith

contents. Admittedly, this conception of grammar

begs some intractable questions about what exactly

the substrate to speech consists of.119 But if it is

reasonable to assume that language maps some pre-

articulate linguistic equivalent of thought, or even

thought itself, into a public form,120 then by analogy

it seems reasonable to assume that pictures transform

shape percepts into a form that makes them publicly

available.121 This idea – or its principle – inds support

in the fact that computer programs capable of mapping

shape descriptions into visual form are readily

written.122

A more speciic similarity between the

transformational work done by pictures and language

is that both employ forms of ‘deletion’ when they

convert their deep content into surface output.

Hence, the surface structure of the sentence, ‘A

wise man is honest’, is more concise than the deep

structure it corresponds to (plate 24), since, typically, it

eliminates syntactic constituents that its counterpart

makes explicit.123 Notwithstanding, deletion of this

kind has no serious deleterious effect on the sense

of most sentences. Something similar, although

not identical, is true of deletion in line drawings,

which only inform us about the edges of objects

and their relations at the cost of excluding semantic

information about the shape of objects provided

by shading. They can, as a result, be a little unclear

about the contiguity of, or distance between, objects,

compared with fully elaborated drawings employing

cast shadows.124 But most of the time, line drawings

provide perfectly good working representations of

objects and scenes, perhaps because they map what

Marr calls the ‘2½D sketch’, or the skeletal preview of

a scene that we construct for the purposes of grasping

its main features prior to forming it in detail.125 In

short, deletion allows both forms of output to be more

succinct than the underlying contents they transform,

enabling us to articulate meaning eficiently. And,

although he opposes the idea of pictorial grammar

altogether, Wollheim nevertheless seems to conirm

this conclusion since he subscribes to the idea that

deletion of a kind described by Chomsky operates

in pictures, arguing more speciically that it plays a

role within ‘thematization’, or the process whereby

the artist gives salience to particular features of the

eventual painting at the expense of features that were

signiicant in its earlier stages.126

The kinds of deletion at work in language and

pictures are nevertheless different inasmuch as it

24 Tree diagram of the deep structure of the sentence, ‘A wise man is honest’, from Noam Chomsky, ‘Linguistic contribution: present’ [1967], in Language and Mind, Cambridge, 2006 [1968], 26. Photo: © Cambridge University Press.

25 Example of stimulus objects in an experiment on the perception of degraded objects, igure 16 from Irving Biederman, ‘Recognition-by-components: A theory of human image understanding’, Psychological Review, 94: 2, 1987, 115–47. © 1987, American Psychological Association. Photo: Reprinted with permission.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50


Pictorial Grammar

is syntactical elements that are most readily abbreviated in the one and semantic

elements in the other. Hence, while speech from which syntactic information has

been deleted is normally clear, removing the same kind of information from line

drawings can have a catastrophic effect. Biederman has produced drawings, for

example, which show that while it is perfectly possible to remove a segment of a

continuous line without making it particularly dificult to grasp the shape it implies,

eliminating a vertex or cusp with the function of specifying the relationships

between the adjoining edges of the same object can sometimes make it almost

impossible to decipher its shape (plate 25).127 Hence, and although they are meant

principally to illustrate our perception of objects, his drawings demonstrate that

pictures of mundane objects and not just faces must possess a ‘minimal syntax’.

Transformational rules are also among the things that allow language the capacity

for ‘movement’,128 which means that it is possible to change the sense of some

sentences by altering the order of their constituent phrases and words. Movement

thus allows us to turn a statement into a question or a modal expression such as a

surmise or speculation with ease, whereas pictures can do nothing of the kind.129 So

too, language can embed phrases within a sentence ‘recursively’ – most obviously

perhaps in The House that Jack Built, where ‘that’ is used in each iteration to embed

another phrase within the sentence enunciated – while nothing strictly analogous is

possible in pictures, as Wollheim rightly points out (pictures within pictures being

most closely analogous to parentheses). But neither of these limitations implies that

pictures have no grammar at all. Rather, and once more, they merely show that

pictures use a relatively simple grammar appropriate to their particular content.

Nature vs Nurture

Any language must be at least potentially accessible to its users, since a logically ‘private

language’ is not a language at all.130 One way of explaining how all languages are

capable of being understood in principle – provided their vocabulary is learned – is

that they transform the same underlying content according to rules derived from an

innate, universal grammar. Language is thus in a very strong, and signiicant, sense

‘natural’ in the last instance, rather than purely or wholly ‘conventional’. Schier

evidently fails to grasp this argument. But Willats is almost equally obtuse when he

argues that language is ‘conventional’ unlike drawing.131 It is, of course, true that

words are unlike picture primitives inasmuch as the form of any primitive will map

some of the properties of what it stands for. Hence a line which is darker than the

igure it encloses will map a contour more effectively than one that is lighter.132 It

is also the case that while most sentences do not mirror the structure of the events

they refer to, any regular drawing system will preserve at least some of the spatial

relationships obtaining in the scene it represents. However, neither of these differences

means that language is wholly conventional while depiction is wholly natural. Rather,

the difference between language and depiction is less substantive, and narrower.

As regards the role of convention in language, and setting aside the issue of

the lexicon, the most obvious way in which convention determines the surface

structures of any particular language is that it sets its many parameters arbitrarily.

Convention can nevertheless only set parameters that universal grammar makes

available, and it must set them according to the small set of options it allows. This

means that many of the widely divergent forms of different languages can be

attributed to the different ways in which they happen to set the same parameters.

The word orders of English and Japanese, for instance, are grossly different,133 but

the structures of both are consistent with the possibilities sanctioned by universal

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Paul Smith

grammar. And, by corollary, only very few languages have the form OVS, while OSV

is almost non-existent, which suggests that universal grammar makes them very

dificult, or nigh-on impossible, to generate.

By analogy, or just as the widely different ‘particular grammar(s)’ of all

languages are generated by the same innate rules in the last instance,134 so the

diverse surface pictorial ‘grammars’ sanctioned by the various academies and

different cultures all express rules consistent with universal pictorial grammar.

This suggests that convention gets into depiction by a more elliptical route then we

commonly imagine, by deciding which of the rules made available by universal

pictorial grammar will be used in practice. So, for example, while a particular

culture’s preference for certain forms of drawing and denotation systems is a matter

of convention,135 all drawing systems are still ultimately generated by universal

pictorial grammar. By the same token, if it is convention that sanctions the use of

widely different kinds of mark among different cultures and groups for establishing

the same picture primitives, it is universal pictorial grammar that permits a line

to stand for an edge or a luminous spot to refer to a point in space. Moreover,

although it may be convention that decides what counts as an acceptable degree of

grammaticality in pictures, it is universal pictorial grammar that makes it possible to

speak of pictorial grammaticality at all.136

Language and depiction are therefore similar in the respect that convention

shapes the output they generate on the basis of innate universal grammar. But

they are different inasmuch as pictorial form is closely constrained by its innate

grammar to conform to a limited set of norms whereas language can assume a wide

multiplicity of particular, and extremely complex, grammatical forms. Put more

prosaically, the effect of convention on pictures is small beer compared to its effect

upon speech. So it is that the vast majority of pictures are transparent to people of all

cultures, whereas the different languages are relatively opaque to non-native speakers

irrespective of the question of unfamiliar vocabulary.

Conclusions

It is fair to say then, that one major achievement of Willats’s account of representation,

suitably modiied, is that by furnishing a conception of pictorial grammar as

something to which nature and convention both contribute it can relieve the profound

antinomy between pictorial syntax and iconicity that dogged earlier debates about

representation. But perhaps the most general conclusion to be drawn from Willats’s

work is that grammatical structure is central to how, and what, pictures articulate.

This basic idea has many possibilities. One is that a more developed theory of

pictorial grammar might allow us to specify more closely just what pictures can, and

cannot, achieve by pictorial means alone, apart from brute depiction. For example, it

still needs to be decided whether a picture can represent a particular object, or kind

of object, by showing a cluster of visible properties alone, or whether it can only do

so by recruiting concepts from language. By corollary, the same sort of analysis might

make it clearer how pictures are dependent on the services of concepts when they

show particular aspects of things,137 and on more complex alien structures imported

from language when they make statements,138 or tell stories.139

Closer attention to the constraints imposed by grammatical structure might

also produce a different, or at any rate a more complete, understanding of why

some pictorial forms evolved as they did than is presently available. So, for instance,

although the composite forms used to articulate the human igure in antique Greek

vase painting are readily described as conventional schemata, it might enrich our

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49


Pictorial Grammar

Notes

This article owes much to the writings of and conversation with

several colleagues and friends, especially John Willats, Jason

Gaiger, Andrew Harrison, Michael Podro, and Richard Wollheim.

I am also indebted to the J. Paul Getty Trust for a scholar grant

which allowed me to pursue the research it presents.

1 On ‘capacities’, see Noam Chomsky, Rules and Representations, Oxford,

1980. 4–5.

2 Throughout, ‘picture’ is used in the strict sense to mean

‘representational picture’.

3 Richard Wollheim, ‘Relections on Art and Illusion’, in On Art and the Mind,

London, 1973, 261–89, esp. 266–7.

4 Noam Chomsky, Syntactic Structures, The Hague, 1957, and Aspects of the

Theory of Syntax, Cambridge, MA, 1965.

5 For an outstandingly lucid account of Chomsky’s thinking organized

according to its various phases, see Vivian Cook and Mark Newson,

Chomsky’s Universal Grammar: An Introduction, third edn, Oxford, 2007. For a

perceptive account of his theories from a broader perspective, see Neil

Smith, Chomsky: Ideas and Ideals, second edn, Cambridge, 2004.

6 We can do this by applying a small number of overarching ‘principles’

governing the broad possibilities of sentence and phrase construction,

and a series of ‘parameters’ governing word and phrase order, which

together constitute a relatively simple hierarchical system capable of

generating all the speciic rules required. See Smith, Chomsky, 60–94.

7 John Willats, Art and Representation: New Principles in the Analysis of Pictures,

Princeton, NJ, 1997, xii.

8 John Willats, Art and Representation, 279. For a review of the literature,

see W. F. Miller and A. C. Shaw, ‘Linguistic methods in picture

processing: A survey’, AFIPS Joint Computer Conferences: Proceedings of the

December 9–11, 1968 (fall joint computer conference), part 1, 279–90.

9 John Willats, Making Sense of Children’s Drawings, Mahwah, NJ, 2005, ix–x.

10 Willats, Art and Representation, xii and 19–20, and Children’s Drawings, x.

11 See David Marr, Vision: A Computational Investigation into Human Representation

and Processing of Visual Information, New York, 1982, 28–9 and 356–7, and

Noam Chomsky, Aspects of the Theory of Syntax, 14.

12 Criticisms of Chomsky’s early work often misconstrued what he

said, particularly about ‘deep structure’. See ‘Deep Structure’ [1975],

reprinted in Noam Chomsky, On Language, New York, 2007, esp. 171–2.

For an over-zealous attack on Chomsky’s method, see Robert D. Levine

and Paul M. Postal, ‘A corrupted linguistics’, in Peter Collier and David

Horowitz, eds, The Anti Chomsky Reader, San Francisco, CA, 2004.

understanding of them to appreciate how they are determined by the orthogonal

projection system demanded by curved, rotatable, surfaces that do not readily tolerate

more sophisticated drawing systems dependent on a ixed viewpoint.

Another possibility of a theory of pictorial grammar is that it might reveal how

mapping rules apply to colour, as well as to drawing, in painting. More speciically,

it might be capable of expressing what kind of regularities must obtain between

colours in a representational painting if they are to stand successfully for colours

in the world, which they cannot do punctually because chromatic effects like

contrast,140 and chromostereopsis, operate differently on a lat surface from how

they operate in depth.

Willats’s radical idea that the grammar of projection makes all pictorial space sui

generis might be extended, as well, into an analysis of the space produced by artists’

individual styles of drawing, particularly if combined with his idea of how artists

disguise their drawing systems and play with the rules of concatenation. An analysis

of this sort might reveal, for example, how Cézanne’s peculiarly elastic space lends

itself to the expression of our tendency inside acts of seeing to probe the world

before us for the meanings it has for our potentially grasping physical hands and

mobile body.

Willats’s more speciic ideas about the more basic semantic constituents of

pictures have a similar potential inasmuch as they might explain how an artist’s

individual style is closely dependent on such things as the particular manner in

which it articulates pictorial features like edges. More particularly, the notion taken

from Frédo Durand that pictorial marks bestow ‘attributes’, including colour, tone,

transparency, texture, thickness, ‘wiggling’ and orientation upon lines and other

pictures primitives, might start to explain why some artists’ way of using marks gives

the objects and spatial relations in their pictures a characteristic look of their own.141

In sum, Willats’s account of pictorial grammar is pregnant with possibilities for

specifying the historical and aesthetic particularity of pictures and for explaining

how their different forms came about, because it offers a way of analysing pictorial

structure more precisely than previously, and at the same time vividly demonstrates

just how, and to what extent, this structure is responsible for the meanings and effects

that pictures produce.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33343536373839

40414243444546474849505152


Paul Smith

13 Yve-Alain Bois, ‘The semiology of cubism’, in Lynn Zelevansky, ed.,

Picasso and Braque: A Symposium, New York, 1992, 169–221. Wollheim also

refers to Rosalind Kraus, ‘The motivation of the sign’, in Zelevansky,

Picasso and Braque, 261–305.

14 Richard Wollheim, Formalism and its Kinds, Fundació Antoni Tàpies,

Barcelona, 1995

15 Bois, ‘The semiology of cubism’, 173. For an argument in favour of the

existence of the ‘conventional sign’ in painting almost contemporary

to cubism, see Teodor de Wyzewa, ‘Wagnerian Painting’ [1897]. trans.

Paul Smith, in Charles Harrison and Paul Wood, eds, Art in Theory: An

Anthology of Changing Ideas, Oxford, 1992, 18.

16 Bois, ‘The semiology of cubism’, 173–4. By contrast Chomsky

argues that language has ‘ininite’ productive capacity, implying that

phonocentrism can only impose practical limits on what it can say.

See Chomsky, Theory of Syntax, 8. Cf. Steven Pinker, The Language Instinct:

The New Science of Language and Mind, London, 1995, 84.

17 Bois, ‘The semiology of cubism’, 174. Cf. Kraus, ‘The motivation of

the sign’, 262–3, and Rosalind Kraus, ‘In the Name of Picasso’ [1981],

in The Originality of the Avant-Garde and Other Modernist Myths, Cambridge,

MA, 1996, 34–7.

18 Cf. Andrew Harrison, ‘A minimal syntax for the pictorial: The

pictorial and the linguistic – analogies and disanalogies’, in Salim

Kemal and Iwan Gaskell, eds, The Language of Art History, Cambridge,

1991, 213, which argues that any account of linguistic or pictorial

meaning that takes no account of reference is ‘absurd’.

19 Ludwig Wittgenstein, On Certainty, Oxford, 1969, §160, and Anthony

Grayling, Wittgenstein, Oxford, 1996, 95–6.

20 Wollheim also attacks Kraus here. See Wollheim, Formalism and its Kinds,

28 and 38, note 12. Cf. the argument that cubism is not about ‘the

nature of the sign’ in Flint Schier, ‘Painting after art? Comments on

Wollheim’, in Norman Bryson, Michael Ann Holly, and Keith Moxey,

eds, Visual Theory: Painting and Interpretation, Cambridge, 1991, 154.

21 Richard Wollheim, Painting as an Art, London, 1987, 43–100. For a

succinct analysis of Wollheim’s position, see Harrison, ‘Minimal

syntax’, 220–4.

22 Wollheim, Formalism and its Kinds, 26–7. All subsequent quotations

from Wollheim are from this section of text. The essay is reprinted,

in an abbreviated and modiied form, as ‘On formalism and pictorial

organisation’, Journal of Aesthetics and Art Criticism, 59: 2, Spring 2001,

127–37.

23 Cf. Flint Schier, Deeper into Pictures: An Essay on Pictorial Representation,

Cambridge, 1986, 150, which paraphrases Roger Scruton’s criticism

of Barthes’ semiotics for its inability to adduce a ‘grammatical

rule’ connecting the meaning of a system’s ‘signiicant parts’ to the

meaning of the ‘ensemble’.

24 Wollheim, Formalism and its Kinds, 1995, 37, note 10.

25 Schier, Deeper into Pictures, 67–8.

26 Schier, Deeper into Pictures, 67.




30 Chomsky, Theory of Syntax, vi. Cf. Max Clowes, ‘On seeing things’,

Artiicial Intelligence, 2: 1, Spring 1971, 79–116, esp. 80.

31 Harrison, ‘Minimal syntax’, 216–17 and 225–7. For the view that

pictures cannot map appearances constituted by Gestalt and constancy

effects (inter al.), see E. H. Gombrich in ‘Mirror and Map’ [1975], in The

Image and the Eye, Oxford, 1982, 172–214.

32 Cited in Noam Chomsky, ‘On Cognitive Capacity’ [1975], reprinted in

On Language, 3–35, esp. 8.

33 Michael Podro, ‘Formal elements and theories of modern art’, British

Journal of Aesthetics, 6: 4, 1966, 329–38, esp. 335.

34 Richard Wollheim, ‘Form, elements, modernity: Reply to Michael

Podro’, The British Journal of Aesthetics, 6: 4, 1966, 339–45, esp. 344.

35 Richard Wollheim, ‘Giovanni Morelli and the origins of scientiic

connoisseurship’, On Art and the Mind: Essays and Lectures, London, 1973,

200–1.

36 Goodman, Languages of Art: An Approach to a Theory of Symbols, Indianapolis,

IN, 1976, 133.

37 Goodman, Languages of Art, 131–41.

38 Goodman, Languages of Art, 225–32.

39 Curtis Carter, ‘Painting and language: A pictorial syntax of shapes’,

Leonardo, 9: 2, Spring 1976, 111–18, esp. 112–15.

40 Carter, ‘Painting and language’, 114–16.

41 Clowes, ‘On seeing things’, 84–7, and Willats, Art and Representation, 94

(on Guzman).

42 Willats, Art and Representation, 4 and 93–100.

43 Willats, Art and Representation, 8–9, 98–100, and 220, and Children’s

Drawings, 11 and 63–8.

44 The rule was irst applied in Chomsky, Syntactic Structures, 26. Cf. Cook

and Newson, Chomsky’s Universal Grammar, 28–32. See Clowes, ‘On

seeing things’, 80–1 for the analogy with pictures.

45 On this, technical, sense of ‘generative’ (as synonymous with

‘explicit’) see Cook and Newson, Chomsky’s Universal Grammar, 35–6,

which prohibits its use as a synonym for ‘productive’. For a more

liberal view, see Neil Smith, ‘Chomsky’s science of language’, in James

McGilvray, ed., The Cambridge Companion to Chomsky, Cambridge, 2005,

21–41 and 295–6, esp. 296, note 4.

46 Willats, Children’s Drawings, 77 and 88.

47 Willats, Children’s Drawings, 82–97, and Art and Representation, 309–10, 356,

and 370.

48 Pinker, The Language Instinct, 159–61.

49 Irving Biederman, ‘Visual object recognition’, in Stephen Kosslyn

and Daniel Osherson, eds, Visual Cognition, vol. 2, 121–65, esp. 131 (on

‘natural parsing region[s]’ in the human igure) and 139 (on the geon

and parsing). Cf. Donald Hoffman and Manish Singh, ‘Vision: Form

perception’, in Lynn Nadel, ed., Encyclopedia of Cognitive Science, vol. 4,

London, 2003, 486–90.

50 Clara Casco and Daniela Guzzon, ‘The aesthetic experience of

“contour binding”’, Spatial Vision, 21: 3–5, 2008, 291–314.

51 John Willats, ‘The third domain: The role of pictorial images in

picture perception and production’, Axiomathes, 13: 1, March 2002,

1–15, esp. 11–12. Cf. Willats, Children’s Drawings, 178–9.

52 In his early television programmes, Rolf Harris used to draw in a

sequence of this kind with the intention of withholding the meaning

of the picture until it was almost completed.

53 David Huffman, ‘Impossible objects as nonsense sentences’, in

Bernard Meltzer and Donald Michie, eds, Machine Intelligence, vol. 6,

Edinburgh, 1971, 295–323.

54 Huffman, ‘Impossible objects’, 323 (where implicit reference is made

to Chomsky), and Clowes, ‘On seeing things’, 79–80. Cf. Willats,

Art and Representation, 29, 175, 272, 279 (citing Clowes), and 282, and

Children’s Drawings, 199.

55 Chomksy, Aspects of the Theory of Syntax, 148–9.

56 Huffman, ‘Impossible objects’, 301–4, and Clowes, ‘On seeing things’,

86–7. Cf. Willats, Art and Representation, 114–15, and Biederman, ‘Visual

object recognition’, 127–9.

57 Huffman, ‘Impossible objects’, 305, and Clowes, ‘On seeing things’,

79–80 and 87–91. Cf. Willats, Art and Representation, 118.

58 Clowes, ‘On seeing things’, 89, and Huffman, ‘Impossible objects’,

306–13.

59 Huffman, ‘Impossible objects’, 304–11, and Clowes, ‘On seeing

things’, 104–6. Cf. Willats, Art and Representation, 113.

60 On the analogy between drawings and maps, see Fred Dubery and

John Willats, Drawing Systems, London, 1972, 9, and Willats, Art and

Representation, 70–6.

61 Willats, Art and Representation, 10–13 and 369.

62 Willats, Art and Representation, 70–84, and Children’s Drawings, 68–70.

63 Willats, Children’s Drawings, ix–x.

64 Willats, Children’s Drawings, 77 and 88. The analogy is implicit in Art and

Representation, but is suggested in a citation from Clowes, ‘On seeing

things’, 273.

65 Smith, Chomsky, 79. Cf. Andrew Carnie, Syntax: A Generative Introduction,

second edn, Oxford, 2007, 19 and 23. Cf. Pinker, The Language Instinct,

234, which includes OVS among the rare forms.

66 Noam Chomksy, The Minimalist Programme, Cambridge, MA, 1995, 35,

and Theory of Syntax, 12–14, and 83–5.

67 King Sun Fu, Syntactic Pattern Recognition, Englewood Cliffs, NJ, 1982, 1–7.

68 Willats ‘The third domain’, 5. This radical view inds some support

in Michael Podro, Depiction, New Haven and London, 1998, 9, which

argues that our sense of the work a line performs will affect the course,

and the outcome, of our visual experience of what it represents.


1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071


Pictorial Grammar

70 Schier, Deeper into Pictures, 67. Cf. 170.

71 Harrison, ‘Minimal syntax’, 229–30.


73 Doris Tsao and Margaret Livingstone, ‘Mechanisms of face

perception’, Annual Review of Neuroscience, 31, July 2008, 411–37, esp.

418–20.

74 Ludwig Wittgenstein, Lectures and Conversations on Aesthetics,

Psychology and Religious Belief, ed. Cyril Barrett, Oxford, 1966, 10, §4.

75 Harrison, ‘Minimal syntax’. The essay’s title may allude to

Chomsky, Theory of Syntax, 3, which refers to ‘minimal syntactically

functioning units’.

76 Harrison, personal communication.

77 Maurice Merleau-Ponty, ‘Cézanne’s doubt’, in Galen Johnson,

ed., The Merleau-Ponty Aesthetics Reader: Philosophy and Painting, Evanston, IL,

1993, 59–75, this quotation 66. See also The Phenomenology of Perception,

London, 1962, 322

78 Robert Solso, Cognition and the Visual Arts, Cambridge, MA, 1996,

147.

79 Alan Kennedy, The Psychology of Reading, London, 1984, 126–39. I

am grateful to Katherine Shingler for this reference.

80 See Gotthold Lessing, ‘Loacoon’, in Selected Prose Works of G. E.

Lessing, London, 1879, chs IV, XVI, and XVII.

81 Willats, Art and Representation, 341–6, and ‘The rules of

representation’, in Paul Smith and Caroline Wilde, eds, A Companion to

Art Theory, Oxford, 2002, 417–18.

82 Schier, Deeper into Pictures, 66, note 1.

83 Willats, Art and Representation, 20–1, 40–1, and 170–4. Marr,

Vision, 313–17.

84 See Vicki Bruce, Patrick Green and Mark Georgeson, Visual

Perception: Physiology, Psychology & Ecology, fourth edn, Hove, 2003,

276–89, for succinct accounts of Marr’s (and Nishihara’s) theory of

the role played by object-centred co-ordinates in object recognition,

and of Biederman’s alternative.

85 Willats, Children’s Drawings, 16, 104–8, 122–3, 146, Art and

Representation, 93, 177–8, 182–5, and 316–17.

86 Willats, Art and Representation, 165–7.

87 Willats, Art and Representation, 29 (citing Chomsky, Theory of Syntax,

148).

88 Willats, ‘Anomaly in the service of expression’, Art and

Representation, 248–67, and Children’s Drawings, 18.

89 Willats, Art and Representation, 267.

90 Clowes, ‘On seeing things’, 104. Cf. Willats, Art and Representation,

30 (on Guzman) and 357, note 1 (on Huffman).

91 Meyer Schapiro, Paul Cézanne, New York, 2004, 62.


93 Paul Smith, Interpreting Cézanne, London, 1996, 46.

94 Willats, Art and Representation, 48 and 51, and Children’s Drawings,

197.

95 Cf. Merleau-Ponty, ‘Cézanne’s doubt’, 64, which offers some

dubious remarks about another ‘warped’ table in Cézanne, and

Huffman, ‘Impossible objects’, 312–13, on how ‘ungrammatical’

combinations of lines can lead to a ‘warped’ appearance.



98 This means that even grammatically simple sentences can

express abstract concepts like number, while pictures can only show

particulars, although this allows them to exemplify the sensuous

properties and relational properties of objects more fully than even

the most poetic language. See Nelson Goodman, ‘Art and inquiry’,

Proceedings and Addresses of the American Philosophical Association, vol. 41,

1967–1968, 5–19, esp. 12.

99 Richard Wollheim, ‘On pictorial representation’, in Rob van

Gerwen, ed., Richard Wollheim on the Art of Painting, Cambridge, 2001, 14.

Cf. Alex Potts, ‘Sign’, in Robert Nelson and Richard Shiff, eds, Critical

Terms for Art History, Chicago, IL, 1996, 22.

100 Willats, Children’s Drawings, 13, 15, 221.

101 Chomsky, Rules and Representations, 4.

102 See Smith, Chomsky, 28–32, and Cook and Newson, Chomsky’s

Universal Grammar, 13–19.

103 Chomsky, Theory of Syntax, v.

104 Willats, Children’s Drawings, 8, 78, and 170.

105 Willats, Children’s Drawings, 77 –8.

106 Willats, Children’s Drawings, 19, 40, 233.

107 Willats, Children’s Drawings, 147, 171, 177, 216, and 235. Children, such

as Stephen Wiltshire, who draw in perspective spontaneously are

exceptional. See Oliver Sacks, An Anthropologist on Mars, New York, 1995,

185–7.

108 Willats, Children’s Drawings, 159, and Art and Representation, 289 and 311–15.

109 Cf. Willats, Art and Representation, 165, and Children’s Drawings, 37–9, 60–1,

and 76–7.

110 Harrison, ‘Minimal syntax’, 229.

111 Cf. Willats, Art and Representation, 175.

112 Willats, Children’s Drawings, 8–9.

113 Willats, Children’s Drawings, 179–80.

114 Willats, Children’s Drawings, 170, 181, 201, and 229.

115 Willats, Children’s Drawings, 172. Cf. Carnie, Syntax, 21.

116 Schier, Deeper into Pictures, 55, 66, note 1, and 151. Cf. Harrison,

‘Minimal syntax’, 221, which argues that ‘Wollheim clearly does

think that the pictorial is in some central sense “generative” despite

his opposition to “semiotic” theories of it’.


118 Willats, Art and Representation, 171–2 and 289.

119 See Chomsky, Theory of Syntax, 17–18 and 63–4, and The Minimalist

Programme, 223. Cf. Smith, Chomsky, 43–4.

120 Cf. Ludwig Wittgenstein, Philosophical Investigations, Oxford, 1953, §§319,

320, 329, and 335, and Pinker, The Language Instinct, 55–82;

121 Cf. Wittgenstein, Philosophical Investigations, 198e.

122 See Willats, Art and Representation, 171–2.

123 On the condition of deletion, see Chomsky, Theory of Syntax, 179–84,

and Language and Mind, 50–2. Cf. Cook and Newson, Chomsky’s Universal

Grammar, 280.


125 Willats, Art and Representation, 112 and 152–3. Marr, Vision, 268–94.

126 Wollheim, Painting as an Art, 23 and 359, note 16.

127 Irving Biederman, ‘Recognition-by-components: A theory of human

image understanding’, Psychological Review, 94: 2, 1987, 115–47.

128 Smith, Chomsky, 60–8, and Cook and Newson, Chomsky’s Universal

Grammar, 20–3 and 32–5.


130 See Marie McGinn, Wittgenstein and the Philosophical Investigations, London,

1997, 113–42.

131 Willats, Art and Representation, 146, 364, note 6, and Children’s Drawings,

13–14.


133 Pinker, The Language Instinct, 203–4.

134 Chomksy, Theory of Syntax, 6.

135 Cf. Willats, Art and Representation, 34 and 353–4, which emphasizes the

function and purpose to which the different systems are put.


137 See Victor Burgin, ‘Seeing sense’, in The End of Art Theory: Criticism and

Postmodernity, Basingstoke and London, 1986, 51–70, and Wittgenstein,

Philosophical Investigations, 193e–214e.

138 Schier, Deeper into Pictures, 120–5. Cf. Harrison, ‘Minimal syntax’, esp.

219. Cf. Wittgenstein, Philosophical Investigations, §244, and Hans Sluga,

‘“Whose house is that?”: Wittgenstein on the self’, in Hans Sluga and

David Stern, eds, The Cambridge Companion to Wittgenstein, Cambridge,

1996, 340.

139 Wollheim, Painting as an Art, 188–9.

140 Cf. Jean-Désiré Régnier, De la lumière et de la couleur chez les grands maîtres

anciens, Paris, 1865, esp. 31–8.

141 John Willats and Frédo Durand, ‘Deining pictorial style: Lessons

from linguistics and computer graphics’, Axiomathes, 15: 3, September

2005, 1–27, esp. 3–7, and 12–21.

123

45678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970

Pictorial Grammar: Chomsky, John Willats, and the Rules of RepresentationPaul Smith

This article argues that there is such a thing as pictorial

grammar, despite objections to the very idea by

philosophers. It proposes that John Willats developed

a theory of this grammar on the basis of Chomsky’s

early work, which demonstrates that pictures are

segmentable into semantic units, and that these

are organized by syntax into larger, grammatically

coherent structures. It also argues that pictorial

grammar, thus conceived, is innately grounded

and that it operates to transform an underlying

perceptual content into a surface form. The advantages

of this theory are that it allows us to specify how

pictures produce meanings both by obeying and by

transgressing grammaticality, and to reconigure

our present understanding of what is conventional

or arbitrary, and what is natural and iconic, about

depiction. It concludes with a consideration of the

further possibilities of a theory of pictorial grammar.

Paul Smith is author of Seurat and the Avant-Garde (Yale, 1997), co-

editor of The Blackwell Companion to Art Theory (2002), and editor

of an annotated translation of Marius Roux’s novel, The Substance and the

Shadow (Penn State, 2007), whose central character is modelled on Cézanne.

He is also editor of the recently published volume, Seurat Re-Viewed (Penn

State, 2010), and is presently working on a phenomenological study of Cézanne’s

perspective and colour.

‘Pictorial Grammar: Chomsky, John Willats, and the Rules of Representation’, Art History, vol....

Documents

Transcript of ‘Pictorial Grammar: Chomsky, John Willats, and the Rules of Representation’, Art History, vol....