Listening to the Past - Uni DuE

Listening to the Past

Audio Records of Accents of English

edited by

Raymond Hickey

Table of Contents

Listening to the Past Audio Records of Accents of English 1 Analysing early audio recordings Raymond Hickey 2 British Library sound recordings of vernacular speech They were lost and now they are found Jonathan Robinson 3 Twentieth-century Received Pronuncation: Prevocalic /r/ Anne Fabricius 4 Twentieth-century Received Pronuncation: Stop articulation Raymond Hickey 5 London’s Cockney in the twentieth century Stability or cycles of contact-driven change? Paul Kerswill and Eivind Torgersen 6 The origins of Liverpool English Kevin Watson and Lynn Clark 7 Tyneside English Dominic Watt and Paul Foulkes 8 Scotland – Glasgow and the Central Belt Jane Stuart-Smith and Eleanor Lawson 9 Early recordings of Irish English Raymond Hickey 10 Evidence of American Regional Dialects in Early Recordings Matthew J. Gordon and Christopher Strelluf 11 New England Daniel Ezra Johnson and David Durian 12 Upper Midwestern English Thomas Purnell, Eric Raimy, Joseph Salmons 13 Western United States Valerie Fridland and Tyler Kendall 14 Analysis of the Ex-Slave Recordings

Table of Contents

Erik R. Thomas 15 Archival Data on Earlier Canadian English Charles Boberg 16 Canadian Raising in Newfoundland? Insights from early vernacular recordings Sandra Clarke, Paul De Decker and Gerard Van Herk 17 The Caribbean - Trinidad and Jamaica Shelome Gooden and Kathy-Ann Drayton 18 Early recordings from Ghana

A variationist approach to the phonological history of an Outer Circle variety Magnus Huber 19 Earlier South African English Ian Bekker 20 Early twentieth century Tristan da Cunha h’English Daniel Schreier 21 Open vowels in historical Australian English Felicity Cox 22 Early New Zealand English: the closing diphthongs Márton Sóskuthy, Jennifer Hay, Margaret Maclagan, Katie Drager and Paul

Foulkes 23 The development of recording technology Raymond Hickey Index

Contributors

Contributors IAN BEKKER is Associate Professor at North-West University in Potchefstroom, South Africa, having previously spent ten years lecturing at Rhodes University, Grahamstown. His main research interest is the sociophonetics of South African English, from both a synchronic and diachronic perspective. His main current research project is focused on the role of Johannesburg in the formation of South African English. CHARLES BOBERG is Associate Professor of Linguistics at McGill University in Montreal, Canada. His main research interests are in language variation and change, with a particular focus on Canadian English. He is a co-author, with William Labov and Sharon Ash, of the Atlas of North American English: Phonetics, Phonology and Sound Change (2006) and the author of The English Language in Canada: Status, History and Comparative Analysis (2010). His current research projects concern dialect variation in the phonological nativization of loan words and accent and dialect in North American film and television. LYNN CLARK is a senior lecturer in Linguistics at the University of Canterbury, New Zealand. Her main research interests are language variation and change, sociophonetics and usage-based models of language. She has worked on several varieties of English including Scots and Scottish English, the English of Polish migrants, Liverpool English and, more recently, New Zealand English. Her work has recently appeared in English Language and Linguistics, Language Awareness, English World-Wide and Language, Variation and Change. SANDRA CLARKE is a Professor Emerita of Linguistics at Memorial University of Newfoundland. Her research deals with social and regional variation, with particular focus on Newfoundland and Canadian English, as well as the indigenous Algonquian varieties of Labrador. Recent publications include Newfoundland and Labrador English (2010). She is coordinator of the online Dialect Atlas of Newfoundland and Labrador English (2013), which documents regional variation in the traditional speech of the province. FELICITY COX is Associate Professor in Phonetics in the Department of Linguistics at Macquarie University. Her main research interests is phonetic variation and change in Australian English. She is the author of Australian English: Pronunciation and Transcription (2012) and co-author of the Australian edition of Introduction to Language, Fromkin, Rodman, Hyams, Cox, Thornton, (2015). She has recently published in the Journal of the International Phonetic Association, Journal of the Acoustical Society of America, Journal of Child Language and Journal of Speech, Language and Hearing Research. PAUL DE DECKER is Assistant Professor in Linguistics at Memorial University. His main research interests are in the areas of sociophonetics and dialectology, where he focusses on phonetic creativity in forms of quoted speech. He is co-investigator of the SSHRC funded project Allophony in Newfoundland English: Production,

Contributors

Perception and Variation which uses ultrasound imaging to examine articulatory patterns in Irish settled areas of Newfoundland. His research also covers data collection methods and analysis of vocalic variation in English. KATIE DRAGER is Associate Professor of Sociolinguistics at the University of Hawai‘i at Mānoa. Her research focuses on sociophonetic variation and the link between social factors and linguistic variation during the perception of speech. Her recent publications have appeared in Journal of Phonetics, Language and Speech, and Language Variation and Change. KATHY-ANN DRAYTON is a Lecturer in Linguistics at The University of the West Indies, St Augustine, Trinidad and Tobago. Her main research interests are the phonology of Eastern Caribbean Englishes and English Creoles, with a particular focus on prosodic systems. She recently completed her dissertation on the prosodic structure of Trinidadian English Creole. She is currently involved in a project investigating the sociolinguistic aspects of prosodic variation in Trinidadian English, especially as it relates to ethnicity and socioeconomic status. DAVID DURIAN completed his PhD in linguistics at the Ohio State University. His work focuses heavily on language change and variation trends affecting vowels in US English among speakers born in the nineteenth, twentieth and twenty-first centuries. He is currently employed as an independent researcher and is the Principle Researcher at the Durian Linguistics Lab. ANNE H. FABRICIUS is Associate Professor of English Language at Roskilde University, Denmark. She trained in linguistics in Australia and in Denmark. Her main research interests are variation and change in the elite sociolect of England and the UK, commonly referred to as Received Pronunciation or BBC English. She has published on this topic in English World-Wide, Language Variation and Change, Journal of Sociolinguistics and Journal of the International Phonetic Association. She also works on sociophonetic methodological questions and has published in this area with Dominic Watt, University of York. PAUL FOULKES is Professor in the Department of Language and Linguistic Science, University of York. He has previously held posts at the Universities of Cambridge, Newcastle, and Leeds, and in 2008 was a Visiting Erskine Fellow at the University of Canterbury, New Zealand. His teaching and research interests include forensic phonetics, laboratory phonology, phonological development, and sociolinguistics. He has worked on over 200 forensic cases from the UK, Ghana and New Zealand. VALERIE FRIDLAND is a Professor of Linguistics at the University of Nevada in Reno, NV. A sociolinguist, her research focus is primarily sociophonetics. She is currently involved in research supported by the National Science Foundation measuring the production and perception of vowel changes in regional US dialects. Her recent publications include papers in the Journal of Phonetics, the Journal of the Acoustical Society of America, Lingua and American Speech. SHELOME GOODEN is an Associate Professor of Linguistics at the University of

Contributors

Pittsburgh. Her research focuses primarily on the prosodic classification and intonational phonology of Caribbean (English) Creole varieties and secondarily on language and identity and sociocultural aspects of language use. She has published articles and book chapters on a variety of topics including the phonological and phonetic properties of reduplication, stress and intonation in Jamaican Creole, past tense marking in Belizean Creole, and language and identity in Pittsburgh African American English. Her most recent publication (with Jennifer Bloomquist) is on African American Language in Pittsburgh and the Lower Susquehanna Valley. MATTHEW J. GORDON is Associate Professor of English at the University of Missouri-Columbia (USA). He studies language variation and change particularly in the context of dialects of American English. He is the author of Labov: A Guide for the Perplexed (2013). JENNIFER HAY is a Professor of Linguistics at the University of Canterbury, and is Director of the New Zealand Institute of Language, Brain and Behaviour. She has published widely on topics relating to New Zealand English, sociophonetics, morphology and laboratory phonology. She oversees the Origins of New Zealand English corpora, and is currently working on compiling a large audio-visual corpus of Christchurch Earthquake Stories. RAYMOND HICKEY is Professor and Chair of English Linguistics at the University of Duisburg and Essen. His main research interests are varieties of English, Late Modern English and general questions of language contact, variation and change. Recent book publications include Motives for Language Change (2003), Legacies of Colonial English (2004), Dublin English. Evolution and Change (2005), Irish English. History and Present-day Forms (2007), The Handbook of Language Contact (2010), Eighteenth-Century English (2010), Areal Features of the Anglophone World (2012), The Sound Structure of Modern Irish (2014) and A Dictionary of Varieties of English (2014). MAGNUS HUBER is Professor of English Linguistics and the History of English at the University of Giessen. His main research interests are World Englishes, pidgins and creoles, (historical) sociolinguistics, dialectology, corpus linguistics, and historical linguistics. He has co-edited the Atlas and Survey of Pidgin and Creole Languages (2013) and a volume on The Evolution of Englishes. The Dynamic Model and Beyond (2014). He also works in the newly established field of colonial linguistics, researching contemporary documents for what they can tell us about the function, structure and development of languages in the colonial context. DANIEL EZRA JOHNSON is a Lecturer in Language Variation and Change at Lancaster University (UK). He received his PhD. in linguistics from the University of Pennsylvania with a study of the development of English in New England. His research interests include dialectology, syntactic variation, and quantitative methods. TYLER KENDALL is Associate Professor in Linguistics at the University of Oregon. His research focuses on language variation and change, primarily in regional and

Contributors

ethnic varieties of American English, using methods from variationist sociolinguistics, sociophonetics, and corpus and computational linguistics. With Valerie Fridland, he is engaged in a large-scale study of vowel production and perception in U.S. regional dialects. He is the developer of several web-based tools for sociolinguistic research, including the Sociolinguistic Archive and Analysis Project (http://slaap.lib.ncsu.edu/) and the NORM suite for vowel normalization and plotting (http://slaap.lib.ncsu.edu/tools/norm/). His recent publications include the book Speech Rate, Pause, and Sociolinguistic Variation: Studies in Corpus Sociophonetics (2013), as well as articles appearing in journals such as the Journal of Phonetics and Journal of the Acoustical Society of America. PAUL KERSWILL is Professor of Sociolinguistics at the University of York. He previously held appointments at the University of Reading and Lancaster University. His research has focused on migration and dialect contact in Norway and Britain, including Bergen and the New Town of Milton Keynes. More recently he has collaborated with Jenny Cheshire on the emergence of Multicultural London English. His publications include work on the role of children in language change and the representation of youth language in the media. He has co-edited Dialect Change: Convergence and Divergence in European Languages (with Frans Hinskens and Peter Auer, 2005) and the Sage Handbook of Sociolinguistics (with Ruth Wodak and Barbara Johnstone, 2010). Currently he is working on language and social class in York. ELEANOR LAWSON is a researcher in articulatory phonetics and sociophonetics at Queen Margaret University Edinburgh, and the University of Glasgow. She previously worked as a lecturer in phonetics at the Phonetics Laboratory, University of Oxford. Her research interests are rhotics, socially-conditioned articulatory variation and sound change. Her current research projects involve the analysis of Scottish rhoticity using ultrasound tongue imaging and the creation of an online ultrasound tongue imaging corpus of world-wide Englishes. MARGARET MACLAGAN is a retired professor in phonetics and linguistics at the University of Canterbury where she currently holds an adjunct position. Her main research interests are sound change in New Zealand English and Māori and language change in Alzheimer’s disease. Recent publications include papers in the Journal of Phonetics and the Journal of the International Phonetics Association. Current research projects include the influence of New Zealand English on the sounds and rhythm of Māori. SALLYANNE PALETHORPE is a senior researcher in both the Department of Cognitive Science and the ARC Centre of Excellence in Cognition and its Disorders at Macquarie University. Her main research interests are in acoustic phonetics and speech physiology. She has published journal articles and book chapters (with Felicity Cox) on Australian English, both past and present. She also provides an acoustic phonetic contribution to research and publications in cognitive science. THOMAS PURNELL is Associate Professor in the English Department, University of

http://slaap.lib.ncsu.edu/)

http://slaap.lib.ncsu.edu/tools/norm/)

Contributors

Wisconsin-Madison. His research and teaching examines the interface between phonetics and phonology with a focus on regional pronunciation. In particular, he is interested in the intersection of ethnically-affiliated social groups and sound systems of language. ERIC RAIMY is the Chair of the Department of Linguistics and a member of the Department of English at the University of Wisconsin-Madison. His research is focused on modularity in grammar and the role that representations play in phonology. He is a member of the Wisconsin Englishes Project that documents and analyzes the local variation in English found in the state of Wisconsin. JONNIE ROBINSON is Lead Curator of Spoken English at the British Library and responsible for the Library’s extensive archive of sound recordings of British accents and dialects. He has worked on two nationwide surveys of regional speech, the Survey of English Dialects and BBC Voices and in 2010 co-curated the world’s first major exhibition on the English Language, Evolving English: One Language, Many Voices. He selects content for the Library's online dialect archive (sounds.bl.uk) and created Sounds Familiar, an educational website that celebrates and explores regional speech in the UK. He is currently accessioning a substantial set of sound recordings (c. 15,000 voices from all over the world) made by visitors to the Library’s Evolving English exhibition and recently published Evolving English WordBank: A Glossary of Present-Day English Dialect and Slang (2015) based on an initial audit of the collection. JOE SALMONS is the Lester W. J. ‘Smoky’ Seifert Professor of Germanic Linguistics at University of Wisconsin–Madison and is the co-founder of the Center for the Study of Upper Midwestern Cultures. He is author of A History of German: What the Past Reveals about Today’s Language (2012, second edition in preparation), editor of Diachronica: International Journal for Historical Linguistics, and co-editor with Tom Purnell and Eric Raimy of Wisconsin Talk: Linguistic Diversity in the Badger State (2013)??. His work focuses on language change in the context of linguistic theory, drawing data especially from Germanic languages, including American English and heritage languages spoken in the US. DANIEL SCHREIER is Professor in English Linguistics at the University of Zurich. His main research interests are variationist sociolinguistics, contact linguistics and English historical linguistics. Recent publications include the co-edited volumes The Lesser-Known Varieties of English (2010; 2015), Contact, Variation and Change in the History of English (2014) and Letter Writing and Language Change (2015). He is co-editor of English World-Wide: a Journal of Varieties of English. MÁRTON SÓSKUTHY is Lecturer in Phonetics and Phonology at the University of York. His research looks at the emergence of sound patterns using computational modelling and corpus-based techniques, with special focus on English and Hungarian. He received his PhD in Linguistics from the University of Edinburgh in 2013; his thesis investigated the role of phonetic biases and systemic effect in the actuation of sound change.

Contributors

CHRISTOPHER STRELLUF is Assistant Professor of English at Northwest Missouri State University. His research interests include language variation and change, dialectology, and composition pedagogy. His dissertation, ‘We have such a normal, non-accented voice: A sociophonetic study of English in Kansas City’, is available at http://catpages.nwmissouri.edu/m/cstrell/kc_speech.htm. JANE STUART-SMITH is Professor in Phonetics and Sociolinguistics, and Director of the Glasgow University Laboratory of Phonetics (GULP) at the University of Glasgow. She has extensively researched accent variation and change in Glasgow and has been principal investigator on a number of projects funded by Leverhulme and the Economic and Social Research Council, including a long-term project on the influence of the broadcast media on language change, and an ongoing investigation of real-time change in Glaswegian across the twentieth century (Sounds of the City). Her publications include a monograph, Phonetics and Philology (2004), a co-edited volume, The Edinburgh Companion to Scots (2003), and articles on aspects of sociophonetics and sound change in a number of journals including, Journal of Sociolinguistics, Journal of Phonetics, Journal of the International Phonetic Association, Laboratory Phonology and Language. ERIK R. THOMAS is Professor of Linguistics at North Carolina State University. His main research interests are in sociophonetics, including variation in minority dialects. Recent publications include Sociophonetics: An Introduction (2011) and, co-edited with Malcah Yaeger, African American English Speakers and their Participation in Local Sound Changes: A Comparative Study (2010). He is currently working on a project on Mexican American English. EIVIND TORGERSEN is an Associate Professor in the Department of English at Sør-Trøndelag University College. He has worked on projects on Multicultural London English and language change in London, in particular modelling of phonological change and the use of spoken corpora in sociolinguistic research. Other research interests are in experimental phonetics and second language acquisition. GERARD VAN HERK is Canada Research Chair in regional language and oral text and associate Professor of Linguistics at Memorial University of Newfoundland. His main research interests are varieties of English, especially Newfoundland, African American, and Caribbean. He is the author of What is Sociolinguistics? (2012) and co-editor of Data Collection in Sociolinguistics (2013). KEVIN WATSON is Senior Lecturer in Linguistics at the University of Canterbury, New Zealand. His main research interests centre on the sociophonetics of English, with a focus on varieties in north-west England and New Zealand. He has worked on phonological leveling, diffusion and divergence in Liverpool and two hinterland localities, and has established the Origins of Liverpool English corpus (OLIVE). DOMINIC WATT is Senior Lecturer in Forensic Speech Science at the University of York, UK. His main research interests are forensic phonetics, sociophonetics, dialectology, and speech perception. Recent publications include Language, Borders and Identity (2014) and Language and Identities (2010), both co-edited

http://catpages.nwmissouri.edu/m/cstrell/kc_speech.htm

Contributors

with Carmen Llamas, and the fifth edition of English Accents and Dialects (2012, with Arthur Hughes and Peter Trudgill). He has published articles in journals including Language Variation and Change, Journal of Language and Social Psychology, Journal of Sociolinguistics, and Language, and is currently working on a British Academy-funded project with Carmen Llamas and Tyler Kendall (University of Oregon) looking at the relationships between hesitation phenomena, conversation topic, and phonetic variation in the Accent and Identity on the Scottish/English Border (AISEB) corpus.

Preface The history of audio recording goes back to the end of the nineteenth century when a technology was developed which allowed the preservation, albeit in poor quality, of the human voice on wax cylinders. This simple fact is of great interest to linguists as it provides an authentic record of many languages and of different varieties of English. The idea behind the present volume has been to examine these records for a representative cross-section of varieties throughout the English-speaking world. I am grateful to the team of thirty two scholars who came together to discuss the types of English available in audio recordings over the past hundred years and to present their results in the chapters of this book. The range of material presented is considerable and nearly all of it has not been subject to linguistic scrutiny before. Hence the insights reached here are novel and hopefully useful to students and scholars who are concerned with variation and change across the anglophone world. In many cases the analysis of audio recordings has led to revisions of assumptions made about contemporary varieties before these recordings were examined. In this respect it is hoped that this book will stimulate renewed research into the roots of modern varieties of English and the paths they took in their development throughout the twentieth century. In the preparation of this book I received great assistance from Prof. Merja Kytö, Uppsala University, who as series editor took particular care in reading the manuscript before it went to print. Helen Barton, commissioning editor for linguistics at Cambridge University Press, was, as always, a great source of assistance and encouragement and ready to answer any questions which arose in the course of the project.

Münster October 2015

Hickey Analysing early audio recordings --- Page 2 of 525

1 Analysing early audio recordings Raymond Hickey 1 Introduction The common thread running through all the chapters in the current volume is the analysis of early audio recordings for varieties about which much is known from present-day data. These early recordings have been examined in the hope that they would shed light on the origin and background of many sound changes and developments which have been studied by linguists for decades, often by the authors of the chapters themselves. The audio material investigated, supported by relevant social and demographic data, can help unearth new information on phonetic, phonological and prosodic developments in the recent history of varieties. Until now, sufficient audio data has not been available in large enough quantities to allow research on early pronunciation features across varieties of English. The systematic use made of the early recordings in the present volume has allowed the authors to go further back in time than has hitherto been possible. In a multi-facetted manner, the studies in the book seek to provide a historical perspective on the present-day patterns of pronunciation by exploring the time depth made possible by early recordings. The authors have asked various research questions, for instance, how recent is a pronunciation feature?, which attested changes should be regarded as innovations?, or which are historical continuations? A judicious use of early recordings can, despite all caveats (see below), help scholars reach conclusions solidly based on primary audio data. 2 Increasing the time depth for varieties of English The use of early recordings as research material is not unproblematic and it is important to show an awareness of the pitfalls that may make it difficult to assess such material and compare it to later and present-day data. The necessary caveats are listed in Table 1. Table 1. Caveats when using data from early recordings

1) The range of data for any one individual is limited. This is usually due to the

length of the recording (normally only a few minutes at the most). Hence early recordings are practically only suitable for sound analysis; for grammatical analysis the recordings contain too little data.

2) Typical recordings consist of reading a set piece, as with the story of the

prodigal son used by Wilhelm Doegen in his recordings of English prisoners of war in WWI. Free speech is rarely available in early recordings.


3) The quality of the recordings is generally not sufficiently good to carry out fully reliable acoustic analysis, although a certain amount is nearly always possible.

4) The available early recordings rarely show any social stratification so that a

variationist analysis, in the modern sociolinguistic sense, is not possible. 3 Testing hypotheses about the development of varieties By examining earlier recordings one can test different hypotheses about how key features of present-day varieties developed. Early recordings can also confirm overall trends in varieties of English. For instance, the rise of non-prevocalic /r/ is typical for supraregional varieties of American English in the course of the twentieth century, but a similar development is also found in Ireland. The two developments are not, of course, directly related. But in one sense there is a connection in that there was a distancing from English models of pronunciation in Ireland and of older models of pronunciation in the United States which, when broadly based on New England pronunciation, would have been non-rhotic. It is also significant that in both the United States and Ireland the increase in rhoticity in the twentieth century lead to the phonetic preference for a retroflex [5] which, acoustically, results in a very salient rhotic pronunciation. The insights reached from the examination of early audio recordings can be classified according to their significance for present-day varieties. The value of the insights can be organised as a scale of decreasing importance as shown in the following table. Table 2. Classification of insights from early recordings

Level 1: Early recordings reveal previously unattested features. This is not

common, but an example would be the use of a rolled [r] attested in the recordings of Baroness Asquith and Virginia Woolf for earlier RP (Fabricius; Hickey, both this volume).

Level 2: Early recordings display combinations of features not previously

attested. For instance, in early twentieth-century Irish English non-rhoticity and a monophthong in the GOAT lexical set is found, a combination which does not occur anymore (Hickey, this volume).

Level 3: Early recordings have combinations of features not continued in a

variety. In early twentieth-century Australian English a retracted START vowel and a fronted STRUT vowel co-occurred whereas these vowels converged later in the twentieth century in Australia (Cox and Palethorpe, this volume).

Level 4: Early recordings display features which help one to decide between

alternatives for the development of features in a variety. For instance, in


early South African English the retracted START vowel is nothing like as prominent as it is today, suggesting that this was an internal development in South Africa after initial anglophone settlement (Bekker, this volume).

Level 5: Early recordings confirm features known from later recordings and

observations of present-day speakers. For instance, the early recordings of English from Tristan da Cunha document clearly the presence of non-etymological initial /h-/ which is known from studies of modern English on the island (Schreier, this volume).

Level 6: Early recordings confirm general patterns assumed for the development

of varieties. For example, the speech of older speakers from Ghana analysed by Huber (this volume) show that for four key variables, Ghanaian English moved away from RP and more towards endonormative realisations of the variables in question supporting the theoretical model of new variety development put forward in Schneider (2003) and (2007).

4 Available early audio recordings When determining what recordings are available for a language or dialect, one must distinguish between those which were made deliberately to record a specific variety of a language and those which were made for some other purpose, e.g. to save for posterity the voice of a famous person – an author, politician or member of royalty. The latter type of recording may incidentally capture a variety at a stage for which there are no other audio recordings. For varieties of English virtually the only example of the first type of recording are those made by Wilhelm Albert Doegen (1877-1967) in the early twentieth century. Doegen was a German scholar who had an interest in recording dialects and minority languages. He studied phonetics in Berlin and later in Oxford under Henry Sweet where he increased his knowledge of English and the anglophone world. He also became a member of the International Phonetic Association. Doegen’s original recordings of English dialect speakers were destroyed during World War II but shellac copies survived and in the 1990s the Humboldt University in Berlin started a project to digitalise this material. Der Berliner Lautarchiv ‘The Berlin Sound Archive’ is a collection of early audio recordings, most from the beginning of the twentieth century, which document many languages and dialects spoken at that time. The recordings come from the Royal Prussian Phonographic Commission for which Doegen was a commissioner from 1915 onwards. Along with the Austro-German professor of English and renowned Shakespeare scholar, Alois Brandl (1855-1940), he recorded prisoners during World War I (Robinson, this volume), many of whom were speakers of English dialects.1 Several of these recordings have been analysed for chapters in 1 Many of Doegen’s recordings were acquired by the British Library and can be heard on their website in the section ‘Early Spoken Word Recordings’, see the information


the present volume (Stuart-Smith and Lawson; Watt and Foulkes, both this volume). The second type – incidental recordings – are not always useful when studying dialects because the individuals recorded are generally standard speakers of their native language, or at least of supraregional varieties of that language. This is true of the incidental recordings for English in England (Fabricius, this volume) and for English in Ireland (Hickey, this volume). However, the investigation of non-vernacular varieties is of value in itself as these are subject to similar types of sociolinguistically motivated change as are vernaculars. Furthermore, these are the more standard forms of language to which vernacular varieties were related on a vertical social scale at the time of recording. The extant recordings of non-vernacular southern British English can provide insights into its form at the beginning of the twentieth century and hence help scholars track developments in standard British English in the past hundred years or so. 5 Structure of the current volume The surveys of the early recordings presented here show a geographical range from Scotland to New Zealand, from Canada to South Africa, from Newfoundland to Tristan da Cunha. Certain varieties have been accorded greater attention than others, such as those in Britain and North America, but this is simply due to the attestational situation for them and their position in the overall arena of varieties of English. The volume opens with a study by Jonathan Robinson in which he looks at the recordings of English prisoners of war made by the German scholar Wilhelm Doegen during WWI and housed at the British Library. These recordings are exactly one hundred years old and constitute a valuable record of English dialects at the beginning of the twentieth century. Robinson compares the dialect features captured in Doegen’s recordings with what is known of these dialects today and compares the features then and now in view of continuous developments in vernaculars of British English. The pronunciation norm of present-day England, Received Pronunciation, is the subject of the two following chapters. In the first, Anne Fabricius looks at the development of prevocalic /r/ and examines the evidence for a rolled /r/ among aristocratic speakers of early twenty-century RP. In the second chapter Raymond Hickey considers the increase in aspiration for voiceless stops throughout the twentieth-century by examining values in the earliest recordings of RP and by comparing these with successive recordings of English monarchs during the century. The situation of Cockney English in the multicultural East End of London in the early twentieth century is the subject of the chapter by Paul Kerswill and Eivind Torgersen. Specifically, they consider the possible influence of Jewish varieties of English on syllable timing and voice onset time in forms of Cockney. This situation accessible at the following URL: http://sounds.bl.uk/Accents-and-dialects/Early-spoken-word-recordings. There is also a double CD set entitled Voices of the UK. Accents and Dialects of English which has been published by the British Library utilising many of Doegen’s recordings.

http://sounds.bl.uk/Accents-and-dialects/Early-spoken


is essentially different from that in present-day Cockney areas of London where many more languages are spoken and young language learners can choose from a feature pool frequently determined by various second-language speakers of English. In their chapter on Merseyside Kevin Watson and Lynn Clark consider the origins of Liverpool English and examine recordings from the early twentieth century which are noticeably closer to the mid-nineteenth century when modern Liverpool English arose. They conclude that there was an Irish input for TH-stopping and probably for plosive lenition as well, though the latter is a complex issue and by no means a case of simple transfer. Additionally, the role of Lancashire English in the formative period of modern Liverpool English needs to be acknowledged. Vernacular speech in Glasgow and the Central Belt as revealed in early recordings by Wilhelm Doegen (see above) is the topic of the chapter by Jane Stuart-Smith and Eleanor Lawson. They find confirmation of patterns of T-glottaling which are present in later recordings and in contemporary forms of speech in this central area of Scotland. The authors furthermore regard derhoticisation , in the light of the early twentieth-century recordings, as a change which has been progressing for a longer period of time than previously assumed. The salient features of Irish English as spoken by individuals born in the nineteenth century and who grew up before the south of Ireland became independent in 1922 forms the focus of the chapter by Raymond Hickey. These persons show accents of Irish English which exhibit significant influence from British pronunciation models, e.g. in non-rhoticity and a low, open STRUT vowel, traits no longer found widely in varieties of Irish English given that these became increasingly independent of British English in the course of the twentieth century. Evidence of American regional dialects in early recordings is considered by Matthew J. Gordon and Christopher Strelluf in their chapter, in particular the North/Midland divide as attested by speakers born well over a century ago. The picture which emerges is, as the authors insist, mixed. There is, on the one hand, evidence of a distinction between DEW and GOOSE among some Northerners but on the other hand this contrast is also heard in the early recordings with some Midlanders, an unexpected finding. The authors conclude that key phonological distinctions now separating the North and the Midland arose during the course of the twentieth century though their roots go back further. Vernacular speech in New England as evidenced by early twentieth-century recordings is scrutinised in the chapter by Daniel Ezra Johnson and David Durian. In particular parts of the vowel system found with educated speakers from the Hanley Recordings have been given close attention. The authors found that certain vocalic features, such as short a were not dealt with accurately by later dialectologists though the remaining low vowels were better discussed. Johnson and Durian also confirm that the Linguistic Atlas of New England was comprehensive in the range of speakers it encompassed and so provided a valid picture of variation among early speakers in the region. Their study also confirms the split pattern for short a which is still found within the broader New England area. Furthermore, the typical mergers of PALM and LOT in Western New England or LOT and THOUGHT in Eastern New England had not yet emerged in the areas studied by the authors.


Early twentieth-century English spoken in the Upper Midwest of the United States is the subject of the chapter by Thomas Purnell, Eric Raimy, Joseph Salmons. The authors look in detail at the realisations of the GOAT, LOT, THOUGHT and TRAP vowels in Upper Midwestern English, drawing on transcriptions and recordings present in archival sources. These resources support some recent views and add further nuances to our picture of Upper Midwestern English. Above all, the available archival data shows widespread variation from early on, contrary to assumptions about linear geographical spread. Upper Midwestern English is emerging less by areal diffusion of features but rather by consolidation of particular patterns introduced as variants during settlement. Forms of English spoken throughout the twentieth century in the Western United States are examined by Valerie Fridland and Tyler Kendall. The authors consider when and how the modern Western system began. They conclude that, based on its historical roots and modern similarity across the region, the dialect of the Western U.S. can be appropriately viewed as a koiné, a variety brought about through contact-induced change by speakers of a wide-range of mutually intelligible dialects. The speech of those archival speakers examined by the authors can be interpreted as anticipating some major features of the vowel system of the modern Western United States without showing these features to any large degree, a finding common to many of the chapters of this volume. Although our archival speakers show only isolated variants that hint at what was to come, DeCamp’s work in the 1950s in San Francisco indicates that speakers born in the early part of the twentieth century were engaged a bit farther along in the journey toward a more cohesive Western system, particularly in the low back system.

The Ex-Slave Recordings form the focus of the chapter by Erik R. Thomas who shows that a great deal of acoustic analysis can be performed on the ex-slave recordings. Other researchers have demonstrated over twenty years ago how important these recordings are for morphosyntactic variants. The Ex-Slave Recordings are clearly valuable for exploring some kinds of phonetic variables as well. Even though certain types of analyses are impossible to conduct on them, other types certainly can be performed. Thomas shows in his chapter that analyses of vowel quality and prosody are quite feasible. He concludes that the methods used in his chapter for analyzing phonetic variables can be applied to early recordings of other dialects and languages as well. Thomas maintains that early recordings are indeed suitable for the techniques of acoustic analysis, despite their usually inferior quality compared to modern recordings. Archival data on Earlier Canadian English provided the basis for the chapter by Charles Boberg in which he examined the speech of seven First World War veterans, these affording a remarkable glimpse at what Canadian English may have sounded like at the end of the nineteenth century, a window on the past made possible only by the availability and analysis of archival data. While direct evidence of this crucial period in the process of consolidation and diffusion that spread Canadian English across the country is not available, Boberg’s analysis nonetheless shows that some key features of modern Canadian English have a long history: Canadian Raising, for instance, has been well established for over a century now. Other modern features appear to have arisen later, or have only recently become uniform across the country: the fronting of /uw/ and the Canadian Shift, together with the reversal of /uw/ and /æ/ in F2 space that they bring about, appear


to be late twentieth century innovations. More surprisingly, the low-back merger of /o/ and /oh/ and even the modern phonemic status and allophonic distribution of /æ/ were not always the way they are today: these defining features of modern Canadian English seem to have evolved gradually, from a dialect landscape that was once much more varied. This process may partly reflect the strong British influence, including not just common war experiences but heavy British immigration to Canada throughout the nineteenth century. Whether Canadian Raising can be ascertained in early recordings of English in Newfoundland is the question which Sandra Clarke, Paul De Decker and Gerard Van Herk address in their chapter. Through acoustic analysis of early audio recordings, their chapter shows that Canadian Raising was variably present, for both /ai/ and /au/, in a small sample of traditional Newfoundland English speakers born between 1898 and c. 1935. The authors maintain that the origins of this feature cannot be claimed to be historical: regional dialect evidence suggests that the Canadian Raising pattern is unlikely to have been inherited from the ancestors of these Newfoundland English speakers, who migrated to the island from southwest England and southeast Ireland between the late seventeenth and early nineteenth centuries. They conclude that while their acoustic analysis is grounded in samples of traditional Newfoundland English speech recorded in the 1970s and 1980s, these recordings have provided new insights into Canadian Raising as used by Newfoundland speakers born near the turn of the twentieth century. Their study indicates that Canadian Raising has existed within Newfoundland for a considerable period of time – perhaps subject to the type of ‘ebb and flow’ described by Hickey (2002) for a number of changes in the English language. As such, their study demonstrates the value to contemporary local linguistic endeavours of access to archived real-time language data.

Prosodic aspects of early recordings of English in Trinidad and Jamaica form the focus of the chapter by Shelome Gooden and Kathy-Ann Drayton. They maintain that the period between the late 1890s and the 1940s brought some changes to Caribbean Creoles: Trinidadian shows changes in prosody, mainly in the marking of prominent syllables and in phrasing in the speech of Afro/Mixed speakers. Using the oldest speakers in the data sets from both Trinidad and Jamaica the prosodic changes can be dated to a period between 1890 and 1947 so that speakers born after that period would have prosodic features that are similar to that used by contemporary speakers. The changes in Trinidadian Creole are postulated by the authors as possibly a marker of a new “Trinidadian” identity among younger Trinidadians. The authors maintain that as speakers redefine themselves in their changing ecologies and create new sociopolitically-driven identities, the linguistic shifts become a reflection of the associated changing identities (Schneider 2003). The combined results from Gooden’s and Drayton’s research and the research on vowel variation suggests that both Trinidadian and Jamaican are likely between the last two stages of emergence; the fourth, i.e. endonormative stabilization, and the fifth, i.e. differentiation (Schneider 2003) with the differences being determined by the local ecologies in which each variety exists. Early twentieth century recordings from Ghana are examined by Magnus Huber in his chapter. While confirming that there is frequently a lack of data documenting the actual structural development of postcolonial Englishes, Huber confirms that the recordings from the Ghana Broadcasting Company Sound


Archive, which he evaluated sociophonetically, represent invaluable data from the period just after Ghana’s independence in 1957. His analysis shows that the four variables he examined have been replaced by newer, more natives Ghanaian ones: the RP realisations have receded over time (1) (ing) : [-Iŋ] by 42%, (2) (wh) : [w] by 20%, (3) (NURSE) : [ɜ] by 17% and (4) (STRUT) : [ʌ] by 23%. Huber postulates that if this is representative of the development of the Ghanaian English phonological system as a whole then it suggests that spoken educated Ghanaian English was closer (but not identical) to RP at the time of Ghana’s independence and that Ghanaian variants have gradually been replacing the British ones over the past 50 to 60 years. He concludes that diachronic phonological studies of early recordings of postcolonial Englishes can provide much-needed data to test and refine evolutionary models that so far have been based mainly on external language history and synchronic structural data. Earlier South African English, especially that in the region of Johannesburg are the subject of Ian Bekker’s chapter in which he examines the speech of early recordings to gain a window on the varieties of English prevalent during the second half of the nineteenth century (Bekker 2012) and considers the wider question of whether there any broader generalizations or conclusions that can be drawn from his analysis. Bekker finds that the speakers in the early recordings reflect the original British input and that they contain features that were eventually ‘ironed’ out once South African English become a fully-focused variety. Assuming that a koiné developed in the Eastern Cape during the mid-nineteenth century and given Trudgill’s (2004: 23) 50-year yardstick for the focusing of such a new-dialect, it would have come into its own around about 1870. He sees clear evidence for this is the retraction of the START vowel, an endogenous development, which is typical of South African English today but not of English in Australia or New Zealand. Other features of South African English, such as the centralised KIT vowel are seen by Becker as resulting from internal pressure in the phonological space of the raised short vowel of the historical input to South Africa, reflecting the view expressed earlier in Lass and Wright (1985). Indeed Bekker considers it perhaps possible that the South African English KIT-Split was perhaps a direct inheritance and, in fact, the initiator of the relevant chain-shift of short front vowels. The earliest recordings of English in Tristan da Cunha form the basis for Daniel Schreier’s investigation of specific features of the dialect of this small and very remote community. In particular, he looked at the occurrence of a non-etymological initial /h-/, a feature only attested robustly within the anglophone world in Newfoundland. In his chapter Schreier considers the use of corpora from varieties of English for the analysis of earlier stages (Schreier and Trudgill 2006) and discusses different approaches and agendas (Krug and Schlüter 2013). His investigation confirms the style-sensitive nature of interviews of dialect speakers, here the relative use of non-etymological /h/ during informal conversations in familiar settings. The development of vowels in Australian English is the centre of attention in the chapter by Felicity Cox and Sallyanne Palethorpe. Here they examine various hypotheses about the development of the open vowels in Australian English comparing data from their Australian Ancestors corpus which contains speech data from eight men and four women born in Australia in the period 1880 to 1899. The authors show that, in contrast to modern data, speakers in the historical database


produced significant horizontal separation between START and STRUT with STRUT more fronted but not more raised than START. They also mention the possibility that START fronted and STRUT retracted during the twentieth century, i.e. that this development was not part of the historical input to Australia. On the other hand, the comparison with reports of historical New Zealand English data with their own historical Australian English dataset suggested to Cox and Palethorpe that differences could represent true regional variation in the English accents spoken in Australia and New Zealand in the late nineteenth and early twentieth centuries. The closing diphthongs in early recordings of New Zealand English form the focus of the chapter by Márton Sóskuthy, Jennifer Hay, Margaret Maclagan, Katie Drager and Paul Foulkes. The authors found that the examination of closing diphthongs in New Zealand English showed a certain number of diphthong-shifted variants from the beginning of the period under investigation, which were likely inherited from other varieties of English. According to the researchers, diphthong shift seems to be further incremented within New Zealand English, with FACE and GOAT leading the process and MOUTH and PRICE lagging behind. These changes also exhibit a degree of interdependence: FACE and GOAT may have undergone a parallel shift, PRICE shifting as part of a push-chain started by FACE, and FACE and MOUTH interacting with each other due to their phonetic proximity. The authors’ data also provide some qualified support for the claims in Gordon et al. (2004), Trudgill (2004) and Britain (2008) concerning the development of the sound system in New Zealand English. 6 Outlook The wide range of contributions in the present volume document both the amount of material available in early recordings of varieties of English and the readiness of scholars, working broadly within the paradigm of variationist sociolinguistics, to concern themselves with such recordings with the aim of adding valuable time depth to varieties whose profiles are well-known from later investigations. This aim seems to have been fulfilled in each case as the studies presented here offer novel insights not just into the earlier sound patterns of many varieties of English but also into the external social factors which determined the trajectories they have taken in the course of the twentieth century. It is hoped that the studies of this volume will stimulate continuing research into earlier recordings to encompass further data in this revealing field. References Bekker, Ian 2012. South African English as a late nineteenth century extraterritorial

variety. English World-Wide 33.2: 127-146. Britain, David 2008. When is a change not a change? A case study on the dialect

origins of New Zealand English. Language Variation and Change 20: 187–223.


Gordon, Elizabeth., Lyle Campbell, Jennifer Hay, Margaret Maclagan, Andrea Sudbury and Peter Trudgill 2004. New Zealand English. Its Origins and Evolution. Cambridge: Cambridge University Press.

Hickey, Raymond 2002. Ebb and flow. A cautionary tale of language change. In Teresa Fanego, Belén Mendez-Naya and Elena Seoane (eds) Sounds, Words, Texts, Change. Selected Papers from the Eleventh International Conference on English Historical Linguistics (11 ICEHL). Amsterdam: John Benjamins, pp. 105-128.

Krug, Manfred and Julia Schlüter (eds) 2013. Research Methods in Language Variation and Change. Cambridge: Cambridge University Press.

Lass Roger and Susan Wright 1985. The South African chain shift: order out of chaos? In Roger Eaton, Olga Fischer, Willem Koopman and Frederike Van der Leek (eds) Papers from the Fourth International Conference on English Historical Linguistics. Amsterdam: John Benjamins, pp 137-161.

Schneider, Edgar. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2): 233-281.

Schneider, Edgar W. 2007. Postcolonial English. Varieties around the World. Cambridge: Cambridge University Press.

Schreier, Daniel and Peter Trudgill. 2006. The segmental phonology of nineteenth century Tristan da Cunha English: Convergence and local innovation. English Language and Linguistics 10: 119-141.

Trudgill, Peter 2004. New-Dialect Formation: The Inevitability of Colonial Englishes. Edinburgh: Edinburgh University Press.

Robinson British Library sound recordings of vernacular speech --- Page 12 of 525

2 British Library sound recordings of vernacular speech

They were lost and now they are found Jonathan Robinson

1 British Library sound recordings of vernacular speech The British Library (BL) has in its sound archives recordings that document spoken English over a period of more than 100 years. The recordings range from ‘performance’, such as speeches, literary productions, audio books, public talks and lectures to more ‘naturalistic’ speech contained in radio and television broadcasts, oral history interviews and linguistic surveys. These unique collections are an especially rich resource for researchers interested in varieties of British English and are used by diverse audiences from academic linguists, language teachers and students to researchers and practitioners in the creative industries such as actors, voice coaches, authors, script-writers, journalists and the broadcast media. A number of recent initiatives have enabled the British Library to extend its resources and services by acquiring existing collections and creating new content, while simultaneously developing remote access to selected material. This chapter provides an overview of a collection of historic sound recordings – the Berliner Lautarchiv British and Commonwealth Recordings (BL shelfmark: C1315) – and evaluates their relevance to contemporary research by comparing them with data from the Survey of English Dialects and similar authoritative linguistic sources. 2 The Berliner Lautarchiv British and Commonwealth Recordings The Berliner Lautarchiv British and Commonwealth Recordings (BLBCR) is a collection of digital audio transfers from shellac disc recordings made between 1916 and 1938 by Wilhelm Doegen, Director of the Lautabteilung an der Preußischen Staatsbibliothek (Prussian State Library Sound Department) in Berlin and its predecessor the Königlich Preußische Phonographische Kommission (Royal Prussian Phonographic Commission). It consists of 821 digital copies of recordings that feature speakers from the British Isles and Commonwealth nations and constitutes a subset of the extensive Berliner Lautarchiv at the Humboldt-Universität in Berlin (publicus.culture.hu-berlin.de/lautarchiv). The BLBCR audio files fall into three categories: recordings in English of British and Irish Prisoners of War (POWs) held in captivity on German soil between 1916 and 1918; recordings in a variety of indigenous languages of British colonial troops made in the same circumstances; and later recordings made by Doegen, including a set of recordings in Irish made during fieldwork in Ireland in the 1920s and 1930s (see Hickey 2011: 98-99). The recordings of British and Irish POWs represent the earliest known collection of sound recordings of vernacular English speech and are the focus of this chapter.


3 Wilhelm Doegen and Alois Brandl Wilhelm Doegen (1877-1967) was a graduate of modern languages at the Friedrich-Wilhelms-Universität in Berlin who developed an interest in the use of phonetics in language teaching as a result of meeting British phonetician, Henry Sweet, while studying at Oxford in 1899. After graduating he began publishing several volumes of teaching materials with the Odeon Recording Company in Berlin (Doegen 1909). Following the outbreak of World War I Doegen became increasingly intrigued by the opportunity afforded by the presence on German soil of unprecedented numbers of peoples from all over the world in the form of POWs. Enlisting the support of academics working in such diverse fields as philology, ethnomusicology and anthropology, Doegen and philosopher/psychologist Carl Stumpf (1848-1936) proposed the creation of a commission to coordinate a comprehensive recording programme. The Königlich Preußische Phonographische Kommission was established in October 1915, with Stumpf as its Chairman and Doegen responsible for implementing the ambitious recording programme. The intention was to capture native German dialects, the voices of famous people, and languages, music and songs from all over the world. Between 1915 and 1918 2,677 recordings were made in 70 POW camps across Germany: 1,022 recorded on wax cylinder by ethnomusicologist Georg Schünemann (1884-1945) and 1,650 on shellac disc by Doegen and his associates (Mahrenholz 2006). While Doegen was responsible for technical production, eminent German linguists assumed responsibility for identifying and recording individual language groups. Alois Brandl (1855-1940), Austrian-born Professor of Anglistik at the Friedrich-Wilhelms-Universität, Münster, and President of the Deutsche Shakespeare-Gesellschaft (German Shakespeare Society), directed the recording of English-speaking POWs. Like Doegen, Brandl had met Henry Sweet in London, receiving weekly tutorials during the winter of 1879-80 to improve his English pronunciation. As he describes in his autobiography (Brandl 1936: 138) Sweet’s instruction involved exercises in articulation, transcription and accurate reproduction of transcribed speech, a process which he acknowledges informed his approach to documenting the recordings made in the POW camps (Doegen 1925: 367). After the war Doegen secured a permanent home for the recordings in the Lautabteilung an der Preußischen Staatsbibliothek, founded in 1920. He subsequently resumed a productive collaboration with distinguished British linguists including Daniel Jones (1881-1967), Arthur Lloyd James (1884-1943) and Ida Ward (1880-1949), with whom he published a series of commercial recordings on the Odeon label, such as Examples of English Intonation (Jones 1921). He also assisted Ferdinand Wrede (1863-1934), director of the Deutscher Sprachatlas (Wrede et al 1927-1956), with sound recordings for the survey of German dialects instigated by Georg Wenker (1852-1911). 4 The 1916-1918 recording programme From Brandl’s own account of the POW recording programme (Doegen 1925: 362-375) it is clear he viewed the opportunity as a natural extension in regrettable circumstances of work he had begun before the war. He recalls several visits he and


his students – undergraduate trainee teachers – made in the early part of the twentieth century to carry out fieldwork in locations across England and Scotland, expressing particular delight in observing dialect verse and songs. In common with many of his contemporaries Brandl considered urban speech somewhat diluted and less worthy of linguistic study than rural dialects, a bias that continued well into the twentieth century as demonstrated some forty years later by the essentially rural focus of the Survey of English Dialects (Orton et al. 1962-1971). The process for selecting informants was assisted by the German military authorities who made available documentation they held on all British and Irish internees, including essential biographical details such as age, home and occupation in civilian life. Access to this data enabled Brandl to compile an initial sample of POWs that met his criteria: ‘Stammte er aus der gebildeten Schicht, aus einer größeren Stadt oder aus dem weiten Cockneygebiet des Südostens, so wurde sein Zettel von vornherein bei Seite gelegt (…) Bauernburschen aus abgelegener Gegend, Fischer aus kleinen Häfen, Schafhirten und namentlich halbe Analphabeten waren zumeist gesucht’ [‘anyone from the educated class, from larger towns or from the wider Cockney area of the South East was rejected from the outset (…) farmhands from remote locations, fishermen from small fishing villages, shepherds and especially semi-literates were prioritised’ – my translation]. Brandl’s dismissal of speech in the South East of England contrasted with his admiration for dialect speech in Lancashire and Scotland which he felt showed greater stability and continuity and explains the prominence within the collection of speakers from the north of England and Scotland. Having identified a long list of potential candidates Brandl then arranged visits to individual camps to interview speakers and assess their suitability for inclusion in the survey. Presented to Brandl in groups and asked simply to state occupation and describe briefly where they lived was considered sufficient to eliminate ‘eine große Zahl von Sprechern, indem sich nach den ersten Worten ergab, dass sie niemals einen Dialekt sprachen oder ihn gründlich abgelegt hatten’ [‘a large number of speakers who, within a few words, revealed they had never spoken dialect or had abandoned it completely’ – my translation]. This rather superficial assessment by modern standards enabled Brandl to select his ‘gutes Material’. Those chosen were then made aware there was no compulsion to participate in the study, but evidently few declined. Sensitive to the controversial recording circumstances and keen to avoid contentious subjects Brandl then spent some time interviewing individual speakers, commenting that most soon relaxed and seemed to enjoy both his interest in their speech and the opportunity for diversion from camp life. Discussions ranged from parents to school and working life, with Brandl encouraging speakers to volunteer local expressions, which he observed were particularly forthcoming when speakers reminisced about their childhood. Unfortunately, this is the only record of these spontaneous conversations, which served, for Brandl, as final confirmation of the authenticity of an informant’s dialect and, for the POWs, as reassurance of Brandl’s purely academic intentions prior to agreeing to make a sound recording. The content of the subsequent sound recordings varies and includes reading passages, word lists and recitals of songs and/or folk tales. The most frequently recorded text is a recital of the Parable of the Prodigal Son (Luke chapter XV, verse 11-32) in the speaker’s own dialect. Clearly a popular device with linguists at the time, this passage also served as the principal


‘specimen’ in the Linguistic Survey of India (Grierson 1903-1928) chosen by its director, Sir George Grierson (1851-1941), as it contained ‘the three personal pronouns, most of the cases found in the declension of nouns, and the present, past and future tenses of the verb’ (Grierson 1927: 18). Although Brandl makes no reference to this precedent, he also commends the passage as containing several ‘Scheideformen’ [‘distinguishing forms’] that enable comparison between dialects. The original shellac discs are held at the Berliner Lautarchiv along with documentation created at the time of recording. Biographical details of informants (e.g. date of birth, religion, occupation in civilian life, schooling, level of literacy) are presented on a ‘Personalbogen’, most of which are complete and survive intact. It is difficult to ascertain the version of the Parable used for the recordings, but Brandl describes how a text was ‘vorbereitet’ [‘prepared’], thereby implying some editing of the original. Speakers were given the text in advance and encouraged to adapt it to reflect their own local speech forms, making it even more challenging to identify the source text. These handwritten adaptations survive in most cases and were used as prompt sheets for the actual recording. They do not correspond exactly with what appears on the discs as, presumably, speakers deviated from the script due to the pressure - and indeed novelty – of being recorded. These orthographic transcripts are accompanied by phonetic transcriptions made by Brandl using a notation scheme which he acknowledges derives primarily from Henry Sweet, although he stresses the discs themselves are the more accurate record of a given dialect. The International Phonetic Association was established in 1888, so Brandl’s transcriptions are an example of practice at a time when the International Phonetic Alphabet was not yet universally applied. As such they are of considerable academic interest and the subject of a forthcoming doctoral thesis by Valentina Tarantelli at the University of Sheffield. The handwritten transcripts are equally intriguing as they capture laymen’s attempts to modify English orthography to convey vernacular forms and localised pronunciation. It has been surprisingly difficult to find any reference to the discs or to Brandl’s descriptions in the years immediately after the recordings were made. There are 20 dialects featured in Englische Dialekte (Brandl 1926) and a series of discs and pamphlets reviewed in The Review of English Studies (1931: 372-373) but few other leads exist. From 1949 onwards the discs were located in the German Democratic Republic and hence, perhaps understandably, disappeared from view. ‘Re-discovered’ and digitised in the 1990s, digital audio files were acquired by the BL in 2007 along with photocopies of the accompanying documentation. Given the extraordinary circumstances in which they were created and the centenary in 2014 of the outbreak of hostilities, there has been considerable interest in the collection in recent years. The recordings featured in the Radio 4 broadcast, Barbed Wire Ballads (2005), and were the subject of a BBC 4 documentary, How the Edwardians Spoke (2007), which traced descendants of three speakers to ‘re-unite’ families with the recordings for the first time and an exhibition to commemorate the contribution made by Sikh soldiers to World War 1, Empire, Faith and War: The Sikhs and World War One (2014) included several recordings with Indian POWs. 5 The Survey of English Dialects


The Survey of English Dialects (SED) was a groundbreaking nationwide survey of the vernacular speech of England, undertaken by researchers based at the University of Leeds under the direction of Harold Orton (1898-1975). As the SED is well-known to linguists and dialectologists only a brief summary is provided here, but Fees (1991) gives a more detailed description of the survey and its legacy. From 1950 to 1961, a team of fieldworkers collected data in 313 mostly rural localities, initially in the form of transcribed responses to a questionnaire containing over 1,300 items, meticulously recorded and subsequently published in the Survey of English Dialects: The Basic Material, Vols. I-IV (Orton et al 1962-1971). The informants were mostly farm labourers, predominantly male and generally over 65 years old and thus, like the BLBCR informants, born in the second half of the nineteenth century. Advances in audio technology during the 1950s made it increasingly possible, and indeed desirable, to record informal conversations on site so several localities were revisited to make sound recordings with original contributors or replacements with similar profiles. The recordings vary in length and quality but generally consist of ten to twenty minutes of unscripted, spontaneous discussions of working life, domestic routine and local custom. The audio archive complements the survey’s published output and includes recordings from 288 SED localities, additional recordings from Orton’s pre-SED Northumbrian corpus and several pilot recordings from non-SED localities. The original open reel tapes and gramophone discs are held at the Brotherton Library at the University of Leeds and digital copies are available at the BL (BL shelfmark: C908). Until recently these two collections of early sound recordings of British dialects were inaccessible to all but a handful of academics. In 2011 the complete set of 66 recordings of British and Irish POWs reading the Parable of the Prodigal Son were made available on the Sounds website (sounds.bl.uk) alongside extracts from 288 SED recordings uploaded in 2005. Audio clips from both collections were included in the BL’s Evolving English exhibition (2010-2011) and on the accompanying Voices of the UK audio CD (British Library 2010). The Library’s Sound and Moving Image catalogue (cadensa.bl.uk) gives complete details of both collections. To assess the validity of the BLBCR recordings for present-day linguistic researchers and dialect enthusiasts I now consider a sample of lexical, phonological and grammatical variables contained within the Parable text and compare the variants supplied by one POW with SED published data and audio files to determine how these two sets of early sound recordings corroborate existing knowledge and/or allow us to re-evaluate previous studies. 6 The dialects of Shelley and Skelmanthorpe The following is a comparison of the BLBCR recording of John Townend (BL shelfmark: C1315/1/813) with SED data recorded in Skelmanthorpe (SED ref: 6Y31). John Townend was recorded in Güstrow POW camp on 3 July 1917. Born on 1 January 1882 in Shelley in the West Riding of Yorkshire he attended the village school then lived and worked locally as a steam engine driver before enlisting. Shelley is situated in present-day West Yorkshire, approximately six miles south east of Huddersfield and one mile north west of Skelmanthorpe. The


two SED informants for Skelmanthorpe were aged seventy-five and seventy-six when interviewed in October 1952 and gave their occupations as weaver/miner/gravedigger/farm labourer and farmer/smallholder respectively. One of the informants, Wibsey Dyson, also made a sound recording on 20 October 1952 (BL shelfmark: C908/48 C1). Brandl’s desire to give each POW creative licence to modify the text means the content of the Parable inevitably varies from speaker to speaker. Nonetheless there is considerable overlap with items contained in the SED questionnaire and in most cases it is possible to map up to 100 lexical items with a corresponding entry in the relevant volume of the SED Basic Material (Orton et al. 1962-1971). Table 1 compares the realisation of a selection of variables that occur in the recording of John Townend with corresponding data from Skelmanthorpe presented in the SED (Orton and Halliday 1962-1963). The variables are restricted to full content words as function words, such as pronouns and prepositions, frequently show contrasting strong and weak forms and are discussed separately below. Table 1. A comparison of 33 variables in a recital of the Parable of the Prodigal

Son by John Townend with corresponding SED data from Skelmanthorpe, Yorkshire.

Variable BLBCR Shelley

SED Skelmanthorpe SED ref.

ALWAYS [ɔːlɪz, ɔːlweːz] [ɔːlɪs, ɔːləs, ɒləs] VII.3.17

BEGAN [bɪgan] [bɪgʊn] VII.6.23

BELLY [bɛlɪ] [bɛlɪ] VI.8.7

BROTHER [bɹʊðə, bɾʊðə]] [bɹʊðə] VIII.1.5

CALF [kɔːf] [kaːf, kɔːf] III.1.2 CLOTHES [klʊəz] [tlʊəz] VI.14.20

CAME [keːm] [kʊm, kɔːm, kʊmd] IX.3.4

EAT [ɛɪt] [ɛɪt] VI.5.11 FATHER [faðə] [faðə] VIII.1.1 FEET [fɪit] [fɪit] VI.10.1 FIELDS [klɔɪzəz] [klɒɪzəz] I.1.1 FOUND [fʊn] [fɑːnd, fan, fʊn] IX.3.2 GIVE [gɪ] [giː] IX.8.2 GREAT [gɹeːt, gɹɛt] [gɹɛt] IX.1.6 HEARD [jɛd] [jəd, jɛd] VIII.2.6


HIMSELF [ɪzsɛn] [ɪzsɛn] IX.11.4 HOME [wɒm] [ʊəm] VIII.5.2 HOUSE [æəs] [æəs] V.1.1 MAKE [mak] [mak, mɛk] IX.3.6 MAN [man] [man] VIII.1.6 MONEY [bɹas] [bɹas] VII.8.7 MORE [mʊə] [mʊə] VII.8.13 NECK [nɛk] [nɛk] VI.6.1 NOTHING [nɔʊt] [nɒʊt] VII.8.14 ONE [wʊn] [wʊn] IX.8.8 OUT [æət] [æət] IX.2.15 PUT [pʊt] [pʊt] IX.3.3 SHOES [ʃʊuːz] [ʃʊuːz] VI.14.22 SON [sʊn] [sʊn] VIII.1.4 TAKE [tak] [tak, tɛk] IX.3.7 TWO [tʊuː] [tʊuː] VII.1.2 VERY [vaɾɪ] [vaɹɪ, vaɹə] VIII.3.2 WORK [waːk, wɒkɪn] [waːk, wak, wɔkɪn] VIII.4.8

The phonetic transcriptions in Table 1 column 2 are based on an auditory assessment of the sound recording of John Townend. To avoid potential influence from SED data the recording was transcribed prior to consulting the published responses for Skelmanthorpe. Multiple entries in columns 2 and 3 show alternative forms recorded in each source. In column 3 these either reflect pronunciations contained within an informant’s response to other prompts in the SED questionnaire or derive from the SED ‘Incidental Material’ (Orton 1962: 17) – tokens that occurred during spontaneous discussions that arose naturally between fieldworker and informant. Table 1 shows a remarkably close correspondence between the forms supplied by John Townend and the SED informants’ responses recorded in Skelmanthorpe. Of the thirty-three variables under scrutiny twenty-four contain an identical match and six differ only in minor phonetic detail as itemised below: (1) a. always shows deletion of word medial /w/ for both speakers but John

Townend favours a voiced final consonant.


b. clothes shares the same dialectal centring diphthong (more below) albeit the SED entry shows the word initial cluster /kl/ was realised as [tl], a phenomenon noted in several SED localities in the north and Midlands.

c. closes [= ‘fields’] shares a dialectal diphthong (more below) but the SED entry indicates a more open onset but a common schwa vowel for the plural <-es> morpheme, a distinctive feature of the accent in this area then and now.

d. give is realised as northern dialect / archaic gie (see e.g. Wright 1898-1905) but John Townend uses a lax vowel where SED shows a tense vowel, a subtle distinction in all probability prompted by contrasting phonetic environment: for John Townend gie occurs pre-consonantally (gie me [gɪ mɪ] that part of your goods that belongs me) whereas for the SED informant it precedes a vowel (gie it me [gi: ɪt mɪi]).

e. naught [= ‘nothing’] differs in phonetic quality in that the SED form shows a lowered onset for the diphthong but crucially the word belongs in the THOUGHT set (Wells 1982) for both speakers – not the MOUTH set (Wells 1982) as elsewhere in the north – and, like analogous words such as daughter and brought, is realised with a localised diphthong still typical of this part of West Yorkshire.

f. very differs only in John Townend’s use of an intervocalic tapped /r/, a feature conspicuously absent from the published SED data in Yorkshire but present in many SED sound recordings, including Skelmanthorpe (more below).

Of the three remaining variables, two show morphonological differences and one contrasts phonetically. The SED past forms begun and come or comed are clearly more dialectally marked than the counterparts John Townend uses: began and came are Standard English past forms albeit his pronunciation of came [keːm] contains a distinctly ‘northern’ monophthong. The pronunciation of home shows predictable H-dropping (more below) for both speakers, but the SED informant uses the same dialectal centring diphthong noted for clothes above, whereas John Townend uses a checked vowel with a /w/ onglide. Although this phenomenon is not recorded in the SED for Skelmanthorpe at HOME (VIII.5.2) it is noted in several nearby localities, such as Golcar (SED ref: 6Y29) and Holmbridge (SED ref: 6Y30) and the response [wʊts] recorded for Skelmanthorpe at OATS (II.5.1) confirms this onglide as a genuine local feature. A smaller subset of the same variables occurs in the SED sound recording from Skelmanthorpe, confirming the same close correspondence. The passages below are extracted from the sound recording of Wibsey Dyson published at sounds.bl.uk and include IPA transcriptions of lexical items that match entries in Table 1. (2) a. I were very [vaɾɪ] oft first there if there were a funeral on b. ‘well’ I said ‘I never seen a ghost in my life but I [inaudible] and I’d look

at one’ [wʊn] c. don’t make [mak] no mistake [mɪstak] when I got nearly up to it I heard

[jəd] it were saying summat


d. a woman’d comed [kʊmd] out in her nightdress and and and were seeking her cat

e. I used to get up about half past four and set off to my work [waːk] I used to leave the house [æəs] at half past five and I’d to be at s… at pit at six o’clock and we worked [wʊkt] while two [tʊu:]

f. before I got to my work [wa:k] I’d gotten a right sweat on, you know, and then I’d to wade nearly knee-deep in water and I’d to work [wʊk] in water all the day through

g. no wonder at folk dying when I reckon it up prematurely conditions that they had to work [wʊk] under in them days you were seldom ever come home [ʊəm] dry you’re always [ɔːlɪz] wet to the skin and besides that you had naught [nɔʊt] for it

There is further confirmation here of a direct match with John Townend’s speech in the case of house, make, one, naught [= ‘nothing’], take and two. Furthermore, Wibsey Dyson uses a final voiced consonant in always as noted for John Townend in Table 1 and indeed a tapped /r/ in very (more below). H-dropping is confirmed on heard and home: heard contains the /j/ onglide also used by John Townend, but a different vowel; John Townend’s characteristic /w/ onglide in home is, however, again not evident here. Wibsey Dyson confirms his preference for past tense comed but shares with John Townend a distinction between the noun work [wa:k] and verbal work [wʊk~wɔk~wɒk] with a back rounded vowel of varying degrees of openness. 7 The phonology of the dialects of Shelley and Skelmanthorpe Thus far this analysis has been restricted to comparing individual lexical items with direct counterparts in each source. To test the value of the BLBCR recording further we now offer a more comprehensive analysis. Despite pre-dating by 70 years the system now widely adopted for comparing accents of English, most recordings of the Parable text include at least one member of most lexical sets (Wells 1982). In the case of John Townend only the CHOICE and NORTH sets are completely unrepresented, although some sets have only a small number of tokens. It is, however, possible to give an account of the general characteristics of John Townend’s accent and compare it with SED data from Skelmanthorpe. The following description focuses on five vowel sets considered salient in this dialect – lexical sets BATH, STRUT, FACE, GOAT and GOOSE – with additional observations on two significant consonantal features and two important connected speech processes. Table 2 indicates John Townend’s realisation of the selected phonological variables and includes all the tokens in his recital of the Parable for each variable. As with Table 1 only full content words are included here: function words are discussed in the section below addressing morphonological phenomena. Table 2. A description of selected phonological variables in John Townend’s

recital of the Parable of the Prodigal Son.


Feature Realisation Token

STRUT [ʊ] son(s), young(er), country, others, husks, such, begrudged, done, hunger, one, us, brother, come(s)

[a] after, master, last, asked, brass BATH

[ɒ] dancing

[eː] gave, day(s), away, wasted, great [first token], came, say, way, safe

[a] take, make

[ɛ] great [second token]

[ɪə] again

FACE

[ɪ] always

[oː] arose, no [second token]

[ʊə] so, no [first token] clothes

[ɔɪ] closes

[ʊ] go

GOAT

[ɒ] over, home

[ʊuː] who, two, food, shoes, music, do GOOSE

[ʊɪ] soon

[h] high

initial H [∅]

who, swineherd, husks, how, hired, here, hunger, home, heaven, hand, hither, house, heard

[ɹ] country, everything, great, begrudged, great, ran, brother [second token] brass

R [ɾ]

very, arose, bring, ring, merry, brother, angry, friends, brother [first and third tokens]

[fɛtʔ swɑːn] the husks that he fed the swine with Yorkshire assimilation [aʔ pɪtɪ] his father saw him and had pity and ran to

meet him


[∅] he saw others at their meals but had naught hissen to eat

Liaison [w] his father saw him and had pity and ran to

meet him

In common with most speakers in this part of Yorkshire then and now, the quality of John Townend’s BATH and STRUT vowels shows no evidence of the TRAP-BATH and STRUT-FOOT splits that characterise southern English accents. His pronunciation of words in the TRAP set (e.g. back, angry with [a]) and FOOT set (e.g. goods, put with [ʊ]) confirms this. Table 2 shows he is consistent in using [ʊ] with STRUT and only deviates from [a] with BATH on dancing [dɒnsɪn]. Unfortunately, dance was not included in the SED questionnaire, but there is evidence of dialects in which words with orthographic <-an-> are realised with [ɒ], although the data locate the relevant isogloss some distance to the west of Shelley (Upton and Widdowson 2006: 16-17). Furthermore, man occurs six times in the recording consistently with [a] and SED entries at MAN (VIII.1.6) and HAND (VI.7.1) show no sign of [ɒ] in Skelmanthorpe, although [mɒn] is recorded in nearby Heptonstall (SED ref: 6Y21). I can cite anecdotal evidence of this pronunciation in Barnsley (personal experience from attending a football match at Oakwell c. 2003 when a local fan within earshot repeatedly taunted the referee with the observation: you want glasses, man [mɒn]). Perhaps, however, the English Dialect Dictionary entry for dancing (Wright 1898-1905) offers most by way of clarification as a citation records a spelling for West Yorkshire that corroborates John Townend’s pronunciation: nivver let noan at lasses gooa to donsin [= ‘never let any of the girls go dancing’]. The data in Table 2 for FACE and GOAT show several variants. Local speech in much of West Yorkshire today is characterised by monopthongal realisations in these sets, with [eː ~ ɛː] and [oː ~ ɔː] respectively. John Townend clearly favours a monophthong for a small set of words in the GOAT set here and for the majority of the FACE set, but also uses broader dialectal variants. As noted in Table 1 [ʊə] is recorded for clothes and [ɔɪ] for closes [= ‘fields’]. All these variants are also present in the sound recording made with SED informant Wibsey Dyson: (3) a. we’d a grave [gɾeːv] to dig and my mate [meːt] had gotten there before me b. I says, ‘sithee, Bill what’s yond?’ he says, ‘it’s a ghost [gʊəst] it’s a ghost’

[gʊəst] he says c. in them days [deːz] there no [nɔː] there were no [nɔː] mechanism in the pit

hardly everyth… every every bit of coal [kɔɪɫ] were hand-gotten d. before I got to my work I’d gotten a right sweat on, you know, [jənɔː] and

then I’d to wade [weːd] nearly knee-deep in water


e. no [nɔː] wonder at folk [fʊək] dying when I reckon it up prematurely conditions that they had to work under in them days [deːz] you were seldom ever come home [ʊəm] dry

The diphthongal GOAT variants appear to be recessive in present-day Yorkshire dialect, although Stoddart reports don’t with [ʊə] among older speakers in nearby Sheffield (Stoddart et al. 1999: 74) and [ɔɪ] survives locally if only in fossilised phrases like fish-’oil [= ‘fish and chip shop’] (see e.g. Kellett 1994: 61) and putwoodintoil [= ‘shut the door’] (see e.g. Mitchell 1987: 20) – expressions used affectionately and/or humorously even by speakers who otherwise consciously avoid overtly dialectal features. The checked variant, go with [ʊ] is confirmed in the SED (VIII.6.1) and in the sound recording with Wibsey Dyson: I started a-working at a shilling in the day shilling in the day for going [gʊɪn] down the pit. Unlike the diphthongs this pronunciation remains a distinctive feature of the present-day local accent and is reported by Stoddart for Sheffield with go, goes and going (Stoddart et al. 1999: 74). Rather surprisingly, given its potential for variation, over was not included in the SED questionnaire, but it does occur in the SED sound recording with Wibsey Dyson, pronounced [ɔʊə]. The use of a checked vowel, however, is certainly possible locally nowadays in a restricted set of words in the GOAT set including broke(n), froze(n), only, open and spoke(n). Of the other FACE variants, make and take with [a], always with [ɪ] for the final vowel and great with [ɛ] are included in Table 1. More recent BL sound recordings confirm always is still widely pronounced with [ɪ] in a number of British dialects, while in this part of the north and Midlands make and take appear to have joined items within the FACE set that can surface with [ɛ]: words such as ain’t, came, break, gave, great (particularly in the collocation great big), laid, made, make, making, say, take(n) and taking. The final variant noted for FACE – again with [ɪə] – is not apparent in the SED published data or sound recording for Skelmanthorpe. However, again was elicited in a number of SED localities in response to the prompt BESIDE (IX.2.5), albeit not in Skelmanthorpe. John Townend uses again in the Parable text as a preposition meaning ‘against’, mirroring those SED informants who supplied it as a variant of the preposition BESIDE. The SED entry [əgɪən] for Thornhill (SED ref: 6Y26), the closest locality to Skelmanthorpe offers confirmation of this as a genuine local feature. The narrow diphthong with an extended offset noted for most items in the GOOSE set in Table 2 is confirmed by the published SED data, including at TWO (VII.1.2) and SHOE (VI.14.22) as shown in Table 1. This realisation remains distinctive of present-day speech locally and is also apparent in the SED sound recording with Wibsey Dyson, although not always with such distinctive length: (4) a. I were very oft first there if there were a funeral [fjʊunɹəl] on


b I used [ʊust]to get up about half past four and set off to my work I used [jʊust]to leave the house at half past five and I’d to be at s… at pit at six o’clock and we worked while two [tʊuː]

c and then I’d to wade nearly knee-deep in water and I’d to work in water all the day through [θɾʊuː]

John Townend’s pronunciation of soon as [sʊɪn] is replicated in the SED Skelmanthorpe entry for MOON as [mʊɪn] (VII.6.3) and indeed occurs in the sound recording made with Wibsey Dyson: I says, ‘art thou late Bill or I’m soon [sʊɪn] or I’m soon [sʊɪn] and thou late?’. Table 2 includes data relating to two consonantal features: initial /h/ and the realisation of /r/. John Townend’s almost categorical deletion of initial /h/ is unsurprising as H-dropping is a well-known feature of most vernacular accents in England. SED data shows it was a particularly productive process in West Yorkshire dialects, as confirmed by several entries for Skelmanthorpe including HEAD (VI.1.1), HEDGEHOG (IV.5.5) and HELP (V.8.13) and in the corresponding sound recording: if it’d a gone ‘sh’ and waved it’s hands [andz] I should’ve very likely either had [ad] a fit or taken my hook [ʊuːk]. John Townend’s realisation of /r/ varies between an alveolar approximant [ɹ] and, as noted previously, a tapped [ɾ]. This is one of the more intriguing elements of the recording as, unlike other features discussed here, it differs quite significantly from SED published data and thus perhaps offers new insights into the dialect of the period. I have always considered this a distinctive feature of present-day speech in this part of West Yorkshire and hear it in the speech of family in Castleford and Pontefract and so I have consequently always been intrigued by its absence from SED descriptions. As Table 2 shows, John is certainly not consistent in his use and we have insufficient data to present a definitive view of the distribution of each variant, but there are nonetheless observable tendencies. Some urban dialects of the North West – notably Manchester and Liverpool – are characterised by almost categorical use of tapped /r/. John, however, uses both available variants word initially (ran with [ɹ] versus ring with [ɾ]) and within identical consonant strings (e.g. brass with [ɹ] but bring with [ɾ], begrudged with [ɹ] and angry with [ɾ]), but consistently uses an alveolar tap intervocalically in very, merry and arose. We might expect to find some trace, therefore, of this in the SED published data at entries like CARROT (V.7.18), PORRIDGE (V.7.1.) and VERY (VIII.3.2). It is difficult to speculate why this does not occur more systematically in the SED Basic Material (Orton et al 1962-1971), but it is certainly audible in several relevant SED sound recordings, including in Carleton (BL shelfmark: C908/9 C4) and Holmbridge (BL shelfmark: C908/47 C6) and, most importantly, in Skelmanthorpe: I were very [vaɾɪ] of first there if there were a funeral [fjʊunɹəl] on and this particular morning […] we’d a funeral [fjʊunɹəl] to we’d a grave [gɾeːv] to dig). It is also unquestionably characteristic of present-day speech in West Yorkshire as demonstrated by more recent BL recordings in nearby Golcar (BL shelfmark:


C900/08618 – I used to go to the Methodist Chapel but my friends [fɹɛndz] went to the Baptist Chapel so I actually did a swap and went to the Baptist my parents [pɛːɾənts] were quite happy for me to do that) and Castleford (BL shelfmark: C1190/19/01 – I can remember [ɹɪmɛmbə] five ton of n… nuts being pulped on a warehouse [wɛːɾaʊs] floor and these nuts had to be turned every month) which confirm a similar distribution to John Townend and Wibsey Dyson, i.e. R-tapping is more likely to appear intervocalically. Another intriguing pronunciation noted in Table 2 is John Townend’s word final devoicing of fed and had in the utterances the husks that he fed the swine [fɛtʔ swɑːn] with and his father saw him and had pity [aʔ pɪtɪ] and ran to meet him. These are examples of Yorkshire assimilation (Wells 1982: 366), a term that describes a localised process whereby a voiced consonant assimilates to a following voiceless consonant. In certain high frequency collocations this can occur in many accents of English (e.g has to [has tu], have to [haf tu], had to [hat tu] and used to [juːst tu]) but is less restricted for many speakers in West Yorkshire, for whom e.g. goldfish [gɔːltfɪʃ] or big fish [bɪk fɪʃ] are typical pronunciations. This phenomenon is popularly associated almost exclusively with the city of Bradford, stereotyped locally as [bɹatfəd] or [bɹaʔfəd], but evidence from modern BL recordings confirms Wells’ impression of a much wider distribution across most of present-day West and South Yorkshire and extending into East and North Yorkshire, too. In the case of fed here this process is possible despite it preceding a definite article with an underlying voiced onset /ð/, as the local preference for definite article reduction (more below) allows speakers to substitute a voiceless /t/ (or its allophonic equivalent [ʔ]) for the, thereby creating the required environment for Yorkshire assimilation. It is difficult to locate entries that show evidence of Yorkshire assimilation in the SED Basic Material (Orton et al 1962-1971), but there are examples in SED sound recordings in, for example, Sheffield (BL shelfmark: C908/48 C2 – if he’s a good crane driver [gʊʔ kɹɛːn dɾɑɪvə] and I’ve had one or two good ’uns uh he’ll be able to shout up to the crane man, you know, which way to go). The final entry in Table 2, categorised as liaison, captures a local process that distinguishes some speakers in West Yorkshire and other parts of the central north from accents elsewhere in England. Many speakers with non-rhotic accents, including RP, tend in connected speech to insert a /r/ sound across certain morpheme and word boundaries where an open syllable meets a syllable with a vowel onset, such as do you want Fanta or Tango? [fantəɹɔːtaŋgəʊ] or drawing [dɹɔːɹɪŋ]. This so-called intrusive /r/ is thought to result from the loss of historic postvocalic /r/ (Wells 1982: 222-227). Present-day speech in West Yorkshire is non-rhotic, albeit relatively recently in the extreme west as shown by evidence of r-coloured vowels in SED localities such as Golcar (Orton and Halliday 1962: 36) but not Skelmanthorpe. Thus intrusive /r/ is allowed in most environments, but appears to be resisted locally in the environment <-awing> and <-aw#> + V, where


many speakers favour either [∅] or insert a sandhi /w/. Both alternatives are illustrated in the recording with John Townend: (5) a. he saw others [sɔː ʊðəz] at their meals but had naught hissen to eat b his father saw him [sɔːʷɪm] and had pity and ran to meet him More recent examples can be found in several BL audio files, including in Huddersfield (BL shelfmark: C900/08561 – we were at the back of uh of the rows in assembly and all we were doing were footsie like this and she saw us [sɔːʷʊz] and had us out and we each had a stroke of the cane on the back of us hand). 8 The lexis and grammar of the dialects of Shelley and Skelmanthorpe We now turn to a description of significant lexical and morphonological features that occur in the BLBCR recording with John Townend. Table 3 lists a set of features and gives the variant(s) demonstrated by John Townend and the relevant token(s) within the Parable text. Table 3. A description of selected lexical and morphonological variables in John

Townend’s rendering of the Parable of the Prodigal Son.

Variable variant Token

GIVE gie gie me that part of your goods that belongs me

FIELD close this man sent him into the closes as a swineherd; his elder son were working in the closes

NOTHING naught he saw others at their meals but had naught hissen to eat

STILL (= ‘now as formerly’) yet when he were yet a great way off his father saw

him MONEY brass who has spent all his brass in wild living second PERSON SUBJECT PRONOUN

thou and always done what thou telled me to do; thou never even gave me a kid; thou hast the fatted calf killed for him; thou’re always with me

second PERSON OBJECT PRONOUN

thee father I’ve sinned again thee; I have sinned again heaven and again thee; these many years have I served thee

thy, thine

I’m no longer to be called thy son; make me one of thy hired servants, thy brother has come home; thy father has killed the fatted calf; this thy brother were dead and is alive again

second PERSON POSSESSIVE PRONOUN

thine all I have is thine


your gie me that part of your goods that belongs me

be PAST (third PERSON SING.)

were

there were a man who had two sons; when he were yet a great way off his father saw him; now his elder son were working in the closes; then he were angry and wouldn’t go in; this thy brother were dead and is alive again; he were lost and is found

get PAST PARTICIPLE gotten he’s gotten him back safe and sound

tell PAST telled and always done what thou telled me to do

[ʔ]

so the father gave him his share; the young man; a great famine came over the country; this man sent him into the closes; that he fed the swine with; when the young man came to think it over; all the food they want; the young man; but the father said; bring forth the best clothes; bring hither the fatted calf; working in the closes; one of the servants; and the servant said to him; killed the fatted calf; the father came out; thou hast the fatted calf killed for him; the father said to him

DEFINITE ARTICLE

[ʔθ] filled his belly with the husks; near to the house

PREPOSITION in [ɪ]

he began to be in want; his elder son were working in the closes; who has spent all his brass in wild living

PREPOSITION of [ə]

the younger of them said to his father; gie me that part of your goods that belongs me; when he found a place with a man of that country; how many hired servants of my father have all the food they want; here I’m dying of hunger; make me one of thy hired servants; he called one of the servants

[tu]

I should go back to my father; I’ll go to my father and say to him ‘father I’ve sinned again thee’; he arose and came to his father; the father said to him ‘dear son thou’re always with me’

[ti]

younger of them said to his father; the young man said to his father; the father said to his servants ‘bring forth the best clothes’; and the servant said to him ‘thy brother has come home’; he said to his father ‘these many years have I served thee’

PREPOSITION to

[tə] I should go back to my father; I’ll go to my father and say to him ‘father I’ve sinned again thee’; near to the house

PREPOSITION with [wɪ]

he’d’ve been glad to have filled his belly with the husks that he fed the swine with; so I might have a feast with my friends, thou’re always with me


[wi] when he found a place with a man of that country

Of the five lexical items included in Table 3, gie [= ‘to give’] closes = [‘field’], naught [= ‘nothing’] and brass [= ‘money’] are all also captured in the SED Basic Material (Orton et al 1962-1971) at IX.8.2, I.1.1, VIII.8.4 and VII.8.7 respectively. Although yet [= ‘still, now as formerly’] was not an SED prompt word and does not occur in the sound recording with Wibsey Dyson, its use in this sense by the SED speaker recorded in nearby Thornhill (BL shelfmark: C908/9 C3) confirms it as a local variant: aye, well I always did a clean job and I do yet. The use of second person <th-> pronouns noted in Table 3 is widely associated with Yorkshire dialect and John Townend’s near-categorical use of thou [ða] as a subject form, thee [ði:, ðɪ] as an object form and thy [ðɪ] and thine [ðɑ:n] as possessives mirror SED responses in Skelmanthorpe at IX.9.9, VI.14.2, IX.8.6 and V1.5.17 respectively. Recent surveys and present-day BL sound recordings suggest this feature is now recessive, although it survives among older speakers in some areas and, as with many dialect variants, remains available as an identity marker or to be used for effect in particular contexts – witness the use by Leeds band Kaiser Chiefs in their 2004 single ‘I Predict A Riot’: watching the people get lairy it’s not very pretty I tell thee. It is intriguing that John Townend uses the more widespread/standard possessive your [jʊə] initially, which is recorded alongside thy in the SED Skelmanthorpe data (IX.8.5). An editorial note for Skelmanthorpe (IX.9.5) indicates the informant judged thou to be ‘older’ than you so perhaps John Townend’s use of <th-> forms here might be performed. As the original source text for the Parable probably contained <th-> forms, this might also have influenced John Townend to retain them as indicative of Shelley dialect. Certainly, the regularity with which <th-> forms occur in the SED Basic Material (Orton et al 1962-1971) is not reflected in the SED sound recordings, although there are several possible explanations for this. Firstly, as interviewees the SED informants seldom ask questions of the fieldworkers and secondly, as local familiar forms, the <th-> variants might not have been deemed appropriate by informants speaking to fieldworkers whom they presumably considered strangers or social superiors (despite the warm and lasting friendships that often developed during fieldwork). Where <th-> forms do occur in SED sound recordings they generally arise in reported speech as is the case in the sound recording with Wibsey Dyson (I says, ‘art thou late Bill or I’m soon or I’m soon and thou late?’, I says, ‘sithee, Bill, what’s yond?’ and she says, ‘hast thou seen my cat?’). Table 3 records three non-standard past verb forms captured in the Parable recording. John Townend’s consistent use of third person singular were remains a distinctive feature of speech locally as confirmed by numerous contemporary recordings and not surprisingly this is the form recorded by SED fieldwork in Skelmanthorpe (VIII.9.5), but what is striking here is John’s alternation between a strong and weak variant of were: (6) a. there were a [wɒɾə] man who had two sons b. when he were [wɒ] yet a great way off his father saw him


c. now his elder son were [wə] working in the closes d. then he were angry [wəɾaŋgɾɪ] and wouldn't go in e. this thy brother were [wə] dead and is alive again he were [wə] lost and is

found This distinction was noted by Wright (1892: 161) but is not apparent in SED published data as the questionnaire only elicited unstressed past ‘be’ (VIII.9.5), something Petyt considers an oversight (1985: 194) as he, too, regarded this contrast as a noteworthy feature of West Yorkshire dialect. In the SED’s defence, no questionnaire can possibly provide comprehensive coverage of every aspect of variation but the extensive supplementary data provided by the SED audio files and ‘Incidental Material’ (Orton 1962: 17) plugs many gaps and the recording in Skelmanthorpe does in fact capture the two forms very clearly: (7) a. when I got nearly up to it I heard it were [wɒ] saying summat b. when I got a bit gainer [= ‘nearer’] I could hear what it were [wɒ] saying c. a woman’d comed out in her nightdress and and and were [wə] seeking

her cat d. in them days there no there were [wə] no mechanism in the pit hardly

everyth every every bit of coal were hand-gotten [wəɾandgɒʔn] Recent BL recordings in South Elmsall (BL shelfmark: C900/08505 – and it were [wɒ] I just forget how it were [wɒ] worded and it and it were [wə] ‘True Labour’ or ‘Socialist’ or whatever) and Castleford (BL shelfmark: C1190/19/01 – my eldest lad didn’t want to do anything else all he wanted to do were [wɒ] come into the business which he did at the age of sixteen but let me say this he’d been working since he were eight [wəɾ ɛɪt]) confirm this distinction continues to exist for some speakers locally. The SED questionnaire did not elicit past forms of either ‘get’ or ‘tell’, but the fieldworker’s notebook for Skelmanthorpe held in the Brotherton Library at the University of Leeds (LAVC/SED/2/2/6/31: 114) records telled [tɛld] in the ‘Incidental Material’ (Orton 1962: 17) and gotten occurs repeatedly in the sound recording with Wibsey Dyson: we’d a grave to dig and my mate had gotten [gɒʔn ] there before me; before I got to my work I’d gotten [gɒʔn] a right sweat on, you know and in them days there no there were no mechanism in the pit hardly everyth… every bit of coal were hand-gotten [andgɒʔn ]. Both forms remain present in dialects in the north of England, although telled is arguably increasingly rare. The final two features noted in Table 3 relate to morphonological processes applied in connected speech to the definite article and certain prepositions. Definite article reduction (DAR) – a contracted pronunciation of the word the – is a distinctive and relatively stable feature of speech throughout Yorkshire and some neighbouring counties. Often rendered in popular representations as if to imply people say t’ book or even omit the article altogether, DAR is in fact a complex phonetic process with speakers realising ‘the’ using a variety of allophones of /t/, most commonly a glottal stop, or, in certain phonetic environments, very


occasionally /ʔθ/. John Townend frequently uses a glottal stop in several grammatical contexts – preceding a subject noun (not many days after the young man [ʔ jʊŋ man] gathered all his belongings together), before an object noun (the husks that he fed the swine [fɛtʔ swɑːn] with) and in combination with a preposition (a great famine came over the country [ɒvəʔ kʊntɹɪ]). This articulation occurs frequently in the SED recording in Skelmanthorpe (e.g. and I were in setting the key [sɛtɪn tʔ kɛɪ] into the lock [ɪntəʔ lɒk] and I turned me round) and persists as the most common realisation today. The fricative variant John Townend uses in the phrases with the husks and to the house is worth further scrutiny. Unfortunately, there are no examples in the Parable text of ‘the’ preceding a noun with vowel onset but as noted above, John Townend deletes initial /h/ meaning both husks and house effectively have vowel onsets thus prompting his use of the variant, /ʔθ/. The definite article was consistently recorded as [t] during SED fieldwork in Skelmanthorpe, even preceding a vowel at IN THE OVEN (V.6.6) and after H-dropping at THE HEAT (VI.13.6), although [θ] is noted for both these prompts in nearby Holmbridge. There is also one example of a similar pronunciation in the sound recording with Wibsey Dyson (I used to leave the house [tθæəs] at half past five and I’d to be at s… at pit at six o’clock and we worked while two). As noted above, this articulation is increasingly rare and does not feature in many present-day BL recordings, although there are isolated examples, such as in a recording from Leeds in exactly the same phonetic environment (BL shelfmark: C900/08627 – oh the steel works [ʔ stiːlwəːks] were massive the Hunslet steelworks [θʊnslɪʔ stiːlwəːks] it were a massive place). Finally, we turn to the prepositions listed in Table 3. In the case of in, of and with John Townend produces a contracted form in which he deletes the final consonant. A contracted form for of is common in vernacular English worldwide and is, not surprisingly, well documented for Skelmanthorpe (e.g. at III.13.3) and in the sound recording (e.g. every bit of [ə] coal were hand-gotten) and numerous present-day recordings. The contracted form of with is also common in a number of varieties of British English and certainly persists locally. John’s articulation with a lax vowel preconsonantally is recorded in the SED for Skelmanthorpe (IV.8.6) and in the corresponding sound recording (with [wɪ] fear and trembling I went my way towards this here […] apparition in white). The deletion of the nasal consonant of in is probably less common nowadays, but it is certainly evident in the SED (V.6.6) and occurs frequently in the recording with Wibsey Dyson: (8) a. with fear and trembling I went my way towards this here […] apparition in

[ɪ] white b. I’d to wade nearly knee-deep in [ɪ] water and I’d to work in [ɪ] water all

the day through c. I started a-working at a shilling in the [ɪʔ] day shilling in the [ɪʔ] day for

going down the pit


Finally John shows a predictable contrast between a weak vowel for the preposition to preceding a consonant (near to [tə] the house) and a stronger vowel preceding a vowel (he arose and came to his [tu ɪz] father), but also a more marked pronunciation (and the servant said to [ti ɪm] him ‘thy brother has come home’) that is perhaps nowadays more readily associated with speakers further north, especially Scotland. There is no record of this realisation in the Skelmanthorpe data, but there are several examples in SED sound recordings in Yorkshire, although, interestingly, only in localities in the North and East Ridings with the closest example to Shelley in Rillington (BL shelfmark: C908/46 C8 – when he got to the [tɪt] far end of the field as he was mowing across there was a lad leading a Galloway over the field). However, a single token in a BBC Voices Recording in Castleford (no, not today, love, that’s all gone but uh like Andrew’s just said to [tɪ] you you’d have more walnuts and Brazil-nuts than you would filberts because people (they’re more popular) didn’t care for a lot of filberts) perhaps adds some validity to John Townend’s use. 9 The value of British Library archival sound recordings to present-day

audiences As noted above, we rely principally on surviving documentation and Brandl’s testimony to determine whether the BLBCR recordings are faithful records of the dialects they represent. Clearly an element of performance was encouraged, making it difficult to judge whether individual features capture an informant’s typical usage or reflect forms imitative of broader or older dialect speech back home. We might perhaps suspect an artificially inflated number of dialectal forms – lexical variants, non-standard grammatical constructions and archaic or localised pronunciations – chosen and rehearsed in advance, rather than indicative of widespread usage. Whether individual utterances are contrived or intuitive, one might nonetheless accept them as valid evidence of a speaker’s active and passive knowledge of his local dialect, but treat them with caution in terms of authenticity. However, a detailed analysis of the recording with John Townend bears comparison with other sources for which we have greater provenance and confirms the collection as a rich resource for linguists wishing to investigate early twentieth-century English dialects. As an internationally acclaimed survey, the SED is rightly acknowledged as a reliable source of dialect data but the analysis here of the recording with Wibsey Dyson should hopefully also raise awareness of the availability and value of SED sound recordings. The recordings of John Townend and Wibsey Dyson offer linguists insights into phenomena unrecorded in the SED published volumes but captured in contemporary written accounts – as demonstrated most notably here in the case of John Townend’s and Wibsey Dyson’s realisation of /r/ and third person were. Online access to a further 65 BLBCR recordings and 287 SED audio files at bl.uk/sounds therefore provides an essential teaching and learning resource for re-examining other less well-documented features of British dialects. A similar exercise as undertaken here would be informative for many other varieties of English. Equally importantly, audio collections are open to interpretation by a


wider range of audiences in that non-specialists can respond much more readily to a sound recording than to a purely written description of speech. To the inexperienced researcher academic descriptions of broad twentieth-century dialect, such as those contained in the SED, can be extremely daunting as they inevitably require familiarity with IPA transcription. Hearing the actual voices of real speakers will not only support academic enquiry but also open up respected linguistic datasets for the first time to teachers, students, actors and the general public. The BL is committed to supporting Knowledge Transfer between higher education research and the general public and has enjoyed considerable success in meeting growing public demand for knowledge about English accents and dialects by presenting and interpreting its linguistic audio content to new audiences. The Voices of the UK project (Robinson et al. 2013) will develop access further by creating detailed linguistic descriptions of the dialect content at bl.uk/sounds, with the ultimate aim of making the sound recordings browsable by linguistic criteria. This will mean that in future students unfamiliar with phenomena such as, for instance, the dialectal GOAT diphthongs discussed here for West Yorkshire will be able to locate and audit authentic spoken examples to assist them with interpreting and evaluating more challenging dialect data. Finally, one should simply appreciate the unique value of these audio archives as the earliest known sound recordings of the speech of ‘ordinary’ folk. References Barbed Wire Ballads, Pres. Miles Kingston, BBC, UK, (tx time n.k.), 11.05.2005,

BBC Radio 4, 30 mins. BBC Voices Recording in South Milford, BBC, UK, rec. 10/11/2004, 61 mins. 31

secs. [digital audio file] British Library, C1190/19/01 (accessed 11/04/2013). Brandl, Alois 1925. Der Anglist bei den Engländern [The English scholar among

the English]. In: Doegen, Wilhelm 1925. pp. 362-375. Brandl, Alois 1926. Englische Dialekte, vol.1. Berlin: Preußische Staatsbibliothek. Brandl, Alois 1936. Zwischen Inn und Themse: Lebensbeobachtungen eines

Anglisten: Alt-Tirol, England, Berlin. [Between the Inn and the Thames: Observations on life by an English scholar: Old Tyrol, England, Berlin] Berlin: Grote.

Doegen, Wilhelm 1909. Doegens Unterrichtshefte für die selbständige Erlernung fremder Sprachen mit Hilfe der Lautschrift und der Sprechmaschine. [Doegen’s instruction books for the independent learning of foreign languages with the help of phonetic transcription and the speech machine.] Berlin: Otto Stollberg.

Doegen, Wilhelm 1925. Unter fremden Völkern. [Among foreign peoples] Berlin: Otto Stollberg.

How the Edwardians Spoke, Prod. Alexander Briscoe, BBC, UK, 21.00, 06/07/2007, BBC4, (duration n.k.).

Evolving English: One Language, Many Voices. [public exhibition] London: British Library, 17/11/2010-03/04/2011.

Empire, Faith and War: The Sikhs and World War One. [public exhibition] London: Brunei Gallery, 09/07/2014-28/11/2014.


Fees, Craig 1991. The Imperilled Inheritance: Dialect and Folklife Studies at the University of Leeds 1946-1982. Part 1, Harold Orton and the English Dialect Survey. London, Folklore Society Library.

Foulkes, Paul and Docherty, Gerard (eds) 1999. Urban Voices: Accent Studies in the British Isles. London: Arnold.

Grierson, Sir George (ed.) 1903-1928 Linguistic Survey of India (11 volumes). Calcutta: Office of the Superintendent of Government Printing.

Hickey, Raymond 2011. The Dialects of Irish, Study of a Changing Landscape. Berlin: de Gruyter Mouton.

How the Edwardians Spoke, Prod. Alexander Briscoe, BBC, UK, 21.00, 06/07/2007, BBC4, (duration n.k.).

‘I Predict A Riot,’ Employment, Perf. Kaiser Chiefs, Prod. Stephen Street/ Stephen Harris, UK (2005) [CD: B-Unique Records, B-Unique/Polydor BUN093]. 03 mins 57secs.

Jones, Daniel Examples of English Intonation. Prod. Wilhelm Doegen, rec. 1921. (duration n.k.). [LP: Odeon]. British Library 1CL0073266.

Kellett, Arnold 1987. Dictionary of Yorkshire Dialect, Tradition and Folklore. Otley: Smith Settle.

Millennium Memory Bank Recording with Bob Hayhurst, BBC Radio Leeds, UK, rec. 23/10/1998, 40 mins. 12 secs. [digital audio file] British Library, C900/08505 C1 (accessed 20/04/2013).

Millennium Memory Bank Recording with Ivy Murray, BBC Radio Leeds, UK, rec. 31/03/1999, 47 mins. 21 secs. [digital audio file] British Library, C900/08627 C1 (accessed 14/05/2013).

Millennium Memory Bank Recording with Raymond Ellis, BBC Radio Leeds, UK, rec. 01/04/1999, 32 mins. 45 secs. [digital audio file] British Library, C900/08618 C1 (accessed 20/04/2013).

Millennium Memory Bank Recording with Vera Vant, BBC Radio Leeds, UK, rec. 25/01/1999, 71 mins. 54 secs. [digital audio file] British Library, C900/08561 C1 (accessed 20/04/2013).

Mitchell, Austin 1987. Teach Thissen Tyke. Lancaster: Dalesman. Orton, Harold 1962. Survey of English Dialects: (A) Introduction. Leeds: E.J.

Arnold. Orton, Harold et al. (eds) 1962-1971. Survey of English Dialects: (A) Introduction;

(B) Basic Material (4 volumes). Leeds: E. J. Arnold. Orton, Harold and Halliday, Wilfred (eds) 1962-1963 Survey of English Dialects:

(B) The Basic Material, Vol. 1, Parts 1-3. Leeds: E. J. Arnold. Mahrenholz, Jürgen 2006. Ethnographic audio recordings in German prisoner of

war camps during the First World War. In: Dendooven, D. and Chielens, P. (eds) World War I: Five Continents in Flanders. Ypres: Lanoo.

Petyt, K. M. 1985. Dialect and Accent in Industrial West Yorkshire. Amsterdam: John Benjamins.

Robinson, Jonathan, Herring, Jon and Holly Gilbert 2013. The British Library description of the BBC Voices Recordings Collection. In: Clive Upton and Bethan Davies (eds), pp. 138-162.

Sounds, London: British Library. Online. Available HTTP: sounds.bl.uk


Survey of English Dialects recording in Carleton, University of Leeds, UK, rec. 01/02/1963, 27 mins. 16 secs. [digital audio file] British Library, C908/9 C4 (accessed 28/04/2013).

Survey of English Dialects recording in Holmbridge, University of Leeds, UK, rec. September 1959, 08 mins. 13 secs. [digital audio file] British Library, C908/47 C6 (accessed 28/04/2013).

Survey of English Dialects Recording in Rillington, University of Leeds, UK, rec. 14/04/1955, 09 mins. 58 secs. [digital audio file] British Library, C908/46 C8 (accessed 14/05/2013).

Survey of English Dialects Recording in Sheffield, University of Leeds, UK, rec. 29/12/1952, 09 mins. 28 secs. [digital audio file] British Library, C908/48 C2 (accessed 16/05/2013).

Survey of English Dialects Recording in Skelmanthorpe, University of Leeds, UK, rec. 20/10/1952, 08 mins. 54 secs. [digital audio file] British Library, C908/41 C1 (accessed 09/04/2013).

Survey of English Dialects Recording in Thornhill, University of Leeds, UK, rec. 01/02/1963, 29 mins. 54 secs. [digital audio file] British Library, C908/9 C3 (accessed 13/05/2013).

Survey of English Dialects Response Book for Skelmanthorpe, Brotherton Library Special Collections, University of Leeds, LAVC/SED/2/2/6/31.

Stoddart, Jana, Upton, Clive and Widdowson, J. D. A. 1999. Sheffield dialect in the 1990s: revisiting the concept of NORMs. In: Paul Foulkes and Gerard Docherty (eds), pp. 73-89.

Tarantelli, Valentina forthcoming. Voice into Text. The History of Linguistic Transcription. PhD thesis: University of Sheffield.

The Review of English Studies, Vol. 7 No. 27 (July 1931). London: Sidgwick & Jackson.

Voices of the UK, British Library, UK, [audio CD] British Library, 2010, NSACD 74-75, 153 mins.

Upton, Clive and Widdowson, J. D. A. 2006. An Atlas of English Dialects. Second edition. London: Routledge.

Upton, Clive and Bethan Davies (eds) 2013. Analysing 2first Century British English: Conceptual and Methodological Aspects of the ‘Voices’ Project. London: Routledge.

Wrede, Ferdinand et al. (eds) 1927-1956. Deutscher Sprachatlas. Marburg (Lahn): N. G. Elwert.

Wright, Joseph 1892. A Grammar of the dialect of Windhill in the West Riding of Yorkshire. London: English Dialect Society.

Wright, Joseph (ed.) 1898-1905 English Dialect Dictionary, London: Henry Frowde.

Fabricius Twentieth-century Received Pronunciation: Prevocalic /r/ --- Page 35 of 525

3 Twentieth-century Received Pronunciation: Prevocalic /r/ Anne Fabricius 1 Introduction Because of its dominance of the programming by the British Broadcasting Corporation for many years, Received Pronunciation is a relatively accessible variety for historical corpus collection and sociolinguistic comparison. There is now a large amount of recorded speech data available from a range of online sources, and more is constantly being added to the collections as resources permit material to be salvaged and digitized (see e.g. http://www.bbc.co.uk/archive/tv_archive.shtml?chapter=10). Since the BBC and British Library archive collections are made up of exemplars of many types of speech context and content, rather than, for example, consisting solely of autobiographical interviews or group conversations, the challenge for the discipline of sociolinguistics lies in being able to assess this historical recorded data in a sociolinguistically-sensitive way. This task requires constant attention to the circumstances of the recordings, to be able to set up systematic corpora which can provide comparable and quantifiable data sets, if the aim, as in this chapter, is to treat the material using proven sociolinguistic methods, including statistical treatment. For the purposes of this chapter, the author gathered a small corpus of fourteen speakers gleaned from the BBC archive, from recordings in a variety of settings and types of TV or radio programme, many of which consisted of personal reminiscences of various kinds, while others were documentary features, made between 1939 and 1977.1 This small corpus consisted of just under four hours of recordings, which yielded 2511 tokens of /r/ spoken in a range of linguistic contexts. The data were analyzed auditorily and explored for patterns of co-variation between speaker profile, linguistic characteristics and time depth in the historical record. The results show that tapped and trilled variants of /r/ (analyzed together as ‘taps’) seem to have a different social and linguistic profile to ‘labial’ variants which were found primarily in the speech of three individuals. The data suggest that these /r/ variants have different statuses, premised on different conditions in time and social space. Systematic observations of historical recordings can ultimately provide points of quantitative comparison with younger data sets which have yielded evidence of other changing features during the twentieth century. These include many features which have already been examined quantitatively, such as conditioned t-glottalling (Fabricius 2002), L-vocalization (Wells 1997), as well as vocalic variation and change most notably in the TRAP, FOOT and GOOSE vowels (Harrington et al. 2000, 2011; Fabricius 2007a; Fabricius 2007b), and the weak-syllable HAPPY- 1 Thanks are due to Daniel Ezra Johnson for assistance with the statistical results presented in this chapter.

http://www.bbc.co.uk/archive/tv_archive.shtml?chapter=10


vowel (Harrington 2006). Other candidates for such investigations would be the GOAT-split (Wells 1997) and GOAT-fronting to similarity to FACE, as well as an incipient GOOSE-split which has been reported anecdotally in the South-East of England.2 FACE,3 MOUTH and FLEECE, on the other hand, also provide interesting points of comparison. Creaky voice quality, smoothing of diphthongs (Hannisdal 2006) and conservative features such as apical /t/ and /s/ have also been treated sporadically in the literature on RP (for present day developments in /s/, see Levon and Holmes Elliot 2013), and could also be considered more extensively in future diachronic comparisons in data sets constructed using online resources. 2 Background If there is one variety of British English which is amenable to quantitative exploration through the construction of an historical sociolinguistically-sensitive corpus of early recordings, it is surely the British English variety we have come to know as Received Pronunciation. Its dominance of the airwaves in the early days of the BBC is well documented, so that historical archives such as those of the BBC and the British Library are replete with examples of spoken material. ‘Listening to the past’, in this case, is a relatively accessible activity, although the sheer volume of material means that much background research remains to be done: the recordings that can be analyzed here are only a tiny fraction of what could be considered systematically, and the analysis presented here can only be regarded as a first step in this direction. Future research of this type has a great deal of potential to detail the historical trajectories which are suggested by summaries such as Wells (1997), and to contribute to a better theoretical understanding of detailed quantitative linguistic pathways to feature obsolescence. In addition, other historical sources from the time can provide glimpses of rich social context for these linguistic variations, or even specific metalinguistic commentary (see further below). We will make brief remarks here on a related meta-issue, that of labelling ‘the accent’, since this is a facet of the considerable historical sociolinguistic complexity of RP, although as such it is outside the major focus of the present chapter. Recently, within phonetics especially, authors have adopted the designation SSBE (Standard Southern British English) (Hudson, de Jong, McDougall, Harrison and Nolan 2007), while other writers continue to refer to (modern) RP (Fabricius 2000, 2002), and the implications of the use of various names are not irrelevant to the 2 Conditioned by /l/ followed by the presence or absence of a morphological boundary, such that ruler1 (‘monarch’, bimorphemic) is phonetically distinct from ruler2 (monomorphemic, instrument used to rule lines), so that the vowel of the second of the pair is more front than the vowel of the first (Christian Uffman, pc, reported on John Wells’ phonetics blog, see http://phonetic-blog.blogspot.dk/2012/02/newly-minimal.html .) 3 The author has recently observed marked FACE-monophthongization (perhaps a development in the face of GOAT-fronting which approaches FACE’s onset position and trajectory) in an adolescent female speaker aged 17 or 18, then a pupil at Marlborough College in the UK. See the segment “Charity garden party: super Sunday” and the tokens today, café, cakes, day as spoken by Bella Duncan (from 01:09 to 01:29 in the recording) at this link: http://www.marlboroughcollege.org/news/video-news/#!prettyPhoto (accessed twelfth December 2013)

http://phonetic-blog.blogspot.dk/2012/02/newly-minimal.html

http://www.marlboroughcollege.org/news/video-news/#!prettyPhoto


discussion. The latter suggests a generational continuity with older forms of RP, while SSBE makes no such tacit claim. The complexity of giving the accent ‘a name’ is increased further by the sheer scale of the speech community we are dealing with here. The population of England is probably presently 56-57 million people,4 and if we take as a first estimate Trudgill’s reference to around 3% of the population being ‘RP speakers’ of some kind (discussed in Trudgill 2002), we end at a likely population of around 1.7 million people. Clearly, empirically-grounded linguistic generalizations over that scale are difficult to make, and we can also assume that the likelihood that such a number of speakers represents one completely homogeneous spoken variety is probably small. Any analysis of RP, moreover, has to confront the inbuilt sociolinguistic ambiguity of the name (Fabricius 2000: 29-36). The name RP has become an enregistered folk concept (see Agha 2003, Fabricius and Mortensen 2013), and the variety has been formally codified in many descriptive phonetic publications since the early twentieth century (for details of the enregisterment history, see e.g. Leitner 1982, Mugglestone 2007). The result is that the name refers not only to a vernacular variety (or set of features) spoken by a primarily socially-defined group of speakers, but also to a more or less overtly codified pronunciation blueprint or norm, a ‘construct’ which is known and recognized by speakers of many other varieties of English, both within the UK and beyond, especially in former colonies. Discussions that arose in the 1990s as to whether Estuary English was ‘replacing’ RP, for instance (cf. Trudgill 2002) remind us that the philosophical question of when a variety ‘ceases to be’, historically, is complex and remains pertinent. In a sense, this type of question can be raised of any ‘enregistered’ variety, be it accent, dialect, or language. Enregisterment (Agha 2003) as a process produces a labelled construct that names a way or speaking and circulates as a piece of social knowledge, so that ‘BBC English’ and ‘The Queen’s English’ (compare ‘Geordie’, ‘Scouse’ or ‘Cockney’) are culturally-defined labels that are attached in the community’s minds to linguistic features at a particular point in history, either individually (‘dropping your t’s’, for instance, as a folk term for t-glottalling) or collectively as ‘spoken styles’ (Johnstone et al. 2006). Received Pronunciation’s broadcast prominence during the emergence of British public radio (with the founding of the British Broadcasting Corporation in 1926), and for many years after, was the result of its social position as the speech variety used by people within a socially-, culturally- and economically-circumscribed dominant elite group. RP was instituted and remained unquestioned as the ‘proper’ accent in which to conduct broadcasting from the earliest years of the BBC, until its monopoly gradually broke down over the years since the 1960s. Its original claim to prominence seems to have been justified on the basis of perceptions that it had widespread acceptance, or rather, did not provoke the opposite. Leitner (1982: 98) quotes a 1929 letter from John (later Lord) Reith, at the time Managing Director of the BBC to Robert Bridges (then the Poet Laureate), wherein he claims that the BBC’s Advisory Committee aspires to promote pronunciation forms that reflect “the type of educated English which can be broadcast without evoking any considerable degree of relevant adverse criticism”.

4 Based on the UK Office for National Statistics 2011 Census: Key Statistics for England and Wales, March 2011. http://www.ons.gov.uk/ons/rel/census/2011-census/key-statistics-for-local-authorities-in-england-and-wales/index.html

http://www.ons.gov.uk/ons/rel/census/2011-census/key-statistics


This is of course a claim that stems from a particular class and ideological position; related to that, Leitner (1982) describes the sociolinguistic anxieties of the ‘new man’ of the early part of the twentieth century, the socially upwardly mobile individual concerned with sociolinguistic acceptance (cf. Preston 2013 on linguistic insecurity). For academic linguists, RP has most often been described in a ‘variety-based’ way that is commensurate with the structuralist paradigm in linguistics that flourished at the same time, and, as Leitner (1982) points out, coinciding with general societal concern at the time with the establishment of a British standard pronunciation. Certain speech patterns and forms came to be considered canonical ‘RP’, while others were definitely excluded from it and indeed railed against. Trudgill (2002: 174) formulates it thus: “[w]hen it comes to employing a codified language variety, a miss is as good as a mile”. Indeed, Daniel Jones, in 1910 lecturer in Phonetics at University College London, came up against considerable opprobrium from the man who became Poet Laureate in 1913, Robert Bridges (see Collins and Mees 1999: 104-106) as to whether or not professional phoneticians had a duty to recommend certain pronunciation forms and not others. Daniel Jones himself was notably objectivist rather than prescriptivist, apart from some comments in his very earliest book-length publication, the Pronunciation of English from 1909 (Collins and Mees 2001). In other words, the variety of RP came to be treated by society as a whole as a descriptive linguistic monolith, codified in pronunciation dictionaries. This viewpoint becomes more and more difficult to sustain as time went on and observations of successive generations of speakers meant that the model regularly needed to be adjusted to represent what seemed to be new mainstream pronunciations; cf. Gimson (1981) and his discussion of criteria for his updates of the English Pronouncing Dictionary, originally Jones (1917). Sensitivity to the developments that RP speech has been undergoing is also evident repeatedly in Gimson’s and Wells’ publications, and second and third editions of the Longman Pronunciation Dictionary (Wells 2000, 2008) include reports of speakers’ perceptions of their own usage of certain pronunciation variants. This ongoing process of change ultimately raises the philosophical questions of when RP might have ceased to be – a new label such as SSBE circumvents this – and how closed the boundaries are to be considered, questions that were especially exercising linguists and lay persons alike in debates on ‘Estuary English’ during the 1990s (cf. Kerswill 2007). Furthermore, the enregisterment of a variety under a new name renders invisible the sociolinguistic process of obsolescence that individual linguistic features undergo within a generational frame, thus cutting these processes off from theoretical consideration and insight. That so much energy has been spent on questions of RP’s linguistic enregisterment (for academic treatments of this, see Trudgill 2002, Wells 1997) is probably indicative of its continuing gatekeeper function, which has undoubtedly been diluted since the 1960s; RP’s place in the sociolinguistic landscape has been affected by widespread socio-economic and ideological change over the past 50 years (Coupland 2010, Coupland and Bishop 2007, Kerswill 2007, Armstrong and Mackenzie 2013). Mugglestone (2007: 254-294) describes RP’s historical transition from ‘proper’ to ‘posh’, and this alliterative characterization sums up the transformation well. It seems that a variety’s demise must depend upon the loss not only of distinctive (majority) variants that characterize the speech form, but also the loss of its ideological place


in the sociolinguistic landscape. If RP was established and codified to fulfill a social gatekeeping role, and continues to exert that pressure in certain contexts, even if its phonetic forms have undergone systematic sociolinguistic change, is it then a ‘dead accent’? The only way out of this conundrum, it seems to the present author, is to regard any accent as an assemblage of a collection of linguistic features as well as an enregistered ideological linguistic construct (a set of postulates about what the accent sounds like). Determining whether RP exists, then, is a tractable empirical question, at least from a micro-linguistic point of view, feature by feature (on the present ideological landscape of RP as expressed through one young speaker’s metalinguistic reactions, see Fabricius and Mortensen 2013). Systematically-collected quantitative speech survey data in the UK has long ignored RP, in contrast to other varieties which have been the focus of dialectological studies (The Survey of English Dialects, for example, see Robinson, this volume) and sociolinguistic studies for many years (US urban varieties, Scottish English); for a discussion of the reasons which lie behind this academic sociolinguistic gap, see Fabricius (2000). RP was not generally the subject of systematic quantitative variationist studies until around 2000, for example Fabricius (2000), Harrington et al (2000), Altendorf (2003), Hawkins and Midgley (2005) and Hannisdal (2007). Historical sociolinguistic corpora of English, such as those listed on the CORD database (see http://www.helsinki.fi/varieng/CoRD/index.html) are an important research resource, but the potential for productive spoken language corpora built with historical materials has scarcely been touched, let alone fulfilled. 2.1 Variationist sociolinguistics and RP In constructing and analyzing a historical corpus, even on the relatively small scale as in this chapter, we are subscribing to the well-known apparent time hypothesis (Bailey 2002), the assumption current within variationist sociolinguistics, that speakers’ phonological patterns and phonetic variation, assuming no great shifts, spatial or otherwise, in their lives, reflect their vernacular and thus the community grammar at their time of their early childhood. The assumption is that this vernacular variety will be evident in speech production throughout the lifespan. The motivation for different types of speech tasks in the classic sociolinguistic interview was that vernacular speech would emerge most strongly in situations where speech production was least monitored. The concomitant of this was the premise that ‘vernacular situations’ were somehow definable and separable from other more monitored situations is not one which has generally held up to critical scrutiny, as has been shown in publications since Bell (1982) (see e.g. the papers in Eckert and Rickford 2001). Nonetheless, we subscribe to an assumption here that variation in certain linguistic features will be ‘under the (vernacular) radar’ for speakers, and, in the absence of evidence to the contrary, we include r variation in that category. Determining whether /r/ variation was subject to overt commentary at the time of the recordings will require more extensive research on this type of data and in historical sources than has been possible to date. We make some comments in the analysis below about the types of media event that these examples of early

http://www.helsinki.fi/varieng/CoRD/index.html


recordings might represent, and the relationship that these might have to vernacular speech of the time. These caveats notwithstanding, this work proceeds from a quantitative sociolinguistic point of departure. Its analyses are premised on the theoretical claim that examination of quantitative patterns of the distribution of variants reveals the embedding of the variation in a social grammar of the variety. It assumes that speakers are members of a community of some sort, although such speakers do not occupy a clearly defined geographical area; nor do they necessarily interact often with each other, so that in this case the community is widely defined as speakers with upper middle and upper class backgrounds5 (see Appendix 1), since our knowledge of these individuals’ actual social networks at the time is limited. We work on the premise that /r/ constitutes a linguistic variable with potential sociolinguistic significance within an indexical field (Eckert 2008), and the task at hand in this chapter is to begin to reveal a structure to that significance in the production of speech. 2.1.1 /r/ in historical RP

5 Recent work on social class in Britain within sociology such as Savage et al (2013), and especially a historically-informed sociology, promises to provide sociolinguistics with a fruitful re-connection with social class theory in the UK.


Figure 1. Consonant chart, Jones (1909: xiii) Figure 1 above reproduces the table of English Speech sounds from Daniel Jones’ The Pronunciation of English, first published in 1909. The interesting feature of this chart is the representation of the consonant /r/, which is split into two forms, the roll (or trill) and the so-called ‘fricative r’ (and it is not clear precisely what this is intended to refer to). No mention is made of a ‘tapped’ r, which as we shall see below, is an important feature of the corpus of recordings that are analyzed here. Wells (1997) dates the loss of tapped /r/ and its replacement by alveolar R in intervocalic positions to the ‘early twentieth century’, and while we do not have sufficient data to corroborate this in the present corpus, where no speakers were born later than 1918, it would indeed be interesting to pursue this variant in speakers born later in the century (again, more research is needed). It is noticeable for instance, in the recording with Lord Cromer, YM1 in this corpus (see below), that the interviewer (Paul Ferris) uses tapped /r/ far more frequently than the interviewee. Labiodental and generally fronted /r/’s are not mentioned in Jones’ treatments of RP pronunciation, but are discussed in Wells’ description of U-RP (Wells 1982:


282), along with tapped /r/. Labiodental ʋ is referred to as being ‘regarded as an upper-class affectation’ (Wells 1982: 282), an image which is corroborated by George Orwell’s representation of it in 1936 in Keep the Aspidistra Flying’ (cited in Foulkes and Docherty 2000), where <w> for orthographic <r> is used to represent /r/ within a consonant cluster in bwowse and poetwy, intervocalic /r/ in tewwible, and initial /r/ in wesist, in a parody of upper class speech. We turn now to a discussion of the corpus data and its analysis. 2.2 Data In dealing with the historical material, it has been a principle of the work to select recordings and speakers from the BBC archive who can be independently established to fall within the expected social grouping of RP speakers at the time. This use of independent social criteria (contra e.g. Gimson 1981 referred to above) was done so as to circumvent the circularity inherent in choosing RP speakers by means of linguistic criteria alone, a problem discussed in Fabricius (2000, 2002). The group of RP speakers, for the time period covered in the present corpus, where speakers were born between 1880 and 1918, largely consisted of speakers with upper middle class or upper class backgrounds, who, according to Jones (1917: viii) came from families and social circles whose education had typically been at the English public schools. This educational criterion certainly applies in the case of the male speakers in the corpus, many of whom have preparatory and public backgrounds, while many of the female speakers were educated by governesses rather than at school, although this is not the case for all of them (Speaker OF2, for example, had a university education). The data were selected by surveying the publicly available BBC archive corpus http://www.bbc.co.uk/archive/. Tables 1a and 1b give details of the selected recordings. Note that OF1 and OF2 are interviewed within the same recording.

ID Speaker DOB Recorded Length

OF1 Baroness Asquith 1887 1968 00:22:11

OF2 Baroness Stocks 1891 1968 00:22:11

OF3 Dame Agatha Christie 1890 1955 00:02:46

1955 00:02:06

OF4 Bridget Monckton, eleventh Lady Ruthven of Freeland 1896 1977 00:13:37

OM1 Baron Dowding 1882 1968 00:36:41

OM2 Viscount Alanbrooke 1883 1957 00:28:54

OM3 A.P. Herbert 1890 1954 00:09:13

OM4 E.F.L.Wood, first Earl of Halifax 1881 1939 00:17:21

YF1 Doris Langley-Moore 1902 1957 00:13:27

YF2 Lady Alexandra Naldera Curzon 1904 1977 00:07:37

YF3 Dame Daphne du Maurier, Lady 1907 1971 00:12:20

http://www.bbc.co.uk/archive/


Browning DBE

YM1 Lord Cromer, GRS Baring 1918 1964 00:08:18

YM2 Cecil Day-Lewis 1904 1962 00:12:52

YM3 Sir Arthur John Gielgud 1904 1954 00:25:59

TOTAL TIME 3:55:33 Table 1a. Composition of the data corpus

ID Link to recording

OF1 http://www.bbc.co.uk/archive/suffragettes/8318.shtml

OF2 http://www.bbc.co.uk/archive/suffragettes/8318.shtml

OF3 http://www.bbc.co.uk/archive/agatha_christie/12501.shtml

http://www.bbc.co.uk/archive/agatha_christie/12503.shtml

OF4 http://www.bbc.co.uk/archive/edward_viii/12939.shtml

OM1 http://www.bbc.co.uk/archive/battleofbritain/11421.shtml

OM2 http://www.bbc.co.uk/archive/churchill/11010.shtml

OM3 http://www.bbc.co.uk/archive/churchill/11009.shtml

OM4 http://www.bbc.co.uk/archive/ww2outbreak/7933.shtml

YF1 http://www.bbc.co.uk/archive/whatwewore/5610.shtml

YF2 http://www.bbc.co.uk/archive/edward_viii/12927.shtml

YF3 http://www.bbc.co.uk/archive/writers/12222.shtml

YM1 http://www.bbc.co.uk/archive/menandmoney/6800.shtml

YM2 http://www.bbc.co.uk/archive/van_gogh/10901.shtml

YM3 http://www.bbc.co.uk/archive/hamlet/8505.shtml Table 1b. Links to online recordings Each individual’s social and family background was investigated as far as possible through online sources (primarily Wikipedia), to determine the extent to which speakers could be said to have a social background which would firmly place them within the upper middle or upper class, and/or a relevant educational background. Some speakers clearly came from aristocratic families (OF4, OM4, YF2, YM1, YM2). Several speakers had been pupils at public schools, Oxbridge or had military officer training. Growing up in England was generally considered a requirement, but growing up abroad before education in England was also considered admissible when other background factors were taken into account (aristocratic backgrounds in India, for example). Cecil Day Lewis proved to be an interesting example, as close listening to his recording showed traces of his Anglo-Irish upbringing, which emerged in slight phonetic details (post vocalic /r/, clear l for syllabic l (by now, 2015, almost an historical variant, Raymond Hickey p.c.) in a small passage of a few seconds in the recording (5:31 to 5:38; in the phrases “of the highest art” and

http://www.bbc.co.uk/archive/suffragettes/8318.shtml

http://www.bbc.co.uk/archive/suffragettes/8318.shtml



http://www.bbc.co.uk/archive/edward_viii/12939.shtml

http://www.bbc.co.uk/archive/battleofbritain/11421.shtml

http://www.bbc.co.uk/archive/churchill/11010.shtml

http://www.bbc.co.uk/archive/churchill/11009.shtml

http://www.bbc.co.uk/archive/ww2outbreak/7933.shtml

http://www.bbc.co.uk/archive/whatwewore/5610.shtml

http://www.bbc.co.uk/archive/edward_viii/12927.shtml

http://www.bbc.co.uk/archive/writers/12222.shtml

http://www.bbc.co.uk/archive/menandmoney/6800.shtml

http://www.bbc.co.uk/archive/van_gogh/10901.shtml

http://www.bbc.co.uk/archive/hamlet/8505.shtml


“lower art”, “people are more important”). Day Lewis’ biography also suggests that he considered himself a British citizen rather than an Irish one. As these pronunciations were singular, he was considered eligible to be in the corpus. Appendix 1 gives further summary biographical details for each speaker included. The recordings include a range of situational contexts and speech registers, including personal domestic settings, in the case of Daphne du Maurier, talking about her writing career in a recording made at her private home, to Lord Barings, Governor of the Bank of England, being interviewed in the bank about the role of the Governor vis-à-vis the government of the day. Interviews of war reminiscences feature as well, with individuals who were leading high-ranked military officers at the time (Hugh Dowding and Lord Allanbrooke). Baroness Asquith and Baroness Stokes are interviewed together about the history of the suffrage movement some fifty years before the recording. Several recordings are monologues on various topics, ranging from Cecil Day Lewis’s presentation of Van Gogh’s art, Doris Langley-Moore’s documentary on the history of fashion to A.P. Herbert’s reminiscences of Winston Churchill’s parliamentary career and Sir John Gielgud’s discussion of Hamlet. As we saw above, there are hints in the phonetic literature that trilled r (or, as it is called in Daniel Jones’ early work, the rolled /r/) had a special status around the turn of the twentieth century. Trilled r is very rare in the corpus of recordings, occurring in all only 10 times (while tapped /r/’s occur 333 times in all). Five of these occur in Cecil Day Lewis’ monologue on the art and life of Vincent van Gogh, and on that basis it may be that the trill has in the past had a special ‘performance-style’ status, but this idea requires more research. There is clearly the potential for a good deal of stylistic variation in the historical data due to speaker stance (Kiesling 2009) and topic. One small observation of the potential effect of speaker stance on pronunciation can be illustrated from the data the interview between an interviewer, Baroness Stokes and Baroness Asquith. In the last part of the interview, the interaction moves to more of a conversation between the two interviewees, who begin to discuss animatedly what types of barriers for women to public life have been removed over the years. At this point, there is a case of trilled r within a syllable onset cluster in ‘breached’, occurring at around 19 minutes into the recording, uttered by Baroness Asquith. The same author also utters a trilled r in one token of ‘heroism’ at 14:39 minutes into the recording, which immediately follows two medial alveolar tokens of ‘heroic’. A spectrogram of this token is shown in Figure 2 below. These cases demonstrate that there is a potential value in a close qualitative analysis of interactional details, as a supplement to larger scale quantitative work such as in the present chapter. The fine-grained moves of stance-taking on an individual level may indeed reveal interesting tendencies which need to be taken into account in a more complete analysis.


Figure 2. Waveform and spectrogram of trilled r sequence in ‘heroism’ spoken by

OF1 (Baroness Asquith) 2.3 Methods The final set of recordings having been chosen and located in the BBC Archive (http://www.bbc.co.uk/archive/), all recordings were accessed and recorded as .wav files onto a personal computer using SoundTap software (http://www.nch.com.au/soundtap/ ). Each sound file was then imported into ELAN (http://tla.mpi.nl/tools/tla-tools/elan/ ), which enabled the linking of the sound file with its transcription. For each file, tokens of /r/ were identified in four phonological positions: word-initial, medial/intervocalic, potential sites for r-sandhi (‘linking r’), and within clusters at the onset of syllables. Separate tiers for the word and its phonetic transcription were used systematically throughout all ELAN files. Phonological position within the word was noted for each case.6 The varying sound quality of the recordings meant that acoustic instrumental analysis of the data was judged to be potentially difficult and in some cases not possible at all. In addition, while some recordings were clearly BBC studio recordings, others were made in differing circumstances (one in a private home, for instance), and in none of the cases was it possible to reconstruct what recording equipment had been used. Since microphones, for example, have an effect on formant measurements (Foget Hansen and Pharao 2006) which are relevant for /r/ (Foulkes and Docherty 2000, Stuart Smith et al. 2014), it was decided to avoid potential complications in using instrumental measures and to proceed with auditory analysis. The files were played back using a Dell Inspiron 1525 computer’s High Definition Audio device and listened to using Sennheiser HD212 Pro headphones. Each token of /r/ was identified and coded phonetically for

6 Preceding and following phonological elements were not noted on this run through the data.

http://www.bbc.co.uk/archive/)

http://www.nch.com.au/soundtap/

http://tla.mpi.nl/tools/tla-tools/elan/


word position and phonetic character by the author, using one of a set of phonetic possibilities: (1) Strongly alveolar approximant R, with no auditory evidence of fronting,

and auditorily rounded Labialized alveolar tokens with weak rounding, midway between R and √ Labiodental approximant tokens: √ Tapped r tokens: ɾ Trilled r: r Additional minor tokens occurring in the data were as follows. Fricated /r/’s were found in initial position and in syllable onset clusters. Linking /r/’s were occasionally vocalized, as were some medial /r/’s between weak syllables. Single backed /r/’s were coded as velarized in initial position or as retroflex in linking /r/ position, where they had a quality similar to an American /5/. Some tokens were discarded because of inaudibility due to recording noise or speaker overlap. Three hours and thirty-three minutes of recordings yielded in all 2511 usable tokens of /r/, divided between the fourteen speakers as shown in Table 2 below.

Speaker N /r/ tokens Speaker N /r/

tokens OF1 150 YF1 241 OF2 97 YF2 88 OF3 72 YF3 125 OF4 112 YM1 81 OM1 293 YM2 202 OM2 311 YM3 411 OM3 106 OM4 222 Total 2511

Table 2. Number of /r/ tokens by speaker The speaker with the smallest number of tokens was OF3 with 72, while YM3’s recording contains over 400 tokens. Male speakers dominate the corpus time-wise (2 hours 19 minutes as opposed to 1 hour 14 minutes for female speakers); all recordings with female speakers were shorter on average than those of male speakers.7 Table 3 shows the distribution of the 2511 tokens according to phonological position. Almost half of the tokens were located in onset clusters (a category which included both word-initial and word-medial clusters). Smaller numbers were available for the other environments, but the data was extensive enough to allow statistical treatment. 7 Daphne du Maurier’s recording is an excerpt of an interview from a longer conversational documentary filmed in several locations around du Maurier’s home. The chosen excerpt is filmed in her living room, where she is sitting still, as opposed to walking around, which she does for much of the programme.


Position Total Initial 498 Linking r 284 Medial 624 Onset cluster 1105 Total 2511

Table 3. Tokens of /r/ by phonological position Two interesting phonetic features within the data emerged during analysis. It was decided to test statistically for the distributions of tapped and trilled /r/, and for labial and labialized /r/, and to determine which independent factors seems to be promoting their usage. Initial cross-tabulations showed that most speakers included tapped tokens of /r/, but they were absent from the recording of Lord Halifax (recorded 1939). Labialized tokens were rarer, but, as figure 3 below demonstrates, these were particularly prominent in one individual: again, Lord Halifax, OM4. For that reason, the statistical results on tapped and trilled /r/ below are conducted on 13 speakers (omitting OM4), while the statistics on labialized tokens include all 14 speakers. It was determined that a suitable modelling of the data could be obtained using mixed methods logistical regression using speaker as a random factor (Johnson 2009). For the purposes of statistical analysis using multiple logistic regression as made available in Rbrul,8 the data were recoded such that examples of categories 2 and 3 were categorized as having labial character to some degree, while all other tokens were classed as ‘non-labial’. Similarly, in a second run in the data, categories 4 and 5 taps and trills were together recategorised as ‘taps’ (since there were few tokens of trills) while all other /r/ tokens were classified as ‘non-taps’. In that way, the tokens were reduced to a binary contrast (tap/nontap, labial/nonlabial) which could be tested statistically. The data was coded with a set of internal linguistic and external social predictors which were tested in the statistical model: gender, date of birth, date of the recording, and type of speech (whether monologue or interview). Date of birth was initially converted into a binary ‘century of birth’ factor, whether 19th (1800’s) or 20th century (1900’s), and then also tested as a continuous factor. ‘Date of the recording’ was later recoded into a factor called ‘decade of recording’ and tested within the model. Position in the word was the only internal linguistic factor tested. ‘Speaker’ was included in the model as a random factor, a practice that brings sociolinguistic modelling better in line with other social science disciplines in treating individuals as potentially divergent from their social group (Johnson 2009: 365). 3 Results We turn first to examine tapped and trilled /r/ in the data. As Figure 3 below shows, cross-tabulations of the data revealed a trend of decreasing tapped /r/ usage across the decades under consideration. This decrease applies across medial (intervocalic)

8 http://www.danielezrajohnson.com/rbrul.html

http://www.danielezrajohnson.com/rbrul.html


contexts and linking /r/ contexts most obviously, while taps and trills are almost absent from the other two environments. Taps and trills in initial position however also decrease from a very marginal rate in the 1950s. This trend, which is independent of the speakers’ dates of birth, suggests a changing ‘style of the time’ where taps and trills become more and more rare in the BBC recordings. As we shall see below, this factor does prove significant in modelling taps and trills in the post-war recordings.

Figure 3. Trends in rates of tapped and trilled /r/ by word position according to

decade of recording These trends suggested that it could be revealing to model the data according to word position and decade of recording, as well as year of birth as a continuous factor and speaker as a random factor. This model turned out to be highly significant, while factors such as gender and speech context (interview versus monologue) did not. Table 4 below shows firstly the results for 13 speakers and 2289 tokens (omitting OM4, as noted above) for tapped and trilled /r/, recoded and categorized together as ‘taps’. The three independent factors, in order from least to most significant effect, were position in the word (p=6.26e-132), decade of recording (p= 157e-07) and year of birth, examined as a continuous variable (p=0.000325). Speaker as a random effect was also part of the model and contributes to a strengthening of the results. In the context of this prestigious variety as it was at the time, tapped /r/ seems to have had had the status of majority variant in certain word positions: the model here shows that medial and linking /r/ positions greatly favour the variant. This is, for that matter, not surprising in articulatory terms, since these are environments that always contain r in intervocalic position. Initial /r/ may also


be immediately following a vowel, but as preceding environment was not coded on this run through the data, the significance of tapped /r/ within the word-initial category cannot be fully tested. The results for the factor ‘decade of recording’ show that while in the 1950s data, the tapped/trilled /r/’s factor weight favours taps and trills at 0.635, we already see a slight disfavouring of the feature in the 1960’s, and a further decrease in the 1970s. The 1950s and 1960s are therefore particularly interesting decades to explore through further data. Year of birth as a continuous variable also shows that taps and trills decrease systematically through later generations, so that the purely speaker-diachronic trend follows the ‘decade of recording’ trend, and both are independently significant. Deviance 1.283.101 Df 8 Grand mean 0.15

Factors Log Odds

Tokens (N)

Proportion of application

value

Centred factor weight

DECADE of recording 1950s 0.556 1141 0.187 0.635 1960s -0.033 823 0.119 0.492 1970s -0.523 325 0.098 0.372 YEAR OF BIRTH (continuous)

0.025 POSITION medial 2.122 578 0.439 0.916 linking r 1.234 260 0.250 0.775 Initial -1.163 451 0.029 0.238 onset cluster -2.193 1000 0.011 0.1 Table 4. Mixed methods logistic regression modelling for tapped and trilled /r/

(omitting speaker OM4), N=2289 Deviance 1118,775 Df 9 Grand mean 0.101

Factors Log Odds

Tokens (N)

Proportion of application value

Centred factor weight

SPEECH TYPE Interview 1.012 1257 0.093 0.733 Monologue -1.012 1254 0.108 0.267


DECADE OF RECORDING 1930s 3.449 222 0.527 0.969 1950s -0.745 1141 0.035 0.322 1970s -0.900 325 0.154 0.289 1960s -1.803 823 0.056 0.141 POSITION Initial 1.062 498 0.189 0.743 Medial 0.266 624 0.093 0.566 onset cluster -0.076 1105 0.083 0.481 linking r -1.252 284 0.032 0.222 Table 5. Mixed methods logistic regression modelling for labiodental and

labialized /r/, N=2511 We turn now to consider the status of labialized /r/. Table 5 above shows the results for labiodental and labialized /r/, recoded and categorized together as ‘labials’. Three independent factors proved significant in the modelling of variation in r. These were, in order from least to most significant, speech type (p=0.00294), decade of recording (p=0.00805) and position within the word (p=5.96e-13).

Figure 4. Proportion of labial and non-labial variants in the corpus by individual

speaker Note that gender was not proven significant in this model either; labial variation seems to have a different social status to tapped/trilled /r/ given the profile in Figure 4, but we cannot find detailed evidence for what that status consisted of here in


terms of a possible gender dynamic. Labials (labiodentals and labialized alveolars) are predominantly produced by three individuals only. Figure 4 illustrates this. OF2 (Baroness Stokes), OM4 (Lord Halifax), and YF3 (Daphne du Maurier) are the only three speakers whose production of labials is above the average for all speakers in the corpus of just on 10%. Lord Halifax (socially-speaking from an aristocratic, upper class background) is by far the most prolific user of labials for /r/ at 52.7%. While he is the only speaker of this type in the present corpus, the BBC archive potentially holds other examples of comparable recordings which could provide a firmer basis for conclusions in future. To turn to the factors which did prove significant in the logistic model, word position, decade of recording and speech type, we can consider each in turn. Labials appear most strongly favoured in initial position, and medials, with a factor weight of 0.566 are also favoured, although more weakly. Other word positions do not emerge as favouring labial production. Decade of recording is strongly favoured only in the case of the 1930s, which is the single recording of OM4 referred to above. Other decades do not favour labial production, but as Table 5 shows, this is a result which is strongly affected by the dominance of a single speaker in this limited corpus. The result for speech type shows a strong factor weight favouring ‘Interview’ as speech context, which may seem anomalous given that OM4’s recording is a monologue, but Table 5 shows that a large number of labial tokens also occur in the interviews recorded with YF3 (Daphne du Maurier) and OF3 (Baroness Stokes). Although we cannot tell the final conclusive ‘story of labial r’ here, we do have tentative indications that labial variants are an idiosyncratic and individual feature in this corpus rather than a general sociolinguistic feature of the group, as tapped /r/’s seem to be. 4 Conclusions As this volume demonstrates, it is not only ongoing, present-day sociolinguistic variation and change that can now be studied empirically and quantitatively. The audible past is accessible, and with sensitive sociolinguistic treatment, can yield many insights into the detailed trajectories of variation at the level of single variables over time. How the different variants, such as different phonetic qualities for /r/, cluster by co-occurrence into varieties can then become a matter of empirical concern, and thereby inform other branches of the historical sociolinguistic and linguistic enterprise. Linguistic variants’ ‘routes to obsolescence’ are actually a relatively underresearched area in sociolinguistics, since the field tends to focus on ‘new and upcoming’ linguistic features. Systematic observations of historical recordings can ultimately also provide points of quantitative comparison with younger, more contemporary data sets. In that way, we can gain a greater time depth for studies, providing substantial empirical evidence of the changing features of RP during the twentieth century, including, as here, the quantitative profiles of various qualities of /r/. The present study has demonstrated that labial /r/ was a more peripheral and idiosyncratic feature in these recordings from the decades around and after World War 2, while tapped /r/ was more solidly socially-based, and largely gender-driven on its route towards obsolescence. Many other such variable features await diachronic comparisons of this kind.


The exemplary analysis here also serves to demonstrate the potential this type of public media-derived data can have for large-scale systematic research. The technologies that are needed, in terms of large digital memory capacities and accessible open source and freeware analytical tools, are more and more widely available. We can look forward to “listening to the past” being a more common pursuit in the future. References Agha, Asif 2003. The social life of cultural value. Language and Communication,

23(3-4): 231-273. doi:10.1016/S0271-5309(03)00012-0 Altendorf, Ulrike 2003. Estuary English: Levelling at the Interface of RP and

South-Eastern British English. Tübingen: Narr. Armstrong, Nigel and Ian E. Mackenzie 2013. Standardization, Ideology and

Linguistics. Basingstoke: Palgrave Macmillan. Bailey, Guy 2002. Real and apparent time. In: J. K. Chambers, Peter Trudgill and

Natalie Schillling-Estes. The Handbook of Language Variation and Change. Malden, MA: Blackwell, pp 312-332.

Bell, Alan 1984. Language style as audience design. Language in Society 13: 145-204.

Collins, Beverley and Inger M. Mees 1999. The Real Professor Higgins: the Life and Career of Daniel Jones. Walter de Gruyter.

Collins, Beverley and Inger M. Mees 2001. Daniel Jones, Prescriptivist, R.I.P. English Studies 82(1): 66-73.

Coupland, Nikolas 2010. Language, ideology, media and social change. In: Karen Junod and Didier Maillat (eds) Performing the Self. SPELL: Swiss Papers in English Language and Literature 24. Tübingen: Narr, pp. 127-152.

Coupland, Nikolas and Hywel Bishop 2007. Ideologised values for British accents. Journal of Sociolinguistics 11(1): 74-93.

Cruttenden, Alan 2008. Gimson’s Pronunciation of English. Seventh edition. London: Hodder Education.

Eckert, Penelope 2008. Variation and the indexical field. Journal of Sociolinguistics, 12(4): 453-476. doi:10.1111/j.1467-9841.2008.00374.x

Eckert, Penelope and John Rickford (eds) 2001. Style and Sociolinguistic Variation. Cambridge: Cambridge University Press.

Fabricius, Anne H 2000. T-glottalling between Stigma and Prestige: A Sociolinguistic Study of modern RP. PhD thesis, Copenhagen: Copenhagen Business School.

Fabricius, Anne H. 2002. Ongoing change in modern RP: evidence for the disappearing stigma of t-glottalling. English World-Wide 23(1): 115-136.

Fabricius, Anne H 2007a. Variation and change in the TRAP and STRUT vowels of RP: a real time comparison of five acoustic data sets. Journal of the International Phonetic Association, 37(03): 293-320. doi:10.1017/S002510030700312X

Fabricius, Anne H., 2007b. Vowel formants and angle measurements in diachronic sociophonetic studies: FOOT-fronting in RP. Proceedings of the sixteenth ICPhS, Saarbrucken, August 2007. www.icphs2007.de.

http://www.icphs2007.de


Fabricius, Anne H. and Janus Mortensen 2013. Language ideology and the notion of construct resources: a case study of modern RP. In: Tore Kristiansen and Stefan Grondelaers (eds) Language (De)standardisation in Late Modern Europe: Experimental Studies. Oslo: Novus, pp. 375-402.

Foget Hansen, Gert and Nicolai Pharao 2006. Microphones and measurements. Lund University, Centre for Languages and Literature, Dept. of Linguistics and Phonetics, Working Papers 52: 49-52.

Foulkes, Paul and Gerald J. Docherty 2000. Another chapter in the story of /r/: “Labiodental” variants in British English. Journal of Sociolinguistics, 4(1): 30-59. doi:10.1111/1467-9481.00102

Gimson, A. C 1981. The Twentyman Lecture. The pronunciation of English: Its intelligibility and acceptability. Modern Languages 62: 61-68.

Hannisdal, Bente R 2007. Variability and Change in Received Pronunciation A Study of Six Phonological Variables in the Speech of Television Newsreaders. Doctoral thesis, University of Bergen. Retrieved August 1, 2012, from https://bora.uib.no/handle/1956/2335

Harrington, Jonathan 2006. An acoustic analysis of ‘happy-tensing’ in the Queen’s Christmas broadcasts. Journal of Phonetics 34(4): 439-457.

Harrington, Jonathan, Sallyanne Palethorpe and Catherine Watson 2000. Monophthongal vowel changes in Received Pronunciation: an acoustic analysis of the Queen’s Christmas broadcasts. Journal of the International Phonetic Association, 30(1-2): 63-78.

Harrington, Jonathan, Felicitas Kleber and Ulrich Reubold 2011. The contributions of the lips and the tongue to the diachronic fronting of the high back vowels in Standard Southern British English. Journal of the International Phonetic Association, 41(2): 137-156.

Hawkins, Sarah, and Jonathan Midgley 2005. Formant frequencies of RP monophthongs in four age groups of speakers. Journal of the International Phonetic Association, 35(02): 183-199. doi:10.1017/S0025100305002124

Hudson, Toby, Gea de Jong, Kirsty McDougall, Philip Harrison and Francis Nolan 2007. F0 statistics for 100 young male speakers of Standard Southern British English. Proceedings of the Sixteenth ICPhS, Saarbrucken, August 2007. www.icphs2007.de.

Johnstone, Barbara, Jennifer Andrus and Andrew. E. Danielson 2006. Mobility, indexicality, and the enregisterment of “Pittsburghese.” Journal of English Linguistics, 34(2): 77-104. doi:10.1177/0075424206290692.

Jones, Daniel 1909. The Pronunciation of English. Cambridge: Cambridge University Press.

Jones, Daniel 1917. An English Pronouncing Dictionary. London: Dent. Kerswill, Paul 2007. RP, standard English and the standard/non-standard

relationship. In: David Britain (ed.) Language in the British Isles, second edition. Cambridge: Cambridge University Press, pp 34-51.

Kiesling, Scott R 2009. Style as stance: stance as the explanation for patterns of sociolinguistic variation. In: A.?? Jaffe (ed) Stance: Sociolinguistic Perspectives. Oxford: Oxford University Press.

Labov, William 2001. Principles of Linguistic Change: External Factors. Oxford: Blackwell.

Leitner, Gerhard 1982. The consolidation of 'Educated Southern English' as a model in the early twentieth century. IRAL Vol XX/2: 91-107.

https://bora.uib.no/handle/1956/2335

http://www.icphs2007.de


Levon, Erez and Sophie Holmes-Elliott 2013. East End boys and West End girls: /s/-fronting in Southeast England, University of Pennsylvania Working Papers in Linguistics: Vol. 19.2, Article 13. Available at: http://repository.upenn.edu/pwpl/vol19/iss2/13

Mugglestone, Lynda 2007. Talking Proper: the Rise of Accent as Social Symbol. Oxford: Oxford University Press.

Preston, Dennis 2013. Linguistic insecurity forty years later. Journal of English Linguistics. 41(4): 304-331.

Stuart-Smith, Jane., Eleanor Lawson, and James M. Scobbie 2014. Derhoticisation in Scottish English: a sociophonetic journey. In: Chiara Celata and Silvia Calamai (eds) Advances in Sociophonetics. Amsterdam: John Benjamins.

Trudgill, Peter 2002. Sociolinguistic Variation and Change. Edinburgh: Edinburgh University Press.

Wells, J. C. 1982. Accents of English. 3 vols. Cambridge: Cambridge University Press

Wells J. C. 1990. Longman Pronunciation Dictionary. London: Longman. Wells, J. C. 1997. Whatever happened to Received Pronunciation? In Carmelo

Medina Casado and Concepción Soto Palomo (eds) II Jornadas de Estudios Ingleses [second Conference of English Studies], Universidad de Jaén, Spain. pp. 19-28.

Wells, J. C. 2000. Longman Pronunciation Dictionary. Second edition. London: Longman.

Wells, J. C. 2008. Longman Pronunciation Dictionary. Third edition. London: Pearson Longman.

Appendix. Social backgrounds of speakers in the corpus

ID code

Speaker Background details; ancestry, education. Excerpted from Wikipedia.

OF1 Baroness Asquith

Helen Violet Bonham Carter, Baroness Asquith of Yarnbury, DBE (15 April 1887 – 19 February 1969) was a British politician and diarist. She was the daughter of H. H. Asquith, Prime Minister from 1908–1916, and later became active in Liberal politics herself, being a leading opponent of appeasement, standing for Parliament and being made a life peer. http://en.wikipedia.org/wiki/Violet_Bonham_Carter

OF2 Baroness Stocks

Mary Danvers Stocks, Baroness Stocks (25 July 1891–6 July 1975) née Brinton, was a British writer. She was the daughter of a London doctor.. ..Her family was deeply involved in changes in the Victorian Era and Stocks herself was deepingly involved in women's suffrage, the welfare state, and other aspects of social work [1] She attended St. Paul's Girls School and earned a degree in Economics in 1913 from the London School of Economics (LSE). http://en.wikipedia.org/wiki/Mary_Stocks,_Baroness_Stocks

OF3 Agatha Christie

Dame Agatha Mary Clarissa Christie, DBE (born Miller; 15 September 1890 – 12 January 1976) was an English crime writer of novels, short stories, and plays. …Born to a wealthy upper-middle-class family in Torquay, Devon, Christie served

http://repository.upenn.edu/pwpl/vol19/iss2/13

http://en.wikipedia.org/wiki/Violet_Bonham_Carter

http://en.wikipedia.org/wiki/Mary_Stocks,_Baroness_Stocks


in a hospital during the First World War, before marrying and starting a family in London. …(I)n 1920, The Bodley Head press published her novel The Mysterious Affair at Styles, featuring the character of Poirot. This launched her literary career. http://en.wikipedia.org/wiki/Agatha_Christie

OF4 Bridget Monckton, eleventh Lady Ruthven of Freeland

Bridget Helen "Biddy" Monckton, eleventh Lady Ruthven of Freeland CBE (27 July 1896–17 April 1982), also known as The Countess of Carlisle between 1918 and 1947, as Lady Monckton between 1947 and 1957, as The Viscountess Monckton of Brenchley between 1957 and 1965 and as The Dowager Viscountess Monckton of Brenchley between 1965 and 1982, was a British peeress and Conservative member of the House of Lords. http://en.wikipedia.org/wiki/Bridget_Monckton,_eleventh_Lady_Ruthven_of_Freeland

OM1 Baron Dowding

Air Chief Marshal Hugh Caswall Tremenheere Dowding, first Baron Dowding GCB, GCVO, CMG (24 April 1882 – 15 February 1970) was a British officer in the Royal Air Force. He was the commander of RAF Fighter Command during the Battle of Britain….Hugh Dowding received his early education at St. Ninian's Boys' Preparatory School … Dowding was educated at Winchester College in England on a scholarship before joining the Royal Military Academy, Woolwich. He later served abroad in the Royal Garrison Artillery. http://en.wikipedia.org/wiki/Hugh_Dowding,_first_Baron_Dowding

OM2 Viscount Alanbrooke

Alan Brooke was born in 1883 at Bagnères-de-Bigorre, Hautes-Pyrénées, to a prominent Anglo-Irish family from West Ulster with a long military tradition. [15] He was the seventh and youngest child of Sir Victor Brooke, third Baronet, of Colebrooke, Brookeborough, County Fermanagh, Ireland, and the former Alice Bellingham, second daughter of Sir Alan Bellingham, third Baronet, of Castle Bellingham in County Louth.[16] Brooke was educated in Pau, France, where he lived until the age of 16. After graduation from the Royal Military Academy at Woolwich, Brooke was, on 24 December 1902, commissioned into the Royal Regiment of Artillery as a Second Lieutenant. http://en.wikipedia.org/wiki/Alan_Brooke,_first_Viscount_Alanbrooke

OM3 A.P. Herbert Sir Alan Patrick Herbert CH (usually writing as A. P. Herbert or A. P. H.; 24 September 1890 – 11 November 1971) was an English humorist, novelist, playwright and law reform activist. He was an independent Member of Parliament (MP) for Oxford University for 15 years. … He was educated at Winchester College and New College, Oxford. http://en.wikipedia.org/wiki/A._P._Herbert

OM4 E.F.L.Wood, first Earl of Halifax

Edward Frederick Lindley Wood, first Earl of Halifax, KG OM GCSI GCMG GCIE TD PC (16 April 1881 – 23 December 1959), … was one of the most senior British Conservative politicians of the 1930s, during which he held several senior

http://en.wikipedia.org/wiki/Agatha_Christie

http://en.wikipedia.org/wiki/Bridget_Monckton,_eleventh_Lad

http://en.wikipedia.org/wiki/Hugh_Dowding,_first_Baron_Do

http://en.wikipedia.org/wiki/Alan_Brooke,_first_Viscount_Ala

http://en.wikipedia.org/wiki/A._P._Herbert


ministerial posts, most notably as Foreign Secretary from 1938 to 1940. During the war, he served as British Ambassador in Washington. He was son of the second Viscount Halifax. He was educated at Eton and Christ Church, Oxford, becoming a Fellow of All Souls College, Oxford, was Member of Parliament for Ripon from 1910 to 1925 when he was elevated to the peerage. http://en.wikipedia.org/wiki/E._F._L._Wood,_first_Earl_of_Halifax

YF1 Doris Langley-Moore

Doris Langley Moore … was one of the first important female fashion historians. She founded the Fashion Museum, Bath (as The Museum of Costume) in 1963. She was also a well-respected Lord Byron scholar, and author of a 1940s ballet, The Quest. Doris Langley Moore was born in 1902 in Lancashire, England. She was educated in South Africa, where her father was a newspaper editor. At the age of 18, she returned to England to study classical languages at university. http://en.wikipedia.org/wiki/Doris_Langley_Moore

YF2 Lady Alexandra Naldera Curzon

Lady Alexandra Naldera Curzon, CBE (20 March/April 1904 – 7 August 1995) was the third daughter of George Curzon, first Marquess Curzon of Kedleston and Viceroy of India, and Lord Curzon's first wife, the American mercantile heiress, formerly Mary Victoria Leiter, Mary Victoria Curzon, Baroness Curzon of Kedleston. http://en.wikipedia.org/wiki/Lady_Alexandra_Curzon

YF3 Dame Daphne du Maurier, Lady Browning DBE

Daphne du Maurier was born in London, the second of three daughters of the prominent actor-manager Sir Gerald du Maurier and actress Muriel Beaumont. … Her first novel, The Loving Spirit, was published in 1931. http://en.wikipedia.org/wiki/Daphne_du_Maurier

YM1 Lord Cromer, GRS Baring

Lieutenant-Colonel (George) Rowland Stanley Baring, third Earl of Cromer, KG GCMG MBE PC (28 July 1918 – 16 March 1991), styled Viscount Errington before 1953, was a British banker and diplomat. After serving during World War II, he was Governor of the Bank of England (1961–1966) and British Ambassador to the United States (1971–1974). The eldest son of the second Earl of Cromer and his wife Ruby Elliot-Murray-Kynynmound, he was educated at Eton and Trinity College, Cambridge, where he left after a year. http://en.wikipedia.org/wiki/Rowland_Baring,_third_Earl_of_Cromer

YM2 Cecil Day Lewis

Day Lewis was born in Ballintubbert, Athy/Stradbally border, Queen's County (now known as County Laois), Ireland. He was the son of the Reverend Frank Cecil Day Lewis (died 29 July 1937) and Kathleen Blake (née Squires; died 1906). After the death of his mother in 1906, Cecil Day Lewis was brought up in London by his father, … He was educated at Oundle School and at Wadham College, Oxford. http://en.wikipedia.org/wiki/Cecil_Day Lewis

YM3 John Gielgud Sir Arthur John Gielgud, OM, CH (14 April 1904 – 21 May 2000) was an English actor, director, and producer. John Gielgud was born in South Kensington in London to Kate Terry-Lewis and Frank Gielgud. … Gielgud's Catholic father,

http://en.wikipedia.org/wiki/E._F._L._Wood,_first_Earl_of_H

http://en.wikipedia.org/wiki/Doris_Langley_Moore

http://en.wikipedia.org/wiki/Lady_Alexandra_Curzon

http://en.wikipedia.org/wiki/Daphne_du_Maurier

http://en.wikipedia.org/wiki/Rowland_Baring,_third_Earl_of_

http://en.wikipedia.org/wiki/Cecil_Day


Franciszek Giełgud, born in 1880, was a descendant of a Polish noble family of Lithuanian origin… After Hillside Preparatory School in Godalming, Surrey he won a scholarship to Westminster School where he started his public school life in September 1917 as a weekly boarder, and where he was elected a non-resident Kings Scholar in 1918. http://en.wikipedia.org/wiki/John_Gielgud

http://en.wikipedia.org/wiki/John_Gielgud

Hickey Twentieth-century Received Pronunciation: Stop articulation --- Page 58 of 525

4 Twentieth-century Received Pronunciation: Stop articulation Raymond Hickey 1 Introduction A number of early audio recordings of speakers of Received Pronunciation (RP) at the beginning of the twentieth century are available. These recordings reveal forms of RP which differ from those today. One of the chief differences between RP then and now1 involves voiceless stops: they show considerably less aspiration in the early recordings than they do at present and it is this issue which is examined in the current chapter. This study is exemplary of the types of insights which can be gained from examining early audio recordings and is similar in approach to the others found in the chapters of this book. It also shows (in section 4) how change can be traced from the beginnings of audio recording down to the present-day. The earliest recordings of RP are of members of the royal family and some prominent cultural and literary figures in the years between the two world wars. One of these recordings is that of Virginia Woolf, a recording made for a BBC broadcast on 29 April 1937. In it she talks about the words of the English language and how a contemporary writer might or might not use them. It is the only recording of her voice. 1.1 Three early twentieth-century writers in England 1.1.1 Virginia Woolf (1882-1941) Among the foremost novelists of the early twentieth century, Woolf was born in London, the daughter of the notable academic Sir Leslie Stephen (1832-1904) and was educated by her parents in Kensington. As a member of an upper middle class family in London she would have acquired an accent typical of Received Pronunciation at the end of the nineteenth century.2 The recording of Virginia Woolf confirms that this was her native accent. As she spent all her life in London and its surroundings and was active in London literary circles it is reasonable to assume that her RP accent, evident in the extant recording, is what she spoke all her life. 1 For a full-length description of present-day RP, see Cruttenden (2014). A short overview is given in Gimson (1984) while Ramsaran (1990) discusses some features of RP which are undergoing change at present. Kerswill (2007) also contains a short section on RP, pp. 47-49, and mentions ‘Eight changes in RP’ none of which has to do with VOT values, probably because the changes in VOT had been completed by the end of the twentieth century. 2 This term was not current for this type of accent at that time. The phonetician Daniel Jones (1881-1967) is credited with establishing the term in linguistics though he was not the first to use it.


Virginia Woolf was also a central figure of the Bloomsbury Group, the loose circle of English writers and intellectuals who congregated around the area of Bloomsbury in London in the first half of the twentieth century. Members of the group such as Virginia Woolf herself, the biographer Lytton Strachey and the novelist E. M. Forster were insiders in several senses. They were English born and bred and frequently graduates of Cambridge. 1.1.2 Thomas Stearns Eliot (1888-1965) One of the major poets of twentieth-century English, Eliot was born in St. Louis, Missouri and went through his schooling there, moving to Milton, Massachusetts when 17 and then to Cambridge to study at Harvard. Eliot left the USA and transferred to Europe in 1910 at the age of 22 but returned to Harvard until 1914 when he moved to England permanently, first to Oxford and then to London where he settled for the rest of his life, becoming a British subject in 1927 after his conversion to the Anglican church. Eliot stated later in his life that ‘[I] wanted to burn my boats and commit myself to staying in England’. Given that he spent the first 17 years of his life in Missouri there is no reason to believe that he acquired anything but the supraregional Midwest American English accent of the late nineteenth century. However, the recordings of Eliot reading his own poetry reveal that as an adult in England he had an accent clearly recognisable as early twentieth-century Received Pronunciation. There are a number of recordings of T. S. Eliot. The most complete set and those of best quality are contained on the vinyl LP published by Caedmon Records in the Caedmon Literary Series. The recordings were made in September 1955 when Eliot was in his late 60s. 1.1.3 Elizabeth Bowen (1899-1973) Born in Dublin where she spent the first eight years of her life, Bowen moved to Hythe, Kent in 1907 with her mother and continued her education in England while maintaining her connections throughout her life with Bowen’s Court in Co. Cork, the estate house which she had inherited in Ireland. Bowen is what is labelled an ‘Anglo-Irish’ writer, born in Ireland but with strong connections to England. In her writings she dealt with Anglo-Irish themes, e.g. in the novel The Last September (1929) the country estate in Co. Cork is at the centre of the action and plot. 1.2 Eliot and Bowen and the Bloomsbury Group Both T. S. Eliot and Elizabeth Bowen were outsiders to the Bloomsbury Group. Eliot had emigrated to England and was keen on being accepted in established literary circles in England and maintained a friendship with Virginia Woolf. Bowen also associated with the group, though not as much as Eliot did. However, she entertained friendships with established members of the English upper-middle classes of the time, such as the Oxford-educated novelist Rose Macaulay. In sociolinguistic terms both Eliot and Bowen can be regarded as upwardly


mobile outsiders seeking to become members of a literary and intellectual establishment whose members were speakers of early twentieth-century RP. An essential part of this desire to be accepted in higher English society was the adoption of an accent identical with that of insiders in that society. And in the case of Eliot, casting off whatever American accent he brought with him to England led to an over-assimilation to early twentieth-century RP in one essential feature: the non-aspiration of voiceless stops. Before proceeding to the quantitative evaluation of this feature a table can be presented showing some of the salient features of early twentieth-century RP and how these occur in the speech of Virginia Woolf, as a central figure of the Bloomsbury Group, along with T. S. Eliot and Elizabeth Bowen as outsiders, not least because one was American and the other Irish (Table 1). Table 1. Distribution of seven key phonetic features for early twentieth-century

Received Pronunciation.

Non- rhotic

STRUT low + central

GOAT central onset

TRAP raising

AI/AU smoothing

lax happY vowel

VOT for /p,t,k/

Virginia Woolf yes yes yes yes yes yes slight

T. S. Eliot yes yes yes variable yes yes negligible

Elizabeth Bowen yes yes yes yes yes yes slight

The first three of these features are uncontroversial and have remained more or less as they are down to present-day RP. TRAP-raising was reversed in the second half of the twentieth century (Upton 2008: 242; Weiner and Upton 2000). AI/AU smoothing refers to the lack of an upglide for the diphthongs of the PRICE and MOUTH lexical sets (Wells 1982: 149, 151) when occurring before a tautosyllabic /r/ as in the words fire and hour respectively. There are other features which are typical of the time, a fairly back realisation of the GOOSE vowel and a mid open back vowel for the THOUGHT lexical set; these are uncontroversial in the present context. 2 Voice Onset Time in the literature on varieties of English Among the features of varieties of English which varies least among native speakers is the aspiration of voiceless stops. A lack of aspiration has been noted for forms of Scottish English (Wells 1982: 74, 112), but elsewhere in Britain, in


Ireland, in North America and in the Southern Hemisphere3 native speaker varieties of English show audible aspiration of voiceless stops in the onsets of stressed syllables, e.g. pan [p,æn], two [t,u:], keen [k,i:n], except where any of these follows /s/, e.g. spit [spit], stick [stik,] skin [skin]. In Wells’ three-volume work on accents of English there is a single sentence on stop aspiration in RP: ‘Initially in a stressed syllable, U-RP /p, t, k/ (U-RP = ‘upper-crust’ Received Pronunciation, Wells 1982: 280) often have surprisingly little aspiration’ (Wells 1982: 282). His characterisation as ‘surprising’ is appropriate: in the context of native varieties of English across the anglophone world, a noticeable delay in Voice Onset Time4 (henceforth: VOT, Docherty 1992: 13-14; Harrington 2010: 125-132) is normal. Although in varieties of English in Scotland voice onset is, or at least was, markedly closer to the release of voiceless stops this has not spread south to other forms of English. Indeed the relative delay in VOT has been analysed to characterise the border between southern Scotland and Northern England, see Docherty et al. (2011). When discussing early twentieth-century RP Daniel Jones (1964 [1918]: 153) remarks that with ‘initial voiceless plosives … breath is heard immediately after the plosion. The sounds are then said to be aspirated’ (emphasis in original). In his discussion of individual voiceless stops Jones (1964 [1918]: 138, 141, 146) mentions that they have ‘considerable aspiration’ when in the onset of a stressed syllable without any discussion of variation among speakers of RP (like himself). In his monograph, The Pronunciation of English [1909], originally published nine years previous to An Outline of English Phonetics [1918], Jones remarks that ‘when k commences a strongly stressed syllable, it is somewhat “aspirated” in Southern speech. This means that there is a slight puff of breath, i.e. a slight h-sound, immediately following the plosion and preceding the vowel’ (Jones 1956 [1909]: 74). Again there is no discussion of this issue, apart from a brief remark at the end of the same paragraph that ‘[i]n the North k is often not aspirated at all’. There are recordings of Daniel Jones reading in his own early twentieth-century RP accent of English. Here one can recognise that his voiceless stops had smaller VOT values than those an equivalent speaker of present-day RP would have. For instance, his pronunciation of /t/ in Tempest has a VOT value of 32ms and 26ms in two which is far less than half the value found in Queen Elizabeth’s 2014 Christmas broadcast (see below). It is perhaps safe to conclude that Jones’ awareness of the low VOT values of his RP contemporaries was not great because these values were characteristic of his own speech as well. Nonetheless, it is remarkable that between 1909 and 1918, the years of publication for his two early books on English pronunciation, Jones changed his description of voiceless stops in RP from being ‘somewhat aspirated’ to having ‘considerable aspiration’. 2.1 Handling Voice Onset Time 3 The lack of aspiration in Afrikaans English is a low-level transfer phenomenon from Afrikaans by first language speakers of that language. See Thomas (2011: 117) on VOT variation among second language speakers of English. 4 In the current chapter all measurements of VOT are given in milliseconds, abbreviated to ‘ms’ and set in brackets, i.e. ‘(ms)’.


VOT is a chronological aspect of speech production and a typically scalar, non-discrete phenomenon. There is no absolute measurement technique for determining its values and it is not possible to normalise it across a population of speakers as the parameters which determine it are difficult to quantify. While it is true that vowel length is the parameter which varies greatest between fast and slow speech styles, VOT can nonetheless vary depending on at least the following parameters (Table 2). Table 2. Parameters influencing the values of VOT for any speaker

1) Style (free speech, text reading, word list) 2) Rate of delivery of individual speaker 3) Status of words, major lexical class (nouns, verbs) and minor lexical class

(prepositions, particles, modal verbs) 4) Phonetic structure of word: monosyllabic or polysyllabic; point of

articulation5 5) Discourse features: high prominence sites, beginning of sentence, theme of

utterance, conversational interaction with an interlocutor, etc.

For an investigation like the present one, the relative values for VOT are what matter. Consider that in their section on VOT Ladefoged and Johnson (2011: 153) give a value of about 55ms for VOT with English stressed initial /p/ and a value under 20ms for English /p/ after initial /s/. These relatively high values, compared to those presented in the tables here, probably stem from measurements of words spoken on their own, i.e. in word-list style, although the authors do not mention this. The measurements presented and discussed below were done using the phonetics software Praat (version 5.1.43) to analyse segments of speech. These were gained by dissecting the available sound files using the sound-processing program Audacity (version 2.1.1). Determining VOT values involves considering two key questions on which values depend (Thomas 2011: 117). 1) Where does the release burst of a voiceless stop set in, bearing in mind that

velars can have two or more bursts? Following Cho and Ladefoged (1999), measurements for the present study were made from the last burst.

2) When does glottal pulsing (voicing) set in for F1 (first formant)? There are

different approaches here (Thomas loc cit.): for instance, one in which the onset of recognisable voicing for F2 is used for determining VOT values. But for many of the poor quality recordings examined for the present study F1 could be more easily recognised and so was used. This means that the VOT values may be slightly lower than other sets gained by measuring glottal

5 This is one parameter which can be controlled: all measurements for VOT in this study are for stops immediately preceding stressed vowels, with one or two exceptions like the word echoes with Virginia Woolf.


pulsing for F2 (or higher formants). 2.2 Voice Onset Time and point of articulation The results here correlate with the findings of other scholars, cf. Stuart-Smith et al. (2015: 233) confirming the work of Cho and Ladefoged (1999: 208), that VOT is increasingly shorter when the point of articulation is further forward in the mouth, i.e. /k/ > /t/ > /p/ is the sequence indicating an increasing reduction in VOT. Lisker and Abramson (1964: 397) also confirm this, in the case of Korean where the distinction between unaspirated and aspirated stops is phonemic. There would also seem to be a difference, in principle, between /k/ and /t/ on the one hand and /p/ on the other. A reason for this might be that it is easier to build up pressure behind the velar and alveolar stops as the muscles involved can produce greater tension than the lips in the articulation of /p/ and hence a greater burst on plosive release for /k/ and /t/ ensues. In the quantitative evaluation of aspiration with /p/ in the historic recordings examined here the burst was less evident than with /k/ and /t/ and so it was frequently less easy to measure it because of the poor quality of the recordings. 2.2.1 Voice Onset Time: phonetic context and lexical incidence The context for VOT examined in this study is the pre-stress, pre-vocalic position of voiceless stops. There are a number of common words in English where a sonorant is found between a voiceless stop – /p/ or /k/ – in the onset of a stressed syllable and the nucleus vowel of that syllable, e.g. praise, play; create, climb. With all speakers the VOT value in this context is higher than when a vowel immediately follows. Again across all speakers, the word time seems to have the shortest VOT value of all /tV-/ syllable onsets. As this is a very common word in English it may be that the older pronunciation of the word with a low VOT value survived longest into the twentieth century. Even George VI, who of all the English monarchs had relatively long VOT values, has the shortest for this word. T. S. Eliot has an average of 9ms for VOT across the seven instances of this word measured in his reading of The Four Quartets. If low VOT for time was salient in early twentieth-century RP then it is understandable that Eliot, in hyperadapting to this accent, had the lowest of his VOT values for /tV-/ in precisely this word. 2.3 Increase in Voice Onset Time during the last hundred years The hypothesis to be tested here is that VOT for varieties of English, which previously had low values, has been on the increase throughout the twentieth century and down to the present. In their investigation of nine Aberdeen English speakers Watt and Yurkova (2007: 1522) confirm that ‘VOT for /p/ across the entire subject group as a whole is inversely correlated with speaker age, in that older speakers in this sample show a tendency to have shorter VOT for this plosive than younger ones’. In their recent study Stuart-Smith et al. (2015) conclude that


aspiration is becoming more characteristic of Scottish English in general, i.e. the delay in VOT is on the increase.6 For the following tables four measurements, each for /p, t, k/, were aimed at. The items are ordered in ascending length of VOT. The values for /p/ show that all speakers had very little aspiration, i.e. a negligible VOT. 2.4 Voice Onset Time in the recording of Virginia Woolf At first hearing the speech of Virginia Woolf conveys an impression of being articulated in a closed manner towards the front of the mouth. The word which occurs repeatedly in her recording is words which she pronounces as [wø-:dz] rather than [w=:dz]. She also has considerable TRAP-raising but little GOOSE-fronting and her THOUGHT vowel is not as closed as it would be in present-day RP. She also has a slightly-rolled /r/7 in positions of discourse focus, e.g. ...she has gone aroving [q/rquvin]. Table 3. VOT values for Virginia Woolf (recording c 1937)

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

properly 7 teach 26 because 25 appear 7 tell 28 cannot 30 part 9 teaching 41 incarnadine 53 passing 10 taught 44 create 74 average 8 average 35 average 43

Some of the voiceless stops in the Woolf recording contain VOT values (Table 3) which are similar to today’s values, e.g. the word echoes (admittedly in post-stress position) has a value over 50 ms which shows that Woolf had a variable use of aspiration, a situation which linguistically would point to a change from earlier categorical non-aspiration to categorical aspiration of voiceless stops in RP today. 2.5 Voice Onset Time in the recordings of T. S. Eliot There are recordings of Eliot reading all his major poems such as The Love Song of J. Alfred Prufrock, The Waste Land, Ash Wednesday and The Four Quartets. For the VOT measurements (see Table 4) these recordings were used. This means, of course, that for Eliot, as opposed to Woolf and Bowen, one is dealing with recitals of poetry rather than stretches of speech where the author is explaining something. 6 Such changes in VOT are found in other languages as well. Thomas (2011: 119) mentions the case of Japanese, studied by Takada and Tomimori (2006), where VOT values are increasing in parts of contemporary Japan. 7 See the discussion of this feature in Fabricius (this volume).


Table 4. VOT values for T. S. Eliot (recording 1955)

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

paper 5 time (7)8 9 comic 12 park 8 tea 11 cups 14 peace 9 take 11 can 15 pen 9 talk 11 candle 17 average 8 average 10 average 15

In the case of Eliot, the lack of aspiration with voiceless stops, which he hyperadapted to on his arrival in England in the early twentieth century, was idiosyncratic and obviously of no further consequence for RP. Lack of aspiration would seem not to have been a feature which Eliot brought with him to England from America. For instance, Robert Frost (1874-1963), a slightly older American contemporary of Eliot, in his 1948 recording of Stopping by Woods on a Snowy Evening has clear aspiration of voiceless stops – for example, the /k/ in kept has a VOT value of 68ms – equivalent to modern American English usage (although his speech is non-rhotic as would have been typical of late nineteenth-century New England where he moved when he was 11). Eliot’s negligible VOT remained a feature of his speech which became increasingly old-fashioned as RP speakers gradually increased VOT for voiceless stops. There was secondary gain here for the poet, though probably unconscious: Eliot’s voice was clearly recognisable and his readings of his own works remained popular and were regarded as uniquely characteristic of the man himself. 2.6 Voice Onset Time in the recordings of Elizabeth Bowen The recording of Elizabeth Bowen is one where she discusses various techniques in novel writing and was broadcast by the BBC on 3 October 1956. Her speech is broadly early twentieth-century RP which contains features like a rolled /r/ for focussed elements in a discourse, similar to that found with Virginia Woolf, but more pronounced. She also had [w] for wh- as in when [wen] which was already a recessive RP feature in the early twentieth century (Cruttenden 2014: 233-234; Upton 2008: 250). However, this sound would have been well represented in the Irish surroundings in which she spent a considerable amount of her time, especially in her childhood. Table 5. VOT values for Elizabeth Bowen (recording c 1956)

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

perhaps 10 at∩all 11 characters 22

8 In these tables a digit in brackets indicates the number of tokens measured for this word. The VOT value is then the average across these measurements.


people 11 temperament 21 carry 25 possible 12 towards 23 because 29 person 19 take 33 kind 62 average 13 average 22 average 37

2.7 Hyperadaption and VOT T. S. Eliot did not participate in the variability of VOT with /k/ as did Elizabeth Bowen (see Table 5). This is typical of adult hyperadaption which does not evince the nuances of target distributions which children show when internalising a feature. Table 6. Comparative ranges of VOT for /k/ with Virginia Woolf, T. S. Eliot and

Elizabeth Bowen

lowest/highest value range Woolf 25ms - 74ms 49ms Bowen 22ms – 62ms 40ms Eliot 12ms - 17ms 5ms

Figure 1. Average VOT values for T. S. Eliot, Elizabeth Bowen and Virginia

Woolf

Eliot is the most consistent in the use of very short VOT values (see Table 6 and Figure 1). For instance, in the first five lines of Burnt Norton from the Four Quartets there are seven occurrences of the word time and the VOT average for these is 9ms (see Table 4). Elizabeth Bowen has the word time twice in her


recording with an average VOT of 24ms which is low by present-day standards but still nearly three times longer than Eliot’s average value. 2.7.1 Post-stress voiceless stops The measurements of VOT, discussed so far, have been of voiceless stops in immediately pre-stress position. But these stops also occur in post-stress position, especially /k/ both as a final, released consonant as in pick, take, back and in a pre-vocalic context, e.g. echo (in various forms). With Viriginia Woolf the /-k-/ in the word echoes has a VOT value of 64ms. Again in Eliot’s Burnt Norton, the word echo(es) occurs twice with an average VOT value of 39ms. So even in this phonotactic position, which favours longer VOT values, Eliot is lagging behind Virginia Woolf, his anchor within the RP-speaking Bloomsbury group. 3 Language change over time: the speech of English monarchs The discussion so far has concerned the speech of three writers with a view to determining how two of them – outsiders – adapted to contemporary speech norms of a country – England – in which they were not born. The assumption was also mentioned that in the course of the twentieth century longer VOT values became typical of RP in England. For the remainder of this study the validity of this assumption is to be scrutinised, this time by examining the speech of individuals – various English monarchs – who were the most established members of the country they ruled over. The speech of English monarchs has not gone unnoticed by linguists: the broadcasts of Elizabeth II have been examined by Jonathan Harrington and his colleagues and the results have been published as Harrington, Palethorpe and Watson (2000), which looked at the Queen’s realisation of monophthongs, and Harrington (2006), which is an examination of happY tensing (Wells 1982: 257-258) in the Christmas speeches delivered by the monarch over several decades. In the following section the VOT values for a range of English monarchs are presented. The first of these is George V (1865-1936) for whom recordings are available, namely Empire Day and Christmas Day messages, beginning at 1923 and continuing until shortly before his death. The quality of these is poor with low frequency hiss present throughout the entire recordings. This makes it difficult to establish where a stop is in the signal and hence where the burst of its release begins. Nonetheless, it has been possible with signal noise reduction and amplification to determine, admittedly with an undesirable margin of error, what the VOT values were in his speech (Table 7). Table 7. VOT values for King George V (1910-1936)

VOT (ms)

VOT (ms)

VOT (ms)

past 16 time 19 counts 38 peoples 19 ties 22 carry 54 personal 18 take 26 confidence 59


average 18 average 22 average 50 The second monarch considered here is Edward VIII whose reign lasted less than a year before his younger brother George VI ascended the throne after Edward’s abdication on 11 December 1936. The recording used for VOT evaluation (see Table 8) is his abdication speech. Table 8. VOT values for King Edward VIII (1936)

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

people 14 take 20 country 44 possible 21 time 21 king 54 parliament 23 entirely 28 carry 58 average 19 average 23 average 52

George VI reigned in the period from 1937 to 1952. A private figure who shyed away from publicity, there are not many recordings of his voice, the speech from 1939 speaking to the nation about the impending war with Germany being the most famous. For the measurements in Table 9 his opening speech at the Empire Exhibition in Glasgow in 1938 and the speech on victory in Europe on 8 May 1945 were consulted. Table 9. VOT values for King George VI (1937-1952)

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

past 31 taking 29 king 64 possible 39 time 31 common 67 people 45 task 41 execution 73 average 36 average 33 average 68

In 1953 the current incumbent on the English throne was crowned Elizabeth II. Beginning in this year, she has regularly broadcast a Christmas message in which she explicitly addresses the English people and those in the countries of the Commonwealth. The first of these messages was broadcast in 1953 from Auckland, New Zealand and was used for the current study. Two further broadcasts were investigated in order to provide a longitudinal comparison of VOT for the current monarch: 1984 and 2014. One very early recording, broadcast from Cape Town in 1947 on the occasion of the 21st birthday of Elizabeth Windsor (the later Queen Elizabeth), was also analysed (see Table 10). Table 10. VOT values for broadcast by Elizabeth Windsor on the occasion of her

21st birthday from Cape Town, South Africa in 1947

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)


people 18 taken 36 kind 52 parents 23 time 39 coming 65 town 41 Commonwealth 74 average 21 average 39 average 64

Table 11. VOT values for Queen Elizabeth II (1953- ) in Christmas broadcast

from 1953

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

perhaps 12 Tonga 48 of course 67 paid 24 time (2) 45 comes 78 people 23 too 52 kindness 83 average 20 average 48 average 76

In this early period Queen Elizabeth still shows much variation in VOT values (see Table 11). For instance, in the 1957 broadcast the word television occurs (stressed on the first syllable) with a VOT value of just 27ms. Table 12. VOT values for Queen Elizabeth II (1953- ) in Christmas broadcast

from 1984

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

partly 25 tend 51 conflict 52 peace 34 retain 52 occasion 53 parents 43 tolerance 79 comradeship 57 can 63 average 34 average 61 average 56

In broadcast used for Table 12 the Queen, at 58, had a quicker rate of delivery than as a young woman in the late 1940s and early 1950s and than as an 88-year old woman in 2014 (see Table 13 and Figures 2, 3 and 4). This might account for the VOT values for /k/ which are lower than in the first two recordings. It is also noticeable that VOT values for /t/ when followed immediately by /r/ are considerably longer than others because the [tr-] sequence is really an affricate, cf. trust [trvst] which had a VOT value of 91ms. Table 13. VOT values for Queen Elizabeth II (1953- ) in Christmas broadcast

from 2014

/p/ VOT (ms) /t/

VOT (ms) /k/

VOT (ms)

poppies 42 taken 60 Coventry 53 people (2) 43 time 73 casts 54 pioneered 65 tour 86 Commonwealth 86


talent 90 average 50 average 77 average 64

Table 14. Average VOT (ms) values for Christmas broadcasts by Queen Elizabeth

II from 1953-2014 and 21st birthday broadcast in 1947

/p/ /t/ /k/ 1947 21 39 64 1953 20 48 76 1984 34 61 56 2014 50 77 64 difference from 1947-2014

29 38 0

Figure 2. Development of VOT (ms) values with /p/ (1947-2014)

Figure 3. Development of VOT (ms) values with /t/ (1947-2014)


Figure 4. Development of VOT (ms) values with /k/ (1947-2014)

3.1 Assessing the speech of the monarchs For a span of over 90 years there are recordings of the English monarchs in which a development in the values for VOT can be recognised. This development is not a straight line from the earliest to the latest recordings as can be seen from Figure 5. Figure 5. Average VOT (ms) values for four English monarchs from 1923 to

2014


George VI was the most ‘modern’ with relatively long VOT values overall, longer than those of his elder brother Edward VIII. One of the reasons for this might lie in George VI’s stammer which could have caused him to pronounce words with more force when he did manage to speak them, this leading to a slight lengthening of VOT. Between George V and Elizabeth II one can recognise an increase in VOT for /p/ of just over a third. But it is with /t/ that the clearest rise in VOT is to be seen. The word talent in the 2014 Christmas broadcast has a VOT of 90ms which is 64ms more than the highest VOT for any word in the Empire Day broadcast of her grandfather George V. Furthermore, in Queen Elizabeth’s 2014 Christmas broadcast the keyword time has a VOT value of 73ms which is two to three times the length of the VOT values of all the others in this study, both her relatives and the authors examined. 4 Conclusion We cannot say for certain whether late nineteenth-century speakers of RP showed the lack of voiceless stop aspiration which can be recognised in the recordings of T. S. Eliot as this could be the result of hyperadaption on his part. But recordings such as that by Virginia Woolf suggest that there was a generation at the beginning of the twentieth century for whom VOT values were on the increase starting at the velar point of articulation and proceeding forwards. Support for this interpretation is found by comparing the VOT values for /p/ and /t/ in the older and more recent recordings of Queen Elizabeth II. In the course of her lifetime the degree of aspiration for these two stops increased, pointing to the completion of a change in RP which probably began already in the late nineteenth century: the increase in aspiration for voiceless stops. This view is furthermore supported by the investigations of VOT values for Scottish English (Stuart-Smith et al. 2015; Watt and Yurkova 2007) which is moving towards the universal pattern for all native speaker varieties of English, namely significant aspiration for the whole voiceless stop series. References Cho, Taehong and Peter Ladefoged 1999. Variation and universals in VOT:

evidence from 18 languages, Journal of Phonetics 27: 207-229. Cruttenden, Alan 2014. Gimson’s Pronunciation of English. Eighth edition.

London: Arnold. Docherty, Gerard. 1992. The Timing of Voicing in British English Obstruents.

Berlin: Foris. Docherty, Gerard 2010. Phonological innovation in contemporary spoken British

English, In: Andy Kirkpatrick (ed.) The Routledge Handbook of World Englishes. London: Routledge, pp. 59-75.

Docherty, Gerard, Dominic Watt, Carmen Llamas, Damien Hall and Jennifer Nycz 2011. Variation in Voice Onset Time along the Scottish-English


border, Proceedings of the Seventeenth International Congress of Phonetic Sciences, Hong Kong, August 2011, 591-594.

Gimson, A. C. 1984. The RP accent, in Peter Trudgill Language in the British Isles. Cambridge: Cambridge University Press, pp. 32-44.

Harrington, Jonathan, Sallyanne Palethorpe and Catherine Watson 2000. Monophthongal vowel changes in Received Pronunciation: an acoustic analysis of the Queen’s Christmas Broadcasts. Journal of the International Phonetic Association, 63-78.

Harrington, Jonathan 2006. An acoustic analysis of ‘happy-tensing’ in the Queen’s Christmas broadcasts, Journal of Phonetics 34: 439-457.

Harrington, Jonathan 2010. Phonetic Analysis of Speech Corpora. Malden, MA: Wiley-Blackwell.

Henton, Caroline 1983. Changes in the vowels of Received Pronunciation, Journal of Phonetics 11: 353-371.

Jones, Daniel 1964 [1918]. An Outline of English Phonetics. Ninth edition. Cambridge: W. Heffer.

Jones, Daniel 1956 [1909]. The Pronunciation of English. Fourth edition. Cambridge: Cambridge University Press.

Kerswill, Paul 2007. Standard and non-standard English, In: David Britain (ed.) Language in the British Isles, pp. 34-51

Ladefoged, Peter and Keith Johnson 2011. A Course in Phonetics. Sixth edition. Boston: Wadsworth.

Lisker, Leigh and Arthur S. Abramson 1964. A cross-language study of voicing in initial stop: Acoustical measurements, Word 20: 384-422.

Mahrenholz, Jürgen-Kornelius 2008. Ethnographic audio recordings in German prisoner of war camps during the First World War, In: Dominiek Dendooven and Piet Chielens (eds) World War I: Five Continents in Flanders. Ypres: Editions Lannoo, pp. 161-167.

Ramsaran, Susan 1990. RP fact and fiction, In: Susan Ramsaran (ed.) Studies in the Pronunciation of English: A Commerative Volume in Honour of A. C. Gimson. London: Routledge, pp. 178-190.

Stuart-Smith, Jane Tamara Rathcke, Morgan Sonderegger and Rachel Macdonald. 2015. A real-time study of plosives in Glaswegian using an automatic measurement algorithm: change or age-grading?, In: Eivind Nessa Torgersen, Stian Hårstad, Brit Mæhlum and Unn Røyneland (ed.), Language Variation -- European Perspectives V. Amsterdam: John Benjamins, pp. 225-238.

Takada Mieko and Nobuo Tomomori 2006. The relationship between VOT in initial voiced plosives and the phenomenon of word-medial plosives in Nigara and Shikoku, In: Yuji Kawaguchi, Susumu Zaima and Toshihiro Takagaki (eds) Spoken Language Corpus and Linguistic Informatics. Amsterdam: John Benjamins, pp. 365-379.

Thomas, Erik 2011. Sociophonetics. An Introduction. Basingstoke: Palgrave Macmillan.

Upton, Clive 2008. Received Pronunciation, In: Bernd Kortmann and Clive Upton (eds) Varieties of English 1: The British Isles. Berlin: Mouton de Gruyter, pp. 237-252.

Watt, Dominic and Jillian Yurkova 2007. Voice Onset Time and the Scottish Vowel Length Rule in Aberdeen English, Proceedings of the Sixteenth


International Congress of Phonetic Sciences, pp. 1521-1524. Weiner, Edmund and Clive Upton 2000. [hat], [hæt], and all that, English Today

61: 44-46.

Kerwill and Torgersen London’s Cokney ´--- Page 76 of 525

5 London’s Cockney in the twentieth century

Stability or cycles of contact-driven change? Paul Kerswill and Eivind Torgersen 1 Introduction: migration and linguistic change in London over six

centuries Recent press reports talk about a new, mixed, multicultural dialect in London’s traditional East End, apparently displacing traditional Cockney, which ends up being pushed to the edges of the city and beyond (Kerswill 2014). The press have labelled this ‘Jafaican’, while academics give it the name Multicultural London English (MLE), seeing it as one of a number of North-west European multiethnolects currently emerging in cities which have seen intense immigration in the past 30 years (Cheshire et al. 2011; Kerswill et al. 2013). We have argued that this variety, which is characterised by phonetic, morphosyntactic and discourse features, has its origins in the early 1980s, a direct result of the mixing of languages followed by generational shift to English in areas of London which have seen particularly high immigration. Whether MLE has ‘displaced’ Cockney is a moot point. First, it is actually hard to talk of it as a ‘variety’, since it contains a broad range of variation. Second, it forms a continuum with more traditional varieties of working-class speech in London which might come under the ‘Cockney’ umbrella, as well as with other sociolects in the city, including varieties close to Received Pronunciation. And third, we have argued that it contains characteristics of a Labovian vernacular, habitually spoken by a demographically defined set of speakers, while it is also a youth style containing highly salient slang items which are adopted by a broader range of speakers than its core group (Green 2014). In this chapter we ask two questions: is there any earlier evidence that migrants have influenced London’s dialect in the period for which we have recorded evidence? Relatedly, to what extent is it possible to identify any precursors of MLE? London has long received significant populations from elsewhere in the country as well as overseas. It is well known that the pronominal forms they, them and their arrived in London from Northern England during the Middle English period (Baugh and Cable 1993: 156), to be followed, again from the north, by third person singular present-tense –s in the mid-sixteenth century (Nevalainen and Raumolin-Brunberg 2000: 305). Both these changes are thought to be the result of direct migration from the North to London. They were successful not because of force of numbers, but because of the relative wealth and status of the people who migrated. That said, numbers are important: we must assume that these early migrants were able to predominate among the circles of the small but influential merchant class. In this chapter, we will look at another, later, migrant group joining a similarly quite circumscribed network in London, the Yiddish-speaking Jews who settled in a very compact part of the East End in the 1880s. Although there is


anecdotal, often only literary, evidence for it, we will consider what local linguistic influence they might have had on the working-class dialect of the East End, even though they were a small minority across the city. We will contrast their situation with today’s linguistic conditions in the same part of London, where the proportional number of immigrants is much higher, and where the number of linguistic groups is hugely greater. 2 Population tipping points and the founder principle: the Jews of the East

End Surprisingly, since the sixteenth century there is little indication that migration has led to changes in London English – at least not to changes which are observable because they have survived later levelling. In any case, in the nineteenth century most migrants were from the southern half of England, with far fewer from other places, including overseas, with the result that the varieties in contact were relatively similar and any changes resulting from the contact therefore difficult to detect. In the Victorian era, of the non-British groups in London, the Irish were by far the largest and ‘most conspicuous’ (Inwood 1998: 413), there being 109,000 Irish-born residents according to the 1851 census. Despite the relatively large numbers of Irish people, there are no claims in the literature that either Irish English or the Irish language had any influence on London English (cf. Wells 1982: 301–334). Does this apparent lack of influence reflect a general pattern? However, later migration-induced contact did, it seems likely, affect other dialects in England, and it is instructive to examine a particular case. According to Trudgill (1998), the typologically ‘unnatural’ third-person singular –s agreement was lost in Norwich English following the immigration of large numbers of Dutch and French speakers from the Low Countries in the years after 1567. By 1579, 37 per cent of the city’s population was composed of Dutch and French speakers. The resulting contact between a proportionally large number of second-language speakers of English and the native population led to the simplification of the paradigm through the introduction of a zero variant, almost certainly aided (Trudgill argues) by the fact that, at that time, there were actually two endings in competition, –eth and the newer (northern) –s. Trudgill suggests that, at a critical time, the three variants (zero, –eth and –s) were numerically balanced, leaving the way open for one to win out – in this case the simplest, zero. The sociolinguistic situation was presumably one of fairly intense contact between the non-native speaking incomers and the existing population, leading to the non-acquisition of the third-person ending by the children of both groups. We will be arguing (following Trudgill 1998, 2004) that the relative frequencies of individual variants and of language varieties, as determined by population sizes, are an important predictor of linguistic outcomes of contact. Of all British cities, London has probably seen the greatest inflow of people over the longest period – so the apparent lack of linguistic influences despite intense dialect and language contact is surprising. The question therefore arises: could it be the case that, despite the size of the migrant populations there, the proportions of immigrants to ‘natives’ were never large enough to lead to change? In the nineteenth century, we discover that, according to the 1851 census, 38 per


cent of London’s population was composed of British and overseas migrants, reducing slightly to 34 per cent by 1891 (Inwood 1998: 412). This, of course, is practically the same proportion as that which obtained in Norwich 270 years previously; the apparent absence of any effect of this migration could be put down to the fact that the immigrants were heterogeneous and were spread unevenly throughout the city. Later in the century, the proportion of migrants dropped markedly, while the city’s population rose from 1 million in 1800 to 4.5 million by 1881 and over 7 million by 1911 (Porter 1994: 249). Migration did contribute to this rise, but improvements in public health in the second half of the century enabled natural increase to account for more than half the population growth (Inwood 1998: 416). Even so, three out of the 30 registration districts in London still had a migrant majority in 1881 (Inwood 1998: 416). Doubtless there were many more such areas at a sub-district level. In the light of our focus on relative frequencies, if we want to trace influences on London English, it is in districts such as these that we are likely to find them. By the time of the late-nineteenth century Jewish immigration, there was already an established, wealthy Jewish population in London, numbering some 46,000 in 1881. The new immigrants were refugees from Tsarist Russia and Poland, and the Jewish population rose to 140,000 by 1905 (Inwood 1998: 413), at which point the Aliens Act of that year would sharply reduce the numbers of new immigrants. Yiddish was their vernacular language (Russell and Lewis 1901: 18). There is little, if any, published research on the maintenance of Yiddish during these years, but the history of Yiddish theatre at the Pavilion in Whitechapel Road is instructive. Yiddish performances had their heyday there in the 1920s, but by 1935 the population of Yiddish speakers was so diminished that they had to cease. Speculative reasons given are the ‘Anglicisation’ of the younger generations and migration to wealthier parts of London (All About Jewish Theatre, n.d.). Language shift was evidently rapid, encouraged by policies favouring integration – this was true even in the 4,000-pupil Jews’ Free School, where ‘the emphasis was on integration. Pupils were encouraged to discard the Yiddish language and focus on becoming little English men and women’ (Cook 2012). We can conclude, therefore, that the language ‘died’ with the demise of the first generation. Yiddish once more became a community language, and remains so in the East End today, when new waves of refugees arrived escaping persecution in Europe in the 1930s. High proportions of Jewish people live in parts of the area today, particularly in Stamford Hill, where there is now a substantial community of Ultra-Orthodox Charedi, whose communities were founded there in the 1920s (http://www.hackney.gov.uk/hackney-the-place-diversity.htm#.UnjZ7XC-2Cd). There was, then, a linguistic if not a social discontinuity between the 1880–1905 East European immigrants and the later inter-war refugees. Because the latter did not settle in such a concentrated way, their scope to influence local varieties of English would inevitably have been much more limited. Our focus, therefore, must be firmly on Whitechapel at the turn of the twentieth century. If we are to find any Yiddish influence on London English from this period, we need information about local speech from the time when Yiddish was at its height in terms of having both adult and child speakers. This implies a window which finishes around 1900, when young immigrant, and hence Yiddish-speaking, children would have been reaching adolescence or early adulthood, and the second-

http://www.hackney.gov.uk/hackney-the-place-diversity.htm#.UnjZ7XC-2Cd


generation, English-born children would be in the process of shift or be mainly Anglophone. However, as is clear from our earlier argument, we still need to know (as far as we are able) the proportions of Yiddish speakers to English speakers. The reason for this is the linguistic ‘advantage’ enjoyed by the founder population of an area: for a number of reasons, including prestige and cultural dominance, the language of the earliest inhabitants of an area stands a better chance of survival than that of later incomers. This ‘founder principle’ has been promoted by Mufwene (2001) as a means of modelling the early development of Atlantic creoles. Importantly for us, the crux of his argument is a focus on the relative proportions of speakers of the European lexifier languages in a given plantation community, emphasising the length of time speakers of one or another language group dominated numerically. The argument can be summarised as follows (cf. Mufwene 2001: 62–64). Where Europeans were in a majority and their language continued to be transmitted for a considerable time, including to members of the slave population living in proximity to them, the language (English or French) would survive among both the European and the slave populations. If Europeans were in a minority and their language was not being transmitted, then creolisation would take place. In the case of the East End just before the beginning of the twentieth century, we need to establish the proportions of Yiddish to English speakers: which group was numerically superior? Yiddish speakers were not a ‘founder’ population, but if their proportions were high enough they had the potential to swamp the local English speakers. If shift to English was rapid, we would expect second-language varieties to have formed a significant input to the resulting variety of English. The shift seems indeed to have been rapid, being well on the way to completion within one generation. What were the social conditions, including demography, contact and ideology, which led to the shift? What kind of social integration was taking place around the turn of the twentieth century? We turn to these questions now. We are fortunate in having relatively detailed information about the distribution of the Jewish and non-Jewish population in the East End at the critical period. In 1899, George Arkell published his Jewish East London (Arkell 1899 [2012]), a street map based on a survey of dwellings across the boroughs of the East End (see Map 1 below). The darkest shading (dark blue in the original) is in the south-west quadrant and shows streets which are at least 95 per cent Jewish. These are surrounded by streets which are at least 50 per cent Jewish. The remainder of the map is largely shaded a deep red, signifying a population which is less than 5 per cent Jewish. What is striking is the extreme concentration of the then-recent Jewish immigration within a fairly compact area. The area became relatively self-contained, with 70 per cent of the population being employed locally in tailoring (Cook 2012). Despite this, there were cross-community contacts, with many gentile children being employed as a Shabbos goy (‘Sabbath non-Jew’) to light fires in Jewish households on the Sabbath. Schools, however, reflected the ethnic composition of the area, with many being close to 100 per cent Jewish (this was true not only of the Jewish Free School). A high proportion of children additionally attended small chederim, or traditional elementary schools teaching religion and Hebrew, where the medium of instruction was almost always Yiddish (Lewis 1901: 217).


Map 1. Jewish East London (south-western portion) (Arkell 1899 [2012]). The

strongly Jewish areas are indicated by darker shading in the bottom left-hand portion of the map

Despite this concentration and despite pride in religious and community-based institutions and traditions, Yiddish was quickly abandoned in favour of English, a process which was apparently complete within one or two generations, as we have seen. Ideological factors of two sorts might be underlying causes. At the time, there was a prevailing European negative attitude to Yiddish (Schmid 2002: 343). As Russell (1901: 31) comments: ‘Yiddish [is] a ‘jargon’, which mainly consists of bad German’. Little is known about contemporary Jewish and non-Jewish attitudes to Yiddish in London, but this negative ideology could well have lessened the potential for the language to act as an identity marker, thus hastening language shift. Secondly, there was clearly a collective desire to make social and economic progress in the adopted society, and this is reflected in many of Russell and Lewis’s (1901) comments about the near-complete ‘Anglicisation’ of the second generation; these are important for our argument, and we return to them below. 3 A Jewish Cockney in the early twentieth century? Testimony and

circumstantial evidence

In the initial stage, perhaps up to 1900, children’s acquisition of English would have been through formal primary and secondary schooling. There would also have been English-language input from adult learners – the parents and their generation. Contact with local English speakers would have ranged from extensive to very little, depending on occupation, neighbourhood and cultural norms. Such a situation


favours the growth of ethnolects (Wölck 2002), where a single ethnolinguistic group has migrated and maintains a measure of internal cohesion, allowing a distinguishable, group-based variety of the host language to emerge. In Britain today, there are British Asian ethnolects in cities such as Bradford and Sheffield, where these conditions apply (Heselwood and McChrystal 2000; Kirkham 2011; for Glasgow and London, see Stuart-Smith, Timmins and Alam 2011, Fox 2007 and Sharma 2011). We turn next to the evidence for a specifically Jewish ethnolect in East London in the early years of the last century. To our knowledge, there are no relevant contemporary recordings. There are, however, a small number of recordings of elderly speakers from this part of London who were children at the critical time. But first we will look for contemporary testimony. Russell and Lewis’s (1901) descriptions of the still-young Jewish community of the East End contains much social commentary, including reflections on religious practice, work, leisure, education, the ‘Jewish character’ and relations with non-Jews. Each author (the second a Jew) paints a largely positive picture of a successful community, aided by the much wealthier existing Jewish population. They are at pains to show that Yiddish is only really spoken by the immigrants themselves, while their children speak English. The ideology of the book is both pro-Jewish and pro-integrationist (though Lewis is at odds with Russell, who he accuses of overestimating the Jews’ degree of integration and secularisation). The authors do not make any comments about the way English is spoken, but, even allowing for their ideological stance, we can deduce from their account that they believed there was not a distinctive ‘Jewish’ way of speaking, or at least that the community’s way of speaking English was not salient to either outsiders or insiders. Four quotations support this:

The ‘Anglicising’ process, however, cannot be said to be very widely or thoroughly effective, except in the case of the rising generation. Here the transformation effected by an English training is astonishing in its completeness. All the children who pass through an elementary school may be said to grow up into ‘English Jews’. (Russell 1901: 23-24) It has been seen that the social isolation which preserves the Yiddish-speaking community from all the contaminating influences of intercourse with Gentiles is no longer maintained in the case of the English-born generation. The English training, and the inculcation of English habits and ideas, goes far towards robbing them of their Jewishness. They consider themselves Englishmen, and do not apparently attach any very great sanctity or importance to the racial and religious ties which bind them to their fellow-Jews who have immigrated from foreign lands. And the reality of this change is at once attested and emphasised by the cordial feelings with which English Jews are commonly regarded even by the most bitterly anti-foreign among the East End Gentiles. The barrier of social prejudice, in fact, may be said to have broken down. (Russell 1901: 140–141) The typical Jew, of the class we mention, has certainly been thoroughly Anglicised, though he may bear a Dutch name which indicates the country from which his family came originally. (Lewis 1901: 163)


… the child brought up in England regards Yiddish with contempt. I have myself met boys who had been taught to translate Hebrew which they did not understand into Yiddish, which was equally unintelligible to them. (Lewis 1901: 219)

In sum, the Jewish immigrants are said to remain socially isolated, while their children have moved a long way to integration socially and linguistically, decisively turning their back on the old language. Schools followed the policy of Anglicisation, and as pointed out by the project Moving Here (n.d.), ‘The schools seem to have succeeded in this aim: an 1894 Board of Trade report describes how the children ‘enter the school Russians and Poles and emerge from it almost indistinguishable from English children’. In relation to speech, one wonders what lies behind the author’s choice of ‘almost’ here. A century on, the Manchester-born author Howard Jacobson denies that a British Jewish accent ever existed. In a film review, he writes:

If you’re going to be funny about being Jewish, know to the bone what you are being funny about. It is not funny simply to name Jewish food. It is not funny to employ yiddishisms like bubbeleh and lobbes unless you can find the poetry in them. Least of all in a Jewish accent that hasn’t been heard since my father’s family arrived from Kamenetz Podolski in 1893, and probably not then. (Evening Standard, 12 February 2004)

Jacobson is criticising the use, in the film, of a stereotyped British Jewish accent, albeit not necessarily a London one. Such stereotypes can occasionally be encountered in films and sitcoms, such as Peter Cook’s 1960s portrayal of an East End tailor in Never Mind the Quality, Feel the Width. Although it is difficult to establish the source of the stereotyped Jewish accent Jacobson mentions, it is highly likely to have a basis in an earlier reality. However, Jacobson’s comments about the non-existence of an early British Jewish accent are not directly applicable to London, since his grandparents had settled in Manchester. So far, our evidence for the presence or absence of a London Jewish accent at the turn of the twentieth century is circumstantial and inconclusive, and favours absence. But given what we know from contemporary situations in different parts of the world, including Britain, ethnically distinct varieties of host languages are far from rare. Can the turn-of-the-century East End have been so different? Wells (1982) suggests that there is, indeed, a London Jewish accent. He writes:

Another subvariety is Jewish, characterized (at least in its stereotype) by laminal rather than apical pronunciation of /t, d/ and by the use of a velarized labio-dental approximant, a dark [ʋ], for /r/; also, often, but the use of [-ŋɡ-] in singer, etc. (Wells 1982: 303)

Following on from Wells’s comments, Foulkes and Docherty (2000: 37–38) mention the absence of orthographic <r> in Dickens’s portrayal of Jewish characters, as in tyfling for trifling, where the omission of /r/ could correspond to a labiodental [ʋ] when spoken in a labial environment (a following /f/), and they go so far as to suggest that London Jewish speech might be the origin of labiodental /r/ in British English (however, see remarks in Fabricius, this volume).


But there is an important caveat: note Wells’s hedge ‘at least in its stereotype’. This suggests that he himself has not heard these variants – and he has since confirmed this to be the case (Wells, p.c. 2013). This stereotype presumably has the same source as the comedian Peter Cook’s 1970s representation of Jewish speech, but we cannot easily tell whether the source is the early or the mid-twentieth century. Our argument suggests that the conditions were right for a Jewish ethnolect to arise shortly after 1900, and that the more diffuse immigration from the 1930s probably did not meet these conditions. Since there are no other (published) observations about a London Jewish accent, existing in the present or in the past, we need to look elsewhere for evidence. In order to judge whether there have been contact-based influences on London English, we need an indication of the degree of stability and change during the period with which we are concerned, the twentieth century up to 1980. To do so, we turn to audio recordings of individuals who were born and raised in the East End before 1900, as well as archive recordings of people born between 1931 and the mid-1950s. Together, these will give us a picture of the stability, or otherwise, of East London vernacular vowel systems before the rapid vowel changes that set in with the appearance of MLE in the last two decades of the century. 4 Sivertsen’s Cockney Phonology: Mid-century recordings as a window

early twentieth-century East End speech

Eva Sivertsen’s Cockney Phonology (1960) provides us with some of the earliest speakers for whom we have extensive recordings. This was one of the first investigations of an urban dialect in Britain, but can really be regarded as traditional dialectology in an urban environment because of the small number of informants, the lack of any quantification of results and an unsystematic treatment of variation. Sivertsen carried out her fieldwork in 1949 and 1956–7 and it constituted the data for both her MA and PhD theses, the latter being later published as Cockney Phonology. We were able to acquire Sivertsen’s original tapes from her family in 2009. She had four named informants, all women born in 1874–1892, though she also based her study on other speakers she met in and around the social club where she carried out the fieldwork. However, only two of the four women’s data were subjected to extensive analysis. Sivertsen’s informants came from Bethnal Green, and grew up in areas which the map, shown in Map 1, and Baker (1998) tell us had a high proportion of Jews. The informants’ formative years would have been around 1900, exactly the time of the map. Of the two women, one talks extensively about the Jewish neighbourhoods and her own close relations with the people. Before we consider these women’s vowel systems, we summarise Sivertsen’s broader conclusions about Cockney pronunciation. Sivertsen’s analysis is mainly auditory, and was based on tape recordings of interviews, reading lists of words and phrases, and note taking – though she also carried out a small-scale acoustic study of some vowels. The emphasis is on Cockney phonological structure, including a comparison with RP vowels, though all consonants are also discussed, in particular with reference to glottalisation and the realisation of the liquids /r/ and /l/. Neither language contact nor the multi- (or perhaps bi-) cultural nature of Bethnal Green is specifically discussed in Cockney


Phonology. She does note changes in speech due to the influence of education and more standard ways of speaking. In addition, she observes the effect of the speaking situation, i.e. the interlocutor, on the use of particular phonological and morpho-syntactic forms. She is aware of up-to-date structuralist contact linguistics, however:

Some people have a better ear for dialect differences than others. Some consider it more important to approach a standard speech form than others do. The result is a conflict not only between two, but between a great number of different speech forms, such as we find it in many large urban areas today. There is interference on a large scale, but of a type which is not easily subjected to the kind of analysis proposed by Uriel Weinreich. There are erratic pronunciations, vacillations, uncertainty, lack of consistency. (Sivertsen 1960: 3)

Weinreich (1953) deals with contact, but Sivertsen does not pursue this line of enquiry further. Sivertsen was interested in speech forms ‘when the speakers are most off their guard, when they are less conscious of how they speak, in so far it is possible to make such an abstraction’ (Sivertsen 1960: 4), and in this regard she not only anticipates Labov’s later formulation of the ‘vernacular’, but also his belief in the centrality of this speech style (Labov 1966). However, in her account Sivertsen does not ascribe features to particular informants, so that what she presents is an impressionistic distillation of the data, generalised to the community as a whole. Sivertsen’s description of Cockney is, however, very detailed. For vowels, Sivertsen found shifted diphthongs in FACE (see Wells 1982), transcribing them as [ɛɪ] or [æɪ] (symbols are as in the original; the superscript ˘ indicates a ‘non-syllabic’ element). She notes considerable variation: the higher variant is found in more formal styles and with speakers who are considered less ‘rough’ (Sivertsen 1960: 57). GOAT has [œʊ] or [œ ʊ] with no indication of stylistic or social variation (1960: 88). This transcription represents a front onset, but a high-back offset, and this suggests that the fronting of the offset of this vowel had not started (Kerswill, Torgersen and Fox 2008). PRICE has [ɑɪ], which can be more or less monophthongal, with a typically unrounded onset (Sivertsen 1960: 64). MOUTH has a front, fairly open onset which may be monophthongal [aˑ] or slightly raised diphthong [ɛə] (Sivertsen 1960: 88–89). GOOSE is ‘strongly diphthongized and considerably more fronted [than RP]’ with a quality in the area of [əu] (Sivertsen 1960: 81). FOOT is generally [ʊ]. TRAP is fairly front and slightly raised, [ɛˑ] or [æ] (Sivertsen 1960: 59–60). There is also variation in STRUT which may be a front or central vowel, which she transcribes [ʌ], or a more retracted vowel in some contexts, in particular before /r/ and /l/ and before vowels (Sivertsen 1960: 83-84). DRESS is [e], between half-close and half-open (Sivertsen 1960: 53). Sivertsen states that her two main informants are EE and MM (she does not give their names), with the former as the more important source. EE was born around 1890 and lived in or near Brick Lane all her life (the street runs north–south, and is located in the upper half of Map1), having worked as a feather-curler and housewife. Looking at F1–F2 plots of the informants’ vowels will enable us to establish a base line for what Sivertsen considered representative of, or at least


canonical for, the accent. The plot shown in Figure 1 presents average formant values, with Lobanov normalisation (Lobanov 1971). For the diphthongs (excluding GOOSE), only the vowel onset is represented. The plot is based on an automated analysis of over 7,500 vowel tokens. For this speaker, the sound files were subjected to forced alignment of segments using the procedure developed at the University of Pennsylvania. A visual inspection of the completed alignment was carried out, and obvious alignment errors were manually corrected. Automatic formant extraction was done using the online tool FAVE Extract (Rosenfelder et al. 2011). EE has a typical London English vowel system: a fairly back FOOT and diphthong-shifted FACE and GOAT (i.e. with open onsets; Wells 1982: 308; Kerswill et al. 2008: 4) and PRICE (with a back onset in the same position or slightly above START). STRUT is the lowest short vowel – a conservative feature (see Trudgill 2004: 44–45; 133). Comparing these plots with Sivertsen’s transcriptions, we see that there is generally a good match. DRESS is higher than expected from Sivertsen’s description, and GOOSE is more front than she indicates. TRAP is a front vowel. For STRUT, Sivertsen allows for considerable variation on the front/back dimension; the plot suggests that EE’s vowel is towards the back of the range. Impressionistically, EE uses a somewhat careful style. There is no TH-fronting (it did not become a majority form until the latter part of the twentieth century), but there is some glottalling intervocalically in words like getting with a syllabic /n/. Glottal stops are also found word-finally before a vowel, such as right in and got into. Wells (1982: 324) states that this feature can be found in ‘educated’ London speech, but not in Received Pronunciation. However, glottal replacement of /d/ is also common in EE’s interview in couldn’t, didn’t and wouldn’t, a well-established Cockney feature. Surprisingly, EE is a frequent user of alveolar approximants intervocalically across a word boundary (‘t-to-r’; see Clark and Watson 2011) in phrases like got it, what I and a lot of, a feature not usually described for London, but widespread in Northern England and in vernacular Dublin English (Hickey 2005: 41). She has an alveolar or labiodental /r/, but no taps.


Figure 1. Vowel system of EE, female born c. 1890 and recorded in 1956,

showing mean vowel onsets and 0.5 standard deviations (Lobanov normalisation).

As a comparison, we analysed the vowels of H. J. Kent, a man born in 1888 in the borough of Hackney, which borders onto Bethnal Green. Data for him comes from the Survey of English Dialects (SED) recordings held by the British Library and accessible online. Forced alignment was not used; 128 tokens were analysed by hand, representing all the stressed vowels found in the SED interview on the British Library website. 18 tokens were measured for the most frequent vowel (FACE), with three being measured for the least frequent (NURSE). Figure 2 shows that Mr Kent has a similar vowel system to EE’s, but has a more shifted FACE, a more back FOOT and a less back/more open STRUT, which is clearly the lowest vowel in the system. The two speakers share fairly high qualities for KIT, DRESS and TRAP.


Figure 2. Mr H. J. Kent, mean vowel onsets (Lobanov normalisation), showing

0.5 standard deviations. A somewhat different picture emerges with Sivertsen’s other main informant, MM. She was born in 1892 and had lived at the corner of Bethnal Green Road and Brick Lane all her life. We are told that she had an Irish family background, though no further details are provided. She worked as an upholsterer and in a tea factory. Sivertsen states that her neighbours considered her to be ‘a ‘real, rough Cockney girl’, in speech and manners’ (Sivertsen 1960: 7). Figure 3 shows MM’s vowel system. Because the sound quality on her recordings was relatively poor, automatic formant tracking was not possible, so a smaller subset of her tokens were analysed by hand using Praat. A total of 194 tokens were analysed, ranging from 30 for the most frequent vowel (FACE) down to two for the least frequent (CHOICE). Monophthongs were measured at the midpoint, while diphthongs were measured at the steady state portion of the spectrogram immediately after the onset but away from influence of preceding segments. This was about one quarter of the way into the vowel.


Figure 3. Vowel system of MM, female born in 1892 and recorded in 1956

(based on a manual analysis of vowels; Lobanov normalisation), showing 0.5 standard deviations.

Her short front vowels, KIT, DRESS and TRAP, are lower than those of EE and Mr Kent, and her STRUT is not the lowest short vowel in her system. These differences are consistent with the early stages of a mid-to-late twentieth century short-vowel shift in the London region (Torgersen and Kerswill 2004) which is not present in the latters’ systems. Compared with EE, MM has a raised MOUTH and lowered FACE, suggesting more advanced diphthong shifting than her coeval. This is true at least for FACE, for which a shifted vowel is probably a twentieth-century innovation (see Trudgill 2004: 55–57 for evidence of this). Shifted diphthongs in MOUTH were well established in the mid-nineteenth century (Trudgill 2004: 52, citing Ellis 1889), and may in fact be a conservative feature and, therefore, ‘shifting’ a misnomer (Britain 2009). There is evidence that diphthong shifting in FACE was an ongoing process in the Southeast of England from the late nineteenth century at least until the 1950s, spreading out from London. Trudgill (2004: 51–59) summarises evidence from Ellis (1889) and the Survey of English Dialects (Orton and Tilling 1970), as well as other publications, to show that there was a gradual diffusion of this feature throughout this period. Diphthong shifting of FACE was most likely still a live process in London around the turn of the twentieth century and later when MM and EE were growing up. (In London’s inner city, diphthong shifting of all the relevant vowels is currently being reversed, as a result, we argue, of post-World War II language contact – see Kerswill et al. 2008 and Cheshire, Kerswill, Fox and Torgersen 2011.) It is possible to argue, then, that MM’s vowel system is more ‘advanced’ in two respects: it shows participation in the short-vowel


shift as well as the results of continued diphthong shifting of FACE. The caveat here is that the height of the onset of this diphthong was socially sensitive and possibly subject to style shifting: some, but probably not all, of the differences between MM’s and EE’s FACE might be due to the latter’s somewhat careful speech style. 5 Cockney vowels 1930–1970 We turn now to recordings of Londoners born one or two generations after Sivertsen’s informants. The project Linguistic Innovators: The English of Adolescents in London (Economic and Social Research Council, 2004–7; see Kerswill et al. 2008) included recordings of eight elderly East Enders born in 1918–35. Figure 4 shows the vowel system of Mr MG, born in 1931.

1.5 1.0 0.5 0.0 -0.5 -1.0 -1.5

1.5

1.0

0.5

0.0

-0.5

-1.0

-1.5

F*2

F*1

TRAP

DRESS

STRUT

START

GOOSE

KIT

FOOT

LOT

CHOICE

FACE

GOAT

MOUTH

PRICE

Figure 4. Mr MG, elderly male speaker from Hackney (b. 1931, recorded 2005). It shows a system similar to EE, MM and Mr Kent, with the low-central STRUT vowel of EE and Mr Kent and the relatively extreme diphthong shifting of MM and Mr Kent. This suggests a certain stability over a 50-year period between the birth of the former three and Mr MG. The speech of two individuals born around 1944 and 1955, respectively, recorded in their adolescent years, brings the comparison to the middle of the century. One of these is PF, a girl aged around 12, who was recorded having a lively conversation with Sivertsen. Figure 5 shows her vowel system.


Figure 5. PF, female aged 12 (recorded in 1956 by Eva Sivertsen). PF shows many of the same characteristics: central GOOSE, back FOOT and strongly shifted MOUTH and PRICE. STRUT is still the lowest short vowel, and in this respect she is conservative in relation to MM, though TRAP has moved down to occupy almost the same space. DRESS and KIT are lower than those of EE and Mr MG, but are similar in relative height to those of MM. PF’s short vowels, then, share with MM the beginnings of participation in the Southeastern short-vowel shift noted earlier. Unlike the older speakers, FACE does not show diphthong shift. Finally, we will look at the speech of a boy aged about 13, recorded by William Labov in Southall (West London) in 1968.


Figure 6. Boy aged 14-15 (1968). Recorded in Southall by William Labov.

Manual analysis. In almost all respects, this boy’s system, shown in Figure 6, is very conservative, with a front, raised TRAP and a front-central STRUT which is the lowest of all vowels by some distance. He shows, therefore, no sign of the Southeastern short-vowel shift which we detect in MM (who was eighty years his senior) or PF. MOUTH, FACE and GOAT are very strongly diphthong-shifted. For FACE and GOAT, this is probably best interpreted as a continuation of an ongoing process. For MOUTH the position is not clear, for reasons we have just given; however, the onset of MOUTH is higher than for any of the older informants discussed here, and this suggests a (new but time-limited?) raising process. Despite a number of uncertainties about the movement, or indeed stasis, of some of the vowels, the overall picture is one of considerable stability across nearly eighty years, with speakers born in the 1880s quite closely matching those of at least some people born in the mid 1950s. This is in spite of great economic and social change, a locally high level of immigration at the beginning of the period and the start of mass immigration at the end of it. Falling within this period were the two World Wars, with their dramatic disruption of families and neighbourhoods. None of these factors, however, seems to have had any impact on the vowel systems of the working-class population of the capital. Instead, change amounted to a slow, small, Neogrammarian chain shift in the short vowels. 6 Language contact and linguistic change in the turn-of-the-century East

End?


The deliberately descriptive approach we have taken so far has excluded considerations of contact and social factors. Earlier we argued that the best place to look for contact-induced change in London is in highly circumscribed, local communities in specific time periods. We therefore return to the Jewish parts of Bethnal Green of 120 years ago. In the previous section, we saw how MM seems to have a markedly ‘modern’ vowel system by comparison with her contemporaries EE and Mr Kent and, indeed, the young speaker born three generations later. MM’s vowels still fall within the envelope of a working-class London accent. However, some listeners today report that they hear something ‘foreign’ about her accent: members of audiences at academic presentations involving the Sivertsen data have commented that her pronunciation suggests either an Italian or an East European influence. Regardless of whether these observations are reliable, it is worth carrying out a closer phonetic analysis. We do this in two phonetic areas where varieties of English which have experienced substantial and prolonged language contact appear to differ from ‘inner-circle’ varieties, such as those spoken in southern England or by North American, New Zealand or Australian descendants of European settlers. The first of these areas concerns speech rhythm, as captured by the Pairwise Variability Index, or PVI, which measures the degree of stress timing in a language – in other words, whether stressed syllables in discourse tend to occur at equal intervals (Torgersen and Szakay 2012). PVI is calculated as the proportional difference between the durations of adjacent syllables in a sample of speech. The more unequal the syllables are, the higher the PVI will be, and the closer to an idealised stress timing the sample is. If syllables are more nearly equal, the PVI will be lower, and the sample is closer to being syllable timed (see Torgersen and Szakay 2012: 824–825 for a more detailed account of the PVI measure). As is usual practice, the PVI we use here is normalised for speech rate, and is known as nPVI (Grabe and Low 2002; Torgersen and Szakay 2012). As an example of a putatively syllable-timed language, French has a low nPVI of 43.5, while ‘British English’ (more specifically, Standard Southern British English or RP) has a score of 57.2, making it relatively stressed timed (Grabe and Low 2002: 544). In what follows, we present the nPVI for a number of varieties of English, three of which are clearly contact varieties (Māori English [Szakay 2006, 2008], Singapore English [Grabe and Low 2002] and MLE as spoken by teenagers in the multilingual London borough of Hackney). Contact varieties tend to have a greater tendency towards syllable timing than other varieties (Grabe and Low 2002), and it is this fact that will interest us. Figure 7 shows nPVI values for these speaker groups and for Southern Standard British English. For London’s East End, the figure shows data for four groups of speakers from the Linguistic Innovators project: from Hackney, there are elderly Anglos (people of white British background), young Anglos and young non-Anglos (the children of immigrants, almost all from developing countries). The latter groups were 17–19 years of age. By way of comparison with matched speakers from a London borough with low language contact, we include data from older and younger Anglo speakers from Havering. Finally, we include nPVI scores for EE, MM and the remaining two elderly informants recorded by Sivertsen, Mrs C and Mrs P.


Figure 7. Normalised Vocalic Pairwise Variability Index (nPVI) for thirteen

speakers/speaker groups (see text for explanations and sources). (nPVI) values

The speakers/groups in Figure 7 have been ranked by descending nPVI, with the more stress-timed voices towards the left of the figure. Contact varieties have been highlighted in grey; we provisionally take younger Hackney speakers of any ethnic background to represent MLE, and hence a contact variety, because of the multi-ethnic and multilingual nature of the communities here. The four Sivertsen informants’ bars are coloured black. Overall, we note that all but one of the non-contact varieties cluster in the left-hand half of the figure, and that all contact varieties are located on the right. Unexpectedly given their apparently homogeneous social backgrounds, Sivertsen’s informants are spread right across the spectrum, with EE’s score exceeding that of any other speakers/groups and Mrs P and Mrs C being placed well towards the syllable-timed end of the spectrum, clustering with the contact varieties. MM, however, is near the centre of the spectrum. We therefore need to square this result with our notion that MM’s spoken English might have a ‘foreign’ element to it, and that part of that impression is syllable timing. Despite their greater syllable timing, Mrs P and Mrs C do not sound ‘foreign’ in the way that MM does to some listeners.


One way of approaching this is to look at the East End as a locale which has seen waves of immigration over centuries, with the result that there has been virtually no period without a substantial number of non-Anglophone incomers in the communities. The nPVI for the elderly Hackney Anglos we interviewed in 2005 matches that of the young Anglos, whose language socialisation very clearly involves high degrees of contact with non-native English. Neither the elderly nor the young people in Havering have similar nPVI values, instead grouping with the prototypical non-contact varieties. The implication of this is as follows. The elderly Anglos from Hackney, born in the 1920s and 30s, were raised in communities which were bi- or multilingual or which had been so in the period immediately before their linguistic socialisation. For parts of the East End, especially Bethnal Green, this was the case. The language variety spoken by MM may therefore not have been atypical; low nPVI values could have been part of the local accent in Hackney, and this is reflected in the scores of all but one of the East End speakers whose nPVIs we have measured. The odd person out is EE, and perhaps it is she who is the exception, not MM. None of this is true of Havering, which was and remains relatively monolingual, and whose population continues to have high nPVI values. The difference between MM’s and EE’s nPVIs suggests the presence of intervening social factors. Before we examine these, we will pursue the phonetic differences between the two of them a little further. This time, we hypothesise an influence from Yiddish itself, specifically voice onset time (VOT). This is the duration of the audible burst in stop consonants, such as /t/, before voicing begins. In some languages, including English, initial voiceless consonants are said to be aspirated, with a relatively long duration compared to, say, Greek or Spanish, which have short VOTs for /t/. Yiddish, along with Dutch and a number of south German varieties, also has a notably short VOT (Iverson and Salmons 1995; Jewish Language Research Website n.d.), and this feature is a strong candidate for adoption in cases of language shift. It is also a variable characteristic of English as spoken in some British Asian communities, where transfer of this feature from Panjabi, Bengali or Sylheti appears to have taken place (Kirkham 2011, McCarthy et al. 2013). With this in mind, we analysed the VOT in an oral history interview with a Jewish East Ender, Philip Bernstein, who was born in Hackney in 1910 of Lithuanian/Russian parentage (National Archives n.d., recorded 1992). Figure 8 shows the VOT values of two reference varieties, Southern British English as spoken by university students (Docherty 1992) and standard varieties of American English (Lisker and Abramson 1964), as well as EE, MM and Philip Bernstein.


Figure 8. Voice onset time (VOT) measurements for American English, Southern

British English and three East Londoners born 1890–1910. We can see that EE groups with Southern British English, while MM lies halfway between EE and the (presumably Yiddish influenced) Mr Bernstein. This may well indicate a similar Yiddish-derived feature in her speech. MM is not, however, Jewish, but one lesson of recent studies of multiethnolects is that linguistic features can be used by speakers whose home language and heritage are not the origin of those features (Svendsen and Røyneland 2008, Cheshire et al. 2011). Our argument is, then, that East End Cockney around the turn of the last century and beyond contained at least some phonetic features which may have come about initially through transfer following language shift, subsequently becoming a permanent feature of the local variety of English, through a process of at least embryonic focussing (RH). The data is consistent with this interpretation, though in the absence of recordings taken from a representative sample of the city’s population in the relevant period it is not possible to tell if the argument matches the reality of the time. We have not yet addressed the reasons for the differences between EE and MM in both of these features (speech rhythm and VOT). As we have already noted, these two people grew up in very similar neighbourhoods and have, so far as we can tell, very similar social histories. Could there be a difference in their social networks at a critical time in their earlier lives? Very strikingly, MM discusses at considerable length her contacts with, and attitudes to, the local Jewish population, in a way that none of the other three of Sivertsen’s informants do. MM was heavily involved with the Bethnal Green Jewish community when growing up, and still was at the time of the interview. She knows about their religion and their cooking, and comes up with a number of Yiddish words:


now we can do horseradish (.) have you ever heard of horseradish? (.) and beetroot? [Eva: yes horser er horseradish] chrein [kɹɛɪn] we call it ... mixed together [Eva: horseradish and what mixed together?] beetroot [Eva: mm oh I haven’t tried that] I’ll show you a jar of it [Eva: mm] (.) no I suppose you haven’t seen it? [Eva: no I don’t think so] you can eat it with cheese meat lamb whatever you like [Eva: oh mhm] (.) open it] go on [Eva: horseradish and beetroot] that’s it and it’s hot (.) so smell it [Eva: yes] it’s very hot [Eva: oh yes it is you take with er?] (.) you eat that with cheese [Eva: aha] or meat [Eva: mhm] (.) or anythink you like [Eva: yes] hm (.) that’s what it’s made of [Eva: mhm] (.) they call it (.) chrein [kɹɛɪn] (.) Jews call this chrein [kɹɛɪn]

She describes her job as a Shabbos goy (though she doesn’t use the term):

I like Jews Jewesses rather but I get along with them all right all me life I suppose it’s living down here in Brick Lane when I was a little about ten (.) not cos we wanted it but we used to go there like five (.) to light their fire on a Saturday (.) we light their fire (.) they call them frum [fɾʊm] a good Yiddisher person is frum [fɾʊm] you see

MM gives chrein a mainstream English pronunciation with initial [kɹ], rather than Yiddish [xr]. Frum receives the Yiddish [ʊ], which suggests first-hand knowledge of the word, if not the language. She uses a tap [ɾ] in this word, though this is characteristic of her speech more generally: /r/ in /fr/ and /θr/ clusters is usually a tap, as it is intervocalically, including linking /r/, as in after all. It is not certain the extent to which [ɾ] was a normal pronunciation in London at this time, but it is likely that the retroflex [ɻ] or alveolar [ɹ] approximant was widespread in southern England in the nineteenth century, alongside weakly tapped variants: referring to Southern dialects, Ellis (1889: 23) writes ‘The one ancient character which runs more or less persistently through the modern S. div. [Southern Division] is the reverted (ʀ) or retracted (r,), the parent of the point-rise or untrilled (ro) or vocal (ɐ), which still permeates received speech’. The symbol (ro) refers to a ‘buzzed’ sound ‘not touching the palate’ (Ellis 1889: Preliminary Matter p. 85), while (ʀ) may have a ‘flap [which is] indistinct and less sharp than for (r)’ (Ellis 1889: Preliminary Matter p. 85), while it is often a retroflex sound that characteristically ‘seems to blend with the preceding vowel’ (Ellis 1889: 23). Taps may well be a conservative feature (as pointed out by Trudgill 2004: 71–72), but the vigorously articulated taps produced by MM do not sit well with what is otherwise a rather modern vowel system and marked syllable timing compared to Sivertsen’s other informants. We would speculate that this, too, is a transfer feature from Yiddish (but see Chapter X for a discussion of a tap in Received Pronunciation without any reference to contact). 7 Discussion


Cockney in the twentieth century displays considerable continuity and slow change, suggesting that its transmission has been through an unbroken chain of intergenerational transfer (Thomason and Kaufman 1988: 9–10). Labov makes the strong claim that such transmission precludes language and dialect contact, and that cases of contact must be treated separately (Labov 2007). It is clear that the social conditions in the East End around the turn of the last century would have been propitious for transfer through language shift, and the greater than usual degree (for English in southern England) of syllable timing and a short VOT might well be transfer features of this kind. The use of tapped /r/ could fall into the same category. These features could have been transmitted to non-speakers of Yiddish of Jewish, Anglo and other backgrounds. The first two features (at least the tendency towards syllable timing) may well be restricted to the East End (we lack data to state this with any certainty); if that is true, then we are probably dealing with a long-term, stable contact phenomenon, which is reinforced by successive waves of immigration and which is likely to be swamped if the migration were to cease. Vowel qualities, however, appear to be unaffected by this process, though it is highly likely that the move to greater syllable timing is partly reflected in durational changes in the vowels and the loss of some reduced forms (see Torgersen and Szakay 2012 for discussion). As we have argued elsewhere (Cheshire et al. 2011), the situation in the late twentieth century is altogether different, in two respects. First, there are now upwards of 200 languages in the mix, compared to a handful a century ago, and just two (English and Yiddish) in Bethnal Green. Secondly, immigrant populations of 30 per cent and higher are now pervasive throughout London, and not restricted to just a few wards in some boroughs as was the case 120 years ago. We argue that language acquisition is now characterised by group second-language learning (Winford 2003) in the context of a feature pool (Mufwene 2001: 4–6): children and adolescents are acquiring their linguistic and sociolinguistic competence in a context where often a majority of other people are not first-language speakers of English. This applies also to those whose home language is English and are exposed to traditional varieties of London English. What the present has in common with the past is that intensive language contact leads to (potentially) long-term changes. In MM’s voice, we may just be catching a glimpse of a long-extinct, Yiddish-influenced way of speaking. References All About Jewish Theatre n.d. London: the end of Yiddish theatre.

http://www.jewish-theatre.com/visitor/article_display.aspx?articleID=3504. Accessed 8/4/14.

Arkell, George 1899 [2012]. Jewish East London. Map reproduced by the Museum of London, 2012. Oxford: Old House.

Baker, T. F. T. 1998. Bethnal Green: Economic History, A History of the County of Middlesex, Vol. 11: Stepney, Bethnal Green (1998), 168-190. http://www.british-history.ac.uk/report.aspx?compid=22757. Date accessed 27 May 2014.

http://www.jewish-theatre.com/visitor/article_display.aspx?articleID=3504

http://www.british-history.ac.uk/report.aspx?compid=22757


Baugh, Albert C. and Thomas Cable 1993. A History of the English Language Fourth edition. London: Routledge.

Britain, David 2009. One foot in the grave?: Dialect death, dialect contact and dialect birth in England. International Journal of the Sociology of Language 196/197: 121-155.

Cheshire, Jenny, Paul Kerswill, Susan Fox and Eivind Torgersen 2011. Contact, the feature pool and the speech community: The emergence of Multicultural London English. Journal of Sociolinguistics 15: 151-196.

Clark, Lynn and Kevin Watson 2011. Testing claims of a usage-based phonology with Liverpool English T-to-R. English Language and Linguistics 15: 523-547.

Cook, Beverley 2012. The Jewish East End, 1899. Notes accompanying a facsimile edition of George Arkell’s map ‘Jewish East London’. Oxford: Old House.

Docherty, Gerard 1992. The Timing of Voicing in British English Obstruents. Berlin: Foris.

Ellis, Alexander 1889. The Existing Phonology of English Dialects, Compared with that of West Saxon Speech. New York: Greenwood Press.

Foulkes, Paul and Gerard Docherty 2000. Another chapter in the story of /r/: ‘Labiodental’ variants in British English. Journal of Sociolinguistics 4: 30-59.

Fox, Susan 2007. The demise of Cockneys? Language change among adolescents in the ‘traditional’ East End of London. Unpublished PhD dissertation. Colchester: University of Essex.

Grabe, Esther and Ee Ling Low 2002. Durational variability in speech and the rhythm class hypothesis. In: Carlos Gussenhoven and Natasha Warner (eds), Papers in Laboratory Phonology, Vol. 7. Mouton: Berlin, pp. 515-546.

Green, Jonathon 2014. Multicultural London English: the new ‘youthspeak’. In: Julie Coleman (ed.). Global English Slang: Methodologies and Perspectives. London: Routledge, pp. 49-61.

Hansen, Gert Foget and Nicolai Pharao 2010. Prosody in the Copenhagen multiethnolect. In: Pia Quist and Bente Ailin Svendsen (eds) Multilingual Urban Scandinavia. Bristol: Multilingual Matters, pp. 79-95.

Heselwood, Barry and Louise McChrystal 2000. Gender, accent features and voicing. Leeds Working Papers in Linguistics and Phonetics 8: 45-69.

Hickey, Raymond 2005. Dublin English. Evolution and Change. Amsterdam: John Benjamins.

Inwood, Stephen 1998. A History of London. London: Macmillan. Iverson, Gregory K. and Joseph C. Salmons 1995. Aspiration and laryngeal

representation in Germanic. Phonology 12: 369-396. Jewish Language Research Website. n.d. http://www.jewish-

languages.org/yiddish.html. Accessed 7/9/14. Kerswill, Paul 2014. The objectification of ‘Jafaican’: The discoursal embedding of

Multicultural London English in the British media. In: Jannis Androutsopoulos (ed.) The Media and Sociolinguistic Change. Berlin: de Gruyter Mouton, pp. 428-455.

Kerswill, Paul, Jenny Cheshire, Susan Fox and Eivind Torgersen 2013. English as a contact language: The role of children and adolescents. In: Marianne Hundt and Daniel Schreier (eds) English as a Contact Language. Cambridge: Cambridge University Press, pp. 258-282.

http://www.jewish


Kerswill, Paul, Eivind Torgersen and Susan Fox 2008. Reversing ‘drift’: Innovation and diffusion in the London diphthong system. Language Variation and Change 20: 451-491.

Kirkham, Sam 2011. The acoustics of coronal stops in British Asian English. Proceedings of the XVII International Congress of Phonetic Sciences, pp. 1102-1105.

Labov, William 1966. The Social Stratification of English in New York City. Washington: Center for Applied Linguistics.

Labov, William 1994. Principles of Linguistic Change Vol 1: Internal Factors. Oxford: Blackwell.

Labov, William 2007. Transmission and diffusion. Language 83: 344-387. Lewis, H. S. 1901. Another view of the question. In: Russell, C. and H. S. Lewis

The Jew in London. New York: Thomas Y. Crowell, pp. 155-236. Lisker, Leigh and Arthur S. Abramson 1964. A cross-language study of voicing in

initial stop: Acoustical measurements. Word 20: 384-422. Lobanov, B. M. 1971. Classification of Russian vowels spoken by different

speakers. Journal of the Acoustical Society of America 49: 606-608. McCarthy Kathleen M., Bronwen G. Evans and Merle H. Mahon 2013. Acquiring a

second language in an immigrant community: The production of Sylheti and English stops and vowels by London-Bengali speakers. Journal of Phonetics 41: 344-358.

Moving Here n.d. http://www.movinghere.org.uk/galleries/histories/jewish/growing_up/growing_up.htm. Accessed 27 April 2014.

Mufwene, Salikoko 2001. The Ecology of Language Evolution. Cambridge: Cambridge University Press.

National Archives. n.d. Interview with Anna Tzelniker and Philip Bernstein. http://webarchive.nationalarchives.gov.uk/+/http://www.movinghere.org.uk/search/catalogue.asp?catphase=FullDetails&phase=&RecordID=56660&ResourceTypeID=2&Sequence=1&keywords=&fuzzy=&format=&community=&theme=&date_from=&date_to=&source=&section=&person=&PageMove=Goto&PageNo= . Accessed 26/4/14.

Nevalainen, Terttu and Helena Raumolin-Brunberg 2003. Historical Sociolinguistics. Harlow: Longman.

Nevalainen, Terttu and Helena Raumolin-Brunberg 2000. The changing role of London on the linguistic map of Tudor and Stuart England. In: Dieter Kastovsky and Arthur Mettinger (eds) The History of English in a Social Context. Berlin: Mouton de Gruyter, pp. 279-337.

Orton, Harold, and Philip M. Tilling 1970. The Survey of English Dialects, Volume 3: The East Midland Counties and East Anglia. Leeds: Arnold.

Porter, Roy 1994. London: A Social History. London: Penguin. Rosenfelder, Ingrid, Joe Fruehwald, Keelan Evanini and Jiahong Yuan 2011.

FAVE (Forced Alignment and Vowel Extraction) Program Suite. http://fave.ling.upenn.edu.

Russell, C. 1901. The Jewish question in the East End. In: C. Russell and H. S. Lewis The Jew in London. New York: Thomas Y. Crowell, pp. 1-148.

Russell, C. and H. S. Lewis 1901. The Jew in London. New York: Thomas Y. Crowell.

http://www.movinghere.org.uk/galleries/histories/jewish/growing_up/growin

http://webarchive.nationalarchives.gov.uk/+/http://www.movinghere.org.uk/s

http://fave.ling.upenn.edu


Sharma, Devyani 2011. Style repertoire and social change in British Asian English. Journal of Sociolinguistics 15: 464-492.

Schmid, Monica S. 2002. Persecution and identity conflicts: the case of the German Jews. In: Anna Duzsak (ed.) Us and Others. Amsterdam: John Benjamins, pp. 341-356.

Sivertsen, Eva 1960. Cockney Phonology. Oslo: Oslo University Press. Stuart-Smith, Jane, Claire Timmins and Farhana Alam 2011. Hybridity and ethnic

accents: A sociophonetic analysis of ‘Glaswasian’. In: Frans Gregersen, Jeffrey K. Parrott and Pia Quist (eds) Language Variation - European Perspectives III: Selected Papers from the fifth International Conference on Language Variation in Europe (ICLaVE 5), Copenhagen, June 2009. Amsterdam: John Benjamins, pp. 43-57.

Svendsen, Bente Ailin and Unn Røyneland 2008. Multiethnolectal facts and functions in Oslo, Norway. International Journal of Bilingualism 12: 63-83.

Szakay, Anita 2006. Rhythm and pitch as markers of ethnicity in New Zealand English. In: Paul Warren and Catherine I. Watson (eds) Proceedings of the eleventh Australian International Conference on Speech Science and Technology. University of Auckland, New Zealand, December 6-8, 2006, pp. 421-426.

Szakay, Anita 2008. Ethnic Dialect Identification in New Zealand: The Role of Prosodic Cues. Saarbrücken: VDM.

Thomason, Sarah Grey and Terrence Kaufman 1988. Language Contact, Creolization and Genetic Linguistics. Berkeley: University of California Press.

Torgersen, Eivind and Paul Kerswill 2004. Internal and external motivation in phonetic change: Dialect levelling outcomes for an English vowel shift. Journal of Sociolinguistics 8: 23-53.

Torgersen, Eivind and Anita Szakay 2012. An investigation of speech rhythm in London English. Lingua 122: 822-840.

Trudgill, Peter 1998. African American vernacular English, East Anglian dialects and Spanish persecution in the Low Countries. Folia Linguistica Historica 8: 139-148.

Trudgill, Peter 2004. Dialect Contact and New-Dialect Formation: The Inevitability of Colonial Englishes. Edinburgh: Edinburgh University Press.

Weinreich, Uriel 1953. Languages in Contact: Findings and Problems. New York: Linguistic Circle of New York.

Wells, J. C. 1982. Accents of English. 3 vols. Cambridge: Cambridge University Press.

Winford, Donald 2003. An Introduction to Contact Linguistics. Oxford: Blackwell. Wölck, Wolfgang 2002. Ethnolects - Between bilingualism and urban dialect. In: Li

Wei, Jean-Marc Dewaele and Alex Housen (eds) Opportunities and Challenges of Bilingualism. Berlin: Mouton de Gruyter. pp. 157-170.

Watson and Clark Origins of Liverpool English --- Page 101 of 525

6 The origins of Liverpool English Kevin Watson and Lynn Clark 1 Introduction The accent of Liverpool, in the north-west of England, is one that many non-linguists find easy to identify (see e.g. Montgomery 2007). Linguists, too, often acknowledge the distinctiveness of the variety, pointing out that it is ‘not quite like its neighbours’ (Honeybone 2007: 106). Trudgill (1999: 65), for example, uses dialect maps to paint a picture of a dialectal island which separates Liverpool and some other parts of Merseyside, the region in which Liverpool sits, from nearby localities. Many of the phonological characteristics which are diagnostic of Liverpool English are well known, both to lay speakers (see Honeybone and Watson 2013) and linguists, and include the following: (1) the absence of rhotic /r/, when other parts of the nearby county of Lancashire are rhotic, (2) TH/DH-stopping, which is common in varieties of Irish English (Hickey 2007: 326-332) but is not found in other parts of north-west England, (3) a lack of contrast between the lexical sets NURSE and SQUARE, which are both realised as a front vowel, [ɛː],1 which results in homophonous pairs of words such as hair/her and fair/fur, and (4) plosive lenition, which, like TH/DH-stopping, is not found in nearby localities but is attested in Irish Englishes (for /t/, see Hickey 2007: 322-325; 2009). The fact that TH/DH-stopping and plosive lenition are found in Irish Englishes is not trivial, since it has often been suggested that contact between Liverpool English and Irish varieties in the mid-nineteenth century is what has given the Liverpool accent, popularly called Scouse, some of its distinguishing characteristics (see Knowles 1973, Wells 1982: 371, Honeybone 2007). Honeybone (2007) considers the origins of Liverpool English in some detail, couching the development of the variety – and in particular the four phonological features mentioned above – in terms of Trudgill’s (1986, 2004) model of new dialect formation. Honeybone (2007) uses contemporary data to shed light on these historical issues. This is an important foundation but is not without problems because, as Honeybone (2007: 122) acknowledges, it assumes that mid/late-nineteenth century Liverpool English is the same as the contemporary variety. If there have been phonological changes during this substantial period of time, as seems likely, we need additional resources to substantiate Honeybone’s trajectory of new dialect formation in Liverpool.

In this chapter we explore the origins of Liverpool English by examining Honeybone’s (2007) claims using both contemporary and historical data.

1 As we show below, the fact that these lexical sets have a front vowel in Liverpool is important, since a merger of these lexical sets to a central vowel is common in other nearby South Lancashire varieties.


Specifically, we utilise the new Origins of Liverpool English (OLIVE) corpus,2 which consists of recordings of speakers from Liverpool born between 1890 and 1994, representing over 100 years in apparent time. The two questions we ask are: (1) What did Liverpool English sound like in the late 1800s? And (2) How have some of its characteristic features developed over the last 100 years? The chapter is structured as follows. In section 2 we elaborate on the features of Liverpool English introduced above, before turning to Honeybone’s (2007) discussion of their origins in section 3. We provide an outline of the OLIVE corpus in section 4, before using the corpus to address our two main questions, in section 5.3

2 Some diagnostic features of Liverpool English The phonetic/phonological characteristics of Liverpool English are relatively well documented (for an overview see Watson 2007b). Here we comment on the four features mentioned above, namely: (non)rhoticity, TH/DH stopping, the front NURSE-SQUARE merger and plosive lenition. It is of course not the case that each of these features is individually diagnostic of Liverpool English, but their combination, and the frequency with which they occur in the variety, is largely confined to the Liverpool area. Knowles (1973) is the first serious linguistic study of Liverpool English, so is a good place to look for evidence of what early Liverpool English was like. Knowles (1973) collected data from spontaneous interviews and elicitation tasks from 47 informants, stratified by the social variables of age, sex, social class and religion.4 The number of speakers per cell is between 0 and 3, which is low by modern standards of work on language variation and change, but there is broadly an even split by sex (23 female and 19 male speakers) and an exact split by social class (21 working class and middle class speakers). The oldest speakers were born in 1897, but this group is represented only by two working class females. There are more speakers in the next oldest group (people born 1898-1907) but they are unevenly distributed in terms of age and social class: 1 working and 3 middle class males, and 2 working and 5 middle class females. Although Knowles’ description of the

2 OLIVE was created thanks to the financial support of the Economic and Social Research Council, as part of a project entitled ‘Phonological levelling, diffusion and divergence in Liverpool and its hinterland’ (RES-061-25-0458). We also gratefully acknowledge support from the University of Canterbury’s summer scholarship scheme. Thanks are also due to the North West Sound Archive for donating the recordings for our Archive subcorpus. 3 We would like to thank Raymond Hickey and Patrick Honeybone for their helpful feedback on this chapter. Any errors remain ours. 4 Knowles used the 1966 census to identify 100 informants by random sample from a working class area and a middle class area of Liverpool (Vauxhall and Aigburth, respectively, with 50 informants in each). A number of intended informants had died or moved house by the time Knowles tried to contact them which, along with some other problems, meant the total dataset included 56 informants, but this number “included Scots, Yorkshiremen and a Russian emigree who had to be excluded” (Knowles 1973: 4). The total usable number of speakers was 47. The total number of speakers born and raised in Liverpool is 42.


data is very detailed, we should be cautious about relying solely on these speakers for a comprehensive picture of what Liverpool English was like at the turn of the twentieth century because the cell counts are too low to generalise across the community.

Perhaps a greater challenge in gleaning a more detailed picture of late nineteenth century Liverpool English from Knowles (1973) is that the data presented is not quantified using the phonological variable, in the Labovian sense of the term. Written shortly after Labov’s (1966) ground breaking work in New York, Knowles’ (1973: 1) writes that ‘the original intention was to apply some of Labov’s methods to Liverpool speech, identifying socially significant variables, and subjecting them to detailed analysis. However, it proved a major problem to identify the variables themselves, and to describe them in a simple and meaningful way. Consequently, although it is hoped that this work will be of interest to socio-linguistics, it is not intended to be a contribution to socio-linguistics as such’. It is not always clear exactly why identifying the variables (or, perhaps, variants) posed such a problem, but the decision not to fully quantify the data means that direct comparison with subsequent work is not always straightforward. Nevertheless, we attempt this here by first commenting on Knowles’ observations of the phonological variables highlighted above, before elaborating on them by, where possible, discussing findings from other more recent work. Like Received Pronunciation and many other varieties of English in England, contemporary Liverpool English is non-rhotic. This is despite the fact that Liverpool is in close proximity to other localities in Lancashire, one of the few remaining pockets of rhotic accents in England. (Non-)rhoticity has never been studied as a variable in Liverpool English. Knowles (1973:259) notes that ‘Scouse and RP agree…offa and offer are identical [ɒfə]’,5 Trudgill (1999: 72) writes that being r-less is one of the features which distinguishes Liverpool from central Lancashire, and Honeybone (2007: 125) observes that ‘no trace of rhoticity has been reported for any speaker of the variety’. Without a quantitative study of rhoticity in Liverpool, however, we cannot be certain. While it seems clear that contemporary Liverpool speakers are non-rhotic, we do not know whether this was truly the case in early Liverpool English. Although TH/DH-stopping in Liverpool English is regularly included in textbook accounts of the accent and is often identified by lay speakers (see Honeybone and Watson 2013), it has not been the focus of much published work. Wells (1982: 371), citing Knowles, notes ‘the use by some speakers of dental or alveolar stops for /θ,ð/ as in [tɾɪi] three, [tɹuːt] truth, [mʊnt(θ)] month, [dat ~ dat] that’, and Hughes, Trudgill and Watt (2012: 113) support this observation, at least for /ð/ in initial position. Stopped realisations are similar to those found in varieties of Irish English, which, as Hickey (1999) notes, are not recent innovations but have

5 When discussing variation in the NURSE/SQUARE vowels (see discussion below), Knowles points out that one of the variants is [ɵ], where the rounding in this vowel is a vestige of r-colouring. It is not clear whether Knowles intends to imply that r-colouring was still present in Liverpool English in the recent past, or whether ‘vestige’ here implies a time period much further in the past, before Liverpool English itself would be said to exist (see section 3)


been well established since at least the seventeenth century. Knowles (1973: 323) directly connects the presence of stops for /θ, ð/ in Liverpool English to contact with Irish varieties, even explicitly making a binary distinction between ‘consonants that sound “English” [the dental fricatives]’ and those that sound “Irish” [the dental stops or more occasionally affricates]’. Like the other phonological variables that Knowles describes, the data for TH-stopping is not fully quantified, but it is possible to reinterpret Knowles’ results in a way sociolinguists are now more familiar with. Knowles reports that the “Irish” types (the dental stops) are ‘virtually restricted to working class Catholics’, and he presents a table which lists realisations of these words for this speaker group (stratified by age and sex) in an elicitation task (1973: 323). There are 5 tokens of TH and just 2 tokens of DH per speaker. Since quantifying this low number of tokens is somewhat problematic, we focus here and in the remainder of this chapter only on TH. Figure 1 is a reinterpretation of Knowles’ TH data.

Figure 1. TH variation in working class Catholic Liverpool speakers, adapted

from Knowles (1973: 324). N values are speaker counts per group. There are 5 tokens of TH words per speaker (thirteen, three, mouth, truth, month). Because of the low token numbers we make no distinction between tokens at different word positions, but acknowledge this would be desirable. Knowles’ table includes 1 male and 1 female speaker from Dublin. These speakers have been removed from this reinterpreted graph.

When discussing this data, Knowles (1973: 324) says ‘…there is no clear pattern, the frequency of the Irish forms being idiosyncratic for each person…’. The patterns are indeed far from clear cut, but they may well be being masked by the low speaker and token numbers. There are some suggestive patterns which warrant further investigation. For example, there is a suggestion of a sex effect: 6 of the 11 female speakers are categorical users of [θ], compared to just 1 of the 7 male speakers. Conversely, 2 male but 0 female speakers are categorical users of a


stopped variant.6 The stopped variants also appear to be changing over time, although there is not enough data to present a clear picture. For the male speakers, use of [t] decreases in speakers born from 1918 to 1947, and in the female group no speaker born in 1918 or later uses [t ] at all. While TH-stopping is something of a stereotype of Liverpool English, it does not seem to have been very widespread or very stable, even in early varieties of the accent, and the tentative pattern here is that [t] was already beginning to decline early in the twentieth century.

On the phonetic quality of NURSE/SQUARE, Knowles (1973: 318) writes that ‘there are a number of variants of first, girl, word etc., and – for most speakers – square, pear, swear. We have distinguished the rounded [ɵ], the RP-type [ɜ], the slightly fronter [ɜ ], the fronter still and half open [ɛ], and the “closer” [e], where the term “closer” is entirely auditory’. This suggests that the phonetic quality of NURSE/SQUARE was characterised by much variation in early Liverpool English. Knowles attempted to identify how the NURSE/SQUARE variants patterned according to different speaker groups, observing that ‘the most conservative vowel is almost certainly [ɵ]’ (Knowles 1973: 319) and that over time ‘the [ɵ] gives way to [ɜ], which is somewhat more open, front of centre, and perhaps a little “rounded”; [ɜ] is the characteristic middle class vowel…Many Vauxhall [working class] speakers have [ɛ ], but this has largely given way in the younger groups to [e]’7 (Knowles 1973: 320). As well as noting high variability, then, Knowles also presents a picture of change, where the NURSE/SQUARE vowels are becoming somewhat more front and possibly more close over time. There has been very little work published more recently on the production of NURSE/SQUARE (see De Lyon 1981 for some elaboration, and Watson and Clark 2013 for an investigation into the perception of these vowels), so we cannot say whether the changes identified by Knowles’ have continued. But it seems nevertheless apparent that the lack of contrast between NURSE/SQUARE sets was likely present in Liverpool speakers born at the turn of the nineteenth century, along with realisational variation.

The varied realisations of Liverpool English plosives have been examined in a considerable amount of work (as well as Knowles 1973 see De Lyon 1981, Sangster 2001, Honeybone 2001, and Watson 2006a, 2006b, 2007a). The typical attested patterns are those of plosive lenition, where plosives are regularly realised as affricates or fricatives (see Honeybone 2001, Watson 2007a). Sometimes described as a ‘cline of weakening’ (Hickey 1996: 182), lenition processes can be ordered along ‘scales’ or ‘trajectories’ such as that in Figure 2, adapted from Lass (1984: 178).

6 These individual speaker patterns are not shown in Figure 1, which presents group means. 7 Knowles switches between phonetic and phonological brackets, but we consistently use phonetic brackets.


Figure 2. A typical lenition trajectory, adapted from Lass (1984: 178) In Figure 2 any step right-ward would count as a process of lenition. In the case of Liverpool, lenitions are attested at every step such that, for /t/, particularly in utterance-final or prepausal position, we find that but, for example, can be realised as: [bʊt, bʊts, bʊs, bʊh].8 Like TH-stopping, plosive lenition is often mentioned in textbook accounts of the variety. Wells (1982: 317), citing Knowles, notes that ‘voiceless stops sometimes lack complete closure in certain syllable-final environments, so that varieties of fricatives…result for /p, t, k/ in such words as [sneɪx] snake, [ʃɔːt] short, [dɔːtə] daughter…’, and Hughes, Trudgill and Watt (2012: 113) add that ‘/p t k/ are heavily aspirated or affricated…In final position, /p t k/ may be fully spirantised, that is realised as the homorganic fricatives [ɸ s x]’. Fricative realisations of /t/, are universal in Irish varieties (e.g. Hickey 1999 on Dublin English, Hickey 2007: 322-325 on supraregional Irish English). Knowles (1973: 324-325) does not refer to these realisations as processes of lenition but instead refers to ‘incomplete stops’ which result from a ‘lax’ articulatory setting. Nevertheless, he observes that ‘most Merseysiders use stops with incomplete closure at least sometimes, and the majority of informants use them even in the slow deliberate style of the questionnaire responses’. Knowles lists 12 speakers from Vauxhall who use ‘incomplete /t/’ at least once, in the 7 words intended to capture this feature in the elicitation tasks (namely: white, that, short, foot, sprout, thirteen, daughter; Knowles 1973: 326). Figure 3 reinterprets Knowles’s description of individual speakers and plots the use of ‘incomplete /t/’ over time, for males and females. Words which do not have /t/ in word-final position (i.e. thirteen and daughter) are excluded, leaving 5 words per speaker.

Figure 3. Word-final /t/ variation in working class Liverpool speakers, adapted

from Knowles (1973: 326). N values are speaker counts per group. 8 We focus on /t/ in this chapter, due to space constraints, as it is /t/ which has the widest range of realisational variability. Lenition of other plosives is also found, most typically /k/ and /d/ and to a lesser extent, /p/ (see Watson 2007a).


This data shows that lenited variants of /t/ have been present in Liverpool English for a reasonably long time, even though they appear to be absent for speakers born in 1897 or earlier. Female speakers use ‘incomplete /t/’ 20% of the time if born between 1898 and 1907, and this has increased by the time the female speakers in the youngest age group are born (1938-1947), but there is only 1 speaker in this group. Male speakers lag behind the females, never reaching the 20% mark. Knowles (1973: 233-234) identifies another variant of /t/, writing: ‘there is a small class of words including get, got, bit, what, that, it, not in which the final /t/ is pronounced before another consonant, but can be elided in absolute final position’. Although labelled as /t/ elision, it is possible that this variant is actually [h] – the final step before deletion on the sort of lenition scale presented in Figure 2.9 If this is correct, it suggests another connection to Ireland, where in vernacular varieties [h] is also a variant of /t/. However, there are differences. In local Dublin English, [h] can be found in intervocalic position and word-final position following a long vowel (e.g. motorway [moːhəwe], thought [tɑːh]; Hickey 1999: 217) but in Liverpool, according to Knowles’ description of this variant, it appears only in pre-pausal position and only in a small set of lexical items, usually monosyllabic function words or high frequency content words with short vowels.

Recent work has shown that plosive lenition remains a common characteristic of contemporary Liverpool English. Sangster (2001) elicited word-initial and word-final alveolar stops from 16 female adolescents aged 16-17 from 2 social classes. She observed that there was no effect of social class, but that ‘lenition of alveolar stops is evident to some degree in all speakers of Liverpool English studied. This non-standard feature does not pattern sociolinguistically in a straightforward way…it appears to be a prominent feature of Liverpudlian’s speech generally’10 (Sangster 2001: 410). That plosive lenition is common across many speakers is confirmed by Watson (2007a), who elicited utterance-final plosives from teenage speakers born in 1985-1986. Every speaker used a lenited variant for /t/ at least some of the time. Figure 4 presents the realisations of utterance-final /t/ for male and female speakers, adapted from Watson (2007a: 181). Although male speakers use fricative variants more often overall than females, both groups are equally likely to use a lenited variant of some form (i.e. an affricate or a fricative) – canonical stop variants are very rare.

9 Knowles also writes of the final devoicing of vowels in this context, which certainly leaves open the possibility that /t/ is not elided but is realised as a period of voiceless glottal friction. More recent work (Honeybone 2001, Watson 2007a) has taken this approach, but a systematic study is needed to tease these complexities apart. 10 The term ‘Liverpudlian’ is an informal label often given to someone from Liverpool.


Figure 4. Utterance final /t/ variation in working class adolescent speakers (9

female, 7 male), adapted from Watson (2007a: 181).11 N values are token counts per group.

Watson (2006b, 2007a, 2007c) argues that the use of the [h] variant of /t/ has increased over time. Whereas Knowles reports that [h] is a possible variant for a small set of words, Watson finds that in younger speakers [h] is also found in polysyllabic words with an unstressed final syllable. That is, [h] is a possible variant of utterance final /t/ in words such as biscuit, bucket, certificate and aggregate (with a preceding schwa) but not in words like jackpot, acrobat or internet (with a preceding full vowel), where lenition to an oral fricative or affricate is more likely. This appears to be a relatively recent change, although there is no published work fully documenting its trajectory (see Clark and Watson in preparation).

In this section we have elaborated on four of the phonological features often thought to be diagnostic of Liverpool English. There are no existing reports of rhoticity in Liverpool, even in Knowles’ description of his oldest speakers. If rhoticity was ever present, it seems it had been lost in speakers born by 1897. There is stronger evidence for the presence of the other three variables we have considered. TH-stopping, the realisation of NURSE/SQUARE as a front vowel, and plosive lenition, at least as far as our focus on /t/ goes, were all clearly part of nineteenth century Liverpool English, but to varying degrees. Knowles described how the NURSE/SQUARE vowel was becoming more front and close over time, the reinterpreted (TH) data implied that the stopped variant was already in decline for speakers born in 1918, and /t/ lenition seemed to be on the rise. In the next section, we discuss the likely origins of these features, by examining Honeybone’s (2007) analysis of new dialect formation in Liverpool. 3 New dialect formation in Liverpool

11 Here and elsewhere, the fricative variant of /t/ is transcribed as [θ] to signal that it is different from the realisation of /s/ (see Honeybone 2001).


Honeybone (2007) is the first to couch the birth of Liverpool English in terms of Trudgill’s model of new dialect formation, e.g. Trudgill (2004). Situating new dialect formation firmly in the context of dialect contact, Trudgill (2004) proposes that the birth of a new dialect, at least in tabula rasa situations, is predictable, once we know about the early speakers involved, and their dialects (but see differing views, e.g. Hickey 2003). The first stage of Trudgillian new dialect formation involves the initial contact between adult speakers of different regional and social varieties in a new location. Typically, when speakers of different regional accents come together, they accommodate to each other, resulting in the levelling of more marked linguistic features (Trudgill 1986). The outcome is that even among the first generation of immigrants, their accents will be a little less ‘broad’ than they otherwise would have been (Trudgill 2004: 84). The second stage of new dialect formation predicts the behaviour of children born into this unique linguistic melting pot. Essentially, as a result of having not one but several different adult linguistic models to aim towards, children will select variants or features from different dialects, potentially creating new combinations. They may also generate ‘interdialect’ forms, i.e. features which are not present in any of the input varieties. This stage of new dialect formation is characterised by extreme variability, resulting in an unstable linguistic situation. Stability begins to emerge in the next stage, driven by the next generation of speakers who create a more focussed koine – a new dialect (Trudgill 2004:88).

Honeybone (2007) briefly outlines the social history of Liverpool in light of these stages of new dialect formation. First, citing Neal (1988: 2), Honeybone (2007) provides the population figures in the city, gleaned from census data. These are reported in Table 1. The rapid growth of the city should be clear, particularly between 1831 and 1861, when the population almost trebles. Table 1. The population of Liverpool between 1801 and 1911, from census

returns. Taken from Honeybone (2007: 115, via Neal 1988: 2).

Year Population 1801 77,653 1811 94,376 1821 118,972 1831 165,175 1841 286,656 1851 375,955 1861 443,938 1871 493,405 1881 552,508 1891 517,980 1901 684,958 1911 746,421

The reasons for this rapid increase are complex, but an important factor was the migration of people to Liverpool from elsewhere. Honeybone (2007), using data from Munro and Sim (2001), Neal (1988) and Knowles (1973), also reports on


where the migrants to Liverpool came from, summarised in Table 2. A large part of the population of Liverpool came from Ireland as adults. An often cited reason for this is the Irish famine, which began in 1845 and forced people to leave their homes, in the direction of Liverpool (often as a port of departure for emigrant ships to the USA and Canada). But the population of people in Liverpool who were born in Ireland was already high in 1841, so the famine cannot have been the only factor. Liverpool was a major port city even by this point in time, so was an attractive destination for anyone looking to improve their economic circumstances. This will have meant that there was migration to Liverpool from other parts of England, too, and from elsewhere, adding other dialects into the melting pot. Table 2. The proportion of Liverpool population born outside of England, from

Honeybone (2007: 116).

Year Population % Irish-born % Welsh-born % Scots-born 1841 286,656 17.3 1851 375,955 22.3 4.9 3.6 1861 443,938 18.9 4.7 4.0 1871 493,405 15.6 4.3 4.1 1881 552,508 12.8 3.9 3.7 1891 517,980 9.1 3.4 2.9

Honeybone (2007: 117), based on Knowles (1973), argues that the period 1841-1871 covers the ‘crucial period in which levelling, koineisation and new dialect formation occurred in Liverpool’. Before the nineteenth century, Honeybone claims, there was no distinct ‘Liverpool English’ but by the end of the century we see the ‘focussing of the dialect mixture and the emergence of a stable koine’ (2007: 119). This means, in Trudgill’s (2004: 112) terms, the dialect appears as a ‘stable, crystallised variety’, largely because of the ‘survival of majority forms’ (Trudgill 2004: 114). As part of this process, we expect to see ‘variant reduction’ where, as we move from the position of ‘extreme variability’ in earlier stages of new dialect formation to a more stable variety, the number of variants of a given variable are reduced, leaving perhaps just one remaining variant.

This brings us back to the four phonological features highlighted above, which are the features which Honeybone discusses in light of the predictions of Trudgill’s dialect formation model. As we noted in section 2, Honeybone (2007: 125) observes that ‘no trace of rhoticity has ever been reported for Liverpool English’. However, many of the varieties in the Liverpool melting pot in the mid-nineteenth century were rhotic – both the varieties of the speakers who came from other countries (e.g. such as Scotland and rural Ireland, although working class Dublin English was already non-rhotic at this time, Hickey pc) and those of speakers from neighbouring localities in north-west England. Honeybone (2007: 126) points out that if Trudgill’s principles of new dialect formation are correct, we might expect that Liverpool English became rhotic when it was formed, but then lost rhoticity at a subsequent period of time, in line with other varieties in England. But without some evidence which suggests that Liverpool English was once rhotic, we cannot be sure of this. With this in mind, our first question, to be discussed in section 5, is: is there any evidence of rhoticity in early Liverpool English?


The presence of TH-stopping seems at first to result somewhat straightforwardly from processes of new dialect formation, since it is a feature common in Southern Irish English varieties, and so would have likely been present in Liverpool’s dialect mixture in the nineteenth century. Honeybone (2007: 124) argues that this is an example of ‘a Southern Irish English feature winning out’ over its competing counterparts from other varieties in the mix. However, recall that Trudgill’s (2004) model predicts that in the focussing of a new dialect, it is the majority form which will win out, eradicating competing variants. This does not seem to be the case with TH-stopping. The evidence above suggested that TH-stopping was not a clear categorical variant of (TH), or even the majority form, in early Liverpool English, as it appeared to be in decline in speakers born from 1918 onwards. Of course, we were limited by a small dataset, so it may be that a different picture emerges when a larger corpus is examined. The next question we ask in section 5, then, is: Was the stopped variant of (TH) ever the majority variant in early Liverpool English?

Like TH-stopping, the origin of the lack of contrast between NURSE and SQUARE, and the realisational variability of the vowels, is also complex. South Lancashire varieties merge these sets to a central vowel like [ɜː], while Irish varieties maintain two distinct lexical sets, with a front vowel in SQUARE and, at least in Dublin, a high back vowel in NURSE (e.g. [nʊ:(ɹ)s], Hickey, p.c.) We saw above that there were many different realisations of NURSE and SQUARE in early Liverpool English, in line with Trudgill’s predictions of extreme variability in the early stages of new dialect formation, and we saw tentative evidence of the change towards a more front and closer vowel in younger speakers. This may be evidence of the beginnings of a stabilised feature in Liverpool’s koine. Or it may be, and this is the position that Honeybone (2007: 128-129) takes, that the front variant emerged as the ‘realisation of choice in Liverpool English since koineisation’. In section 5 we ask: if Knowles is correct in suggesting that the front, more close variant was a new variant in his data, when did it ‘win out’ over the other competing variants?

The final feature, plosive (particularly /t/) lenition, while present in Irish varieties, is thought not to have been borrowed wholesale from there into Liverpool English. This is mainly because the features do not pattern in exactly the same way in Irish varieties and Liverpool English, as we saw briefly in section 2 when we discussed the [h] variant of /t/. Nevertheless, contact between the varieties is thought to have been crucial in innovating plosive lenition in Liverpool English. Honeybone (2007: 132) writes that the presence of lenited forms gave Liverpool children acquiring the variety ‘a clear indication that spirantisation and affrication of at least certain stops was possible’. The Irish realisations would presumably have been minority forms in early Liverpool English, but they appear not to have been levelled out, as Trudgill’s model would predict. Instead, Honeybone argues, children extended these lenitional patterns, both in terms of their frequency of occurrence and in terms of the phonological environments in which they are likely to occur. To examine whether this is correct, we first need to corroborate observations about /t/ lenition in early Liverpool English. We do this in section 5, before asking: have the patterns been extended by younger speakers, over the last 100 years? To answer all the questions posed here, we utilise the Origins of Liverpool English corpus, which we outline in the next section.


4 The OLIVE corpus The Origins of Liverpool English corpus – OLIVE – was created as part of the ESRC-funded project ‘Phonological levelling, diffusion and divergence in Liverpool and its hinterland’. It holds recordings of 140 speakers from three localities in north-west England: Liverpool, a major urban centre in the region, and two smaller towns: Skelmersdale and St Helens. These smaller towns are equidistant from Liverpool but the former saw much more migration from Liverpool in the 1960s when it was designated a new town. There are three age cohorts for each locality in the corpus, split into subcorpora: the ‘Archive’ subcorpus has speakers born between 1890 and 1943, the ‘Older’ subcorpus has speakers born between 1918 and 1942, and the ‘Teen’ subcorpus has speakers born between 1992 and 1994.12 OLIVE is fully time-aligned and searchable, following a similar structure to the Origins of New Zealand English (ONZE) corpus, and using the same browser-based LaBB-CAT client (see Gordon et al 2007, Fromont and Hay 2008). When an orthographic transcript, time-aligned at the utterance level, is uploaded to OLIVE along with an audio file, LABB-CAT performs a series of automatic processes to add further annotations. These include phonemic transcriptions and part of speech tags, taken from the CELEX database (Baayen et al 1995). These are added to the orthographic transcript as additional layers (see Figure 5 for an example of the phonological transcription tier) and then become fully searchable. The audio data is also further time-aligned at the word and segment level (see Figure 6, the segment tier is represented in the CELEX “DISC” format which associates one character per phoneme) and users can interact with these annotations using Praat, which can be called directly from the OLIVE client.

12 While there is some overlap in years of birth in the Archive and Older subcorpora, the data is different. Materials from Archive speakers come from oral history interviews and were kindly donated to the project by the North-West Sound Archive. Materials for the Older speakers were recorded as part of the ‘Phonological levelling, diffusion and divergence in Liverpool and its hinterland’ project, and include informal interviews, following the standard sociolinguistic interview methodology, and additional reading and elicitation tasks.


Figure 5. An example of a transcript in OLIVE, showing the orthographic and

phonological tiers.

Figure 6. An example of a textgrid generated automatically by OLIVE. The

utterance level alignment is done manually but the word and segment levels are completed automatically.

In this paper, our analysis is based mainly on 24 speakers from the Liverpool corpus of OLIVE, with 4 females and 4 male speakers from each of the three sub-corpora: Archive (speakers born 1897-1919), Older (speakers born 1937-1955) and


Teen (speakers born 1992-1994).13 Our oldest speaker was born in 1897, a couple of generations after the beginning of significant waves of migration from Ireland. In the terminology of new dialect formation, we would expect this speaker to be taking part in the early processes of koineisation, including levelling and the development of interdialect forms. Perhaps by the time the youngest speakers in the Archive subcorpus are born, and certainly by the time the speakers in the Older subcorpus are born, following the process of focussing, we should expect to see the emergence of a more stable koine. 5 Using OLIVE to understand Liverpool English in the past and in the

present In this section we take each of the phonological features in focus in this chapter in turn and begin by first describing the speech of two typical speakers from the OLIVE’s Archive subcorpus, before examining a much larger dataset. M07 is a working class male who was born in Liverpool in 1897. He lived his entire life in the house in which he was born. He was recorded late in life as part of an oral history project chronicling the working lives of carters14 in the early 1900s. F05 is a working class female who also lived her whole life in Liverpool. She was born in 1905 and recorded as part of an oral history project to collect stories about the sinking of the Lusitania in 1915. Below is an orthographic transcript of an extract of these speakers’ stories, followed by a relatively broad phonetic transcription. Pauses are marked in each transcription with a vertical line.

Speaker M07 took her out on the road | with just | what with just ten hundred weight of coal on it | and when she heard, she heard the tram cars rattling | ooh there | up comes the ears | cost me five hundred quid that lot | knocked a wall down [tʰʊk ə:ɹ aʊt ɒn ðə ɹoəd | wɪð ʤəst | wɒʔ wɪð ʤəs tʰεn ʊndɹə weɪt ə kʰoəl ɒn ɪts | ən wεn ʃi: ɜ:d ʃi: ɜ:d ðə tsɹam kʰa:z ɹatlɪn | u: ðɜ:ɹ | ʊp kʰʊmz ði e:z | kʰɒs mi: fɒɪv hʊndɹɪ kwɪd ðatl lɒts | nɒxt ə wɔ:l daʊn] Speaker F05 my mother said to don’t don’t go away or anywhere cos the bag will be coming | and it never come and none of us thought of that | and at just half past two | quarter to three | the Echo papers come flying up the street | Lusitania gone down, all hands lost! | well the world went | berserk | the world went mad [mi mʊðə sɛd tə dəo dəo gəo əweɪ ə εnɪwɛ: kʰəz ðə bag l bi: kʰʊmən | an ɪʔ nεvə kʰʊm an nʊn əv əz tθɔ:t əv ðats | an aʔ ʤʊst haf pas tsu: | kwɔ:tsə

13 We use a larger dataset, with a higher number of speakers, to discuss TH-stopping and /t/ lenition. We provide further information about that data when we discuss the variables in question. 14 A ‘carter’ typically drove a horse-drawn wagon carrying goods for transport.


tə tɹi: | ðɪ εkəo pʰeɪpəz kʰʊm flaɪn ʊp ðə stɹi:tS | lu:sɪteɪniə gɒn daʊn ɔ:l hanz lɒst | wεl ə wε:ld wεnts | bəzε:k | ðə wε:ld wεnʔ mad]

As we discussed above, Honeybone’s (2007) predictions of new dialect formation suggest that Liverpool English may have become rhotic during the koineisation process. We can see that the speech of M07 is variably rhotic. In the words heard, cars and ears /r/ is not realised, but in there, where the variable is in a pre-pausal position, there is a clear realisation of a rhotic consonant. F05, however, shows no sign of rhoticity, in any portion of her interview. The other Archive speakers analysed in this paper show no signs of rhoticity either. A quantitative analysis of M07’s interview revealed that a non-prevocalic /r/ was realised in 14/482 tokens (3%). This is obviously a very small number, but that doesn’t mean it is trivial, since no other speaker had any realisations of /r/ in this position at all. We have no reason to think M07 has a social history which is very different from the other speakers, so it is possible that this speaker is demonstrating the last vestige of rhoticity in Liverpool English. If this is correct, then Trudgill’s model of new dialect formation would be supported. Liverpool English may have been at least partially rhotic in the mid-late nineteenth century, but then rhoticity was lost, in line with what has happened in other English localities.

There is also a difference between speaker M07 and F05 in the use of TH-stopping. Only F05 uses the stopped variant of (TH), in thought and three, and there is no DH-stopping for either speaker in these short extracts. To see if this pattern extended to the wider speech community, 1380 tokens of (TH) were extracted and analysed across the three subcorpora (see Figure 7).15 In line with the observations about the extracts from M07 and F05, in the Archive subcorpus overall the female speakers use a stopped variant of (TH) more often than the males. This difference has disappeared in the Older and Teen subcorpora, as the use of the stopped variant declines. In the Teen subcorpus, the stopped variant is again further reduced in frequency, but it has still not disappeared completely. The stopped variant is never particularly common, however, even for the Archive speakers – [θ] is the majority variant for both Archive and Older groups. This means that if the stopped variant did ‘win out’, during new dialect formation in the mid nineteenth century, it must have begun to recede again quite rapidly. It seems unlikely that [θ] was ever lost completely because of variant reduction during the levelling process, but that the fricative and stopped variants were used together. Rather than reducing the number of variants of this variable, then, we have an increase, which has still not fully levelled away even for speakers born in 1992-4.

Moreover, in the Teen subcorpus we see the rapid increase of TH-fronting, which was entirely absent from the other subcorpora. This is perhaps unsurprising, since TH-fronting is a feature known to have been spreading across the UK over the last few decades (see Kerswill 2003 for an overview of the geographical diffusion of this feature). The usual claim in the literature is that TH-fronting is absent from Liverpool. Indeed, this is the picture presented in Watson (2007c), which examined

15 The speaker counts for this analysis are as follows: 10 Archive speakers (4 female), 10 Older speakers (6 female), 19 Teen speakers (9 female). All data is from a conversational style.


(TH/DH) in Liverpool teens born in 1985-6. The Teen speakers in the OLIVE corpus were born in 1992-4, and by this period, TH-fronting is firmly established in Liverpool English, adding another possible variant for (TH).16

Figure 7. TH variation, by corpus and sex. N values are token counts per group. Another difference between the features used by M07 and F05 is their realisation of the NURSE and SQUARE lexical sets. Both speakers merge these phonological categories, but for M07 the merger is to a central vowel (typical of the present day realisation of these words in South Lancashire) but for F05, the merger is to a fronter vowel (typical of modern Liverpool English). To explore whether this pattern holds for the other Archive speakers, we extracted 11,068 vowels from OLIVE (6283 vowel tokens from 4 male speakers, 4785 vowel tokens from 4 female speakers – see table 3).

Table 3. Numbers of Archive vowel tokens analysed across each lexical set,

arranged by sex. Female BATH BOOK DRESS FLEECE FOOT GOOSE KIT

16 We cannot say for sure that age is the only explanatory factor here (and it may not even be the main factor). The data discussed in Watson (2007c) was collected via elicitation tasks (originally for Watson 2007b), but the data presented here from OLIVE is from informal conversation, so we may be dealing with a stylistic difference, i.e. TH-fronting may well have existed in the speech of Liverpool teenagers born in 1985-6, but it was associated with a more informal style of speech. This would not have been captured in Watson (2007c). That said, the methodology in Watson (2007c) did capture other features presumably associated with informal styles, such as TH-stopping and plosive lenition.


88 58 1235 251 72 149 516 LOT NURSE SQUARE START STRUT THOUGHT TRAP 437 314 321 91 331 479 443 BATH BOOK DRESS FLEECE FOOT GOOSE KIT 115 62 1594 305 158 179 543 LOT NURSE SQUARE START STRUT THOUGHT TRAP

Male

605 271 406 256 462 710 617 The vowels, normalised using the Lobanov method (1971), are plotted for each Archive speaker group in Figure 8. We can clearly see that the NURSE and SQUARE lexical sets are indeed more central for men and more front for women, on average. The pattern described above from the speech of M05 and F05 is not simply an artefact of the few tokens within the transcript; rather, M05 and F07 pattern with the rest of the speech community. Figures 9 and 10 show vowel plots of male and female speakers from the Older and Teen subcorpora.17 We can see that for the Older male speakers, NURSE and SQUARE are realised in a more front position – the men seem to have caught up with the women by this point. This is continued in the Teen corpus, where again both male and female speakers use front realisations for each of these lexical sets. These results, based on a total of 1410 NURSE/SQUARE vowels, support Knowles’ (1973) early observation that these vowels were moving to a more front position. Females seem to have been in the lead, since a front variant was used for the female speakers even in the Archive corpus.

17 Relevant token counts are as follows: Older corpus: NURSE=335, SQUARE=290 [total vowel tokens=8071], Teen corpus: NURSE=67, SQUARE=83 [total vowel tokens=2267].


Figure 8. Vowel plots for female Archive speakers (F, bottom pane) and male

Archive speakers (M, top pane).


Figure 9. Vowel plots for female Older speakers (F, bottom pane) and male Older

speakers (M, top pane).


Figure 10. Vowel plots for female Teen speakers (F, bottom pane) and male Teen

speakers (M, top pane).


Finally, there is some evidence of /t/ lenition in the extracts from M07 and F05 – both speakers produce a phonetically affricated /t/ in pre-pausal position. There are no examples of fricative variants of /t/, so in order to explore the extent to which lenition was present across the OLIVE subcorpora, we extracted and analysed 1536 tokens of utterance final /t/ from the conversational data.18 Averaged group data is presented in Figure 11.

Figure 11. Realisation of utterance final /t/, by age and sex. We can see that /t/ lenition, at least in utterance final position, is clearly present in the Archive subcorpus, for both male and female speakers. There are, as expected, realisations of /t/ as [h], although this is more common for the male speakers than the females. When describing (what we interpreted as) this variant, Knowles restricted its occurrence to a small set of single syllable, high frequency words with short vowels. In our Archive subcorpus, [h] is attested for /t/ in the following words only: bit, but, get, got, it, lot, minute, not, street, that, what. In general these fit Knowles’ pattern – it is much more likely for [h] to appear in a monosyllabic function word than in other words, but there are some exceptions, i.e. the bisyllabic minute, and street, with a long vowel. This is the first time [h] has been documented as a possible variant of /t/ in these words in earlier Liverpool English. The [h] variant increases rapidly in the Older and Teen sub-corpora. In the Older subcorpus, [h] occurs in the following words: but, got, it, market, not, passport, that, thought, what, and the Teen speakers have [h] in: about, bit, but, Charlotte, get, got, it, lot, not, put, quiet, rabbit, that, what, yet. Again, [h] is more likely in monosyllabic function words, and it is possible in polysyllabic words with a final

18 Utterance final position was chosen since it is the position where the greatest range of realisational variability can be found. Speaker counts for this analysis are as follows: 10 Archive speakers (5 female), 10 Older speakers (6 female), 13 Teen speakers (7 female). All data is from a conversational style.


unstressed syllable (e.g. market, Charlotte, rabbit), as reported in the literature. But it is also attested in words with a long vowel in the final syllable (e.g. about, thought). This is different from the original environment proposed by Knowles (1973) but it is in keeping with the observations from the Archive subcorpus. One possible interpretation of this data is that while the frequency of the [h] variant has increased over time, the linguistic constraints operating on this variable may have remained relatively stable. Further work is required to test this properly, however (see Clark and Watson, in preparation).

Importantly, the patterning of Liverpool English utterance final /t/ is unlike that of Irish English. Honeybone’s (2007) idea that Liverpool lenition is a creative act by the young, after they heard certain lenited variants in their linguistic environment during the early processes of new dialect formation, seems to be supported by data from the OLIVE corpus. There is little evidence here of a wholesale adoption of a feature from Irish varieties. Neither is there conclusive evidence of variant reduction. While there is more equal usage of the range of variants in the Archive subcorpus, every variant used by the oldest speakers is still used by the Teen speakers, albeit at different rates.

To summarise, in the last two sections we have presented data from the OLIVE corpus, the largest time aligned corpus of Liverpool English and two varieties in its hinterland. In doing so, we have made a number of observations about Liverpool English which have not been previously documented in the literature. These are: (1) there is evidence of rhoticity in early Liverpool English, (2) the realisation of NURSE/SQUARE as a front vowel appears to be gendered, at least in earlier forms of the accent, where female speakers use the fronter, closer vowel, in line with contemporary Liverpool speakers, (3) TH-stopping is not very common, even in early Liverpool English, even though it is a stereotype of the variety, (4) TH-fronting, previously thought to be absent in Liverpool English, is now attested in younger speakers, and (5) [h] was a possible variant of utterance-final /t/ in polysyllabic words even for speakers born 100 years ago, and is not an entirely modern innovation. Until now, sufficient data was not available in large enough quantities to be able to document these characteristics. The OLIVE corpus provides such data, and allows us to push our understanding of Liverpool English further back in time than has hitherto been possible. 6 Conclusion In this chapter, we have explored Liverpool English and contributed to the discussion of its origins by bringing historical data to bear on Honeybone’s (2007) hypotheses about the emergence of the variety. The results largely support Honeybone’s (2007) observations. Rhoticity is expected, given that it is present in some of the input varieties (e.g. Lancashire English), and the move towards the front vowel in NURSE/SQUARE indicates a degree of focussing over time, to a more stable koine. In this case it is led by the female speakers. The direct influence of Irish varieties seems to be less clear cut than is sometimes assumed. Although TH-stopping was adopted in Liverpool, it declined again rather quickly. The stopped variant did not really ‘win out’ as is sometimes claimed – other variants are


continually used, usually more often. Plosive lenition, too, is complex, and seems not to have been directly borrowed from Irish varieties.

If claims about the emergence of this variety are correct, and modern Liverpool English really started to come into existence in the middle of the nineteenth century, then we have not examined the earliest Liverpool English in this chapter. But, as far as we know, no earlier recordings than those in OLIVE exist. If some are found, it will be important to see how they fit into the picture presented here. References Baayen, R, H. R. Piepenbrock, and L. Gulikers 1995. The CELEX Lexical

Database. Release 2 (CD-ROM). Philadelphia: Linguistic Data Consortium. Clark, Lynn and Kevin Watson. In preparation. The transmission and diffusion of

/t/ debuccalisation. Ms. De Lyon, Hilary 1981. A Sociolinguistic Study of Aspects of the Liverpool Accent.

Unpublished MPhil dissertation, University of Liverpool. Fromont, Robert and Jennifer Hay 2008. ONZE Miner: the development of a

browser-based research tool. Corpora. 3(2): 173-193. Gordon, Elizabeth, Margaret Maclagan and Jennifer Hay 2007. The ONZE Corpus.

In: Joan C. Beal, Karen P. Corrigan and Hermann L. Moisl (eds), Creating and Digitizing Language Corpora. Volume 2: Diachronic Databases. Basingstoke: Palgrave Macmillan, pp. 82-104.

Hickey, Raymond 1996. Lenition in Irish English. Belfast Working Papers in Linguistics 13: 173-193.

Hickey, Raymond 1999. Dublin English: current changes and their motivation. In: Paul Foulkes and Gerard Docherty (eds) Urban Voices: Accent Studies in the British Isles. London: Arnold, pp. 265-281.

Hickey, Raymond 2003. How do dialects get the features they have? On the process of new dialect formation. In: Raymond Hickey (ed.) Motives for Language Change. Cambridge: University Press, pp. 213-239.


Hickey, Raymond 2007. Irish English. History and Present-Day Forms. Cambridge: Cambridge University Press.

Hickey, Raymond 2009. Weak segments in Irish English. In: Donka Minkova (ed.) Phonological Weakness in English. From Old to Present-day English. Basingstoke: Palgrave Macmillan, pp. 116-129.

Honeybone, Patrick 2001. Lenition inhibition in Liverpool English. English Language and Linguistics 5: 213-249.

Honeybone, Patrick 2007. New-dialect formation in nineteenth century Liverpool: a brief history of Scouse. In: Anthony Grant and Clive Grey (eds) The Mersey Sound: Liverpool’s Language, People and Places. Liverpool: Open House Press, pp. 106-140.

Honeybone, Patrick and Kevin Watson 2013. Salience and the sociolinguistics of Scouse spelling: exploring the phonology of the Contemporary Humorous Localised Dialect Literature of Liverpool. English World-Wide 34, 305-340.


Hughes, Arthur, Peter Trudgill and Dominic Watt 2012. English Accents and Dialects: an Introduction to Regional Varieties of English in the British Isles. Oxon: Routledge.

Knowles, Gerald 1973. Scouse: the Urban Dialect of Liverpool. Unpublished PhD thesis. University of Leeds.

Labov, William 1966. The social stratification of English in New York City. Washington DC: Center for Applied Linguistics.

Lass, Roger 1984. Phonology. Cambridge: Cambridge University Press. Lobanov, B. M. 1971. Classification of Russian vowels spoken by different

speakers. Journal of the Acoustical Society of America. 49: 606-608. Montgomery, Chris 2007. Northern English dialects: a perceptual approach.

Unpublished PhD thesis, University of Sheffield. Munro, Alasdair and Duncan Sim 2001. The Merseyside Scots: a Study of an

Expatriate Community. Birkenhead: Liver Press. Neal, Frank 1988. Sectarian Violence: the Liverpool Experience, 1819-1914, an

Aspect of Anglo-Irish History. Manchester: Manchester University Press. Sangster, Catherine M 2001. Lenition of alveolar stops in Liverpool English.

Journal of Sociolinguistics 5/3, 401-412. Trudgill, Peter 1986. Dialects in Contact. Oxford: Blackwell. Trudgill, Peter 1999. Dialects of England. Oxford: Blackwell. Trudgill, Peter 2004. New-dialect Formation: the Inevitability of Colonial

Englishes. Edinburgh: Edinburgh University Press. Watson, Kevin 2006a. Lenition and segmental interaction: evidence from Liverpool

English (and Spanish). Glossa 1: 54-71. Watson, Kevin 2006b. Phonological resistance and innovation in the North-West of

England. English Today 22(2): 55-61. Watson, Kevin 2007a. The Phonetics and Phonology of Plosive Lenition in

Liverpool English. Unpublished PhD thesis, Edge Hill University/Lancaster University.

Watson, Kevin 2007b. Liverpool English. Journal of the International Phonetics Association. 37(3), 351-60.

Watson, Kevin 2007c. Is Scouse getting Scouser? Exploring phonological change in contemporary Liverpool English. In: Anthony Grant and Clive Grey (eds), The Mersey Sound: Liverpool’s Language, People and Places: Liverpool: Open House Press, pp. 215-241.

Watson, Kevin and Lynn Clark 2013. How salient is the NURSE~SQUARE merger? English Language and Linguistics 17(2): 297-323.

Wells, John. C. 1982. The Accents of English. 3 vols. Cambridge: Cambridge University Press.

Watt and Foulkes Tyneside English --- Page 125 of 525

7 Tyneside English Dominic Watt and Paul Foulkes 1 Introduction Tyneside English (TE) – nicknamed ‘Geordie’ – is spoken in the conurbation around Newcastle upon Tyne, in the northeast of England. Newcastle (population c. 280,000) is the largest city in the region, and as an economic and cultural centre of gravity its influence is unrivalled in the northeast. Besides Newcastle, the conurbation includes Gateshead, a large town (population c. 200,000) on the south side of the River Tyne, and other smaller communities that extend eastward as far as the North Sea coast. Nearby towns in (historical) County Durham (e.g. Washington, Chester-le-Street), and in southern Northumberland (e.g. Ponteland, Cramlington), are sometimes also included.


Ellington

Earsdon

South ShieldsNewcastle upon Tyne

Heddon-on-the-Wall

Gateshead

Washington TunstallEbchester

Durham

Crook

Berliner Lautarchiv

Survey of English Dialects

Tyneside Linguistic Survey

Survey localities

0 5 10 miles

0 5 10 km

SunderlandRyhope

Witton Park

Map 1. Map of northeastern England showing the places of origin of the

Berliner Lautarchiv (BLA), Tyneside Linguistic Survey (TLS) and Survey of English Dialects (SED) speakers, and a selection of other locations mentioned in this chapter.

For space reasons we cannot discuss the history of Tyneside and its local vernacular in detail here. Beal (1993a, b, 2008a, b), Watt (1998, 2002), and Beal et al. (2012)


offer more extensive treatments of the subject. We will, however, briefly consider relevant aspects of the variety before looking more closely at the oldest extant recordings of Tyneside English (TE) that we can identify. 2 Tyneside English Tyneside is almost as far removed from London and the linguistically influential southeast as it is possible to be within England. Tyneside’s relative isolation even from other large urban centres in northern England is often argued to have favoured the retention of archaic pronunciations. Prominent among these is uvular [ʁ],

1

which has long excited comment. As far back as the 1720s, a Reverend Jones remarked upon the unusual ‘burred’ /r/ in Northumbrian English (Wales 2006: 100-101), while Daniel Defoe, who travelled to Newcastle and Northumberland at around the same time as Jones, also devoted journal space to a discussion of the burr. Defoe, who described it as a ‘hollow Jarring in the Throat’ and an ‘Imperfection’ of speech, noted that Tynesiders seemed proud of the burr as a badge of the ‘Antiquity of their Blood’ (1724-7[1778]: 257-258). At the end of the nineteenth century, the burr was still sufficiently frequent that Ellis, who described it as ‘the characteristic of [Northumberland] speech… though quite inessential to the dialect’ and ‘really a defect of articulation which tends to become epidemic’ (1889: 641), allotted it a special section in his On Early English Pronunciation (see further Påhlsson 1972; Howell 1987). Claims of the uniqueness, exceptional antiquity and immutability of Newcastle and Northumbrian dialects are repeated so often in popular discourse about variation in British English that they have practically become truisms. For example, Today the only part of England where the original Anglo-Saxon language has

survived to any great extent is of course the North East. Here the old language survives in a number of varieties, the most notable of which are Northumbrian and Geordie. It is from the ancient Germanic and Scandinavian language of the Angles that the unique local dialects of Northumberland and Durham primarily owe their origins. (Simpson 2009)

There is undoubtedly a strong sense of regional identity in the north-east of England, and this might also help to account for the survival of traditional pronunciations (for examples, see §4 below). We noted earlier that Newcastle is relatively remote from other large English cities, but we should not forget that there are two significant conurbations, Edinburgh and Glasgow, to the north of it. Wales (2006) argues that the ‘austrocentrism’ of British culture often leads observers to neglect the fact that as a linguistic barrier the Scottish border is far from impermeable (Watt et al. 2014). Indeed, according to Beal (1993b), ‘[t]he strongest influence on the dialects of Tyneside and Northumberland is undoubtedly from Lowland Scots, but this can hardly be called an outside influence given the

1 This realisation of /r/ is found to a limited extent in other parts of Anglophone world (rural Ireland, west Wales, north-east Scotland, and part of South Africa; see Hickey 2004: 79; Wells 1982).


common origin of these dialects; it must rather be said that the continuing close relationship between Scots and Northumbrians has served to maintain and reinforce the linguistic similarities between their dialects’ (Beal 1993b: 189-190). See also Mess (1928: 34), and cf. Ellis’ lament about Sunderland speech, which ‘can hardly be said to be a dialect on account of the mixed population and influence of Scotch’ (Ellis 1889: 640). Social and economic changes across the northeast have also been held accountable for increasing ‘dilution’ of the traditional dialects, and for the ability of dialect speakers even in rural areas to code-switch between their local vernaculars and more standard-like forms of English. Wright (1898: v), looking ahead to the twentieth century, is pessimistic about the fate of ‘pure dialect speech’, blaming the greater availability of education and ‘modern facilities for intercommunication’ (Wright 1905: vii). Ellis observes that two Newcastle coal miners he interviewed ‘spoke very well’, and when asked if they talked that way while working in the pit they told him that to do so would invite scorn from workmates, but that they were allowed to speak ‘properly’ at union meetings (Ellis 1889: 650). Orton (1929, 1930, 1933) also notes that traditional Northumbrian and County Durham dialects were moving closer to Standard English, and that in the fullness of time standard forms were likely to supplant indigenous ones. Viereck (1966) echoes Orton’s views in his discussion of the speech of a group of working-class Gateshead men born as early as the 1880s. He ascribed changes in traditional TE to the combined influences of Standard English, immigration from other British regions, television and radio, and the greater availability of state education. At the turn of the twenty first century, Griffiths (1999: 44-45) added ‘official contempt’ for regional vernaculars to the list of factors – ‘industrial decline, education [and] technological shifts’ – that, like Viereck, he thought responsible for the erosion of traditional dialect. Contemporary treatments (e.g. Beal et al. 2012) tend to express more nuanced perspectives on the ways in which the dialect is changing in response to a variety of intrinsic and extrinsic forces, but with respect to its phonology the overall picture is still one of convergence on a more standard-like pattern. It would not be controversial to say that the traditional features of TE that set it apart from other varieties in northern England are ever more infrequently heard. With a view to examining whether such features are present in the oldest available recordings of TE, and to what degree they occur if they are present, we turn now to consider the materials we have analysed with these aims in mind. 3 The recordings The northeast of England is relatively well-served in terms of the availability of early recordings of its dialects. The corpus, collected from 35 locations across northern England between 1928 and 1939 by Harold Orton, a native of County Durham, contains samples from speakers who ranged in age from their twenties to their sixties at the time of recording (Rydland 1998: 12-13). In principle, this gives us the opportunity of hearing the speech of people who were born almost 150 years ago. Similarly, the fieldwork carried out for the Survey of English Dialects (Orton 1962; Orton and Halliday 1962) included interviews with sometimes quite elderly


men and women. Many were in their seventies during the first phase of fieldwork in the early 1950s; one man from Washington, then in County Durham, was born as early as 1876. The oldest residents of Newcastle and Gateshead who participated in the Tyneside Linguistic Survey (Pellowe 1976; Pellowe et al. 1972) were born in the 1890s. Later in this chapter we will make use of SED and TLS data, but these are not the focus of our analysis. Because the date that the recording was made is our criterion for selection, rather than the birth years of subjects, the oldest available recordings of TE appear to be those made in February 1916 by or for Wilhelm Doegen and Alois Brandl (see Robinson, this volume). They recorded the speech of two British prisoners-of-war (POWs) being held in camps near Berlin. The prisoners in question were Arthur Roper, a 22-year-old professional soldier born in Durham but raised in Newcastle, and William Dixon, a 24-year-old former machinist/shipbuilder from South Shields, then in County Durham (Map 1). The samples of their speech, and other recordings like them, are housed in the Berliner Lautarchiv (henceforth BLA) at Humboldt University, but since 2008 the British Library has also held digital copies available for download from their website. Doegen, Brandl and their colleagues, supported by the Royal Prussian Phonographic Commission, collected nearly 2,700 recordings in German POW camps between 1915 and 1918 (Mahrenholz 2008). Among this wealth of recorded speech and song there is regrettably little material from northeastern England; besides the Roper and Dixon samples there are only two other recordings of speakers from the northeast (Table 1).

Table 1. Information about the Berliner Lautarchiv (BLA) speakers. These are also of young men from County Durham, but the locations recorded as their places of origin, Tunstall and Crook, are not in Tyneside itself (Map 1). Tunstall, then a rural village, has since been absorbed into Sunderland, a city about 11 miles (18 km) east of Newcastle and comparable to it in size. The interactive map on the British Library website inaccurately suggests that this particular Tunstall – there are several in England – is in the far south-west of County Durham; the locality marked on their map is actually Bowes. The Tunstall speaker had in 1910 moved to the neighbouring village of Ryhope, now also part of the Sunderland conurbation, and had lived there for four years until the outbreak of war. Crook is a small town about 20 miles southwest of Newcastle, but note that the Crook speaker lived between the ages of 8 and 20 in nearby Witton Park, and is labelled as coming from the latter place by Borgis (1936: 14).

Origin County (pre-1974)

Speaker name Birth year

Recording year

Age Place of recording

Newcastle (born in Durham)

Northumberland Arthur Roper 1894 1916 22 Wittenberg, Germany

South Shields

Co. Durham William Dixon 1892 1916 24 Dyrotz, Germany

Tunstall Co. Durham Arthur Clark 1882 1917 35 Güstrow, Germany

Witton Park (born in Crook)

Co. Durham Richard Blacket 1895 1917 22 Güstrow, Germany


Borgis’ treatise on language use in County Durham and Northumberland is very valuable in the current context, as it reproduces Brandl’s notes about the four BLA speakers, and provides Borgis’ own narrow (quasi-IPA) phonetic transcripts of each recording alongside a list of ways in which his transcription choices differ from Brandl’s. There then follows a dense and lengthy exposition of individual consonant and vowel features. We cannot hope here to replicate Borgis’ accomplishment in this respect, so will not attempt to focus on more than a subset of the features present in the recordings. For unknown reasons the four BLA samples cannot be located on the British Library website by searching for recordings by county. Puzzlingly, neither Durham nor Northumberland are listed among the counties represented in the archive. Recordings must instead be searched for by keyword (locality, county, or speaker’s name). One (Arthur Roper) is also available on the British Library’s Voices of the UK CD, though as per the BLA website Roper is said to come from Durham rather than Newcastle. At the time he was recorded, Roper had lived for most of his life in Newcastle – he was taken there from Durham by his parents when he was about six and remained there until he joined the army and moved to London at the age of 20 – so understandably is described by Borgis (1936: 3), following Brandl’s notes, as a speaker of Newcastle dialect. We can legitimately classify Roper, then, as a speaker of late nineteenth-century TE, and will treat him as such in the present chapter. Though only Roper and Dixon are TE speakers in the strictest sense, we include the recordings of Clark and Blacket for the purposes of comparison, and because of the scarcity and brevity of relevant samples of this age. There are of course differences between all four recordings, but we judge them to be sufficiently similar that treating them together is justified. The BLA recordings are all short (3-5 minutes), being readings of a passage less than 600 words in length. The text (the Parable of the Prodigal Son; Luke 15: 11-32) is reproduced at the end of this chapter (see Stuart-Smith, this volume, for an analysis of the same text). Although the British Library website notes that this particular story was used for other surveys, e.g. the Linguistic Survey of India (Grierson 1927), the wording of Doegen and Brandl’s text differs somewhat from the standard versions available today (e.g. www.biblegateway.com). There is also variation in the wording of the passage between the four speakers. It is unclear whether these are just the result of misreadings, or alternative wordings chosen by the speakers (e.g. the South Shields speaker says shoon (= shoes) rather than boots, and the Tunstall speaker reads man as chap, sons as lads, nothing as nowt, and friends as chums, among other changes). It is possible, of course, that Doegen and Brandl modified the wording over the two years in which they made the recordings of our four POWs. It is also likely that they instructed the speakers to read ‘broadly’ as though talking to a relative or friend from home. There are points at which the word that is presumably the one shown on the page is read but then immediately re-read as its dialect equivalent (e.g. go is first read as such but then replaced by gan by the Newcastle and Crook speakers, and the Newcastle speaker starts to read home but opts instead for yem). Doegen and Brandl may have stipulated that speakers attempt a ‘broad’ reading not just to elicit traditional forms of special interest, but also because the elevated, archaic language of the text might well have prompted the readers to adopt a more standard, ‘learned’ pronunciation of the sort they would probably have heard in church or school. It is unclear

http://www.biblegateway.com)


whether the Tunstall speaker’s substitution of thou/thee for you, thy for your, and thine for yours was a choice of his own – these forms were present in the speech of County Durham in the nineteenth century (Pietsch 2005), and they would certainly be in keeping with the ‘biblical’ language used throughout the text – or a faithful rendition of the words on the page (the other BLA speakers consistently use you, your and yours in the same places). Otherwise, there are few examples of non-standard grammar evident in the recordings, but this is to be expected under the circumstances. Exceptions are the Tunstall speaker’s use of invariant come and run instead of standard came and ran, and the apparent omission by the Crook speaker of the definite article in what is presumably into the country (though the other speakers have (in)to a far country here, so what the standard equivalent would be is uncertain). Three things about the recordings are striking at first hearing. One is the surprisingly good technical quality of most of them (the exception is that for William Dixon, which suffers from high levels of non-speech noise). Doegen was the inventor of the ‘Lautapparat’ recording device, which cut grooves into wax discs as the talker spoke into a large sound-gathering horn. The originals were then copied onto shellac discs by means of a copper ‘matrix’ (Mahrenholz 2008). The Lautapparat could record only a few minutes of speech onto one disc, which accounts for the short length of the reading passage. While the sound quality in even the best of the four recordings is crackly, hissy and occasionally distorted or muffled, for all four it is sufficiently good that acoustic measurements are feasible, but inevitably less reliable than those derived from recordings made with modern equipment, and they are based on only small numbers of observations owing to the brevity of the samples. Another striking feature of the recordings is the fluency of the readings. According to Borgis (1936) and the accompanying notes on the British Library website, all four speakers had had a school education before joining the army, but all were in occupations (farm labourer, soldier, machinist, miner) that at the time would not have demanded a high level of literacy. Indeed, Brandl’s notes specify that Arthur Roper (Newcastle) could not read and write (Borgis 1936: 3), but this is difficult to square with the competence of his recitation, even if he was being prompted. The latter is a possibility: it might account for the voice that can be heard in the background uttering what sounds like the line of the passage Roper is about to recite. Nonetheless, we have the strong impression that even if Roper was unable to read the parable on cue, he had rehearsed it carefully beforehand. Similarly, there are indications that Blacket is being prompted by another speaker. This may have been Brandl, whose notes specify that he was in the room when Blacket was recorded (Borgis 1936: 4). In any case, although literacy rates in the United Kingdom exceeded 90% at the turn of the twentieth century (Vincent 1993), that would hardly guarantee a high level of skill at reading Bible passages written a style redolent of early seventeenth-century English. It may be that the readers were specifically chosen by Doegen and Brandl, or their representatives, from among the body of prisoners in their respective camps on the basis of their ability at reading aloud, and they might have been permitted to practise reading the passage several times before being recorded. Indeed, it may also be the case that in spite of the often appalling conditions in which the POWs were kept (Hinz 2006; Oltmer 2006), they may in general have had more frequent opportunities than the average working man to practise reading


aloud, through reading to fellow prisoners to communicate news from home or simply as a way of passing the time (van Emden 2009). It is perhaps doing a disservice to the BLA speakers to set out with the assumption that they would have had difficulty reading a text aloud, but under the circumstances it does not seem unreasonable to be surprised by the facility with which they did so. Thirdly, one gets the impression that the speech in the samples is actually quite modern-sounding. As suggested above, it is possible, perhaps even probable, that the speakers erred on the side of relatively standard pronunciations while reciting the text, at least to the extent that they could style shift in any consistent way. In 1916-17, having one’s voice recorded would have been extraordinary in a way that is difficult to imagine today. Recording equipment would be unfamiliar to practically everyone alive at the time, and the fact that the reading task was being administered by academics – whose English was likely to have been spoken in accents modelled on Received Pronunciation – could well have led to strong observer’s paradox effects (Labov 1972). One speaker (Blacket) adopts a rather shrill, ‘shouted’ style, perhaps in response to an instruction to talk loudly and clearly into the recording device. In short, we are almost certainly not hearing samples of the speech of these interns that is fully representative of their home language. But even if no such style-shifting was taking place when the speakers were recorded, one must consider the circumstances in which the prisoners were being held: the other English speakers in the POW camps were from all over the British Empire, and alongside these were prisoners from non-Anglophone countries (e.g. Russia, Romania, Italy) who were almost certainly unable to speak any English at all (Hinz 2006; Mahrenholz 2008). If the BLA speakers had enlisted in 1914, they would have already been mixing with speakers of other varieties of English for two or three years before their incarceration. It seems almost inconceivable that some degree of accent levelling did not occur among these speakers before and during their internment (for relevant discussion of the effects of language contact, see entries in Hickey 2010). Whether the convergence would have been towards a Received Pronunciation-like model, or something quite different, is hard to say. But since Received Pronunciation was at the time the British English accent most closely linked with clarity and suitability for communication across geographical and social divides (a reason it was favoured by the British military for use among the officer classes), it would at any rate have been available to the BLA speakers as a model of pronunciation that they could orient to when communicating with fellow prisoners, and perhaps also their captors. In the following sections we describe some of the key features to be heard in the BLA recordings. We draw a particular focus on the vowel of the FACE, GOAT and NURSE lexical sets (Wells 1982) by way of illustrating the value of recordings of this antiquity when looking for evidence of the progress of sound changes that appear to have taken place over the intervening period. FACE and GOAT are among the most variable vowels in spoken English, a point which is amply illustrated by the accents of the northeast of England (Watt 2000, Haddican et al. 2013). In the case of NURSE, it has been asserted quite frequently in the literature on traditional forms of TE, e.g. O’Connor (1947), Viereck (1965, 1966, 1968), Wells (1982), Hughes et al. (2012), that NURSE was once fully merged with NORTH such that, for example, shirt and short were homophonous. If the merger was indeed complete it appears to have been ‘undone’ in the intervening period, in that for most speakers


of contemporary TE pairs like shirt and short are now distinct. Using measurements of the frequencies of the lowest two formants in tokens of relevant vowels, we examine variability in the FACE and GOAT vowels and assess the degree to which these, the oldest known recordings of TE, corroborate the reports of a NURSE~NORTH merger. The BLA data are supplemented with figures drawn from the TLS and SED corpora. Firstly, however, we discuss some of the more notable segmental features to be heard in the BLA recordings, to give a flavour of the diversity of forms across the four samples and the ways in which these northeastern varieties differ from forms of British English that might be more familiar to readers. 4 Features The four speakers are denoted as follows: Ne = Newcastle; SSh = South Shields; Tu = Tunstall; Cr = Crook. The transcriptions below were made independently of those by Borgis and Brandl in Borgis (1936), and differences between our versions and theirs are of course to be expected. Some may be the consequence of poor sound quality, particularly in the recording of the SSh speaker. 4.1 Consonants /p t k/ These are fairly similar to contemporary pronunciations, in that they are moderately aspirated in initial positions, but are frequently realised with the local variants [ʔp ʔt ʔk] where they occur between sonorants, as in Tu yet a, pity (Docherty and Foulkes 2005). In several cases intervocalic /t/ is voiced (e.g. Ne fatted, got him; SSh but even, what he, part of) or tapped (Ne but as). Some examples of voiced intervocalic /t/ (e.g. Tu got him, SSh put a) are verging on [ɹ], i.e. they may be instances of T-to-R (Wells 1982: 370). Borgis (1936: 8) in fact has [r] (sic) for /t/ in put a and got him for Ne, SSh and Tu. The realisation of /t/ as the glottal stop [ʔ] does not occur to any extent in the BLA samples. It seems to us more likely that the infrequency of [ʔ] is indicative of the retention of the traditional TE [t] pronunciation than it is the outcome of an effort by the speakers to bring their readings more into line with the standard pronunciation. The Tu speaker spirantises /k/ to [x] in e.g. working. Borgis notes an affricated pronunciation for the Ne speaker in kill. There are, unsurprisingly, no signs of the pre-aspirated variants reported for modern Tyneside English (Docherty and Foulkes 1999). /r/ All four speakers’ accents are fairly consistently non-rhotic, but not categorically so: e.g. Ne together, father; Tu work; Cr your hired (NB: the /h/ is not dropped). Sporadic rhoticity is not unexpected (e.g. Beal 1985: 43). Borgis indicates considerably more rhoticity than we can hear in the samples, but does not indicate subtypes of /r/ other than to indicate the uvular /r/, which he denotes as [r]?????; for other variants of /r/ he uses [r] throughout.


Where realised, /r/ is generally [ɹ], but the uvular /r/ noted earlier is quite frequent (Ne great, arose, ran, bring, angry, together, father (see above); SSh country, great; Cr country, there, very). The alveolar tap [ɾ] is not uncommon (Ne very, friends; Tu everything, very, merry). The tap also occurs intervocalically across word boundaries (e.g. Tu father has, here I) but in linking /r/ contexts of this kind the /r/ may also be [ɹ] (Cr for him). As in contemporary TE, /r/ liaison may be avoided altogether through use of [ʔ] (Tu share of; Cr are always). Intrusive /r/ does not occur in the context saw him for any speaker (e.g. Tu [sɑːm]), but Borgis has the unexpected [twaːr sonz] two sons for the Cr speaker. Most surprising are the two or three occasions on which /r/ sounds labiodental ([ʋ]; SSh friends, and possibly also SSh and Tu brother; note that in each case there is a preceding labial consonant). Given the poor sound quality of the SSh recording we cannot identify the pronunciation with complete certainty, however. /l/ The majority of /l/ tokens are clear (i.e. palatalised) but can on occasion be ‘darker’ (more velarised) than expected, e.g. Ne meals. Word-final /l/ may be elided in certain historically /l/-final words, e.g. Ne all. /ŋ/ [ŋ] and [n] are both present, the latter occurring in suffixes, as in e.g. belongings, working. Hunger has [ŋ] only, not [ŋɡ], for all four speakers (though on one occasion the Cr speaker has [ŋɡ] in hunger; he also has [ŋ] rather than [ŋɡ] in youngest and angry). The [ŋɡ] sequence is used by SSh in nothing himself (Borgis has [ŋk] here, and in everything, bring and also for e.g. Cr young). /h/ /h/-dropping is not consistent across the samples. Contemporary TE is said to be the only urban accent of England in which /h/ is regularly pronounced (Wells 1982: 374), although it may be absent from unstressed /h/-initial pronouns and auxiliary verbs, as in e.g. Ne begrudged him. Among the other speakers, /h/-dropping is more frequent, and affects /h/-initial content words, e.g. SSh hired; Tu husks; Cr heaven, house, he had (but [h]usks). Although [h] is the expected form, [j] rather than [h] occurs at the beginning of two instances of heaven for the Ne speaker. /θ ð/ There is no sign of any fronting to /f v/ of either of these consonants – this is infrequent even in contemporary TE – but the Cr speaker uses a tap in father [ˈfaɾa]. For the same speaker Borgis gives [d] in father and together, but [dđ] (IPA [dð]) in there, that, and found (i.e. the /d/ is affricated). He also has [t ????] (IPA [tθ]) for one of the Cr speaker’s productions of with. The final consonant of with is dropped by the Ne and Tu speakers in with me. /v/ The final /v/ is dropped by the Ne speaker in give, gave, have. Borgis transcribes a [v] at the end of to in to his (IPA [tɪv ɪs]) by the Tu speaker, and has both [tɪv] and


[tuv] in to his for the Ne speaker. This is presumably an outcome of the same process that gives rise to [dɪv] for do in contemporary TE. /j/ An expected form of done in TE is [djon] (or similar; see e.g. Jones 1911) but this is only heard in the Tu speaker’s sample. He has [djʊn] for done (twice), and both he and the Ne speaker have the now archaic [bjʊts] boots and [sjʊn] soon. 4.2 Vowels FLEECE A narrow closing diphthong [ɪi] and often a wider one [ei] occur in open syllables, e.g. Tu me, but also in checked ones, e.g. Ne feet, Cr sleep. In northeastern English varieties the mapping between the vowel system and the lexicon often deviates markedly from that of Received Pronunciation and other reference accents, a point well illustrated by the membership of the FLEECE set. For example, Ne’s dying is [diːɪn], who is [wɪi] and no is [niː]; SSh has [dɪi] do and Tu [dɪid] dead. KIT Generally [ɪ], but the vowel is often raised to [i], e.g. Tu kissed, things; Cr this, killed. DRESS The vowel is typically [ɛ], but many RP DRESS-class words take other values in the BLA data, e.g. Ne [ˈmɒʁi] merry, [ˈnɪvə] never, [fɾiːnz] friends; Cr [ˈmɒni] many (cf. Scots), [ˈmʌʁi] merry. TRAP/BATH The quality may be appreciably closer than in contemporary TE; for several speakers it is [æ], e.g. in man, had, glad, last, dancing. Elsewhere it is mostly [a], but Cr gathered has [ɛ], as does Cr’s BATH item afterwards, and Tu’s after and gathered take [e ː ]. Words of the BATH set almost all pattern with TRAP items, but – as in modern TE – an exception is master, which for all four speakers has [ɑː] or [aː] (cf. Borgis [aː(ə)]) rather than [a] (Beal et al. 2013: 36). As in Scots, belong(ing)s, long(er), etc. often have [a] rather than [ɒ]. START In contemporary TE, START tends to be fully back, somewhat raised and often slightly rounded, but in the BLA samples it can be very front (SSh [aː], Tu [æː]; Borgis has [ɛə] for the latter, and indicates a postvocalic /r/ that we do not hear). PALM The short [a] in the first syllable of father has already been noted. It may on occasion be closer: [æ] (Cr) and even [ɪ] (SSh) are recorded in this word.


LOT Words such as what and want, which take [ɒ] in RP, generally have [a] in the varieties represented by the BLA speakers (e.g. Ne, SSh want, Tu what), though the standard-like [ɒ] is also attested (cf. THOUGHT, below). THOUGHT Words containing pre-lateral vowels, e.g. all, often have [aː]. Ne has this vowel in called and always. Cr’s saw is [sɑː]. FOOT/STRUT As elsewhere in northern England the split between these two classes did not take place. All words in this set take /ʊ/, but the actual quality used is quite variable. Even in the 1870s Ellis had remarked upon a fudged ‘new sound… [that] adumbrates (œ)’ which he had heard in Newcastle and considered a ‘bad imitation’ of a more standard form (1889: 638). We may assume the target would have been [ʌ] (Ellis’ [Ǝ]). Ellis claims that the use of [ʌ] in Sunderland could be attributed to Scottish influence. Interestingly, the SSh and Cr speakers produce son(s) with [ʌ] on one occasion apiece, while the Ne and Cr speakers generally have [ʊ] and [ɒ] respectively (Borgis gives [o] and [ɔ]). Food [fʊd] is apparently a FOOT/STRUT item for these speakers, as are found [fʊnd] and boots [bjʊts]. GOOSE In line with its ‘partner’ vowel FLEECE, GOOSE is diphthongal in open syllables and often also in checked ones (e.g. Cr shoes), but the majority form we had anticipated was a close, back and strongly rounded [uː]. Unfortunately, there are very few examples of GOOSE items in the text with which to illustrate this; food, as noted above, patterns with FOOT/STRUT. The vowel of soon is as fronted as [y] for the Tu speaker (perhaps an interaction with the preceding yod; Borgis gives [iu]). There is a neutralisation of vowel quality between GOOSE and MOUTH as a consequence of the failure of MOUTH to break and fall to a back upgliding diphthong in the far north of England and in Scotland. Forms such as toon ‘town’ and aboot ‘about’ are one of the principal stereotypes of TE, in fact; see further MOUTH, below. PRICE There is some evidence in the BLA data of the context-conditioned alternation between a diphthong with a closer and fronter onset ([ɛi]) and one with a more open and central onset ([ai]); see Milroy (1995). The first variant is expected where PRICE precedes a voiceless consonant, as in sight or price, while the second occurs elsewhere (side, prize, pry, etc.). However, the BLA material is not altogether clear with respect to the alternation. Cr has a monophthongal [aː] in swineherd (cf. Ne’s [ai]) but [ai] in time (Ne has [æi]). SSh has [ɛi] and [ai] in time, and his dying is [daɪn]. MOUTH Contrary to our expectations, MOUTH-monophthonging was not especially common in the four samples. The majority forms were back upgliding diphthongs of the [aʊ] type, with first elements anywhere along a continuum between [a] and [ə], and


glides at [ʊ], [o] or [u(ː)]. More monophthongal productions included Ne [ʌuːt] out; SSh [huː] how, [hʉs] house (cf. Scots), [suːnd] sound; Tu [fuənd] found. happY As in contemporary TE, happY is a short tense close front [i] (Ne, SSh worthy, merry; Tu many, pity; Cr very, etc.). lettER It was noted above in connection with the pronunciation of father that lettER can be very open and peripheral, e.g. Ne [ˈfaða] father; SSh [ˈhʊŋɐ] hunger. This is not invariably the case, however: [ə] is the most common form across the four recordings (e.g. Ne together, never). FACE, GOAT and NURSE We turn finally to look in more detail at the FACE, GOAT and NURSE vowels, which have been shown to vary in complex ways in late twentieth-century TE (Watt 2000, 2002; Maguire 2008). 4.3 FACE and GOAT FACE and GOAT are in some sense ‘partner’ vowels, in that the ways in which they vary in terms of their phonetic exponency and the distribution of these variants across the TE-speaking population seem to work in parallel (it should be noted that the tendency of these two vowels to shift in concert with one another is attested in numerous other varieties, including London Cockney English and, to a lesser extent, RP). In this variety, both have long close-mid monophthongal realisations ([eː] and [oː]). There is some variation in vowel height, but they are seldom as open as the [ɛː] and [ɔː] realisations found on Teesside and Yorkshire to the south, though make and take may have short open-mid or open vowels (e.g. [mak]) in Tyneside. [eː] and [oː] are today the majority variants. The traditional FACE and GOAT forms are ingliding or ‘centring’ diphthongs, which start close and peripheral and transition to schwa ([ɪə] and [ʊə], or forms approximating these values). These forms are now rather uncommon in the speech of young Tynesiders. They are also less common among women and middle-class speakers. Each vowel also has an upgliding form resembling those in RP: [eɪ oʊ]. These are believed to be relatively new to the variety; older sources do not cite them. Perhaps predictably, they are preferred by women and middle-class speakers. Lastly, GOAT features an ‘extra’ variant for which there is no analogy in FACE. This is the fronted variant [ɵː], which is very close qualitatively to the long central variant of the NURSE vowel, [ɜː]. It is this variant that eye-dialect spellings such as turtle (for total) are seeking to capture (e.g. Beal 2000). The antiquity of the [ɵː] is uncertain, but it may relate to the archaic [øː] found in parts of Northumberland (Rydland 1998). The BLA recordings contain examples of the peripheral monophthongs [eː] and [oː] but the ingliding diphthongs [ɪə] and [ʊə] are the most common forms of FACE and GOAT (compare, for example, the Ne and SSh speakers’ pronunciations of place and go, which are Ne [plɪəs], [gʊə] and [pleːs], [ɡoː] respectively). [a] in


make and take is attested. [eɪ oʊ] are absent, as is [ɵː]. This does not mean that the closing diphthongs and the fronted variant were never heard in the vicinity; Jones (1911: 184) has [o(ː)u] in a Newcastle male speaker’s pronunciations of coals, so and though, and there are indications that the BLA Tu speaker favours a slightly centralised monophthongal GOAT vowel [öː]. Thirty years later, O’Connor (1947: 8) would record the (short) fronted [ɵ] form in so, and [ɵə] in both, in Newcastle, so it is possible that GOAT fronting was already underway when the BLA recordings were made.

Figure 1. F1/F2 plot of averaged trajectories of the FACE and GOAT vowels for the

four BLA speakers. The corner vowels FLEECE, GOOSE and TRAP are the grand means for all four speakers. The Hz values have been normalised using the modified Watt-Fabricius method in NORM (Thomas and Kendall 2007).


Figure 2. F1/F2 plot of averaged trajectories of the FACE and GOAT vowels for

four TLS speakers (four elderly women from Gateshead, Co. Durham). The corner vowels FLEECE, GOOSE and TRAP are the grand means for all four speakers. The Hz values have been normalised using the modified Watt-Fabricius method in NORM (Thomas and Kendall 2007).

Figure 1 is a plot of the means of the first and second formants of FACE and GOAT, normalised following Watt and Fabricius 2002, for each of the four BLA speakers. The grand means for all four speakers’ corner vowels (FLEECE, GOOSE and TRAP) are included to give an indication of the height and frontness of FACE and GOAT, and the arrows indicate the averaged trajectories between the nuclei and the glides of the two vowels, the formants of which were measured at the approximate 25% and 75% duration points for each vowel. It will be noticed for FACE that the nucleus tends to be front and moderately close, with a normalised F2 close to that of FLEECE, though for the SSh speaker the FACE nucleus is relatively open. All four speakers’ FACE vowels are diphthongal, being characterised by a centralising or opening offglide at [ə ~ ɛ]. The Tu speaker has the narrowest diphthong but the nucleus-glide trajectory is comparable in direction to that of the Ne and Cr


speakers. The GOAT vowel shows some similarities to FACE in respect of the tendencies noted above, but for the Cr speaker the nucleus of GOAT appears to be fairly open, such that it is a centring and upgliding diphthong rather than a downgliding one. The trajectories of the FACE and GOAT vowels plotted in Figure 1 differ quite markedly from those shown for comparison in Figure 2. The latter are the by-speaker averages of a set of female Tyneside Linguistic Survey speakers from Gateshead. Although these women were recorded more than half a century later than the BLA speakers, they were born between 1891 and about 1900 and so can be considered their contemporaries. It is important to remember that the recordings from which the data in Figures 1 and 2 were drawn were made under very different circumstances: the TLS speakers were talking spontaneously rather than reading aloud, and a relatively modern magnetic tape recorder was used rather than Doegen’s primitive device, for example). However, the formant plot clearly confirms the auditory impression that these female talkers favour monophthongal variants more than the BLA men do. This pattern is replicated in TE data collected in the early 1990s (Watt 2000), more strongly among the older working-class speakers who were born shortly before World War II than among the young and middle-class speakers in that sample. On the assumption that the qualities of the FACE and GOAT vowels in the speech of the four TLS speakers were in the 1970s still much as they were when these women were young, it appears that the preference for the diphthongal [ɪə ʊə] variants among men, and the monophthongal [eː oː] variants among women, is a long-standing and quite stable feature of TE. 4.4 NURSE As noted earlier, it has been claimed that TE NURSE has undergone merger at some point in the past with NORTH (and by implication FORCE and THOUGHT, which like NORTH have /ɔː/; for simplicity, we hereafter refer to these three sets collectively as NORTH). Retraction and rounding of NURSE to [ɔː] from a value close to [ɜː] has been put down to the regressive coarticulatory influence of a following uvular [ʁ], i.e. the Northumbrian burr (Wells 1982: 370), which, following derhoticisation of most of the accents in northeast England, has been lost in non-prevocalic coda positions. The most rigorous critique of these claims is to be found in Maguire (2008), who exhaustively sifts the evidence that has been put forward in support of a full-scale NURSE~NORTH merger in Tyneside, Northumberland and parts of County Durham. Homophony of NURSE and NORTH items is indicated or discussed in some of the earliest scholarly treatments of the accents and dialects of the north-east of England, notably by Ellis (1889). Maguire (2008: 94) credits Ellis with the earliest reliable phonetic attestation of NURSE~NORTH homophony, but points to various non-technical sources (rhymes in dialect poetry and songs, deliberate or accidental misspellings in other forms of writing, renderings of place-names, and so forth) that would suggest that it may have been present in TE as early as the latter half of the eighteenth century. However, as Watt (1998) and Maguire (2008) point out, NURSE and NORTH items are for many modern TE speakers every bit as distinct as they are in other


British accents, so if there ever was a full merger it has subsequently been ‘undone’ again. This poses a potential problem: according to what Labov calls ‘Garde’s Principle’ (Garde 1961, cited in Labov 1994: 602), mergers cannot be reversed ‘by linguistic means’. Accepting that this axiom – ‘once a merger, always a merger’ – is literally inviolable forces us to conclude that the merger can never have been complete in the first place. Such a conclusion would be supported by the fact that hypercorrection (the pronunciation of short as [ʃøːt], for example) does not appear to occur (Wells 1982: 375); had the lexical sets in question coalesced fully, one would not expect speakers to be able to apply phonetic alternations to words from one of the original input lexical sets but not to those from the other. It is possible that varieties in which the two vowel classes were distinct (e.g. RP) served as a model such that TE speakers were provided with direct evidence concerning the original lexical set membership of individual NURSE and NORTH words, though it can be assumed that a contact-based explanation of this sort would not in Garde’s view qualify as ‘linguistic means’. Although the number of NURSE items in the BLA recordings is small, it is worth analysing them more closely to see whether they cast any light on the NURSE~NORTH merger in late nineteenth-century TE. It should be noted that the geographical extent of the merger specified by Maguire (2008: 99) would exclude Crook and possibly also Tunstall; Newcastle and South Shields are well within its limits, however. Figure 3 shows vowel plots (in raw Hz) for each of the four BLA speakers, with the individual NURSE-class words that occur in the reading passage picked out by unfilled triangles. Note that some NURSE tokens are not represented for certain speakers because of difficulties in extracting formant values. The grey squares indicate the F1 and F2 values for the few NORTH-class words that are found in the passage (before and two instances of saw). The ranges of values on the axes in each plot have been kept consistent for the purposes of comparison across speakers. Ellipses have been drawn around two subsets of NURSE items, based upon (a) the tendency of Middle English /ɛr/ items (<e(a)r> words like servants, etc.) to develop an open reflex (given as [ɑː] by Maguire 2008: 110) and (b) the items in <or> – work, worthy, etc. – that have converged on /ɔː/ (Maguire 2008: 111). In some informal dialect sources, <e(a)r> forms are indeed often spelled in a way that reflects the relative openness of the vowel (e.g. German as ‘Jarmin’ or learn as ‘larn’; e.g. Todd 1987).


Figure 3. F1/F2 plots (in Hz) of the vowels of NURSE and NORTH items produced

by the four BLA speakers. Means for each speaker’s corner vowels FLEECE, GOOSE and TRAP are included for comparison. Ellipses are drawn around the <or> and <e(a)r> subsets of NURSE and around NORTH items.

In spite of the obvious differences between each of the four speakers with respect to the positions of the NURSE and NORTH tokens, and bearing in mind the difficulty of extracting reliable formant values from the BLA materials, it is noticeable that the <er> words do indeed seem to be more open than the <or> tokens, and in the cases of the Ne and Tu speakers are fairly well separated from them. In one <e(a)r> case (heard) the Tu speaker produces a vowel with a markedly front quality; Borgis transcribes the word as [heːd], corroborating our own transcription precisely, and reflecting the vowel’s position on Tu’s formant plot in Figure 3. It is probably to be expected that the formant values for the worthy set are mostly quite low, owing to perseverative coarticulation with the preceding /w/, though the Cr speaker seems to


have a central rather than a back vowel in both <or> and <e(a)r> words, so retraction and raising of the vowel in <or> items is not inevitable. On the basis of limited observations it is rather difficult to say whether there is qualitative overlap between NURSE and NORTH for these speakers, but for the SSh and Tu speakers it is certainly not implausible to suppose that, given a larger sample, one might find that vowels of the two sets would be hard to distinguish on the basis of their F1 and F2 values alone. For the Cr speaker, the three NORTH items are further back than even the mean of GOOSE, which places them where one would expect on the basis of the older descriptions of TE given by Ellis and others. For the Ne speaker, NORTH seems very open, but two of the three tokens in question are the word saw, the vowel of which may take a quality close to [ɐː] (cf. the Tu speaker’s productions). The scarcity of NURSE and NORTH items makes it difficult to argue definitively in favour of a NURSE~NORTH merger, although the auditory qualities of the NURSE <or>-word vowels (work, etc.) are often at or close to [ɔː] (it is possible, on the other hand, that this latter tendency represents the retention of an historical pattern rather than the outcome of a more recent merger). Regardless, given that both older and contemporary published descriptions describe the NORTH vowel in TE as [ɔː], it would presumably mean that for adult male speakers its F1 and F2 values were in the region of 400-600Hz and 700-1000Hz, respectively. This is indeed exactly what we see in the left panel in Figure 4, which shows raw F1/F2 data for NURSE and NORTH in the SED recordings of four older male speakers from Northumberland and Co. Durham. There is clear overlap of the two clouds of points, suggesting that the vowels are not distinct from one another. There are too few NURSE tokens in the SED figures (N = 15) to establish whether there is any correlation between subclass (<e(a)r>, <or>, etc.) and the relative height and fronting of individual tokens. However, if we compare the SED speakers’ productions with those for four TLS speakers (all elderly Gateshead women), we can see that there is much greater separation of the NURSE and NORTH vowels than is the case for the SED men. There is also a tendency for the fronter NURSE tokens to be <ir>- or <e(a)r>-class words (e.g. circle, girl, first, thirty, heard), while tokens which overlap with NORTH in respect of the range of F2 values are almost all <or> and <ur> items (e.g. work, world, church) (see further Maguire 2008: 235-243).

Figure 4. F1/F2 plots (in Hz) of the vowels of NURSE and NORTH items produced

by (left) four SED speakers (older males from Earsdon, Ebchester, Ellington and Heddon-on-the-Wall) and (right) four TLS speakers (older females from Gateshead, as per Figure 1). Grand per-group means for the corner vowels FLEECE, GOOSE and TRAP are included for comparison. Ellipses are drawn around NURSE and NORTH items.

On the basis of these strands of evidence we therefore have principled grounds for thinking that in late nineteenth-century TE the vowel of <or> items like work, worthy, etc. was indistinguishable, or only marginally distinguishable, from that of


NORTH-class words, but that the vowel of <e(a)r> or <ir> words like servants, heard, girl, etc. was either more open or more fronted – or both – than the vowels found in other NURSE-class words. Given that in contemporary TE the NURSE vowel can still be realised anywhere along a continuum from [ɔː] through [ɜː] to [ɛː] or [øː], it appears that the situation has changed rather little with respect to this variable. The [ɔː] variant is now rare, however, with central and front realisations accounting for the great majority of NURSE pronunciations in Tyneside, and indeed the northeast more widely; NURSE in accents of Teesside, the third major conurbation in northeast England after Tyneside and Wearside, is generally as front as [ɛː], for instance. In line with Maguire (2008), we conclude from this analysis that to talk of there having been a NURSE~NORTH merger in TE would be a misleading oversimplification. 5 Conclusion While there are obvious limitations to the data analysed here, it has been possible to form an impression from the BLA recordings of what the phonology and phonetics of TE were in the late nineteenth and early twentieth centuries, and to gain from them some intriguing indications of the sociolinguistic patterns that obtained in the variety around that time. For a subset of the vowel variables examined, the SED and TLS data provide useful support by demonstrating how the production patterns of men and women born at around the same time as Doegen and Brandl’s POWs can be used as a baseline against which to compare the BLA data. The distributions of vowel tokens on the F1~F2 plane reveal a good deal about the relative stability of vowel targets within and across speakers, and within their lexical and phonemic classes. They also allow us to corroborate some of the observations of TE that were made by phoneticians and dialectologists around the time the BLA speakers were recorded. Almost 80 years have elapsed since Borgis performed his detailed auditory analysis of the BLA recordings, and in the centenary year of the beginning of the First World War it seems fitting to return to the samples to glean new insights into the properties of these voices from the past. Acknowledgments We are grateful to Warren Maguire for useful discussion of some of the issues raised in this chapter, and for pointing us towards Borgis’s monograph. We also thank Joan Beal, Kerry Bossons, Karen Corrigan, Natalie Fecher, Rachel Gill, Tyler Kendall, Paul Kerswill, Carmen Llamas, Katy McGahan, Adam Mearns, Dave Parsons, Jonnie Robinson, Simon Rooks and Eivind Torgersen for their advice and assistance, as well as Raymond Hickey for feedback on several points in the chapter. References


Beal, Joan C. 1985. Lengthening of a in Tyneside English. In: Roger Eaton, Olga Fischer, Willem Koopman and Frederike van der Leek (eds), Papers from the fourth International Conference on English Historical Linguistics, Amsterdam. Amsterdam: John Benjamins, pp. 31-44.

Beal, Joan C. 1993a. Geordie accent and grammar. In: T. Namba (ed.). The Society and Culture of England. Tokyo: Seibundo.

Beal, Joan C. 1993b. The grammar of Tyneside and Northumbrian English. In: Lesley Milroy and James Milroy (eds). Real English: The Grammar of English Dialects in the British Isles. London: Longman, pp. 187-213.

Beal, Joan C. 2000. From George Ridley to Viz: popular literature in Tyneside English. Language and Literature 9(4): 343-359.

Beal, Joan C. 2008a. English dialects in the north of England: phonology. In: Bernd Kortmann and Clive Upton (eds). Varieties of English, vol. I: The British Isles. Berlin: Mouton de Gruyter, pp. 122-144.

Beal, Joan C. 2008b. English dialects in the north of England: morphology and syntax. In: Bernd Kortmann and Clive Upton (eds). Varieties of English, vol. I: The British Isles. Berlin: Mouton de Gruyter, pp. 373-403.

Beal, Joan C., Lourdes Burbano-Elizondo and Carmen Llamas 2012. Urban North-Eastern English: Tyneside to Teesside. Edinburgh: Edinburgh University Press.

Borgis, Karl-Heinz 1936. Der Sprachgebrauch in Nord-Durham: mit Berücksichtigung von Süd-Durham und Südost-Northumberland [Speech usage in North Durham: with consideration of South Durham and South-East Northumberland]. In: Alfred Müller and Karl-Heinz Borgis (eds). Der Sprachgebrauch in den Dialektgebieten von Südost-Yorkshire und Nord-Durham [Speech Usage in the Dialect Areas of South-East Yorkshire and North Durham]. Palaestra 204. Leipzig: Mayer and Müller GmbH, pp. 1-99.

Defoe, Daniel 1724-7 [1778]. A Tour through the Island of Great Britain. Vol. 3. Eighth edition. London: Everyman’s Library. Online resource: http://bit.ly/1hswpLf [accessed 18 March 2015].

Docherty, Gerard J. and Paul Foulkes 1999. Newcastle upon Tyne and Derby: instrumental phonetics and variationist studies. In: Paul Foulkes and Gerard J. Docherty (eds). Urban Voices: Accent Studies in the British Isles. London: Arnold, pp. 47-71.

Docherty, Gerard J. and Paul Foulkes 2005. Glottal variants of (t) in the Tyneside variety of English: an acoustic profiling study. In: William Hardcastle and Janet Mackenzie Beck (eds). A Figure of Speech: A Festschrift for John Laver. London: Lawrence Erlbaum Associates, pp. 173-199.

Ellis, Alexander J. 1889. On Early English Pronunciation, vol. 5: The Existing Phonology of English Dialects, Compared with that of West Saxon Speech. London: Trübner and Co.

Garde, Paul 1961. Réflexions sur les differences phonétiques entre les langues slaves [Reflections on the phonetic differences between the Slavic languages]. Word 17: 34-62.

Grierson, George A. (ed.) 1927. Linguistic Survey of India. 11 vols. Delhi: Motilal Banarsidass.

Griffiths, Bill 1999. North-Eastern Dialect: Survey and Word-List. Newcastle upon Tyne: Centre for Northern Studies, University of Northumbria.

http://bit.ly/1hswpLf


Haddican, William, Paul Foulkes, Vincent Hughes and Hazel Richards 2013. Interaction of social and linguistic constraints on two vowel changes in northern England. Language Variation and Change 25: 371-403.

Hickey, Raymond 2004. A Sound Atlas of Irish English. Berlin: Mouton de Gruyter.

Hickey, Raymond (ed.) 2010. The Handbook of Language Contact. Malden, MA: Wiley-Blackwell.

Hinz, Uta 2006. Gefangen im Großen Krieg: Kriegsgefangenschaft in Deutschland, 1914-1921 [Captured in the Great War: Prisoners of War in Germany, 1914-1921]. Essen: Klartext Verlag.

Howell, Robert B. 1987. Tracing the origin of uvular R in the Germanic languages. Folia Linguistica Historica 20(7/2): 317-350.

Hughes, Arthur, Peter Trudgill and Dominic Watt 2012. English Accents and Dialects: An Introduction to Social and Regional Varieties of English in the British Isles. Fifth edition. London: Hodder Education.

Jones, Daniel 1911. English: Tyneside dialect (Northumberland). Le maître phonétique 26: 184.

Labov, William 1972. Sociolingustic Patterns. Philadelphia: University of Pennsylvania Press.

Labov, William 1994. Principles of Linguistic Change, vol. 1: Internal Factors. Oxford: Blackwell.

Maguire, Warren N. 2008. What is a Merger, and can it be Reversed? The Origin, Status and Reversal of the ‘NURSE-NORTH Merger’ in Tyneside English. PhD thesis, University of Newcastle upon Tyne. Online resource: http://hdl.handle.net/10443/493 [accessed 18 March 2015].

Mahrenholz, Jürgen-Kornelius 2008. Ethnographic audio recordings in German prisoner of war camps during the First World War. In: Dominiek Dendooven and Piet Chielens (eds). World War I: Five Continents in Flanders. Yprès: Editions Lannoo, pp. 161-167.

Mess, Henry A. 1928. Industrial Tyneside: A Social Survey. London: Ernest Benn, Ltd.

Milroy, James 1995. Investigating the Scottish Vowel Length Rule in a Northumbrian dialect. Newcastle and Durham Working Papers in Linguistics 3: 187-196.

O’Connor, Joseph D. 1947. The phonetic system of a dialect of Newcastle-upon-Tyne. Le maître phonétique 87: 6.

Oltmer, Jochen 2006, ed. Kriegsgefangene im Europa des Ersten Weltkriegs [Prisoners of War in the Europe of the First World War]. Paderborn: Schöningh.

Orton, Harold 1929. Northumberland dialect research: First report. Proceedings of the University of Durham Philosophical Society 8: 127-135.

Orton, Harold 1930. The dialects of Northumberland. Transactions of the Yorkshire Dialect Society 5(31): 14-25.

Orton, Harold 1933. The Phonology of a South Durham Dialect: Descriptive, Historical, and Comparative. London: Kegan Paul, Trench, Trübner & Co.

Orton, Harold 1962. Survey of English Dialects (A): Introduction. Leeds: E.J. Arnold & Son.

http://hdl.handle.net/10443/493


Orton, Harold and Wilfrid J. Halliday 1962. Survey of English Dialects (B): the Basic Material, vol. 1, the Six Northern Counties and the Isle of Man. Leeds: E.J. Arnold & Son.

Påhlsson, Christer 1972. The Northumbrian Burr: a Sociolinguistic Study. Lund: Gleerup.

Pellowe, John 1976. The Tyneside Linguistic Survey: aspects of a developing methodology. In: Wolfgang Viereck (ed.). Sprachliches Handeln - Soziales Verhalten: Ein Reader zur Pragmatiklinguistik und Soziolinguistik [Speech Acts - Social Behaviour: A Reader on Pragmatic Linguistics and Sociolinguistics]. Munich: Wilhelm Fink, pp. 203-217, 365-367.

Pellowe, John, Graham Nixon, Barbara Strang and Vince McNeany 1972. A dynamic modelling of linguistic variation: the urban (Tyneside) Linguistic Survey. Lingua 30: 1-30.

Pietsch, Lukas 2005. ‘Some do and some doesn’t’: verbal concord variation in the north of the British Isles. In: Bernd Kortmann, Tanja Herrmann, Lukas Pietsch and Susanne Wagner (eds). A Comparative Grammar of English Dialects: Agreement, Gender, Relative Clauses. Berlin: Mouton de Gruyter, pp. 125-210.

Rydland, Knut 1998. The Orton Corpus: A Dictionary of Northumbrian Pronunciation, 1928-1939. Oslo: Novus Press.

Simpson, David 2009. Geordie Origins: Dialect Origins. Roots of the Region. Online resource: http://www.englandsnortheast.co.uk/GeordieOrigins.html [accessed 8 February 2014].

Thomas, Erik R. 2011. Sociophonetics: An Introduction. London: Palgrave Macmillan.

Thomas, Erik R. and Tyler Kendall 2007. NORM: The Vowel Normalization and Plotting Suite. Online resource: http://ncslaap.lib.ncsu.edu/tools/norm/ [accessed 18 March 2015].

Todd, George 1987. Todd's Geordie Words and Phrases: An Aid to Communication in Tyneside and Thereabouts. Rothbury: Butler Press.

van Emden, Richard 2009. Prisoners of the Kaiser: The Last POWs of the Great War. Barnsley: Pen & Sword Books.

Viereck, Wolfgang 1965. Specimen passages of the speech of Gateshead-on-Tyne. Le maître phonétique 123: 6-7.

Viereck, Wolfgang 1966. Phonematische Analyse des Dialekts von Gateshead-upon-Tyne/Co. Durham [Phonematic Analysis of the Dialect of Gateshead-upon-Tyne/Co. Durham]. Hamburg: Cram, de Gruyter & Co.

Viereck, Wolfgang 1968. A diachronic-structural analysis of a northern English urban dialect. Leeds Studies in English 2 (new series): 65-79.

Vincent, David 1993. Literacy and Popular Culture: England 1750-1914, second edition. Cambridge: Cambridge University Press.

Wales, Katie 2006. Northern English: a Social and Cultural History. Cambridge: Cambridge University Press.

Watt, Dominic 1998. Variation and Change in the Vowel System of Tyneside English. PhD thesis, University of Newcastle upon Tyne. Online resource: http://hdl.handle.net/10443/350 [accessed 18 March 2015]

http://www.englandsnortheast.co.uk/GeordieOrigins.html

http://ncslaap.lib.ncsu.edu/tools/norm/

http://hdl.handle.net/10443/350


Watt, Dominic 2000. Phonetic parallels between the close-mid vowels of Tyneside English: are they internally or externally motivated? Language Variation and Change 12(1): 69-101.

Watt, Dominic 2002. ‘I don't speak with a Geordie accent, I speak, like, the Northern accent’: contact-induced levelling in the Tyneside vowel system. Journal of Sociolinguistics 6(1): 44-63.

Watt, Dominic and Anne H. Fabricius 2002. Evaluation of a technique for improving the mapping of multiple speakers’ vowel spaces in the F1~F2 plane. Leeds Working Papers in Linguistics and Phonetics 9: 159-73. Online resource: http://www.leeds.ac.uk/arts/download/1346/watt2002 [accessed 18 March 2015].

Watt, Dominic, Carmen Llamas and Daniel Ezra Johnson 2014. Sociolinguistic variation on the Scottish-English border. In: Robert Lawson (ed.) Sociolinguistics in Scotland. London: Palgrave Macmillan, pp. 79-102.


Wright, Joseph 1898. The English Dialect Dictionary, Vol. 1 (A-C). London: Henry Frowde

Wright, Joseph 1905. The English Dialect Grammar. Oxford: Henry Frowde. Online resource: http://bit.ly/1aHsFCw?? [accessed 18 March 2015].

The Parable of the Prodigal Son (Luke 15: 11-32) The text below is based upon the text spoken by Arthur Roper, the BLA Newcastle speaker. Misread sections have been omitted or amended slightly, and Roper’s word grub has been replaced by food. The punctuation is our own. Note that the texts read by the four BLA speakers are all slightly different. There was a man who had two sons. The younger of them said to his father, “Give me that part of your goods that belongs to me.” So the father gave him his share. Not many days afterwards the young man got all his belongings together and went away into a far country. There he wasted all that he had. When he had spent everything, a great famine came over the country and he began to be in want. Before long he had to take any work he could get, and was glad when he found a place with a man of that country. This man sent him into the fields as a swineherd. He had no place to sleep, and he saw others at their meals but had nothing himself to eat. Many a time he would have been glad to fill his belly with the husks that he fed the swine with, but even such food the master begrudged him. At last, when the young man came to think over what he had done, he said, “Ah! How many of my father’s hired servants have all the food they want, and even more than they want? And here am I, dying of hunger! High time is it that I should go back to my father. This very day I will start for home. I will go to my father and will say to him, ‘Father, I have sinned against heaven and against you. I am no longer worthy to be called your son. Make me one of your hired servants.’” And he arose and came to his father. When he was yet a great way off, his father saw him and pitied him, and ran to him and put his arms around his neck, and kissed him. And the young man said to his father, “Father, I have sinned against heaven and against you, and am no

http://www.leeds.ac.uk/arts/download/1346/watt2002

http://bit.ly/1aHsFCw


longer worthy to be called your son. Make me one of your hired servants.” But the father said to his servants, “Bring out the best clothes and put them on him, and put a ring on his hand and boots on his feet. And bring out the fatted calf and kill it, and let us eat and be merry. For this my son was dead, and is alive again. He was lost and is found.” Now, his elder son was working in the fields and as he came near the house he heard music and dancing, so he called one of the servants and asked, “What does this mean?” And the servant said to him, “Your brother has come home again, and your father has killed the fatted calf because he has got him back again safe and sound.” And then he was angry, and would not come in, so his father came out and asked him to come in. And he said to his father, “These many years have I served you, and have always done what you told me to do. But you never gave me even a kid so that I might have a feast with my friends. But as soon as my brother comes, who has spent all his money on wild living, you have the fatted calf killed for him.” And then the father said to him, “Dear son, you were always with me, and all that I have is yours. But this your brother was dead, and is alive again. He was lost and is found.”

Suart-Smith and Lawson Scotland - Glasgow --- Page 150 of 525

8 Scotland – Glasgow and the Central Belt Jane Stuart-Smith and Eleanor Lawson 1 Introduction The vernacular of the Scottish Central Belt, stretching between Edinburgh and Glasgow, where the majority of the Scottish population live, is thought to have undergone radical changes during the course of the twentieth century (e.g. Johnston 1997; Macaulay 2014). This chapter presents results from an ongoing study of what we believe to be the earliest existing recordings from Glasgow and Central Belt Scottish English, namely the subset of young men from this area who were recorded in German Prisoner of War camps during World War I by Wilhelm Doegen, and which now form part of the British Library’s Berliner Lautarchiv British and Commonwealth Recordings collection. Impressionistic auditory analysis was carried out on the speech of eight speakers, with particular emphasis on T-glottalling and rhotics. The speech of three of the western Central Belt men, two speakers from Glasgow and one from the countryside outside the city, was further subjected to acoustic analysis of specific features, particularly the quality and duration of the stressed monophthongs and /ai/, and word-initial /l/; the results of the acoustic analysis are given in an accompanying paper, and are not discussed further here (see Stuart-Smith et al forthcoming). The chapter has two aims. After discussing the materials and our methods for analysing the recordings (Sections 2 and 3), we give first a very brief phonetic sketch of early Scottish English as attested in these speakers, with a more detailed outline of T-glottalling and the realization of /r/ (Section 4). Our second aim (Section 5) is to consider the evidence for real-time change across the century, by focussing on coda /r/. In order to do this, we compare the new results from the Berliner Lautarchiv speakers with findings from previous and current projects on phonological variation and change in Glasgow and the Central Belt. These reveal that the derhoticisation of coda /r/ observed in sociolinguistic surveys since the late 1970s appears to be rather more gradual than has been supposed, though impressions of the progress of derhoticisation also differ depending on the kind of comparison that can be made. This study of coda /r/ constitutes the theoretical contribution from this chapter: our appreciation of sound change depends on the resolution of the time period with which we view it (cf. Milroy 2003; Gregersen 2014). 2 Materials and resources This chapter focuses on what seem to be the earliest extant recordings of vernacular Scottish English. Since Aitken (1984), Scottish English is generally considered to be a sociolinguistic continuum of varieties, ranging from Scottish Standard English (SSE), spoken by middle-class speakers, and Scots vernacular, spoken by working-


class speakers (e.g. Macafee 1997; Stuart-Smith 2003; Corbett and Stuart-Smith 2012). In rural areas speakers often switch between SSE and the local dialect, but in urban areas, and particularly the Central Belt of Scotland, it is more common for speakers to drift up and down the continuum depending on context and interlocutor (Stuart-Smith 2003). Until recently urban vernacular varieties of Scottish English have been highly stigmatized, with the result that searches for recordings made before the late 1960s have yielded very little indeed. The Sound Archive at the School of Scottish Studies, University of Edinburgh, mainly has traditional rural dialect recordings and songs. The Scottish Screen Archive of the National Library of Scotland has a substantial collection of films of different kinds, from news reports to locally-made films, recording different aspects of life in the Central Belt, but vernacular speech hardly occurs before the 1970s. Films made between the 1890s and the 1930s are silent. In the 1940s the small number of films with sound, presenting reports on events or topics in Glasgow or in the Central Belt, have a musical soundtrack and an RP-speaking commentator. For an example, see film number 0268, made by the Glasgow Corporation Housing Department: ‘Progress Report: A survey of Municipal Housing Activity in Glasgow. New council housing at Knightswood, Cranhill, Pollock and Tollcross’: http://ssa.nls.uk/film/0268. Films from the 1950s and 1960s are very similar, only there are now a very few which have Scottish Standard English commentators, whose accents are accommodating strongly to RP (see Johnston’s comments on SSE e.g. Johnston 1997, Johnston 1985). The situation is similar for the British Library’s Sound Archive, which has a handful of rural Scottish dialect recordings and songs for this period, with the important exception of the Berliner Lautarchiv corpus, from which the sample discussed here is drawn. 2.1 The Berliner Lautarchiv British and Commonwealth Recordings In 2008 the British Library acquired 66 dialect recordings, digitized versions of 821 shellac discs, from a much larger sound archive held by the Humboldt University in Berlin, which had been compiled between 1915 and 1938 by the German language teacher and sound pioneer, Wilhelm Doegen (for more information about Doegen, see e.g. The Doegen Records Web Project). The recordings which are analysed here are from a series of visits made by Doegen and a Professor of English, Alois Brandl, to Prisoner of War camps in various parts of Germany between 1916 and 1917, in order to make recordings of soldiers’ languages and dialects, in the form of recitations and songs. A selection of the recordings, including all those discussed here, are openly available on the British Library’s website: http://sounds.bl.uk/Accents-and-dialects/Berliner-Lautarchiv-British-and-Commonwealth-recordings 2.2 Speaker sample for early Central Belt Scottish English (1916-17) The Scottish recordings comprise 17 speakers, nine from northern and southern Scotland (Caithness, Aberdeen, Fife, Perth and Kinross; Ayrshire, Berwickshire),

http://ssa.nls.uk/film/0268

http://sounds.bl.uk/Accents-and-dialects/Berliner-Lautarchiv-British-and


and the eight men considered here from Glasgow and the Central Belt; there is no recording from Edinburgh. The details of our speaker sample are given in Tables 1 and 2; Map1 shows the location of their place of birth/residence before the war. Speaker

Provenance

Date of Birth

Date of recording

Age

BL shelfmark

William Bryce (1) Glasgow 1891 21/07/1916 25 C1315/1/444 John Johnstone (2) Maryhill 1896 15/06/1917 21 C1315/1/639 William Cooper (3) Kirkintilloch 1889 10/05/1916 27 C1315/1/401 James Crawford (4) Blantyre 1884 26/09/1917 33 C1315/1/700-701 William Lothian (5) Hamilton 1887 10/05/1916 29 C1315/1/398 Hugh Fulton (6) Newarthill 1883 03/07/1917 34 C1315/1/671 Thomas Sneddon (7) Falkirk 1896 03/07/1917 21 C1315/1/651-2 Thomas Finnie (8) Bathgate 1887 14/06/1917 30 C1315/1/634-6

Table 1. Provenance, age and details of the recordings for the eight Central Belt

Scottish English men, whose speech was recorded in 1916 and 1917.

Map 1. Map of the Central Belt of Scotland showing the provenance of the 8

speakers from the Berliner Lautarchiv corpus. Speakers can be identified by their numbers, which are also given in Tables 1 and 2.

Speaker

Education

Stated occupation

Mother tongue

Father

Mother

William Bryce (1) public school Soldier "Scottish" Glasgow Glasgow John Johnstone (2) boarding school Accountant "English" Dumfries Stirlingshire William Cooper (3) elementary school Tailor "Scots" not known not known James Crawford (4) public school Salesman "Scottish" Dundoland Glasgow William Lothian (5) elementary school Soldier "Scottish" Fife Lanarkshire Hugh Fulton (6) public school Vanman "Scottish" Newarthill Newarthill Thomas Sneddon (7) public school Farmer "Scottish" Falkirk Falkirk Thomas Finnie (8) elementary school Shoemaker "Scots" Bathgate Slamanan


Table 2. Education, occupation, stated dialect, and provenance of parents, for the overall sample of eight Central Belt Scottish men, from whom recordings were made in 1916 and 1917.

The sample consists of eight men, six from the western Central Belt and two from the east, Thomas Sneddon and Thomas Finnie from Falkirk and Bathgate respectively. Two are from Glasgow, William Bryce from the city of Glasgow, who indeed sounds ‘Glaswegian’, and John Johnstone from Maryhill to the northwest of the city centre. All of the recordings can be accessed via the British Library website, by searching for the British Library (BL) shelfmark given in Table 1, at the Advanced Search page of the Sound Archive. A few personal notes were collected from each speaker, and can be seen beneath the recording on the British Library webpage. All of the men were in their twenties and thirties at the time of the recordings (average age is 29). All state that they are of Protestant religion, and have had at least elementary education (until the age of 10 or 11); they can all read and write (in ‘English’). Interestingly, all bar John Johnstone refer to their spoken language as ‘Scottish’ or ‘Scots’, but to their written language as ‘English’; Bryce and Cooper even say that English was their ‘additional language’. In terms of social background, Johnstone seems to be aspiring middle-class. He reports that he was educated at a boarding school in Glasgow, and describes himself as an accountant, having been previously a clerk in a railway company. We include him in our sample because his recording is clearly in Scots. 2.3 The text: reading/reciting a biblical passage Various short recordings were made from each speaker, such as counting in Scots, singing songs, and reading a short passage. Here we present analyses of the men reading the passage, the Parable of the Prodigal Son (Luke chapter XV, verses 11-32). These recordings are presented as reading a text. The few photographs of the recordings in progress, as shown in Figure 1, suggest that speakers may have had a text in front of them as they spoke (Doegen is holding the text above the recording horn for the speaker). They also had a rather diverse audience.


Figure 1. Wilhelm Doegen (right) recording a speaker in the Prisoner of War

camp in Wahn (near Cologne) in October 1916, with the assistance of Alois Brandl (left of speaker). The picture was sent by Doegen to Brandl in 1925 with an inscription recalling their recordings together (Wissenschaftliche Sammlungen an der Humboldt-Universität, Berlin: http://www.sammlungen.hu-berlin.de/dokumente/8247/).

However, it is clear from the actual recordings that speakers were as much reciting or recalling biblical recitation from memory as they were reading, because each of the recordings discussed here is slightly different in terms of lexis and phrasing. For example, in our eight speakers’ recordings, the ‘fatted calf’ that is killed for the feast for the prodigal son appears as: the ‘fatted calf’ [kaf] (Maryhill, Blantyre), the ‘fatted cauf’[kɔf] (Hamilton, Newarthill, Bathgate, Falkirk), the ‘muckle fat cauf’ [kɔf] (Kirkintilloch), and the ‘fatted coo’ [kʉ] (Glasgow). That the men might have been able to recite the passage from memory is also likely given that all state that they are ‘of Protestant religion’, which in Scotland constitutes different versions of reformed Protestantism, all of which are strongly centred on biblical teaching. Their likely attendance at church Sunday School, as well as daily school, would have required rote learning of key bible passages, such as this one. Certainly all produce it fairly fluently, and with the intonation and rhythm of a recitation. An example of the text, as recorded by Bryce from Glasgow, is given in the Appendix. The first impression of this and the other recitations, by contrast with contemporary Glaswegian (and Central Urban Scots) is the density of Scots lexis, which is reminiscent of the ‘broad’ Scots from Aitken’s model of Scottish speech (Corbett and Stuart-Smith 2012). Doegen was keen to document dialects. There is a possibility that even these speakers, who seem to have been largely vernacular speakers, also enhanced their use of Scots forms for the recitation. Reading here was not intended to elicit more formal or standard speech.

http://www.sammlungen.hu-berlin.de/dokumente/8247/


To sum up, it is clear that these recordings represent a particular style of speech which is certainly recitation, but may also include reading, for Doegen and his assistants (and possibly the others present), with likely emphasis on the production of dialect forms. As a result our inferences about the state of early Scottish English are made with caution, as are our comparisons with findings from later recordings. 3 Method 3.1 Preparation of the sound files for analysis We used the methods and resources of an ongoing project on real-time change in Glaswegian dialect, Sounds of the City (http://soundsofthecity.arts.gla.ac.uk/) to facilitate the analysis of the Berliner Lautarchiv recordings. This project uses electronic corpus software, LABB-CAT (Fromont and Hay 2012; http://labbcat.sourceforge.net/), developed for the Origins of New Zealand English project, to store, manage, search, and partly analyse our data. The eight sound files were orthographically transcribed using Praat (Boersma and Weenink 2013), with reference to a transcription protocol for vernacular Scottish English, which had already been devised for the existing project. In Praat, time-aligned orthographic transcriptions were made for each recording, breaking the speech down into shorter utterances, usually intonational phrases which tended to align with major or minor syntactic boundaries. Transcripts and wav files were then uploaded to LABB-CAT, which allowed us to access and browse the time-aligned text/sound files, and at the same time, generate an automatic phonemic transcription. After correction and addition of any new lexical items to the LABB-CAT dictionary, the recordings were then force aligned using the HTK routine in LABB-CAT, resulting in an initial automatic segmentation of the waveform against the phonemic transcription for all the recordings. This gave us a set of text/sound files which could be electronically searched using orthographic or phonemic search strings, against other parameters such as lexical stress (generated by LABB-CAT’s link with the CELEX databases). Given that the materials used are digitised versions of audio recordings made around a hundred years ago on shellac discs, the general quality of the recordings is poor, and some recordings had poorer sound quality than others, e.g. the Hamilton and Newarthill recordings are particularly muffled. In order to remove high-frequency hiss, recordings were lowpass filtered removing acoustic energy above 7KHz; thereafter noise-cancelling was carried out using Audacity (Audacity Project 2005). The filtered and noise-cancelled sound files were used for auditory transcription. The particular features reported here were at least partly determined by the difficulties imposed by the quality of the recordings. For example, fricatives were hard to hear and patterns of fricative energy can also be difficult to discern spectrally. 3.2 Auditory phonetic analysis

http://soundsofthecity.arts.gla.ac.uk/

http://labbcat.sourceforge.net/


T-glottalling The realization of /t/ in sites where T-glottalling (Wells 1982) is found in contemporary Scottish vernacular was transcribed by JSS, a native speaker of near-RP. All possible instances of /t/ were coded in the following contexts (see e.g. Stuart-Smith et al 2007): (1) occurring in intervocalic position (e.g. fatted), (2) in word-final position before a word beginning with a vowel (e.g. let us), and (3) in word-final position which was also phrase-final position (e.g. feet#). A range of auditory variants were identified and categorized broadly as [t], which included released and apparently unreleased denti-alveolar voiceless stops as well as a few apparently voiced plosives (in e.g. pity, but as for some speakers), and [ʔ], what sounded like glottal stops, but includes apparently completely deleted stops (especially in word-final/phrase-final position), also stops with nasal release, e.g. in hadn’t. A total of 202 stops, on average 25 per speaker, were transcribed to assess T-glottalling. Transcription was carried out from the waveform using the Praat TextEditor view; the poor sound quality meant that auditory impressions were also supplemented by inspection of the spectrogram in order to assess acoustic characteristics such as reduction in amplitude reflecting stop closure, or slowing or irregularity in periodicity reflecting creaky voicing for a glottal stop. Rhotics Auditory analysis of all rhotics was carried out by EL, a native speaker of Central Belt Scottish Standard English. Auditory coding of tokens of Scottish /r/ can be difficult enough, even with good audio quality, see Stuart-Smith 2007. In order to improve the reliability of auditory coding, two stages of auditory coding were undertaken using Praat: (1) a transcription tier annotation, (2) a randomised re-listening to, and recoding of, each /r/ token, using the Praat Multiple Forced Choice interface, followed by a comparison of the results of these two coding phases. /r/ was transcribed as one of five variants: a trill, tap, approximant, derhoticised /r/, zero /r/, or was labelled as unclear/uncategoriseable. Given the quality of the recordings, more subtle categorisation of /r/, e.g. identification of devoicing or retroflexion was not possible. Phonotactic context was also transcribed using conventional notation, including syllable position, whether /r/ was preceded or followed by a vowel or a consonant, whether word-final /r/ was immediately followed by a syllable, word or phrase boundary and whether onset or coda /r/ occurred in a stressed or unstressed syllable. 708 tokens of onset, intervocalic and coda /r/ were identified overall (around 90 for each of the 8 speakers). Due to the fact that the audio quality of the recordings was poor, each token of /r/ plus some context (words immediately flanking) was extracted for randomised relistening and recategorisation. Using the Praat Multiple Forced Choice interface, the sound files were listened to again in random order and the /r/ classified by EL into one of the five variants listed above – multiple replays were allowed. 53% of the tokens of /r/ were classified as belonging to the same variant category during initial transcription and randomised replay, 5% were within one category of one another, e.g. categorised as derhoticised and no /r/, or tap and


trill1. 21% were classified as variants more than one category apart, most commonly approximant and derhoticised, or tap and derhoticised. The remaining 20% were not classified in one or both of the classification scenarios. Agreement of less than 58% by the same auditory coder along with 20% of tokens unclassifiable in one or other classification scenario is, in part, a reflection of the poor audio quality of the recordings. The low agreement rate also reflects the complexity of coding /r/ auditorily in connected speech, where /r/’s quality is affected by preceding and following segments. Finally, this low rate of agreement also reflects the difficulty of categorisation of /r/ variants in a variety of English where /r/ is weakening in postvocalic position and auditory cues have become ambiguous, see Stuart-Smith (2007). During annotation, the spectrogram was also inspected to improve accuracy of identification of rhotic variants used. The spectral characteristic most frequently associated with rhoticity, lowering of the third formant, could not always be observed, due to a combination of the lower acoustic intensity of F3 during production of a rhotic and masking noise in the acoustic signal. Some spectral cues for rhoticity were observable, however; the auditory impression of a tapped /r/ often coincided with a brief, but substantial, lowering of the energy in the first and second formants on the spectrogram, i.e. corresponding to the constriction phase of the tap, see Figure 2.

Figure 2. Thomas Sneddon from Falkirk saying “brother has”. Black ellipses

show momentary interruptions of the acoustic signal, corresponding to the tapped /r/ variants at the beginning and end of “brother”.

Likewise, trilled /r/ often corresponded to multiple brief interruptions in F1 and F2 in the acoustic signal (see Figure 3).

1 It could be argued that categories such as approximant and derhoticised or tap and derhoticised are one category apart also – these differences made up the majority of the remaining tokens where there was no agreement between the two stages of classification.


Figure 3. William Bryce from Glasgow saying prepausal “faither”. Black ellipse

shows two momentary interruptions in the acoustic signal, corresponding to a trilled /r/.

Conversely, a lack of these features in the acoustic signal could often be used to verify the auditory impression of a non-tapped/non-trilled approximant variant. Where a derhotic variant was identified, it often corresponded with straight F1 and F2 or a rising F2 trajectory. 4 A sketch of early Central Belt Scottish English 4.1 General impressions Prosody The recordings are strongly affected by the style of the speech elicitation, namely the recitation or reading of a passage. Interestingly, whilst Central Belt Scottish English shows a classic isogloss in the direction of terminal pitch patterns, which fall in the east and rise in the west (e.g. Macafee 1983: 36-37; Cruttenden 1997), in these recordings, the ends of sentences, and many clauses, show falling intonation, for all speakers, irrespective of their provenance. Cruttenden (2007) found that the Glaswegian speaker in his study also had final rises in conversation and falls when reading a text. Both Grant’s contemporary (1912) and McAllister’s later (1938) elocutionary texts for Scottish and especially western Central Belt Scottish speakers, urge speakers to use Anglo-English falling pitch for declarative statements. The need for explicit models, practice routines, and both commentators’ brief and disparaging comments about local intonation indicate that Central Belt Scottish English in the first half of the twentieth century typically showed final rises. The predominant use of falling terminals by all the western Central Belt men indicates that they may have used the visual text prompt which we can see in the photograph in Figure 1, even though they ended up selecting slightly different words. It also points to the presence of distinctly different intonational models for literacy and spoken language, the former probably continuing that of (Southern)


Anglo-English through the emerging Scottish Standard which was well established in education (e.g. Corbett et al 2003; Cruttenden 2007). Vowels Broad phonetic auditory transcription of the stressed monophthongs of the Berliner Lautarchiv recordings confirmed the presence of the following vowels: /i ɪ e ɛ a ʌ ɔ o ʉ/, with the expected distribution for Scottish English. So one vowel is found where Anglo-English has two for TRAP/BATH, FOOT/GOOSE, and COT/CAUGHT; Wells 1982; Abercrombie 1979), but vowels are selected differently for systematic Scots/SSE lexical alternations (e.g. aff/off /a ɔ/, heid/head /i ɛ/, oot/out /ʉ ʌʉ/, e.g. Macafee 1994). The main findings of the acoustic analysis of the three-speaker subsample are reported in Stuart-Smith et al (forthcoming), and align with those of contemporary reports from e.g. Grant (1912). /ɪ/ is retracted and lowered, /e/ is raised, /a/ is retracted, /ʉ/ is central and close, and /o ɔ/ are merged. Again analysis of vowel duration confirms that the Scottish Vowel Length Rule operates only for /i ʉ/ and marginally for /ae/, which shows clear differences in vowel quality (Stuart-Smith et al forthcoming). 4.2 T-glottalling T-glottalling has a particularly long history in western Central Scotland. Perhaps the earliest account of the glottal stop in the U.K. comes from Alexander Melville Bell, who noted:

“The Breath Obstructive Articulations, especially the letter T, are, in the West of Scotland pronounced without any articulative action, but with a mere glottal catch, accompanying the articulative position.” Melville Bell (1860: §137).

The use of the glottal stop for /t/ is well documented for urban Scots from the first sociolinguistic surveys (e.g. Macaulay 1977), but it probably became established as a stereotype of Glaswegian vernacular well before the turn of the twentieth century. This is clear both from Bell’s observation and also the elocution manuals which observe and advise against the ‘degenerate glottal stop’ typical of the Central Belt (McAllister 1938: 71; see also Grant 1912: 30), and from contemporary observations collated by Macafee (1994: 27, n. 20), such as the quotation from one George MacDonald’s letter of 1892: ‘Strangers hurl at us as a sort of Shibboleth such sentences as ‘Pass the wa’er bo’’le, Mr Pa’erson!’. The results of our auditory analysis of possible sites for T-glottalling in the Berliner Lautarchiv speakers also reveal glottal stops for /t/.


Figure 4. Distribution of glottal stops for /t/ in the full sample of the Berliner

Lautarchiv speakers according to position: intervocalic e.g. fatted, word-final prevocalic, e.g. let us, and word-final prepausal, e.g. eat; N = 202.

We can see from Figure 4 that glottal stops constitute around a third of the variants for /t/ even in these recitations. T-glottalling also shows clear differences in patterning according to the position of /t/ (χ2 = 42.49, df = 2, p < 0.001), which are also rather similar to studies of T-glottalling from speech recorded in the second half of the twentieth century (e.g. Macaulay 1977; Stuart-Smith 1999). Glottal stops are least likely to occur between vowels, confirming this as the most stigmatized position, and in fact they mainly occur in the Glasgow speaker, in the word fatted. The use of glottal stops by this speaker seems to be part of a general style shift towards the vernacular which he moves into about halfway through the recording, perhaps because he could no longer sustain the more monitored recitation style. 4.3. Rhotics /r/ in present-day vernacular Scottish English Scottish Central Belt /r/ allophony is complex, more so in the vernacular than in Scottish Standard English, with much of this /r/ variation phonotactically conditioned. Some variation has been found to correlate with social factors such as socio-economic class and gender (Stuart-Smith 2007; Lawson et al 2014). Of particular interest is the weakening of /r/ in coda position in vernacular Scottish English where the presence of /r/ is indicated more by qualitative changes to the preceding vowel (retraction, lowering, velarisation or pharyngealisation) than by the presence of a rhotic segment itself (e.g. Romaine 1978; Johnston 1997; Stuart-Smith 2007; Lawson et al 2014). Sociolinguistic researchers began to take serious notice of derhoticisation of /r/ in vernacular Scottish English from the late 1970s onwards; however, its appearance in the western Central Belt seems to have been much earlier and can be dated to at least 1901 from a comment by R. Trotter about ‘Glasgow-Irish’ in the Gallovidian magazine (Johnston 1997: 511). The


phonetic characteristics of derhoticisation, in particular pharyngealization of the preceding vowel, suggest that this process does not result from straightforward adoption of Anglo-English nonrhoticity (see e.g. Speitel and Johnston 1983: 28; Johnston 1997: 511). While vernacular Scottish English has seen /r/ weaken, approximant /r/ has become increasingly common in middle-class speech, replacing traditional tapped and trilled variants. Grant (1912: 35) states:

“Within recent years there has been a tendency to attenuate the force of the trill especially in final positions and before another consonant. This tendency is probably due more to imitation of Southern speakers than to a natural development in the pronunciation. The trill may be reduced (finally and before consonants) to a single tap [ɾ], or even to a fricative consonant [ɹ],”

Johnston (1985) suggests that [ɹ] in middle-class Scottish English was most likely adopted from eighteenth-century precursors to RP. On a scale of strong (more consonantal) to weak (more vocalic) rhoticity, the transition from tap/trilled /r/ to approximant /r/ in Scottish English therefore does not represent a natural process of lenition, but rather signifies the convergence of two phonetic/phonological systems, those of Scots and Scottish Standard English. The variants do not fit easily into a rhotic continuum, as Romaine (1978: 147) notes in her study of Edinburgh schoolchildren’s use of /r/, “there is nothing ‘in-between’ a [ɾ] and [ɹ]”. The Lautarchiv recordings allow us to assess to what extent approximant /r/ was prevalent in male working-class speech during the early twentieth century as well as to compare rates of weak rhoticity in this early twentieth century corpus with later twentieth and early twenty-first century corpora.

/r/ in the Berliner Lautarchiv sample The results of the auditory analysis of all possible instances of /r/ in the Berliner Lautarchiv recordings are given in Table 3, showing the distribution of variants according to syllable position. Mean % of variant used Phonotactic position

Trill Tap approximant Derhotic No /r/

Onset /r/ 7 52 42 0 0 Coda /r/ 4 29 35 27 5

Table 3. Mean percentage of variants used in syllable-onset and coda position (onset n = 139; coda n = 502)

Weak /r/ variants (derhoticised /r/ and /r/ deletion), as we would expect, are confined to syllable-coda position. Tapped /r/ and approximant /r/ variants dominate in onset position with a small proportion of trilled /r/ also used. The Falkirk and Glasgow speakers produced higher quantities of trilled onset /r/ than other speakers in the study, respectively 25% and 17% of their onset /r/s were trilled. Trilled /r/ is used more frequently for singleton word-initial /r/ e.g. ring,


rose, ran (37% of singleton word-initial /r/ tokens were trills) than in other phonotactic contexts. It is possible that use of trills in this phonotactic context are a demonstration of Scots pronunciation, given the performative aspect of reciting aloud the biblical passage. We see low levels of trilling in other onset contexts #Crv (angry, country /kʌn.tre/), CrV (brother, great, bring), C#rv (country /kʌnt.re/, everything), v#CrV(begrudged), v#rV (arose) and no instances of trilling intervocalically Vrv (merry, very, forrit, parable). The highest rates of approximants occurred in the v#CrV (e.g. begrudged) (67%) and CrV (great, brother, bring) (52%) contexts. Tapped /r/, however, occurs at high levels in all onset contexts, including labial and velar clusters, and is particularly common after a vowel, e.g. v#rV (arose) (80%) and intervocalically (merry, very) (79%). In coda position, approximants, taps and derhotic variants dominated, with a small percentage of the strongest (trills) and weakest (/r/ deletion) variants. Again, around 20% of the Falkirk informant’s coda /r/s were trills, but this was exceptional in the speaker group. When we considered onset and coda /r/ variants by geographical location, we found no clear east-west divide concerning either the use of the non-local approximant /r/, or the use of vernacular derhoticised variants (see Figure 5). Surprisingly high levels of weak /r/ variants (derhoticised /r/ and /r/ deletion) were found in the Lautarchiv data, even at this early date. The speaker from Falkirk stands out from others in this small corpus due to his frequent use of trilled /r/, not only in onset position, but also in coda. This speaker also produced the smallest percentage of weak /r/ variants (derhoticised /r/ and /r/ deletion) in coda position. The speaker who used the greatest quantity of weak /r/ variants was from Blantyre.


Figure 5. Percentage breakdown of onset /r/ variants used by the Berliner Lautarchiv sample, from 8 different locations in the Scottish Central Belt, in (a) upper, onset position (n = 144) and (b) lower, coda position (n = 552).

A comparison of the Maryhill and Glasgow speakers gives us some insight into /r/ variation conditioned by social-class during the early twentieth century (see Figure 6). The Maryhill speaker reports that he was an accountant, while the Glasgow speaker was identified as having ‘been in the army since the age of 18’. We are able to see emerging patterns of use that would become well-established as the twentieth century progressed. The Maryhill speaker was found to use fewer tapped and trilled /r/s than the working-class Glasgow speaker. On the other hand, the rate of approximant /r/ use in onset position was much higher in the speech of the Maryhill speaker than for the Glasgow speaker. These results confirm observations made


from the early elocutionary books and Scottish phonetic handbooks, e.g. Walker (1791); Grant (1912: 35ff.); McAllister (1938: 179), as well as much later studies of Central Belt /r/ (e.g. Lawson et al. 2008), that middle-class speakers are more likely to use approximant variants than working-class speakers, who use more traditional tapped and trilled variants of /r/.

Figure 6. Variants of /r/ by social class, comparing one Glasgow middle-class

speaker (Maryhill: pale grey) with one Glasgow working-class speaker (Glasgow: dark grey), in (a) upper, onset position and (b) lower, coda position.


For coda /r/, the working-class Glasgow speaker used slightly more tapped /r/ than the middle-class Maryhill speaker. The Maryhill speaker used more approximant /r/ than the Glasgow speaker (+14%) and the Glasgow speaker uses much more derhoticised /r/ than the Maryhill speaker (+22%). The fact that the lowest percentage of derhoticised coda /r/ was produced by the middle-class Maryhill speaker seems to agree with the findings of later studies that suggested that derhoticisation is a change from below (Romaine 1979; Speitel and Johnston 1983).

Weakening of /r/ in postvocalic position in English (both in present-day Scotland and historically in English, Hickey 2014) has previously been found to be associated with specific phonotactic positions, specifically unstressed syllables (Dobson 1957: §427), preconsonantal position (Walker 1791: 50) and also utterance-final position (Romaine 1978; Lawson et al 2008). Table 4 shows the percentage of tokens realised as weak variants of /r/ in the seven most common phonotactic environments where /r/ weakening occurs in the Berliner Lautarchiv recordings. Phonotactic context

% of variants in this context that were realised as derhoticised or /r/-less

Example of phonotactic context

vr### 88 faither, brother, maister in utterance-final position

VrC 59 forth, years, start, served vrC 52 gaithered, others, faithers Vr### 40 share, more in utterance-final position. N.B.

only 10 tokens of /r/ in this context were identified in total.

vr##C 24 efter the; faither gied; never give Vr#C 16 worthy, servant, working Vr##C 14 mair than; start for Table 4. Seven phonotactic environments conditioning weak /r/ in the Berliner

Lautarchiv full sample. v indicates an unstressed syllable, V indicates a stressed syllable, C indicates any consonant.

The results in Table 4 confirm utterance-final position, preconsonantal position and weak stress as key internal factors that condition weak rhoticity in these early Scottish English recordings. All of these environments have also been linked to /r/-loss either in present-day Scottish English or in Anglo English in the past. One potential explanation for /r/ weakening (and loss) in utterance-final position relates to changes in the timing of the /r/ gesture where utterance-final syllable lengthening occurs. It has been found that in this key phonotactic location, the tongue-tip raising gesture can often be delayed to the point that it occurs after voicing ends and so becomes inaudible or only weakly audible, see Lawson et al. (2014). A similar tendency has been found to occur for utterance-final /l/ (see Recasens and Farnetani 1994). 5 Coda /r/ across a century of Central Belt Scottish English


Carrying out a real-time comparison of opportunistic samples is always awkward because of gross differences in context and speech style. Here, the Lautarchiv recordings are essentially performed recitations, while the more recent sociolinguistic speech samples of Central Belt Scottish English comprise spontaneous speech of different kinds, and/or read wordlists. For example, Stuart-Smith’s 1997 and 2003 datasets had speakers chatting to each other in same sex, self-selected pairs, as well as reading word lists (e.g. Stuart-Smith et al 2007; Stuart-Smith et al 2014). The datasets for Scobbie and Lawson’s 2007 West Lothian corpus, see Scobbie, Stuart-Smith and Lawson (2008), have similar materials from the east. However, Scobbie, Lawson and Stuart-Smith’s 2007eastern Central Belt and 2012 western Central Belt audio-ultrasound recordings are primarily in wordlist form (Lawson et al 2014). Comparison is inevitably tricky, so here we select two, which interestingly show slightly different views of stability and change in coda /r/ across the century. Comparison 1: coda /r/ in the Berliner Lautarchiv and West Lothian 07 corpora.The first comparison is made with our West Lothian 2007 corpus (WL07), see Lawson et al 2008, which contains the same-gender dyad conversations of fourteen male High School pupils aged 12-13 from Livingston in West Lothian. Livingston is a New Town 13 miles west of Edinburgh and geographically closest to the location of the Bathgate (eastern) speaker. The school was selected because it served areas of multiple social deprivation in Livingston and therefore the young male pupils recorded can tentatively be identified as working class, like the majority of the males in the Lautarchiv study. Some adaptation was required in order to match the /r/ variant classification categories from the Lautarchiv study to the WL07 study. While the WL07 analysis had auditory categories alveolar approximant and retroflex approximant, WL07 also had one category only to cover the categories derhotic and no /r/, as the rapidity of spontaneous conversation made it difficult to make the fine distinction between those two categories. For the purpose of comparing the two corpora, therefore, the variants considered were trill, tap, approximant and derhotic/no /r/. The differences between percentages of variants used by the Lautarchiv informants and the WL07 informants is shown in Figure 7.


Figure 7. Error bars showing the mean percentage use of coda /r/ variants (+/- 1 standard deviation) used by all Lautarchiv informants (black circles, n =552) and the eastern Central Belt WL07 informants (white circles, n =2566).

While a small percentage of coda trilled /r/s were used by the Lautarchiv informants, only one out of the fourteen WL07 informants produced trills for coda /r/. Taps were reasonably common in coda position amongst the Lautarchiv informants (mean 25%, range 11-53%), but for the WL07 informants, coda tapping is rare (mean 6%, range 0-17%). Approximants, on the other hand, are much more common in the WL07 corpus (mean 68%, range 38-92%) than in the Lautarchiv corpus (mean 32%, range 11-53%). Perhaps the most interesting result, however – irrespective of speech style and age – is how little things have changed regarding weak variants of /r/ in the intervening one hundred years, the average percentages of weak variants in each corpus is almost the same; Lautarchiv mean 28%, range 17-38%, WL07 mean 26%, 7-39%. This result suggests that this realization of postvocalic /r/ may have been stable over the century, and that derhoticisation is not advancing, or if it is advancing (in the west), this change is very gradual indeed. A couple of findings that would back up this notion are (1) Lawson et al’s (2014) finding (based on word-list data), that showed almost all tokens of postvocalic /r/ rated as derhoticised or /r/-less contained covert tongue-tip raising gestures, i.e. /r/ was still present at the articulatory level; and (2) the finding that Glaswegian speakers can usually distinguish between rhotic and nonrhotic minimal pairs where derhoticisation has occurred, even where derhoticisation renders the minimal pairs auditorily very similar, e.g. hut/hurt, bud/bird (Lennon 2013). These two findings imply that an obvious mechanism for driving this change forward, misinterpretation or reinterpretation of an auditorily weak variant, may not be contributing to further


weakening or loss of coda /r/. Weak /r/ may, in fact, be a positionally-conditioned variant, occurring in specific predictable phonotactic positions.

Phonotactic context

% of variants in this context that were realised as derhoticised or /r/-less

Example of phonotactic context

vr### 46 better, remember, another in utterance-final position

Vr### 46 car, here, there in utterance-final position

vr##C 34 another, answer, better followed by a consonant

Vr##C 32 pure, four, here followed by another consonant

Vr#C 21 morning, normal, recording

VrC 18 burp, first, heard

vrC 15 bothered, players, trainers

Table 5. Seven phonotactic environments conditioning weak /r/ in the WL07

corpus. v indicates an unstressed syllable, V indicates a stressed syllable, C indicates any consonant.

Table 5 shows the percentage of tokens realised as weak forms of /r/ in the seven most common phonotactic environments where /r/ weakening occurs in the WL07 corpus. If we compare these environments with those shown in Table 4 for the Berliner Lautarchiv speakers above, we find that for both corpora, utterance-final unstressed position e.g. faither, better, brother, remember, is the most common context for weak rhoticity. Further comparison shows similarity in the contexts where weak rhoticity is observed, but also differences in the ranking, which may relate to the more general differences in derhoticization observed in Eastern and Western varieties of Central Belt Scots. Comparison 2: coda /r/ in the Berliner Lautarchiv and Glasgow 1997/2003 corpora. Our second comparison with a different dataset, this time from the western Central Belt and with real- and apparent-time dimensions, shows a slightly different pattern, and one which indicates that change may now be progressing, albeit gradually, and in conjunction with social identity. The data are from the two Glasgow corpora collected from working-class speakers from Maryhill, in 1997 and 2003, using read wordlists (Stuart-Smith et al 2014). Interestingly, this speech elicitation task did not provoke the automatic shift to the standard as expected in


classic sociolinguistic methodology. In both corpora, the adolescent speakers treated this task as an opportunity for display and performance (see Stuart-Smith et al 2007). They rattle through the lists of words, sometimes laughing and commenting on them, and most interestingly for us, they show a strong shift towards non-standard variants (though not those blocked by orthography). This performative aspect of the wordlist readings make them in some ways rather comparable to the performative aspects of the Lautarchiv recitations. This comparison shows that not only is weak /r/ a positionally-conditioned variant, it also has an additional element, age-grading in the form of the adolescent peak – heightened use of features undergoing change in adolescents’ speech (e.g. Tagliamonte and D’Arcy 2009). The speaker groups for this comparison are the six Berliner Lautarchiv speakers from the western Central Belt, so excluding Bathgate and Falkirk (10X), and 90M: four men born in the 1940s and recorded in the 1990s; 00M: six men born in the 1950s and recorded in the 2000s; 90Y: four adolescent boys born in the 1980s and recorded in the 1990s; and 00Y: eighteen adolescent boys born in the 1990s and recorded in the 2000s. Tokens which were coded as weak /r/, i.e. all variants of velarized/pharyngealized vowels, and all instances of vowels without audible secondary articulation – in the 1997/2003 data, so as derhotic (dark) or non-rhotic (pale) bars in the Lautarchiv data, are shown in Figure 8.

Figure 8. Percentage of weak rhotic variants of coda /r/ in the six western Central

Belt Berliner Lautarchiv speakers (10X) and in the Sounds of the City corpus data for middle-aged men and adolescent boys recorded in the


1990s and 2000s (see text for labels). Derhotic variants are dark; non-rhotic/plain vowels are light; n = 4048.

This perspective on derhoticisation extends that of the comparison with the WL07 corpus. Here we see similar evidence of the stability in derhoticisation across the century, if we compare the Berliner Lautarchiv speakers with the middle-aged men. But if we look at the adolescents from the 1990s and the 2000s, we see much more derhoticisation (cf. also Lawson et al 2014). This pattern could reflect a long period of stability in weak rhoticity (as suggested by Comparison 1), though lacking evidence for the intervening years, followed by a real-time increase in derhoticisation as indicated by the adolescent data. Or, the adolescent data reflect age-grading in the form of the adolescent peak. Our suspicion, following the careful analyses of Sankoff, e.g. (2006), is that we are probably witnessing both – a very long-term process of derhoticisation which may exist for long periods as positionally-conditioned stable variation, but which is also available to carry social-indexical meaning, and so to move into a socially-determined trajectory for change. The adolescent peak is witnessed when language changes are in progress; derhoticisation has carried covert social prestige of ‘street smarts’ since before the 1980s (Johnston 1997: 511). More recent evidence shows that derhoticisation is also promoted by indirect engagement with Anglo-English shown in television soap dramas (e.g. Stuart-Smith et al 2014). After a long period of near stability, derhoticisation may be taking off (again). 6 Concluding remarks This chapter has presented an account of some aspects of what we believe to be the earliest extant recordings of vernacular speech of the Scottish Central Belt, from the Berliner Lautarchiv corpus. Much of what we find aligns well with contemporary observations often made perceptively, but inadvertently, by elocution and phonetic manuals. Qualitative and quantitative comparison of the features of the speech of these men with those of later recordings is subject to the constraints of differences in recording context, style and content. Nevertheless cautious comparison shows evidence for both stability and change in Scots vernacular across the twentieth century, and not always as we might expect. The pattern of T-glottalling according to context in the Lautarchiv speakers shows very similar constraints to those observed a hundred years later; the modest degree may likely be affected by the style of the recordings – this feature may have been fairly stable, and/or increasing only very gradually over the century. Our view of derhoticisation of coda /r/ has shifted from an emphasis on change, as in the early sociolinguistic studies, to one of a much longer-term change in progress, possibly exhibiting long periods of apparent stability, but also perhaps showing some signs of acceleration in conjunction with particular social-indexical meanings for derhotic variants, towards the end of the century. Clearly, the resolution of the window through which we view variation over time affects our inference of language change (Milroy 2003). At the same time, we note that our descriptions – and so our suggestions – are necessarily partial, requiring more evidence to help fill


out the patterning for the missing decades across the century. It is clear that for us to gain a better appreciation of sound change in Scottish English vernacular, we must also continue to listen to – and document – the sound heritage that constitutes a key aspect of Scotland’s past. Appendix Sample Text (‘The Story of the Prodigal Son’, read by William Bryce from Glasgow) There was a man whae had twa son. The youngest of them said to his faither, “Gie me that pairt of your guids that belangs to me.” So the faither gied him his share. No mony days efterwards, the young man gaithered aw his belangings together, and went away into a far country. There he wasted aw that he had. When he had spent awthing, a great famine kam ower the country, and he began to be in want. Before lang. So he had to tak ony work he could get intae, and was gled when he fund a place wi a man of that country. This man sent him intae the fields to look efter the pigs, he had to he had nae place to sleep in. He saw others at their meals but had naething hissel to eat. Many a time he would have been gled to fill his belly with the husks that he had fed the pigs with, but even sic food his maister begrudged him. At last, when the young man kam to think ower what he had done, he said, “Ah!, how many paid servants of my faither have aw the food they want, and even mair than they want, and here am I dying of hunger. High time is it that I should gang back to my faither. This very day I will stert for hame, and I will gang to my faither and say to him, ‘Faither, I have sinned against heaven and against thee. Ah'm nae langer worthy to be cawed your chiel. Mak me wan of your paid servants.’” And he arose, and kam to his faither. When he was yet a great way aff, his faither saw him, and had pity, and ran to meet him, and fell on his neck, and kissed him, and the young chiel said to his faither, “Faither, Ah've sinned against heaven and against you, and am nae langer worthy to be cawed your son.” But the faither said to his servant, “Bring forth the best claes and pit them oan him, and put a ring on his haun, and shoes on his feet, and bring hither the fatted coo and kill it, and let us eat and be merry, for this ma son was deid and is aleeve again. He was loast and is found.” Noo, his elder son was working in the fields, and he kam ower to the hoose. He heard music and dancing, so he cawed wan of the servants and asked him, “What do these things mean?” And the servant said to him, “Your brother has kam hame, and your faither has killed the fatted coo because he has got him back again, and soon.” Then he was angry and wouldnae go in, so his faither kam oot, and asked him to come in, and he said to his faither. “These mony years have I served you, and ah've aways done what you tellt me to dae, but you never gave me even a kid so that I might have a feast wi ma friends, but as soon as my brother kam, who has spent aw his money on wild leeving, you have the fatted coo killed for him.” And the faither said to him, “Dear chiel, you are aways with me and all that I hae is your, but this your brother was deid and is aleeve again, he was lost and is found.”


References Abercrombie, David 1979. The accents of Standard English in Scotland. In: A. J.

Aitken and Tom McArthur (eds) The Languages of Scotland. Edinburgh: Chambers, pp. 68-84.

Aitken, A. J. 1984. Scots and English in Scotland. In: Peter Trudgill (ed.) Language in the British Isles, Cambridge: Cambridge University Press, pp. 517-532.

Audacity Project 2005. Audacity. 2.0.5 ed. Pittsburgh: Pittsburgh Carnegie Mellon University, October 21, 2013.

Bell, Alexander Melville 1860. The Elocutionary Manual: the Principles of Elocution with Exercises and Notations. Third edition. New York: N. D. C. Hodges, Edgar S. Werner.

Boersma, Paul and David Weenink 2013. Praat: doing phonetics by computer. 5.3.47. http://www.praat.org/.

Corbett, John, J., Derrick McClure and Jane Stuart-Smith 2003. Introduction: A brief history of Scots. In: John Corbett, J Derrick McClure and Jane Stuart-Smith. The Edinburgh Companion to Scots. Edinburgh: Edinburgh University Press, pp.1-16.

Corbett, John, J., and Jane Stuart-Smith 2012. Standard English in Scotland. In: Raymond Hickey (ed.) Standards of English: Codified Standards around the World, Cambridge: Cambridge University Press, pp. 72-95.

Cruttenden, Alan 2007. Intonational diglossia: a case study of Glasgow. Journal of the International Phonetics Association 37: 257-274.

Cruttenden, Alan 1997. Intonation. Revised Edition. Cambridge: Cambridge University Press.

Dobson, Eric 1957. English Pronunciation 1500-1700. Oxford: Clarendon Press. Fromont, Robert and Jennifer Hay 2012. LaBB-CAT: An annotation store.

University of Otago, Dunedin, New Zealand: Australasian Language Technology Workshop (ALTA), 4-6 Dec 2012. In: Proceedings 10: 113-117.

Grant, William 1912. The Pronunciation of English in Scotland. Cambridge: Cambridge University Press.

Gregersen, Frans 2014. A matter of scale only? Paper presented at Methods in Dialectology XV. University of Groningen. 11-15 August 2014.

Hickey, Raymond 2014. Vowels before /r/ in the history of English, In: Daniel Schreier, Olga Timofeeva, Anne Gardner, Alpo Honkapoja, and Simone Pfenninger (eds) Contact, Variation and Change in the History of English. Amsterdam: John Benjamins, pp. 95-110.

Johnston, Paul 1985. The rise and fall of the Morningside/Kelvinside accent. In: Gorlach, Manfred (ed.) Focus on Scotland. Amsterdam: John Benjamins, pp. 37-53.

Johnston, Paul 1997. Regional variation In: Charles Jones (ed.) The Edinburgh History of Scots. Edinburgh: Edinburgh University Press, pp. 433-513.

Lawson, Eleanor, James M. Scobbie and Jane Stuart-Smith 2014. A socio-articulatory study of Scottish rhoticity. In: Robert Lawson (ed.) Sociolinguistics in Scotland. London: Palgrave Macmillan, pp. 53-78.

Lawson, Eleanor, Jane Stuart-Smith and James M. Scobbie 2008. Articulatory insights into language variation and change: Preliminary findings from an

http://www.praat.org/


ultrasound study of derhoticization in Scottish English. University of Pennsylvania Working Papers in Linguistics 14(2): 102-110.

Lennon, Robert 2013. The effect of experience in cross-dialect perception: Parsing /r/ in Glaswegian. Unpublished Masters dissertation. University of Glasgow.

Macafee, Caroline 1983. Glasgow. Varieties of English Around the World. Text series T3. Amsterdam: John Benjamins.

Macafee, Caroline 1994. Traditional Dialect in the Modern World: A Glasgow Case Study, Frankfurt: Lang.

Macafee, Caroline 1997. Ongoing change in modern Scots: The social dimension. In: Charles Jones (ed.) The Edinburgh History of the Scots Language, Edinburgh: Edinburgh University Press, pp. 514-548.

Macaulay, Ronald 1977. Language, Social Class and Education: A Glasgow Study. Edinburgh: Edinburgh University Press.

Macaulay, Ronald 2014. A short history of sociolinguistics in Scotland. In: Robert Lawson (ed.) Sociolinguistics in Scotland, London: Palgrave Macmillan, pp. 15-31.

McAllister, Anne 1938. A Year’s Course in Speech Training. London: University of London Press.

Milroy, James 2003. When is a sound change? On the role of external factors in language change. In: David Britain and Jenny Cheshire (eds) Social Dialectology: In honour of Peter Trudgill. Amsterdam: John Benjamins, pp. 209-221.

Recasens, Daniel and Edda Farnetani (eds) 1994. Spatiotemporal properties of different allophones of /l/: phonological implications: Phonologica 1992: Proceedings of the seventh International Phonology Meeting. Torino: Rosenberg & Sellier.

Romaine, Suzanne 1978. Postvocalic /r/ in Scottish English: Sound change in progress. In: Peter Trudgill (ed.) Sociolinguistic Patterns in British English, London: Hodder and Stoughton, pp. 144-158.

Sankoff, Gillian 2006. Age: Apparent time and real time. Encyclopedia of Language and Linguistics, Second Edition, Article Number: LALI: 01479

Scobbie, James M., Jane Stuart-Smith and Eleanor Lawson (2008) Looking variation and change in the mouth: developing the sociolinguistic potential of Ultrasound Tongue Imaging. ESRC Final Report. Queen Margaret University, Edinburgh.

Speitel, Hans, H. and Paul Johnston 1983. A Sociolinguistic Investigation of Edinburgh Speech. Economic and Social Research Council.

Stuart-Smith, Jane, Brian José, Tamara Rathcke, Rachel Macdonald and Eleanor Lawson forthcoming. Twa son, some soldiers and a city: An acoustic phonetic investigation of real-time change over a century of Glaswegian. In: Chris Montgomery and Emma Moore (eds) A Sense of Place, Cambridge: Cambridge University Press.

Stuart-Smith, Jane, Eleanor Lawson and James M. Scobbie 2014. Derhoticisation in Scottish English: A sociophonetic journey. In: Chiara Celata and Silvia Calmai (eds) Advances in Sociophonetics. Amsterdam: John Benjamins, pp. 57-94.

Stuart-Smith, Jane 2007. A sociophonetic investigation of postvocalic /r/ in Glaswegian adolescents In: Proceedings of the sixteenth International


Congress of Phonetic Sciences. Saarbrücken, Germany: Universität des Saarlandes, pp. 1307-10.

Stuart-Smith, Jane 2003. The phonology of Modern Urban Scots. In: John Corbett, J Derrick McClure and Jane Stuart-Smith (eds) The Edinburgh Companion to Scots. Edinburgh: Edinburgh University Press, pp.110-37.

Stuart-Smith, Jane 1999. Glottals past and present: A study of T-glottalling in Glaswegian. Leeds Studies in English 30: 181-204.

Stuart-Smith, Jane, Claire Timmins and Fiona Tweedie 2007. ‘Talkin’ Jockney: Accent change in Glaswegian. Journal of Sociolinguistics 11: 221-261.

Tagliamonte, Sali and Alexandra D’Arcy 2009. Peaks beyond phonology: Adolescence, incrementation, and language change. Language. 85: 58-108.

Walker, John 1791. A Critical Pronouncing Dictionary and Expositor of the English Language. London: Robinson and Cadell.

Wells, John 1982. Accents of English. Vol. 2: The British Isles. Cambridge: Cambridge University Press.

Hickey Ireland - Dublin --- Page 175 of 525

9 Early Recordings of Irish English Raymond Hickey 1 Introduction For English as spoken in Ireland there are audio recordings going back to the early twentieth century. The earliest recordings are generally of prominent figures in Irish society and in some cases, where enough is known about their biographies, one can be reasonably certain that their use of English was typical of the social group they belonged to. Following on these recordings there are more which appear with the widespread use of radio and television in Ireland. The range of individuals for whom recordings are available became increasingly wider in the second half of the twentieth century. This chapter will be concerned with examining the earliest audio records of Irish English beginning before the Second World War (WWII) and will then consider some after this watershed in twentieth century history moving up into the 1960s. While WWII does not have particular relevance for the development of Irish English its end does cut the period from 1900 to 1990 exactly in two. After the 1990s new accents arose in Dublin which led to a reorientation in supraregional Irish English. These accents are not dealt with in great detail here as they have already been investigated by the present author (Hickey 1999, 2003a, 2005). But the later recordings discussed here have been examined to see if they embody the changes found in non-local Dublin English in the past few decades. This means that two separate but related questions are dealt with in this chapter. (1) Do the recordings of those speakers born in Ireland while it was still part of

the United Kingdom show that they had significantly different accents from those individuals born after independence in 1922?

(2) Do the recordings of speakers in the 1960s, 1970s and into the 1980s show that they did not have the vowel shift which began in the late 1980s and which spread rapidly among non-vernacular speakers in the 1990s and the early 2000s?

Answers to these questions can hopefully be reached by an examination of early audio recordings of Irish people. The situation is a typical one of ‘bad data’ (Labov 1994;Nevalainen 1999) given that not as many recordings are to be found as one would like. Nonetheless, a careful inspection of what audio data has been handed down can help to address the above questions and suggest answers. The earlier group of available recordings are referred to here as pre-WWII recordings which will be contrasted with Irish English documented for the post-WWII period. 2 Available recordings


Short videos are available for political events in Ireland from about 1920 onwards, essentially in the run-up to independence for the twenty-six counties of the south of Ireland in 1922. These recordings were made by the media company Pathé News, founded in 1910 by the Frenchman Charles Pathé (1863-1957). But because of the limited technology of the time these videos were without sound,1 i.e. they were “silent films” and were in fact shown in cinemas, usually as introductory material before a film presentation. It was not until the late 1920s that videos with sound were introduced by Pathé News which continued to report on Ireland, frequently on major sport events. Newsreels on Irish affairs with sound appear from 1934 onwards. However, the commentator was invariably an RP speaker with classical music playing in the background. Hence these newsreels cannot be used to determine accents of Irish English at the time. Radio broadcasting began in Ireland with the founding of a station called 2RN in 1925 (broadcasting began on 1 January 1926), a precursor of Raidió Éireann ‘Irish Radio’ established in 1938. Television broadcasting began on 31 December 1961 when the organisation was called Radió Teilifís Éireann ‘Irish Radio and Television’.2 A significant development in the recording of vernacular speech was the setting up in 1947 of a Mobile Recording Unit, used by Irish folklorists, which was equipped with a recorder with which to make acetate discs. Such discs were created using a recording lathe which cut a groove into a disc coated with a specific lacquer which allowed the audio-modulated signal to be captured.3 The cutting of the disc was done is real time and required careful preparation and execution and was not conducive to gathering spontaneous speech as the informants thus recorded were aware of what was happening and were given signals by a technician for when to start, pause or stop. The Mobile Recording Unit in Ireland was intended for use in recording Irish, either from the speech of Irish speakers or singing in Irish. Any recording of English was incidental, e.g. when Irish speakers switched to English, which happened occasionally when a piece of oral history was recorded in Irish. A small number of recordings are also available from the collection of prisoner of war recordings made by Wilhelm Albert Doegen (1877-1967), the German scholar interested in recording dialect speakers of various languages (see Hickey, this volume). Doegen’s sources were English prisoners captured by the German army during WWI and because of the demographic composition of the British army most of the prisoners were English, followed by Scottish, Welsh and soldiers from

1 There are a number of individuals from the beginning of the twentieth century who can be seen on film but for whom there is no sound record, e.g. the charismatic leader Michael Collins (1890-1922). 2 For England there is a similar timeline with the British Broadcasting Company, initially a private firm producing wireless sets under the leadership of the Scot John Reith, which began broadcasting in the early 1920s. With the Royal Charter of 1927 the British Broadcasting Corporation was established with regular radio programmes. Initial television broadcasting began in the early 1930s and was exapanded considerably, but suspended from 1939 to 1946 because of WWII. 3 The technique of creating acetate discs is different in principle from that found in the production of vinyl discs where a master mould, prepared from an existing recording, was used to press any number of copies of this mould.


the 36th (Ulster) division. Because very few men from the south of Ireland fought in the British army, only a few were taken prisoner and came into contact with Doegen.4 The quality of all these recordings is what can be expected for the early to mid twentieth century: the frequency range is quite narrow, there is much crackle and distortion and considerable background noise due to the limited technology of the time. This makes formant recognition in processing software difficult as the acoustic signal is not clearly distinguishable from extraneous noise in the medium and hence an acoustic analysis of vowel values is not always possible. In many cases the practice of overlaying the speech signal with music makes any attempt at acoustic analysis futile from the outset. Nonetheless, with some recordings the quality is sufficient for acoustic processing and allows the generation of charts to determine vowel values objectively (see various sections below). 3 Varieties of Irish English Any discussion of ‘Irish English’ necessitates distinguishing various varieties which can be subsumed under that label. The first binary division, which can be made, is between varieties in Northern Ireland (along with the counties of Donegal, Fermanagh and Cavan in the Republic of Ireland) and those in the remainder of the island of Ireland, roughly the varieties spoken south of a line running from Sligo in the north-west to Dundalk in the north-east (Map 1).

4 The situation is quite different for Irish. In the late 1920s Doegen travelled to Ireland to record dialect speakers of Irish making invaluable recordings, many of which were of individuals who were among the last speakers of their dialect of Irish. See Hickey (2011) for details.


Map 1. Main dialect division in Ireland (north, approximately one third, and south,

approximately two thirds of the island) This southern area is entirely contained in the Republic of Ireland and the influence of non-vernacular Dublin English as a model for supraregional speech has been, and still is, considerable. For the current chapter the reference ‘Irish English’ refers to supraregional English spoken in the large southern section of Ireland.5 This in turn is derived historically and at present from non-vernacular Dublin English. There are nonetheless differences between general supraregional speech in southern Ireland and non-vernacular Dublin English. 3.1 The relationship of Dublin English to Irish English

5 This form of Irish English is not explicitly codified as opposed to other major varieties of English such as American or British English and hence the term ‘standard Irish English’ is not used here (but see the discussions in Hickey ed., 2012). However, Irish people are aware of the phonetic form of supraregional English in their country and can maintain a distance to varieties of British English on the one hand and vernacular varieties of Irish English on the other.


The most noticeable feature of non-vernacular Dublin English, which is not part of supraregional southern Irish English, is SOFT-lengthening. By this is meant the occurrence of a long vowel, as in THOUGHT, in words like soft, off, frost, across. i.e. where the LOT vowel occurs before a voiceless fricative, /s/ or /f/. Before /θ/, generally realised as a dental stop in Irish English, usage varies, with cloth showing a long vowel but broth a short one. The occurrence of a long vowel before /s/ or /f/ appears to be a retention of nineteenth-century southern English English which also had this lengthening but which was later largely reversed in England (Hickey 2008). There are additional features of Dublin English which up to recently were not found in general supraregional Irish English but which have been adopted since. For instance, the lack of a /w/ ~ /w/ contrast, in words like which and witch, whale and wail, whet and wet, has spread out from Dublin in the past few decades so that this contrast is no longer found with most individuals under 40 years of age throughout Ireland. The same applies to the maintenance of a distinction between morning and mourning, born and (air-)borne, for (stressed) and four which has been abandoned completely in the past few decades in the south of Ireland (see the discussion in Hickey 2005: 50, 229 and the relevant audio data contained in Hickey 2004a). 4 Diagnostic features of Irish English 4.1 Consonants TH/DH-stopping This is a cover term for two types of realisations of the initial consonant of words like THIN and THIS: (i) dental stop and (ii) alveolar stop. For all supraregional forms of Irish English it is a dental stop which is found in the THIN and THIS lexical sets. An alveolar stop is characteristic of strongly vernacular forms of Irish English and is often used by supraregional speakers to deride local accents. At the opposite extreme, many presenters on Irish television strive to produce initial fricatives in the THIN and THIS sets. In syllable-final position this is in fact common in a reading style, e.g. path [pat] ~ [pa2]. But syllable-initially, particularly in word-initial, pre-stress position, a fricative is unusual and so the attempts by television presenters to produce fricatives in words like think, thought; there, those are conspicuous. T-lenition A persistent feature of all forms of southern Irish English, and many in Northern Ireland as well, is the categorical realisation of /t/ as a fricative in positions of high sonority, i.e. intervocalically and post-vocalically before a pause, i.e. in words like pity and pit. In supraregional forms of Irish English the /t/ in such words is realised


as an apico-alveolar fricative6, i.e. pity is [pi8i] and pit is [pi8]. For local Dublin English a cline of lenition is found with lenited /t/ occurring as [h, ?] or Ø with a realisation as /r/ due to a T-to-R rule also an option, e.g. get up [gerup] (Hickey 2005: 41; Clark and Watson 2011). This progression is in keeping with more general lenition trajectories such as that specified by Lass (1984: 178), see Figure 1.

Figure 1. Lenition scale after Lass (1984) L-velarisation Syllable-final /l/ is characteristically velarised in local Dublin English, e.g. field [fi:1d]. This is in contrast with previous non-local forms of Dublin English and more general supraregional varieties of the rest of the country which were noted for the presence of an alveolar [l] in syllable final position, e.g. deal [di:l]. But with the changes in non-local Dublin English in the 1990s the velarised (or pharyngealised) [1, lè] became part of the new pronunciation and hence in time replaced the syllable-final alveolar [l] of supraregional Irish English. 4.2 Long vowels and diphthongs Traditionally, open back vowels have been characteristic of all forms of Irish English: the THOUGHT vowel was open with an open starting point found in the CHOICE diphthong. This openness along with the monophthongal quality of the FACE and GOAT vowel and a high back GOOSE vowel made Irish accents easily recognisable. The changes in these vowels which set in during the 1990s constituted a concerted movement away from traditional realisations. The movement was upwards for THOUGHT and CHOICE and towards a diphthongal realisation for the GOAT vowel. The FACE vowel now shows a slight diphthongal quality for supraregional speakers, but the starting point of the vowel is never as low as in southern English pronunciations, this higher starting point keeping it separate from the latter realisations. The raising of back vowels during the past two to three decades has exhibited a degree of gender difference: the centralised onset for GOAT, i.e. [q], was, and is, considered primarily a feature of female speech (Hickey 2005: 88-92). The GOOSE vowel still shows quite a range of values in Ireland. The now

6 The IPA chart would seem to favour the symbol [θ] for an apico-alveolar fricative. This consists of the theta symbol with an underscore indicating retraction of the point of articulation. The difficulty with this transcription is that it implies that the sound is phonologically related to the TH sound of English as in think /θ-/. However, the lenited /t/ of Irish English is an allophone of /t/ hence the symbol [8], first introduced in Hickey (1984), is used here as the relationship to phonological /t/ is immediately obvious.


somewhat conservative supraregional accent has a high back vowel with no noticeable fronting and in some rural accents, notably in the south-west, in Co. Cork and Co. Kerry the vowel is further retracted. But all newer, non-local Dublin English accents partake in the GOOSE-fronting so common across the anglophone world (Henton 1983; Friedland 2008; Docherty 2010; Mesthrie 2010). Like fronting of the MOUTH diphthong onset (see next paragraph), a front vowel in the GOOSE set is typical of local Dublin English and is enregistered in the pronunciation of book [bu+:k] with a long, fronted vowel.7 But general fronting of words in the GOOSE lexical set is prevalent outside Ireland and so it does not appear to carry the stigma of highly local features such as the centralisation of the PRICE vowel or the high back realisation of the STRUT vowel. The main diphthongs of English, PRICE and MOUTH, have been realised in supraregional speech with a similar low central starting point, i.e. price [prais] and mouth [maut]. Local Dublin English has different realisations: the onset of PRICE is a central schwa-like vowel, i.e. [prqis], and that of MOUTH is a low front vowel in the region of [e ~ æ], i.e. [meut] ~ [mæut]. Again with the changes of the 1990s these diphthongs shifted somewhat. In the late 1980s and early 1990s a retraction of the starting point for PRICE was observable (Hickey 1999), especially before voiced segments, i.e. pride was frequently [pr<id] showing, embryonically at least, a contextually differentiated realisation similar to one half of the phenomenon known as Canadian Raising (Chambers 1973). This retracted starting point did not spread and today there are few individuals who use it. With the MOUTH vowel the situation is different: here the low front starting point of local Dublin English was adopted into the new non-local accent and from there spread to more general forms of supraregional Irish English. If the assumption, posited by the present author, is correct that the motivation for the back vowel raising of the early 1990s was dissociation from local Dublin accents (see Hickey 2013 for a fuller discussion) then the adoption of the local front onset of MOUTH would seem unlikely. However, like GOOSE-fronting, this onset is not a salient feature of local Dublin English and is not enregistered (Hickey 2016) in the same way that the central onset of PRICE or the high back vowel in STRUT is for local Dublin English. In addition, many forms of English outside Ireland, such as southern English and general American accents have a low front onset for MOUTH so that in non-local Dublin English this would correspond to pronunciations in prestigious extranational accents. 4.3 Short vowels Of all the short vowels of English only one or two are diagnostic of an Irish English accent. The KIT, DRESS and TRAP are unspectacular with values similar to standard southern English English, perhaps with a somewhat lower TRAP realisation. However, very recently the process of short front vowel lowering has reached Ireland from North America (Hickey 2016) and led to change in the speech

7 The long vowel in this word is referred to in the title of Kenny (2000) where it is associated with being a ‘true’ Dubliner.


of young females in Dublin. The LOT vowel in Irish English is similar to that in southern English English and shows no signs of merger with the THOUGHT vowel as has happened in Canada and is continuing to happen across the United States. However, individual lexical items may belong to a different lexical set than in English English, e.g. caught is homophonous with cot because the former word belongs to the LOT lexical set in Irish English. The STRUT vowel has a distinctive pronunciation in Irish English: the vowel is retracted towards cardinal [v] and can be somewhat rounded, i.e. [ß] or [Ê]. Despite all the changes of Irish English in recent years, which have made it quite like general accents of American English, the retracted, slightly rounded STRUT vowel is still a clear diagnostic of all varieties of Irish English. This pronunciation established itself in the twentieth century against the more open and lower [ä] pronunciation which was found in Ireland in the earlier twentieth century (see section 5.1 below). Like the GOOSE vowel, the FOOT vowel tends to be fronted in the speech of young females but is otherwise unremarkable. The tensing of the happY vowel, which is found in North America and the Southern Hemisphere (Wells 1982: 165-166), has always been typical of local Irish English and of early twentieth-century supraregional accents going on the recordings from this period. The NURSE vowel is a central rhotacised schwa in supraregional Irish English, but in local Dublin English the vowel is raised and retracted and the following /r/ is only weakly pronounced, if at all: [nu:(-)s]. In addition, local speech in the capital shows a different vowel, a mid open vowel, in the TERM lexical set, consisting of all words in which the inherited vowel derives from Middle English /e/ before /r/: [te:(-)m]. Mergers found in England, such as the NURSE-NORTH merger of Tyneside (Watts and Foulkes, this volume) or the NURSE-SQUARE merger8 of Liverpool (Watson and Clark, this volume) are generally not present in southern Irish English. In local Dublin English there is, however, a merger of /e/ and /e:/ before ambisyllabic /r/ , e.g. MERRY and MARY are homophonous. But the height distinction, i.e. that between MERRY/MARY on the one hand and MARRY on the other, is clearly maintained. 5 Pre-WWII recordings The recordings of this period, which have been evaluated for the present chapter, stem from individuals born in the second half of the nineteenth century and a few from the beginning of the twentieth, but none after WWII. Their speech is compared with two further types: (1) early twentieth-century RP (see Hickey, this volume) and (2) later non-vernacular Dublin English. There is not a great number of these recordings which is why the question of their representativeness must be 8 A feature of some middle-class Dublin accents has been the merger or near merger of NURSE and SQUARE, not found elsewhere in the south of Ireland. A reason for this could be to avoid the stigma of the [e:] vowel of local Dublin English found in the TERM lexical set. It is true that the SQUARE set is different from the TERM set but using an open front vowel in the former would be acoustically reminiscent of the latter in local accents.


considered carefully. In the following some relevant information is given concerning the persons in the recordings. 5.1 Individuals in the recordings There are two groups. The first consists of seven prominent figures, six writers and one associated with the literary scene in Dublin at the end of the nineteenth century. The second group is formed by five major political figures who were Taoiseach (prime minister) and/or president of Ireland at some stage in their lives. All individuals were from Dublin or spent key parts of their lives there (this increases the plausibility of comparisons between them). 1.1) George Bernard Shaw (1856-1950) was born and educated in Dublin moving to London at the age of twenty. In England he was first an art and music critic and later one the most successful playwrights of his time. He remained in England for the rest of his life, living in the vicinity of London. In one of his recordings he remarks that he had an Irish accent. 1.2) W. B. Yeats (1865-1939), the national poet of Ireland, does not match the general pattern of established middle class Dubliners born in the late nineteenth-century. His pronunciation was clearly rhotic, he had a somewhat retracted STRUT vowel and a monophthong for GOAT. With a perceived centre of gravity (Kingston 1997) further back in the mouth than other Dublin contemporaries of his, let alone Maud Gonne, his speech had a distinctly rural flavour to it. Yeats also had a distinction between WH and W (which # witch). These features may have been idiosyncratic on his part, but it is known that Yeats spent his childhood summers in Sligo in the north-west of Ireland where these features would have been typical of local speech (and still are to this day, Hickey 2004a). Yeats also had variable dentalisation of /t/ before /r/, a feature of vernacular Irish English. There is no sign of T-lenition in his speech, but the only recording of Yeats is where he is reciting poetry and his slow deliberate style of delivery would not have been conducive to T-lenition. Yeats furthermore has the Dublin feature of SOFT-lengthening, at least in the word often [o:f(]. 1.3) Maud Gonne (1866-1953) spent the first 16 years of her life in the south-east of England and in France until her father, an army officer, was transferred to Dublin in 1882. Given this background she would have acquired late nineteenth-century RP in England and in one clear feature her speech was different from the remaining Irish people being considered here: she had BATH retraction. She also had no T-lenition (at least in the recording of her voice). 1.4) Sean O’Casey (1880-1964) was born in Dublin’s inner city at a time when this was the centre of its poor, working-class population. In his youth he was engaged in socialist politics and also in the move towards independence during the 1910s, but not on a military level. In the 1920s his first plays were performed at the Abbey Theatre, the most famous of which are the Dublin trilogy (The Shadow of the Gunman [1923], Juno and the Paycock [1924] and The Plough and the Stars


[1926]). O’Casey speech was quite different from his near contemporary James Joyce, both from Dublin and both writers concerned with that city. O’Casey spoke with a local Dublin accent, quite close to what that accent still is like today. Prominent features of his speech are a low front vowel [e:] in the TERM lexical set and a high back vowel [u:] in NURSE lexical set. He had dentalisation of T/D before R and no retraction of W after A, both features seen in wandering [wandrin]. His pronunciation was only slightly rhotic in keeping with local Dublin English rather than as an imported feature of English-oriented accents. 1.5) James Joyce (1882-1941), the great Irish novelist, was born in Dublin and educated, from the age of 6 to 10 at Clongowes Wood College, a prestigious boarding school near Dublin, and later at schools in Dublin and at University College Dublin. He left Dublin in 1904 to live permanently in various European countries, notably France, Switzerland and Italy (in Trieste, then part of Austro-Hungary). Joyce’s speech was variably rhotic, with slight AI/AU smoothing, above all before tautosyllabic /r/. Otherwise it was less English-like. He has a monophthong GOAT vowel, no TRAP raising and only slight STRUT lowering. As with the recordings of the other individuals considered here, it is difficult to say whether he had T-lenition in his vernacular mode of speaking as the recordings are of him reading passages from his own works in a declamatory style, seen in the rolled syllable-initial /r/ and the careful accentuation of all stressed syllables. 1.6) Patrick Kavanagh (1904-1967), a major poet of the generation after Yeats, was born in rural Co Monaghan, in the dialect transition area from southern to northern Irish English. However, his speech is reminiscent of pre-WWII middle class Dublin English. Nonetheless, he had a distinction between the NURSE and TERM vowels which could well be a remnant of his rural upbringing. But his speech lacked the high central vowel of northern Irish English (and of Co Monaghan), both alone and as a second element of the MOUTH diphthong, i.e. he had two [tu:] and down [daun] rather than [t+] and [da+n] respectively. Kavanagh also had the common historical shift of /e/ to /a/ before /r/ and the alveolarisation of [in] to [in], both seen in his pronunciation of learning as [larnin]. 1.7) Samuel Beckett (1906-1989) was born in Foxrock, south Dublin into a Protestant family and enjoyed primary schooling in the city. At the age of 13 he went to the Portora Royal School in Enniskillen, Co Fermanagh (now in Northern Ireland). He went to university at Trinity College Dublin (1923-27) and then went to France. He returned to work as a lecturer in Trinity College in 1930 but resigned the following year after which he travelled to England and the continent, settling in France where he spent the remainder of this life. Beckett is the author of novels and plays, the latter being responsible for his literary fame. Beckett’s speech was typical of his south Dublin middle class upbringing at the beginning of the twentieth century. It was non-rhotic (words [w=:dz]), showed a low, open STRUT vowel (come [käm]) with slight GOAT diphthongisation (home [ho:Um]). The short recording of him reading some of his own work – an excerpt from the novel Watt


(1953) – show clear T-lenition, e.g. Watt [w>8], not [n>8], meet [mi:8].9 This could be due to Beckett’s manner of delivery which, given his introverted personality, would have been far removed from the declamatory style used by Yeats and Joyce when reading their works. Because Beckett emigrated to France in the 1930s T-lenition is most likely a feature he acquired in his formative years in Dublin. One other recording of Beckett is an amateur video of his reactions to a performance of his 1983 play What Where. Despite the very poor quality it is discernible that Beckett that a WH-W distinction (What Where [w>8 weq]) and did not have L-velarisation. 2.1) Seán Thomas O’Kelly (1882-1966) was born and educated in Dublin. He was active in the run-up to independence in 1922, especially in nationalist journalism, and was a member of Dáil Éireann (the Irish parliament) from 1918 to 1945 when he became president10 of Ireland and remained in this position until 1959 when he was succeeded by Eamonn de Valera (see below). The recording of his reminiscences of his activity around the time of independence was made in 1963. His speech is rhotic and contains clear T-lenition, these features pointing to supraregional speech of the mid-to-late twentieth century. But his STRUT vowel is relatively low and open, a feature of earlier Irish English. 2.2) Eamonn de Valera (1882-1975) was born in New York to a Spanish-heritage father and an Irish-heritage mother. At the age of two after his father’s death he was taken to Ireland and went to school in Co. Limerick and Co. Cork, moving to Dublin at the age of 16 to go to college there. De Valera, one of the main figures of the struggle for independence in pre-1922 Ireland, was later the leader of the major political party, Fianna Fáil, and Taoiseach (prime minister) from 1937 to 1948. From 1959 to 1973 he was president of Ireland. In keeping with supraregional Irish English of the early twentieth century de Valera had a monophthongal GOAT realisation and moderately open and low STRUT vowel. His speech was only very slightly non-rhotic and did not display any T-lenition, at least on the many recordings of him which are available. 2.3) Seán Lemass (1899-1971) was born in Co. Dublin and educated in the city. He took part in the political struggles before independence in 1922 and was Taoiseach (prime minister) from 1959-1966. The brief recording which was examined here is from 1961. His speech is clearly rhotic and his STRUT vowel is not as low as that of others considered here. Nor did he have AI/AU smoothing, GOAT diphthongisation or TRAP raising, so in all, his pronunciation sounds quite like supraregional Irish English between the 1960s and the 1990s. He is the only person, born in the nineteenth century and available on recording, for whom this is the case. 9 In the recording there are one or two instances where he avoids T-lenition, e.g. with stressed habitat [/hæbitæt]. 10 The first president of Ireland was Douglas Hyde (1860-1949) from Co. Roscommon. He was a fervent supporter of the revival of Irish and hence in his presidential address in 1938, which has been preserved on a newsreel, he spoke in Irish. There is apparently no recording of Hyde speaking English.


2.4) Jack Lynch (1917-1999) was born and educated in Cork and remained connected to that city even after becoming Taoiseach (prime minister) of Ireland. He held this office for two periods, from 1966 to 1973 and from 1977 to 1979, something which demanded residence in Dublin. Lynch’s speech was rhotic and showed T-lenition; however, his STRUT vowel was relatively low and open, recalling earlier usage. 2.5) Charles Haughey (1925-2006) was born in Castlebar, Co. Mayo but moved to Dublin in his early childhood and was educated there. He was a key political figure in Ireland in the 1980s and served as Taoiseach (prime minister) three times, from 1979-1981, in 1982 and from 1987 to 1992. During this time he was leader of his political party Fianna Fáil. Haughey’s speech was definite rhotic with clear T-lenition. He had a monophthongal vowel for the GOAT set, a retracted vowel in the STRUT set (much like present-day realisations) and no L-velarisation. The slight AI/AU smoothing before voiced segments, as in time [ta:Im] and round [ra:Und], point back to earlier pronunciation models in Ireland. 5.2 Assessing the speech of the recordings The twelve individuals considered here had very different backgrounds and biographies. Nonetheless, there are common features in their speech. For instance, L-velarisation is not present in any of the recordings of these persons. By the early twentieth century there was a recognisable degree of such velarisation among English RP speakers, albeit not as much as many RP speakers have today. The velarisation is easily recognised, for example, in the speech of Virginia Woolf (Hickey, this volume) in a word like fields [fi:1dz]. Another feature is the lack of BATH-retraction, found in all individuals except Maud Gonne who grew up in a military family in Surrey. Many of the speakers, e.g. Joyce and Yeats, have a tapped or slightly trilled /r/ in syllable-initial position, probably as part of a consciously declamatory style when reciting their own works. Major differences are discernible when the speech of these recordings is compared with present-day Irish English (see Table 1). All the speakers from the first group lack non-prevocalic /r/ with only one exception to this, W. B. Yeats. His rhoticity makes his accent sound distinctly like modern rural varieties of the west. What nearly all speakers have is a monophthongal GOAT vowel. The combination of this monophthong with non-rhoticity, which Shaw, Joyce and Kavanagh all showed, came to be impossible after WWII. Table 1. Distribution of seven key phonetic features for early twentieth-century

supraregional Irish English. A dash ‘—’ indicates that the recording(s) in question did not provide any context where the feature could be assessed.

Non- rhotic

STRUT low + central

GOAT central onset

TRAP raising

AI/AU smoothi

ng

happY tensing

T lenition


George Bernard Shaw (1.1)

yes yes no no slight variable no

W.B. Yeats (1.2) no no no no no yes no

Maud Gonne (1.3) yes yes yes yes yes no no

Sean O’Casey (1.4)

yes (but local) no no no no no —

James Joyce (1.5) variable slight no no slight yes no

Patrick Kavanagh (1.6)

variable no no no yes yes no

Samuel Beckett (1.7) yes yes slight no no — yes

Seán T. O’Kelly (2.1)

no slight no yes — yes no

Eamonn de Valera (2.2) no slight no yes — slight no

Seán Lemass (2.3) yes yes slight yes — yes yes

Jack Lynch (2.4) no slight slight no slight — —

Charles Haughey (2.5)

no slight slight no slight — —

The seven features listed above are those which show the greatest contrast with present-day accents of Irish English. At present (2015) all accents of English in Ireland are rhotic with the sole but important exception of local Dublin English (Hickey 2005: 40). The STRUT vowel is now retracted and somewhat rounded. The GOAT vowel is slightly diphthongised for supraregional speakers and more so with young females in this group; a monophthongal realisation, i.e. [go:8], does still exist but is now associated with local rural accents. The TRAP vowel is low and front, i.e. [træp], with retraction to [trap] now characteristic of young females who have Short Front Vowel Lowering (Hickey 2016), as in dress [dræs]. By AI/AU smoothing is meant a faint or non-existent upglide from [a/<] to a high front or back vowel, especially before voiced segments. Such smoothing is most salient before tautosyllabic /r/, and has always been characteristic of RP (Cruttenden 2014:


151) where the reflex of non-prevocalic /r/ is schwa, of course. The starting point for AU is usually further back that for AI, keeping word pairs like tire [taq] and tower [t<q] apart. As opposed to present-day forms of Irish English, many of the early recordings show a lack of happY tensing, a feature of early twentieth-century RP (Wells 1982: 257). Finally, T-lenition, an ubiquitous feature of present-day Irish English (Hickey 2007: 322-325), is only found with more recent recordings, but see the remarks on Samuel Beckett above. 5.2.1 Rhoticity This is a complex issue in varieties of Irish English. Older supraregional Irish English has a velarised post-alveolar rhotic in syllable codas, e.g. card [k<:xd]. This realisation is similar to the non-palatal /r/ in Irish (Hickey 2011: 225-226) and may well be derived from transfer during the historical shift from Irish to English. Local accents in North Leinster (north of Dublin, centred around the town of Drogheda) frequently have a syllable-final uvular [Z] (Hickey 2004a: 77-79), e.g. square [skweZ]. Local accents in Dublin, on the other hand, tend to be non-rhotic, e.g. square [skwea]. Since the 1990s non-local accents in Dublin have shown a retroflex [5] in syllable-final position, e.g. car [k<:5], card [k<:5d], and this has spread to the supraregional speech of all post-1990s speakers. What is remarkable in the pre-WWII recordings and some of the speakers born before WWII is the degree of non-rhoticity across a variety of speakers. This is very conspicuous in the context of present-day Irish English which is strongly rhotic, indeed the retroflex [5] has been interpreted as a reaction to the non-rhotic nature of local Dublin English by lower middle class speakers and is a dissociation strategy (Hickey 2005: 47, 66-71). But the non-rhoticity of some older speakers of Dublin English was part of a refined pronunciation of the established middle classes which were not involved in dissociation from local Dublin English. So when does rhoticity11 establish itself for supraregional Irish English and does it come suddenly? The answer to the first half of the question is in the decades after independence in 1922. The answer to the second half is ‘no’. Among the recordings assessed here at least two of the speakers are variably rhotic: Patrick Kavanagh and James Joyce. In the case of Kavanagh his partial rhoticity might be a retention of his rural Co Monaghan accent with his absent rhoticity an adaption to perceived Dublin English models of his time. James Joyce seems to be a clearer case as he was from Dublin and given that he left Ireland to live in continental Europe for good in 1904 it can be assumed that his accent remained unchanged after that. There is a recording of Joyce reading the Anna Livia Plurabelle section from Finnegans Wake made in Cambridge in 1929 by C. K. Ogden who had access to recording machines. A few lines into the recording, Joyce reads the following stretch of text ‘And my cold cher’s gone ashley. Fielhur? Filou! What age is at?’

11 The issue here is the rise of rhoticity which has not been the subject of much

investigation. It is the loss of rhoticity which has formed the focus of many scholars, see Hay and Clendon (2012).


The two words ‘Fielhur? Filou!’ are pronounced differently with the first ending clearly in a rhotacised schwa. However, in the early recording of the Aeolus episode of Ulysses, made in 1924 in Paris, Joyce’s speech is only faintly rhotic, if at all. In the opening words ‘Mr. Chairman’ both the r’s are pronounced, a little later the word Ireland is read as [a:qlqnd] and heard as [h=:d] (the Praat spectrograms for both these words showed no lowering of the third formant). Joyce also had a linking r which he realised as a tap, e.g. far away [f<:-4-qwei]. An indication that Joyce’s pronunciation was at least partially rhotic (see Figures 2 and 3) is that he had two different realisations for the BATH and START lexical sets, e.g. pass [pa:s] and remark [rq/m<:-k].

Figure 2. Rhotic pronunciation of ‘Mister’ by James Joyce in 1924 recording

(third format depressed)


Figure 3. Non-rhotic pronunciation of ‘heard’ by James Joyce in 1924 recording

(no depression of third formant) 5.2.2 STRUT In the context of present-day Irish English, which has a retracted and somewhat rounded STRUT vowel, the range of pronunciations found in the pre-WWII recordings are noticeable in having a vowel which is more open and further front. This realisation seems to have disappeared in the decades following WWII for the urban middle class groups in Dublin, and smaller Irish cities like Cork, which also had the more open pronunciation previously. The retracted pronunciation has always been typical of rural Irish accents and in Dublin city the local realisation of the STRUT set is with a high back vowel (now found only in the North of England), cf. come [kum], cut [kuh], blood [blud]. It is only the latter pronunciation which has been the object of negative comment and which is enregistered in Dublin, given that the sound is found in the local pronunciation of the city’s name, i.e. [dublin]. The element of STRUT which evokes social censure is not its roundedness but its height. The raised pronunciation means that there is no FOOT-STRUT split in vernacular varieties of the capital and it is this which would appear to be noticed by non-local speakers of Dublin English. The following table shows the median values for F1 and F2 with the STRUT lexical set for a range of speakers, the oldest of whom is W.B. Yeats (the year of birth is given in the right-most column) and the youngest of whom is Mary Robinson, a speaker of supraregional Irish English (1944- ) who grew up in post-


WWII Ireland. The number of tokens for the STRUT lexical set varies, with between 3 and 10, depending on speaker and recording. Table 2. STRUT vowel realisations for five speakers in pre-WWII recordings

and two speakers in post-WWII recordings (median values of F1 and F2), arranged (i) chronologically, (ii) by value of F1 and (iii) by value of F2.

(i) Chronological order (by year of birth) Yeats F1: 708 F2: 1316 1865 O’Casey F1: 630 F2: 1232 1884 Shaw F1: 801 F2: 1558 1880 Joyce F1: 740 F2: 1057 1881 de Valera F1: 737 F2: 1371 1882 Haughey F1: 639 F2: 1141 1925 Robinson F1: 661 F2: 1192 1944 (ii) F1: relative openness (higher values = lower vowel) Shaw F1: 801 F2: 1558 1880 Joyce F1: 740 F2: 1405 1881 de Valera F1: 737 F2: 1371 1882 Yeats F1: 708 F2: 1316 1865 Robinson F1: 661 F2: 1192 1944 Haughey F1: 639 F2: 1141 1925 O’Casey F1: 630 F2: 1232 1884 (iii) F2: relative frontness (higher values = more front vowel) Shaw F1: 801 F2: 1558 1880 Joyce F1: 740 F2: 1405 1881 de Valera F1: 737 F2: 1371 1882 Yeats F1: 708 F2: 1316 1865 O’Casey F1: 630 F2: 1232 1884 Robinson F1: 661 F2: 1192 1944 Haughey F1: 639 F2: 1141 1925

The most open and most front STRUT vowel is for George Bernard Shaw followed by James Joyce. From the values one can see that Joyce had an English-oriented middle-class Dublin accent while O’Casey had one closer to local forms of Dublin English but still not with the high back vowel [u] found today. De Valera’s STRUT vowel is very similar to Joyce’s. Yeats also had a fairly low front vowel which contrasts somewhat with his other rural-sounding accent with [o:] in the GOAT set. But living in Dublin Yeats would have been close to vernacular speakers with a high back vowel of the STRUT set and may have been (unconsciously) avoiding any suggestion of this pronunciation by using an open, front vowel. For Charles Haughey and Mary Robinson, both politicians active in the second half of the


twentieth century, a less open and further back realisation of the STRUT vowel was typical (and is still in the case of Mary Robinson). For the first four speakers in section (ii) and (iii) there is a clear correlation between relative openness and frontness of the STRUT vowel. The only speaker born in the nineteenth century for whom this is not the case is Sean O’Casey who had a working-class background as opposed to the remaining middle-class speakers. The values for F1 and F2 in the first four rows in each section of the above table correspond roughly to those for Standard Southern British English, see the values given in the survey study by Ferragne and Pellegrino (2010: 28). These authors use the test word ‘Hudd’ to represent the STRUT lexical set in a /hVd/ template (Ferragne and Pellegrino 2010: 3). For each region the authors had about 20 speakers in an age range from 18 to 50. The speakers of Southern Standard (British) English had the following median values: F1: 623; F2: 1370 while those from the Republic of Ireland (all from Dublin) has these values: F1: 509; F2: 1209. The low F1 value for the Dubliners would point to high realisations of the STRUT vowel. In local Dublin English there is no FOOT-STRUT split (as the authors recognize, Ferragne and Pellegrino 2010: 23) and hence there is overlap in the realisations of their words ‘Hudd’ and ‘hood’. But due to the fact that they treated all speakers from Dublin as a single group it is not possible to compare their values for non-local Dublin English STRUT to those found with recent supraregional speakers like Mary Robinson in the Table 2 above. 5.2.3 GOAT Traditional rural accents of Irish English have a monophthongal realisation of the GOAT vowel, [go:8]. In conservative supraregional speech the vowel is slightly diphthongised with a low but not a centralised starting point, not unlike American accents, i.e. [gou8]. As shown in Table 1 above, the early audio recordings of Irish people reveal a general pattern of GOAT realisations with little or no diphthongisation. In this respect the speakers are reminiscent of present-day rural speakers, but their frequently lack of rhoticity and their low, central STRUT vowel are not in keeping with rural accents. In fact, the feature combinations they show are no longer found today. In the following formant contour graphs it can be recognised that for Shaw, Yeats and Joyce F1 and F2 are parallel, indicating a monophthongal GOAT realisation (the contours are not very clear for Yeats due to the poor audio quality of the recording). De Valera has slight diphthongisation while Maud Gonne (raised in a middle class family in Surrey) has a high F2 at the onset of the GOAT vowel leading to a much lower value towards the end indicating that she had a diphthong starting at a central [q] and moving back and up towards [u], see Figure 4).


Figure 4. GOAT realisations five speakers from pre-WWII recordings (frequencies

are given in Hz on left) In non-local Dublin English of the past few decades a centralised starting point arose on a broad front with lower middle-class female speakers as in go [gqu], home [hqum], road [rqud]. A centralised onset is not widespread among present-day Irish males as the gender differentiation study in Hickey (2005: 88-91) has shown. However, as part of a refined middle-class Dublin accent the centralised onset is found with some males in pre-WWII recordings and a few post-WWII ones where the individuals in question were born before well before WWII. With this group the source of the [qu] diphthong in the GOAT set is early twentieth-century RP. The adoption of an RP pronunciation in present-day Ireland is not something which is pursued by young speakers so the question of how the [qu] diphthong in the GOAT set arose needs to be answered. An internal reason can be posited for


this: the raising of low back vowels in the 1990s meant that the THOUGHT vowel was encroaching on the phonological space of the GOAT vowel which was only slightly diphthongised at the time. By shifting the onset to a more central position the phonetic difference between GOAT as [gqu8] and THOUGHT as [to:8] was enlarged blocking any tendency towards merger of the two vowels, cf. Figure 5.

Figure 5. Back vowel raising in Dublin English in the 1990s Vowel shifts may be the product of both internal and external change: an additional motivation for GOAT centralisation, this time external, is the distance which it created to older rural monophthongal realisations from which sophisticated urbanites in the capital seemingly wished to dissociate themselves, albeit unconsciously. GOAT centralisation did not in fact make Irish English very much more like RP. There are several features of the new pronunciation of Irish English from the 1990s which, if anything, made it more similar to supraregional forms of American English: (i) a strongly retroflex [5], (ii) lack of BATH retraction, (iii) T-flapping, to mention just three prominent features (Hickey 2003a). 5.2.4 The TRAP-BATH split It is true to say that there is no TRAP-BATH split in Irish English. Words which belong to the second lexical set in this pair, such as pass, grant, staff are not pronounced with [<:]. In fact, there is only one lexicalised instance of the latter vowel in Irish English, i.e. father, which in all supraregional forms is pronounced [f<:dQ].12 A pronunciation with a mid [a:] or front [æ:] is only characteristic of strongly local urban or rural varieties. The TRAP-BATH split which arose in Early Modern English through lengthening and later retraction of Middle English /a/ before voiceless fricatives and variably13 before syllable codas consisting of an alveolar nasal and homorganic stop, e.g. grant, dance (Dobson 1967: 525-535) did

12 Occasionally some speakers have rather with [<:] by analogy. Other words with this vowel, e.g. lather, are rarer still. 13 Disyllabic words may block the lengrhening, e.g. cancel with a short vowel, and before an alveolar nasal plus voiced stop the vowel is short, e.g grand, band, bland, though not if the /d/ forms the onset of the following syllable, e.g. Sandra /s<:ndrq/.


not occur in Irish English. If there were any Irish people with this split in their speech then only because these individuals adopted the southern English pronunciation with [<:]. However, none of the Irish people in Table 1 above have BATH retraction. The START lexical set (Wells 1982: 157-159) is different as Irish English is rhotic. The vowel here is retracted because of the /r/ following it: [st<:r8]. Indeed for Irish English speakers an [<:] automatically implies a following /r/, cf. the word lager where the Irish pronunciation is [l<:rgQ] triggered by the English pronunciation with [<:] which for Irish ears is only possible in the context of a following /r/. Given the absence of [<:] in the BATH set today it can be asked whether this was ever present in any accent of Irish English. To answer this consider the references at the beginning of the twentieth century to what was then labelled the ‘Rathmines accent’14 which appears to have been used to refer to a middle-class southern English accent found with some Irish people. In Act III of the play The Plough and the Stars by Sean O’Casey (see above) an anonymous person described as “a fashionably dressed, middle-aged, stout woman” makes a very brief appearance. O’Casey, who always attempted to represent speech features by manipulating the spelling of English, alters a few words as can be seen from the following extract.

Woman: For Gawd’s sake, will one of you kind men show any safe way for me to get to Wrathmines? ... I was foolish enough to visit a friend, thinking the howl thing was a joke, and now I cawn’t get a car or a tram to take me home - isn’t it awful? ( … ) Woman: And what am I gowing to do? Oh, isn’t this awful? ... I’m so different from others ... The mowment I hear a shot, my legs give way from under me - I cawn’t stir, I’m paralysed – isn’t it awful?

The spelling Gawd would imply [g>:d], a pronunciation with a long vowel which is still found when this word carries heavy stress. The spelling cawn’t would seem to imply the use of [<:] in this word. Writing Wrathmines for Rathmines may suggest the same retracted long low vowel although this is less certain. O’Casey furthermore uses <ow> for <o>, perhaps to indicate the use of a diphthong with a centralised onset as in gowing for [gquin] or mowment for [mqumqnt]. Among the early recordings of Irish English, there is one of an Irish person who had BATH-retraction. The actor and later theatre director Ria Mooney (1904-1973) was (coincidentally) born in the suburb of Rathmines and educated in Dublin, joining the Abbey Theatre in 1924. On 8 February 1926 she played the role of the poor prostitute Rosie Redmond in the premiere of The Plough and the Stars. At the beginning of the twentieth century, working class people from Dublin’s inner city slums would have spoken with a strong local Dublin accent. A not

14 The name refers to the suburb Rathmines in south Dublin, the more affluent part of the city, and the accent was regarded as fawning on a posh English pronunciation. There is no linguistic description of this accent, but references to it go back to the beginning of the twentieth century and were found in the Irish press, e.g. The Irish Times.


inconsiderable irony is that Mooney’s accent (available in a short recording of her reminiscing about the first performance of the play) sounds uncannily similar to an uppercrust RP accent. Her speech was non-rhotic (here [hiq]), she had BATH retraction (after [<:ftq]), a low, central STRUT vowel and no happY tensing (country [käntri]). 5.2.5 TRAP-BATH and the gender question BATH-retraction is the only feature which Ria Mooney had and which is not present with any of the twelve individuals being considered here (see 5.1 above), except Maud Gonne who was English. But there may have been a tendency for women to have this feature in early twentieth-century middle class Dublin English as opposed to males. To determine whether this was so, a further four recordings were analysed, this time all of females.15 1) Helena Moloney (1883-1967) was born and raised in Dublin and joined Maud Gonne in Inghinidhe na hÉireann ‘Daughters of Ireland’ the organisation Gonne founded to unite nationalistically minded women in Ireland. Her speech was rhotic, showed SOFT-lengthening, a retracted STRUT vowel and T-lenition. It had slight GOAT diphthongisation but no BATH retraction (command [kq/ma;nd]).

2) Sheila (or Sighle) Humphreys (1899-1994) was born into a wealthy Limerick family and moved to Dublin when she was 10 continuing her education there. She was a member of Cumann na mBan (‘The Women’s Organisation’) which succeeded Maud Gonne’s Inghinidhe na hÉireann. Her speech was rhotic with T-lenition and a monophthongal realisation of the GOAT vowel along with happY tensing and only slight AI smoothing (time [ta:Im]), but no BATH retraction (after [a;ftQ]).

3) Máire Comerford (1992-1982) was born in Rathdrum near Dublin into a middle-class family. She engaged in republican activities in the 1916-1922 period and was a member of Cumman na mBan as were Helena Moloney and Sheila Humphreys. Her speech was rhotic, with slight GOAT diphthongisation and AI-smoothing but no BATH-retraction (after [aftQ]). 4) Kathleen Clarke (1878-1972) was born in Limerick and educated there. After a time in the USA (1901-1907) she returned to Ireland and lived in Dublin where she was involved in the struggles from 1916-1922 and was the wife of Tom Clarke, one of the main leaders of the 1916 rising. Her speech was rhotic with a monophthongal realisation of the GOAT vowel and happY tensing, but no STRUT lowering/ centralisation, no AI-smoothing and no BATH retraction.

15 The recordings are relatively short, generally stretches of original voices in documentaries, e.g. Irish Women Revolutionaries, on political events in early twentieth-century Ireland.


Figure 6. BATH vowel realisations for (1) Kathleen Clarke, (2) Máire Comerford,

(3) Ria Mooney and (4) Maud Gonne It would appear from the evaluation of these recordings (see Figure 6) that BATH retraction was not a general feature of female speech in early twentieth-century Ireland. Ria Mooney is an exception in this respect, something which may have been determined by her personality and her profession as an actor. Of all the features of early twentieth-century RP this was one which definitely was not adopted into Irish English and to this day a back vowel in the BATH set is associated with a fawning attitude towards England and Englishness and frequently the object of ridicule by Irish people, often by referring to someone as having a grand [gr<:nd] accent. For present-day Irish English the TRAP and BATH vowels can be treated as one but with different realisations determined by (i) the degree of localness of an accent and (ii) the nature of the coda following the low vowel (phonetic conditioning). The phonetic range of the TRAP-BATH vowel is between [æ] and [a] with the front vowel a strong indicator of local accents, especially when the vowel is long. The length of the vowel is usually determined by the phonetic context. The vowel is short before voiceless stops, e.g. cap, cat, back, but somewhat longer before voiced ones (much as in other varieties of English). A recognisably longer vowel can be found before sibilants and nasals or nasals followed by a sibilant.16 This means that the words gas and dance both have long

16 This applies to monosyllables. With words of more than one syllable a long vowel is not usual, cf. pasta with [æ]. The lexical incidence of long and short vowels is not the determining factor here in contradistinction to other varieties such as American or Canadian English (Boberg 2010: 137-143).


low vowels, [ga:s] and [da:ns] respectively, but in RP gas belongs to the TRAP set and dance to the BATH set. Despite the phonetic conditioning which can be posited for such realisations there is still much variation in length with low vowels in Irish English, e.g. Anne which can be pronounced [æn] or [a:n]. In his treatment of lexical sets for varieties of English Wells (1982: 142-144) has an additional set, the PALM set, whose vowel for RP speakers is identical with the BATH set. In supraregional Irish English the PALM vowel is always long and often contrasts in length with a short vowel before /m/. Despite its weak functional load there are instances of minimal pairs, cf. cam(shift) [kæm] ~ calm [ka:m]. Pam(ela) [pæm] ~ palm [pa:m]. The non-PALM words with a low vowel before /m/ are always realised with a short vowel, irrespective of possible contrast with a corresponding PALM word, e.g. dam [dæm], ham [hæm], spam [spæm], clam [klæm]. 5.3 Summary The examination of the pre-WWII recordings shows that non-vernacular Dublin English, if not to say supraregional Irish English in the south of Ireland as a whole, displayed characteristics of early twentieth-century RP (see Table 3). This is not surprising given that before 1922 the whole of Ireland was a colony of Britain and because of this there was more direct exposure to varieties of English English, specifically from the south-east of England due both to the presence of many English people in Ireland and the frequent employment of Irish people in England. England and Ireland also had a common civil service and administration until after WWI. Table 3. Features of pre-WWII recordings (present to differing degrees with

different speakers)

Derived from RP17 Non-rhoticity or low rhoticity Low, open STRUT vowel GOAT diphthongisation Slightly raised TRAP realisation Lack of happY tensing Specifically Irish features Lack of BATH retraction T-lenition

American English had no influence on Irish English. Before the 1920s there was no radio so there was virtually no way in which Irish people could have gained experience of accents of American English, unless they went to America, or even less frequently, had visits from relatives who had grown up in America and hence 17 The THOUGHT vowel is relatively open with speakers in the pre-WWII recordings. This is in fact in keeping with the rather open quality of this vowel in RP in England at the time, the closing of the vowel being a later development.


had exposure to their accents. In addition, many of the features of present-day supraregional American English, such as the strongly retroflex [5], were not typical of supraregional forms of English in North America until after WWII. If the pre-WWII recordings are representative of larger groups of speakers in Ireland of their time then it would seem that non-vernacular Irish English moved away from similar registers of English English after independence in 1922. Conversely, it would seem that the pre-1922 political union of Britain and Ireland kept non-vernacular Irish accents closer to those in Britain.18 These accents did not disappear immediately and were often characteristic of middle-class south Dublin speech, e.g. Charles Mitchel (1920-1996), the main newsreader on Irish national television from 1961-1984, had a pre-WWII type accent which was non-rhotic, showed a central onset for the GOAT vowel and lacked happY tensing, all features with his main successor in this post, Ann Doyle – born in Co Wexford in 1952 and newsreader from 1978-2011 – did not have in her speech. 6 Irish English after WWII In general the pronunciation of Irish English after WWII moved away from early England-oriented models of middle class speech. More Irish features such as rhoticity, T-lenition, STRUT retraction and slight GOAT diphthongisation became typical of middle-class speech, including figures of public life such as Mary Robinson (1944- ), the former president of Ireland. She was born in Ballina, Co. Mayo in the north-west of Ireland and later became professor of law in Trinity College Dublin where she had received her undergraduate education. Robinson served in the Irish Senate and in 1990 became the first woman president of Ireland. Her pronunciation shows categorical T-lenition, i.e. it is present in all contexts in which its structural description is met. This is not connected to style, e.g. T-lenition is pervasive throughout her presidential address and many speeches as United Nations High Commissioner for Human Rights, and corresponds to the style-independent T-lenition of present-day Irish English. Furthermore, Robinson’s speech shows no L-velarisation; no NORTH-raising and no R-retroflexion, though her speech is clearly rhotic. In all these features it corresponds to supraregional Irish English of the latter half of the twentieth century which is well documented, see the audio data in Hickey (2004a) for those over 30 at the time of recording (early 2000s). From the vantage point of 2016, this pronunciation could be labelled ‘conservative supraregional Irish English’ seeing as how it does not display the key features of the new pronunciation which arose in the 1990s (Hickey 2005), see Table 4. Table 4. Changes in supraregional Irish English by speaker age

18 If this statement is true, then there are consequences for New Dialect Formation scenarios (Trudgill 2004) in locations as distant as Australia and New Zealand. It would imply that colonial status led to non-vernacular accents in the colonies gravitating towards British accents and this would help account for the preference of south-east English type accents in anglophone countries such as Australia and New Zealand despite considerable regional input from the British Isles (Hickey 2003b).


Speakers Group 1 Group 2 new feature (over 40) (under 40) Consonants WHICH [wit$] [wit$] lack of [w] # [w] distinction MEAL [mi:l] [mi:1] syllable-final [1] SORE [so:r] [so:5] syllable-final retroflex [5] Vowels NORTH [n>:rt] [no:5t] raising of vowel MOUTH [maut] [meut] fronting of diphthong onset GOAT [gou8] [gqu8] centralisation of diphthong onset, (mostly typical of females) GOOSE [gu:s] [g+:s, gy:s] greatest degree of fronting found with young females HORSE [ho:rs] [ho:5s] merger of HORSE-HOARSE sets

The vowel changes for Group 2 constitute a typical shift involving several items. The main thrust of the vowel shift of the late 1980s and 1990s in Dublin is a movement upwards of back vowels or upwards for a diphthong starting point, see Table 5. Table 5. Main shift in vowel values in 1990s Dublin English

a) raising of back vowels THOUGHT />:/ à /o:/ à /o:/ NORTH />:r/ à /o:r/ à /o:r/ b) raising of diphthong onset CHOICE />i/ à /oi/ à /oi/

The retraction of the PRICE vowel (Hickey 2005a: 51-53), as in time [t<im], did not establish itself in later supraregional speech and there are only some individuals, mostly in south Dublin, who show a retracted starting point for this diphthong. The fronting of the onset for the MOUTH vowel is a different matter: this is a local feature which is also found in non-local forms of Dublin English and which is firmly entrenched in new supraregional Irish English. Despite these changes a southern Irish English accent can still be easily recognised. Dental stops are used for interdental fricatives, especially in stressed syllable onsets, e.g. think [tink]. The STRUT vowel is quite far back, a slightly centralised version of cardinal vowel [v], i.e. [ß], and may be somewhat rounded, i.e. [Ê]. T-lenition, the realisation of intervocalic and word-final/pre-pausal /t/ as [8], e.g. cut [kßt], is universal in most varieties of Irish English. There are also several negative diagnostics of Irish English: there is no systemic TRAP/BATH distinction, TH-fronting does not occur anywhere and the use of a glottal stop for intervocalic /-


t-/ is only found in local Dublin English.19 7 Conclusion The severing of political ties with England in 1922 meant that the sway of English pronunciation models over supraregional middle-class usage in Ireland receded and a more independent non-local variety arose. This is obvious in the pervasiveness of rhoticity in all varieties of Irish English (bar local Dublin English), the retraction of the STRUT vowel, the decentralisation of the GOAT diphthong onset (where present) and the style independence of T-lenition. The increasingly unique profile of supraregional Irish English could have also been motivated, at least in part, by the fact that the Irish language had reached a dangerously low threshold by the time of independence – it was spoken as a first language by less than 10% of the population20 and the remaining native speakers lived exclusively in rural areas. A clear profile for English in Ireland ensured that the linguistic identity of Irish people could be successfully transferred from the Irish to the English language. The clarity of this profile has remained to this day, despite the dynamic nature of change in supraregional Irish English which maintains its own characteristics making it easily identifiable as uniquely Irish. References Boberg, Charles 2010. The English Language in Canada. Cambridge: Cambridge

University Press. Boberg, Charles 2012. Standard Canadian English, In: Raymond Hickey (ed.)

Standards of English. Codified Varieties around the World. Cambridge: Cambridge University Press, pp. 159-178.

Clark, Lynn & Kevin Watson 2011. Testing claims of a usage-based phonology with Liverpool English T-to-R, English Language and Linguistics 15.3: 523-547.

Cruttenden, Alan 2014. Gimson’s Pronunciation of English. Eighth edition. London: Arnold.

Dobson, E. J. 1968. English Pronunciation 1500-1700. Vol.1 - Survey of the Sources. Vol.2 - Phonology. Second edition. Oxford: Oxford University Press.

Docherty, Gerard 2010. Phonological innovation in contemporary spoken British English, In: Andy Kirkpatrick (ed.) The Routledge Handbook of World Englishes. London: Routledge, pp. 59-75.

Ferragne, Emmanuel and François Pellegrino 2010. Formant frequencies of vowels in 13 accents of the British Isles, Journal of the International Phonetic Association 40.1: 1-34.

19 Word-finally, i.e. in the context /-t#/, glottalisation may occur in quick speech but not in a reading style, for example. 20 The official figures of the independent Irish state were unreliable because of over-reporting of competence in Irish. See Hindley (1990: 21-42) and Punch (2008) for assessments of speaker numbers in the twentieth century.


Foster, Roy F. 2015. Vivid Faces. The Revolutionary Generation in Ireland 1890-1923. Hardmonsworth: Penguin.

Fridland, Valerie 2008. Patterns of /uw/, /u/ and /ow/ fronting in Reno, Nevada, American Speech 83.4: 432-454.

Hay, Jennifer and Alhana Clendon 2012. (Non-)rhoticity: Lessons from New Zealand English, In: Terttu Nevalainen and Elizabeth Traugott (eds) The Oxford Handbook of the History of English. Oxford: Oxford University Press, pp. 761-772.

Henton, Caroline 1983. Changes in the vowels of Received Pronunciation, Journal of Phonetics 11: 353-371.

Hickey, Raymond 1984. Syllable onsets in Irish English, Word 35: 67-74. Hickey, Raymond 1989. R-coloured vowels in Irish English, Journal of the

International Phonetic Alphabet, 44-58. Hickey, Raymond 1996. Lenition in Irish English. Belfast Working Papers in

Linguistics 13, 173-193. Hickey, Raymond 1999. Dublin English: current changes and their motivation. In:

Paul Foulkes and Gerard Docherty (eds) Urban Voices: Accent Studies in the British Isles. London: Arnold, 265-281.

Hickey, Raymond 2003a. What’s cool in Irish English? Linguistic change in contemporary Ireland, In: Hildegard L. C. Tristram (ed.) Celtic Englishes III. Heidelberg: Winter, pp. 357-373.

Hickey, Raymond 2003b. How do dialects get the features they have? On the process of new dialect formation. In: Raymond Hickey (ed.) Motives for Language Change. Cambridge: Cambridge University Press, pp. 213-239.

Hickey, Raymond 2004a. A Sound Atlas of Irish English. Berlin: Mouton de Gruyter.

Hickey, Raymond 2004b. The phonology of Irish English, In: Bernd Kortmann et al. (ed.) Handbook of Varieties of English. Volume 1: Phonology. Berlin: Mouton de Gruyter, pp. 68-97.


Hickey, Raymond 2007. Irish English. History and Present-day Forms. Cambridge: University Press.

Hickey, Raymond 2008. Feature loss in 19th century Irish English, In: Terttu Nevalainen, Irma Taavitsainen, Päivi Pahta and Minna Korhonen (eds) The Dynamics of Linguistic Variation: Corpus Evidence on English Past and Present. Amsterdam: John Benjamins, pp. 229-243.

Hickey, Raymond 2009. Weak segments in Irish English. In: Donka Minkova (ed.) Phonological Weakness in English. From Old to Present-day English. Basingstoke: Palgrave Macmillan, pp. 116-129.

Hickey, Raymond 2011. The Dialects of Irish. Study of a Changing Landscape. Berlin: de Gruyter Mouton.

Hickey, Raymond 2013. Supraregionalisation and dissociation, In: J. K. Chambers and Natalie Schilling (eds) Handbook of Language Variation and Change. Second edition. Wiley-Blackwell, pp. 537-554.

Hickey, Raymond 2014. Vowels before /r/ in the history of English, In: Daniel Schreier, Olga Timofeeva, Anne Gardner, Alpo Honkapoja and Simone


Pfenninger (eds) Contact, Variation and Change in the History of English. Amsterdam: John Benjamins, pp. 95-110.

Hickey, Raymond 2016. English in Ireland: development and varieties, In: Raymond Hickey (ed.) Sociolinguistics in Ireland. Basingstoke: Palgrave Macmillan, pp. 3-40.

Hindley, Reg 1990. The Death of the Irish Language. A Qualified Obituary. London: Routledge.

Honeybone, Patrick 2007. New-dialect formation in nineteenth century Liverpool: a brief history of Scouse. In: Anthony Grant and Clive Grey (eds) The Mersey Sound: Liverpool’s Language, People and Places. Liverpool: Open House Press, 106-140.

Kenny, David 2000. The Little Buke of Dublin: or How to be a Real Dub. Dublin: New Island Books.

Kingston, John 2007. The phonetics-phonology interface, in Paul de Lacy (ed.) The Cambridge Handbook of Phonology. Cambridge: Cambridge University Press, pp. 401-434.

Lass, Roger 1984. Phonology. Cambridge: Cambridge University Press. Mesthrie, Rajend 2010. Socio-phonetics and social change: Deracialisation of the

GOOSE vowel in South African English, Journal of Sociolinguistics 14.1: 3-33.

Nevalainen, Terttu 1999. Making the best use of ‘bad’ data: Evidence for sociolinguistic variation in Early Modern English. Neuphilologische Mitteilungen 100.4: 499-533.

Punch, Aidan 2008. Census data on the Irish language, In: Caoilfhionn Nic Pháidín and Seán Ó Cearnaigh (eds) A New View of the Irish Language. Dublin: Cois Life, pp. 43-54.

Trudgill, Peter 2004. New-Dialect Formation. The Inevitability of Colonial Englishes. Edinburgh: University Press.

Upton, Clive 2008. Received Pronunciation, In: Bernd Kortmann and Clive Upton (eds) Varieties of English 1: The British Isles. Berlin: Mouton de Gruyter, pp. 237-252.

Watson, Kevin 2007. Liverpool English, Journal of the International Phonetics Association. 37(3), 351-60.

Wells, John. C. 1982. The Accents of English. 3 vols. Cambridge: Cambridge University Press.

Gordon and Strelluf American Regional Dialects --- Page 204 of 525

10 Evidence of American regional dialects in early recordings Matthew J. Gordon and Christopher Strelluf 1 Introduction In popular conception American regional dialects belong to a bygone era when people spent their lives in a single location and had limited contact with outsiders. Today, as the story goes, we live in a highly mobile society where people from different regions are regularly in contact and where everyone is exposed to the same broadcast media, resulting in homogenization as once distinctive local speech patterns have been leveled to an unremarkable “General American” dialect (Lippi-Green 2012: 27). Researchers of American English have worked to explode this popular myth and some of the most powerful counterevidence appeared in the Atlas of North American English (ANAE) (Labov, Ash and Boberg 2006). This continent-wide survey found that regional variation was alive and well. While some traditional accent features were in decline, Labov and his colleagues found that new phonological variables had arisen and served to distinguish speakers regionally. One of the most remarkable results emerging from this research was a confirmation of dialect boundaries established a half-century ago (e.g., Kurath 1949, Kurath and McDavid 1961). The traditional picture of American dialects posits three major regions: North, Midland, and South. The boundaries separating these dialects were initially sketched by Kurath (1949) based on materials from the Linguistic Atlas of the United States and Canada project, begun in the 1930s, and later extended to cover the eastern half of the country by various dialect geographers. Research in this paradigm was retrospective in orientation and concentrated its investigation on conservative dialect forms by surveying the speech of mostly older, rural people. Lexical variables (e.g., pail vs bucket) played a central role in these analyses though pronunciation was also studied along with some grammatical variables. The ANAE approach to surveying American dialects differed from this earlier work in almost every way. Labov and his team targeted younger urban speakers and focused their analysis on active sound changes. Nevertheless the ANAE map of dialect regions bears a striking resemblance to the picture drawn by Kurath and other dialectologists. Thus, the boundary between the North and the Midland continues to divide people linguistically even though the components of that divide have evolved. Generations ago Northerners might have distinguished their speech from Midlanders’ by calling dragon flies “darning needles” (cf. “snake feeders”) while today they might do so by their pronunciation of the vowels of LOT and TRAP.1 1 For ease of reading, we indicate vowel classes using the lexical sets developed by Wells (1982). We have adapted that system to describe the varieties of American English we examine. We use, for example, TRAP to designate the /æ/ phoneme class which includes Wells’s BATH class. Similarly by THOUGHT we intend to include the CLOTH words as


The persistence of these dialect boundaries provides a backdrop for the current study as we seek to provide a historical perspective on the present-day patterns. We explore the time depth of some of the phenomena studied by Labov et al. (2006) with a goal of establishing how recently they have arisen. In this way we hope to shed light on the processes constructing the contemporary dialect divisions. Has this diversity resulted from a stream of innovations that stick to established geographical lines or does the present situation represent a continuation of earlier patterns? In keeping with the theme of this volume we sought answers to our research questions in archival recordings. We have gathered evidence from various sources to paint a picture of regional speech patterns among Americans born in the last quarter of the nineteenth century. For the sake of space, we concentrate our analysis on two of the three major dialect regions, the North and the Midlands. The history of Southern dialect patterns is touched on by Thomas (this volume) and has been treated previously by, for example, Bailey (1997) and Montgomery and Eble (2004). 2 Methods Speech samples for this study were drawn primarily from two genres of recordings. First, we sought oral history collections in each region of interest. In particular, we sought the oldest available oral histories that emphasized some aspect of local or regional culture and therefore included a high proportion of locally born interviewees. We found several collections of interviews conducted in the 1960s and 1970s that were of sufficient quality for acoustic analysis. The oldest interviewees in these collections were born in the 1870s. We draw on such oral history collections for our speech samples in Buffalo, New York, Grand Rapids, Michigan, and St. Louis, Missouri. The emergence of radio broadcasting in the United States during the first half of the twentieth century yielded a second source of data. For our Kansas City, Missouri sample, we found recordings made in the 1940s that were of sufficient quality for acoustic analysis. Speakers drawn from these radio recordings were born as early as 1878. Finally, recordings at the Center for Applied Linguistics Collection at the Library of Congress contributed one speaker to our Kansas City sample and one speaker from Indianapolis, Indiana. Our search for relevant recordings was far from exhaustive, and we were overwhelmed by the wealth of available materials in archives around the nation.2

well. Moreover, we use several novel lexical sets to distinguish conditioned vowels that are of interest in ongoing American sound changes (e.g., PIN and PEN to indicate pre-nasal KIT and DRESS; POOL and BOWL to indicate pre-/l/ GOOSE and GOAT). 2 The Center for Applied Linguistics American English Dialect Recordings are available through the searchable map at http://memory.loc.gov/ammem/collections/linguistics/index.html on Library of Congress’s American Memory site. The collection is a subcomponent of the Library’s American Folklife Center, which offers numerous sources of potential linguistic interest. Recordings made specifically for dialect study include those associated with the linguistic atlas projects

http://memory.loc.gov/ammem/collections/linguistics/index.html


Given the diversity of the materials we draw from and particularly the fact that the recordings were made at various times, we frame our analysis in terms of the years of birth of the recorded speakers. Our approach relies on the logic of the apparent-time hypothesis that, for example, examining the speech of an octogenerian recorded in the 1960s opens a window on linguistic patterns in the late nineteenth century. Recent research on language change across the lifespan (e.g., Sankoff and Blondeau 2007) suggests we are generally on safe ground in assuming that phonological patterns like those examined here do not often change dramatically over the course of one’s life. The use of old recordings presented an array of technical challenges. Obviously, there were limitations to recording quality based on equipment available to interviewers from the 1940s to 1970s. Preservation techniques often further degraded recordings. And, since only a few of the interviews in our samples were designed for linguistic purposes, many were unusable because of background noise and other sound quality issues. These technical problems were overcome by limiting our sample to those recordings in a given collection that were relatively free of atmospheric noise and showed generally consistent amplitudes in the interviewee’s speech.

Beyond issues of technical quality, the method also presented specifically sociolinguistic challenges. Particularly important is that people who were likely to be recorded and archived in these collections tended to be those who were regarded as important contemporarily—often as political leaders or civic activists. Many of the speakers we analyze come from middle and upper-class backgrounds, and they appear to be European-Americans. As such, our samples are not representative of the diversity in the communities we examine, a fact that prevents us from exploring possible connections between language change and social factors. Moreover, though we have a better gender balance from some locations, our Kansas City and St. Louis samples include data only from men.

Archive managers for each collection oversaw digitization of original recordings. With the exception of the two Center for Applied Linguistics files, all recordings were digitized from original media specifically for this study. Analog files were sampled at 44.1 kHz with 16-bit resolution to create uncompressed WAV files as recommended by Thomas (2011: 24-27). The files were analyzed acoustically with University of Pennsylvania Linguistics Lab’s Forced Alignment and Vowel Extraction (FAVE) program suite (Rosenfelder, Fruehwald, Evanini and Yuan 2011). Vowels were measured at one-third of their duration between the FAVE-marked onset and offset. Evanini (2009) identified the one-third measurement as most reliable for replicating results from the hand-measured ANAE data. To make inter-speaker comparison possible, vowel measurements were normalized with FAVE’s built-in transformation based on Lobanov (1971).

Not surprisingly, vowel measurement through FAVE introduces a series of potential errors. Strelluf (2014) details these in the context of a large-scale urban dialectology study. For the purposes of the present study it is sufficient to note that some of the vowel measurements herein are less accurate than they would be if they (see http://us.english.uga.edu) and the Dictionary of American Regional English (see http://dare.wisc.edu). Michael Montgomery (p.c.) alerted us to the existence of a series of dialect samples recorded in the 1920s by Ayres and Greet (1930), which we were not able to consult for this study.

http://us.english.uga.edu)

http://dare.wisc.edu)


had been measured individually by hand. However, where hand-measurement of vowels would have allowed us to study only a small proportion of vowels for each speaker, FAVE makes it theoretically possible to measure every vowel that occurs during an interview. For the seventeen speakers we study, FAVE generated 56,023 vowel measurements. This volume of data affords a check against occasional inaccuracies that might occur in machine-analysis. It also allows us much greater flexibility to explore sound changes for conditioning factors, an analysis that we could not conduct meaningfully with a smaller set of hand-measured results.

Results reported here are based on normalized measurements. Vowel charts show mean F1 and F2 values by lexical set for vowels that bear primary stress as marked in the CMU [Carnegie Mellon University] dictionary (Lenzo 2010). Lexical sets exclude vowels with a following nasal consonant or liquid, except in the case of lexical sets that specifically denote a following nasal, /l/, or /r/. 3 The North In American dialectology, the North covers a large territory from New England down to central New Jersey and westward across the Great Lakes region into the northern plains. Kurath’s (1949) original treatment of the North observed several subdivisions (see Johnson and Durian this volume). Greater uniformity seems to obtain to the west in the vast subregion known as the Inland North, which covers western New York, Michigan, and parts of Wisconsin, along with sections of Ohio, Indiana, and Illinois. Several phonological features played a role in originally defining the North (see also Thomas 2010: 392). Kurath and McDavid (1961: 113), drawing on the early linguistic atlas materials, note that only in the North do words like due and new retain the diphthong /iu/ (noted as DEW henceforth), which is merged with GOOSE elsewhere. Furthermore, the PRICE diphthong was commonly produced by Northerners with a centralized nucleus (e.g., [əɪ]) that differed from the low vowel heard in the Midland (Kurath and McDavid 1961: 109). Also noteworthy is the finding that the vowels of the NORTH and FORCE classes were traditionally distinct across the North (and throughout the South) while in much of the Midland they were merged (Kurath and McDavid 1961: 120). While these historical differences have largely been leveled today, the distinctiveness of the North endures due in part to the phenomenon known as the Northern Cities Shift (NCS). The NCS describes an apparently coordinated series of vowel changes represented in Figure 1. While this representation glosses over certain phonetic complications (see Gordon 2001, 2012), it serves to illustrate the main trajectories of the shifting vowels. The pattern shown in the figure is found by ANAE to occur across the entire Inland North with some extensions beyond that region (see, e.g., discussion of St. Louis below).


Figure 1. The Northern Cities Shift (after Labov 1994: 191) Given its tremendous geographical extent, we might expect that the NCS has been spreading for quite a while, though just how long it has been active is a matter of debate. While the changes came to the attention of linguists only in the late 1960s, some elements of the shift could even then be heard in the speech of people over seventy years of age (Labov, Yaeger and Steiner 1972; Labov 1994). Gordon (2001) speculated that the NCS had been operating since at least the 1930s and later suggested a starting point a few generations prior (Gordon 2012). Labov (2010) posited a much deeper time-depth, detailing a scenario in which the sociolinguistic forces that produced the shift were set in motion in the 1820s with the building of the Erie Canal. Uncertainties about the relative ordering of the component changes in the NCS further complicate the matter. Labov (e.g., 1994) argues that the shift began with the fronting and raising of TRAP, which spurred a drag-chain reaction from LOT, which in turn dragged THOUGHT down and frontward. Gordon (2001) challenges that account, noting that the dialectological records indicate that a fronted LOT vowel has characterized Northern speech for at least a century (see also Boberg 2001). McCarthy (2011) explores the history of the NCS through older recordings. Her acoustic analysis of Chicagoans born in the 1890s finds only hints of NCS-like movement in the form of mild raising of TRAP together with some fronting of LOT and sees clearer evidence of the current pattern appearing among people born nearly twenty-five years later. Thomas (2010) finds much the same pattern (i.e., minimal TRAP raising with some LOT fronting) among speakers born around 1880 some 350 miles to the east in northern Ohio. Taken as a whole then the literature suggests that the NCS came into its full form only in the twentieth century though some movement of TRAP and LOT can be heard earlier. Still, the relative order of these changes remains an open question. We build on this picture of the phonological history of the Northern dialect by examining evidence from two locations at roughly opposite ends of the Inland North. In the east, we sampled four natives of the Buffalo area born from 1890 to 1907.3 To the west, we analyzed four natives of Grand Rapids, Michigan born

3 Buffalo interviews are maintained in the University Archives Oral History Collection at the University of Buffalo. Low-resolution audio files are available at http://digital.lib.buffalo.edu/cdm/landingpage/collection/LIB-UA014.

http://digital.lib.buffalo.edu/cdm/landingpage/collection/LIB-UA014


between 1878 and 1893.4 Interviews in both collections were conducted in the 1970s. In addition to representing the eastern and western parts of the dialect region, Buffalo and Grand Rapids also differ in size, especially during the relevant period. Data from the 1890 US Census show that Buffalo had more than four times the population of Grand Rapids (255,664 vs. 60,278, see Gibson 1998). In Figure 2 we plot the vowel systems of the oldest speakers from each location: a Buffalo woman born in 1890 and a Grand Rapids man born in 1878. These speakers have very similar vowel spaces, and indeed they are representative of the larger sample.

A

4 Grand Rapids interviews are maintained in the Grand Rapids Oral Histories at Grand Valley State University. Transcripts and audio are available for many speakers at http://cdm16015.contentdm.oclc.org/cdm/landingpage/collection/p15068coll1.

http://cdm16015.contentdm.oclc.org/cdm/landingpage/collection/p15068coll1


B

Figure 2. Mean productions of vowel classes for (a) Grand Rapids man, b. 1878, and (b) Buffalo woman, b. 1890.

Examining the plots in Figure 2 for evidence of traditional Northern features, we find a mixed bag. The Grand Rapids speaker appears to maintain a distinct DEW class separate from GOOSE, though the subset of /u/ items with preceding coronal consonants (the TWO class) shows some fronting toward DEW. We also plot separately the instances of /u/ followed by /l/ (POOL), which commonly resist fronting. For the Buffalo speaker the separation of DEW, TWO, and even GOOSE is much reduced, casting doubt on the presence of a separate /iu/ phoneme in her inventory. The other speakers in the sample generally pattern with DEW front of TWO which is front of GOOSE (and POOL), and they differ only in the distances between these points. Both of the speakers in Figure 2 show some separation of PRICE and PRIDE, which represent pre-voiceless and pre-voiced (and word-final) environments respectively. The higher position (lower F1) for PRICE is consistent with the pattern known as Canadian Raising (Boberg, this volume; Clarke, De Decker and van Herk, this volume) though the nucleus does not approach the mid range described as characteristic of the North by Kurath and McDavid (1961: 109). Some other speakers in our sample show no separation of PRICE and PRIDE, and among those who do, none has raising much beyond the 800 Hz level. We find only the slightest indications of the expected distinction of NORTH and FORCE with these speakers. For the speaker born in 1878 the mean position of FORCE is somewhat higher and backer of NORTH and we find some separation from GOAT. Naturally the mean values can obscure substantial phonetic variation. In Figure 3 we plot the values for individual tokens of the NORTH and FORCE


classes for our oldest speaker. Despite some overlap with a few tokens, we see here that items from each class tend to cluster separately, though very little acoustic space separates these clusters. Some other speakers in our sample show a similar distinction consistent with the interpretation that NORTH and FORCE constitute separate phonemic classes. Nevertheless, this pattern does not hold across all speakers, as is evident in the other half of Figure 2. For the Buffalo speaker born in 1890, the NORTH and FORCE classes along with GOAT and BOWL occupy the same territory. A plot of the individual tokens, which we do not include due to space constraints, shows nearly complete overlap. Thus, for this speaker and several others like her in our sample, the phonemic contrast of NORTH and FORCE appears to have collapsed.

Figure 3. NORTH and FORCE tokens for Grand Rapids man, b. 1878 In the following section we explore in detail a feature that has come to distinguish the North from the Midland: the fronting of back vowels, particularly GOOSE and GOAT. The ANAE evidence generally shows more conservative positioning of these vowels in the North, and our data suggests this was the case historically as well. The plots in Figure 2 illustrate the pattern seen across all our speakers. Other than some fronting of the TWO class, GOOSE occupies the upper back corner of vowel space. The lack of separation between GOOSE and POOL seen here further confirms the absence of significant fronting for this phoneme. The same picture emerges for the mid back vowels where we find GOAT just barely advanced from BOWL. We conclude our examination of the North by assessing the evidence of the NCS. In the ANAE, Labov and his colleagues give several diagnostics for measuring participation in the NCS, which they use to define the North more broadly (see 2006: 207 for a map of the relevant isoglosses). The criteria most relevant here include:


1. Raised TRAP regardless of phonological context, with F1 lower than 700 Hz (AE1 criterion).

2. Fronted LOT, with F2 greater than 1450 Hz (O2 criterion). 3. Backed STRUT relative to LOT (F2 for STRUT lower than F2 for LOT) (UD

criterion). 4. Reduced distance front-to-back between DRESS and LOT (F2 for DRESS is

within 375 Hz of F2 for LOT) (ED criterion). Among our Northerners, we detect some evidence of TRAP shifting in the predicted direction. Comparing the vowel plots in Figure 2 above, for example, we see TRAP lies some 100 Hz higher (with a lower F1 value) for the Buffalo speaker than for the apparently more conservative Grand Rapids speaker. We note too that in both cases we see a stronger tendency for raising in the pre-nasal context (the PAN class). The NCS is thought to be an urban phenomenon in origin – hence the “cities” in its name – and we might wish to interpret the contrast between the speakers in Figure 2 in these terms. However, we can readily find counterevidence for the suggestion of an urban lead in the changes with examples like the Grand Rapids woman whose vowel system we plot in Figure 4. This speaker, born in 1890, has TRAP slightly higher than her Buffalo counterpart (Figure 2). Still, neither this speaker nor any other in our sample meets the ANAE criterion of a mean TRAP under 700 Hz.

Figure 4. Mean productions of vowel classes for Grand Rapids woman, b. 1890 With the evidence related to LOT fronting we find a stronger case for early NCS activity. Four of our eight speakers meet the ANAE O2 criterion with mean F2 values above 1450 Hz. These four and two others meet the UD criterion, and they do so mainly by virtue of a fronted LOT. We do not find the extreme backing of STRUT heard in NCS areas today. Similarly, we see few indications that DRESS is


shifting. Four of our speakers meet the ED criterion, but they achieve this largely by fronting LOT, as DRESS maintains its position as a front vowel.

For a more nuanced exploration we can look beyond the mean values to the distribution of individual tokens of the NCS vowels. Figure 5 plots the results for tokens of TRAP/PAN, LOT, and THOUGHT for the Grand Rapids speaker whose means are shown in Figure 4. We see that several of her /æ/ words appear above the 700 Hz threshold, and among these are some in which the vowel is raised before voiceless stops (e.g., at, exactly), a distinctly NCS-like pattern. This speaker seems more conservative in her pronunciation of the other vowels. A few tokens of THOUGHT are produced with some apparent lowering in LOT territory. More surprising perhaps is the very limited fronting seen with LOT. She does, however, produce some tokens very low in vowel space, and such lowering may be indicative of the NCS (see Gordon 2001).

Figure 5. TRAP/PAN, LOT, and THOUGHT tokens for Grand Rapids woman,

b. 1890

We see a quite different distribution of tokens in Figure 6, which plots the results for TRAP/PAN, LOT, and THOUGHT for a Buffalo man born in 1899. Several examples of /æ/ are raised beyond 700 Hz though most of these involve a following nasal. We see no clear separation between TRAP and LOT as many tokens of the latter are fronted to the F2 range of 1600 Hz. Similarly it seems THOUGHT has undergone lowering into, but just slightly back of, LOT territory. This picture suggests the speaker has taken some first steps in the NCS.


Figure 6. TRAP/PAN, LOT, and THOUGHT tokens for Buffalo man, b. 1899

In sum, our glances into the history of Northern phonology largely confirm the observations of McCarthy (2011) and Thomas (2010) that the vowel changes associated with the NCS appeared only in a nascent form in the late nineteenth century. Still, when we look beneath the surface, as we do by examining a large number of examples in Figures 5 and 6, we find indications that the wheels of the NCS were in motion at that time. By contrast we are surprised to find that some of the features described in the dialectological literature as characteristic of the North were not so consistent among our sample. This discrepancy may stem from methodological differences between our study and earlier work, with regard to data collection (e.g., unscripted conversational speech vs. elicited words in isolation) or population sampled, though such questions must remain for future research. 3 The Midland Kurath (1949) mapped the Midland based on his observations of the eastern seaboard and the settlement patterns that carried settlers westward from that region. As sketched in traditional dialectology the Midland encompasses a vast and diverse geography, cutting across Ohio, Indiana, and Illinois in the north and skirting the southern edge of the Appalachians in the south before stretching westward across Kentucky and Tennessee and beyond. From a phonological perspective, Labov et al. (2006) define the Midland more narrowly, roughly on the lines of the traditional North Midland, with a southern boundary following the Ohio River. They describe this region as something of a “lowest common denominator of the various dialects of North America” that “does not show the homogenous character” that marks other regions (pp. 264-265). Instead, many Midland urban centers have developed


unique local dialects. Even with this lack of regional homogeneity, ANAE identifies a set of diagnostics for the Midland (see pp. 263-266 for discussion):5

1. The “transitional” merger of LOT and THOUGHT, where the vowels are not

produced and perceived uniformly as either the same or distinct. 2. Fronting of GOAT beyond 1200 Hz in F2. 3. Fronting of MOUTH beyond 1550 Hz in F2. 4. Fronting of STRUT beyond 1450 Hz in F2. While most of these characteristics are shared in some way with other dialect regions, their combination is unique to the Midland, and in many cases—especially the fronting of back vowels—the Midland exhibits more extreme manifestations than other regions. More importantly, because some of these changes are moving in opposition to corresponding changes in the NCS (e.g., STRUT backs in the NCS, but fronts in the Midland), the boundary between the North and Midland marks one of the sharpest dialect divisions in North American English. ANAE describes two cities as prototypical of the broader Midland pattern: Kansas City and Columbus, and three cities as marked by locally unique dialects: St. Louis, Cincinnati, and Pittsburgh. We provide new historical data for Kansas City and Indianapolis and review findings for Columbus and central Ohio to explore the broader development of the Midland dialect. We provide new historical data for St. Louis and review findings for Pittsburgh and western Pennsylvania to explore the development of localized Midland patterns. 3.1 Kansas City ANAE identifies Kansas City as a prototype for Midland speech, in particular for extreme fronting of back vowels. All four speakers surveyed for ANAE show F2 values for GOAT above 1200 Hz – one fronts past 1400 Hz, and another fronts beyond a central position of 1550 Hz (Labov et al. 2006: 265). All show extreme fronting of MOUTH beyond 1750 Hz, placing the nucleus well into TRAP territory (p. 267). All front STRUT beyond a central position of 1550 Hz, with one exceeding 1650 Hz. In the Midland’s transitional low back vowel merger, all Kansas City speakers are judged to have these vowels close (p. 264). Kansas City also shows a relatively high rate of conditioned merging of KIT and DRESS before nasal consonants (the PIN=PEN merger) – which is a more Southern pattern, but shows robust distributions in several Midland cities (p. 68)

Our Kansas City sample is drawn from 1940s radio recordings of five men born between 1878 and 1893.6 A man born in 1902 was also included, drawn from an interview recorded around 1975 and housed in the Center for Applied

5 ANAE also describes the monophthongization of PRIDE before resonants, but not before obstruents in the Midland (Labov et al. 2006: 266). The distribution appears to be relatively limited, though, and we do not explore it here. 6 Radio recordings are housed at the Marr Sound Archive in Miller Nichols Library at the University of Missouri-Kansas City, catalogued at http://library.umkc.edu/marr.

http://library.umkc.edu/marr


Linguistics Collection. Figure 7 shows vowel charts for speakers born in 1884 and 1887, who are generally representative of our Kansas City sample.

A

B Figure 7. Mean productions of vowel classes for Kansas City men, born (a) 1884

and (b) 1887


It is immediately clear that the ANAE-observed pattern of fronted back vowels has not developed in these speakers. GOAT occupies a conservative F2 position showing relatively little separation from BOWL. The nucleus of MOUTH is back of center in F2. For the speaker born in 1884, STRUT is just front of the Midland threshold at 1459 Hz in F2 (it is also at a central position for the 1878 Kansas Citian, who is not shown), but is back of this position for all others. TWO shows some fronting relative to GOOSE and POOL, but even here these speakers appear relatively conservative. Contrary to expectations of traditional Midland speech, Kansas Citians appear to maintain a clear separation between TWO and DEW, and between NORTH and FORCE. In short, the Kansas City sample offers little indication of the unique Midland pattern that will emerge for the fronting of back vowels, and generally denies a clear regional classification in terms of markers noted in other studies of the North-Midland dialect boundary (e.g., Thomas 2010). There is similarly little evidence of the low back vowel merger or the PIN=PEN merger among these early Kansas Citians. Most speakers produce LOT in a relatively high position pre-nasally, with LOT F1 values nearing those of THOUGHT. But the phonemes remain distinct elsewhere. One speaker, born in 1902, stands out as having an apparent PIN=PEN merger. Figure 8 displays his productions of PIN and PEN tokens.

Figure 8. PIN and PEN tokens for Kansas City man, b. 1902 His tokens of in and finished make clear incursions into PEN territory, and ended appears in the high range of PIN tokens. F1 and F2 differences are not significant in Welch’s two sample t-Test (PIN F1: 591, PEN F1: 621, p=0.1208; PIN F2: 2044, PEN F2: 2005, p=0.6177). Nothing in the life story that this Kansas Citian shares suggests that he spent time in areas where he would have obviously had contact with the PIN=PEN merger; he attended college at the University of Missouri and lived in Chicago during a brief stint as a professional boxer. Otherwise, he spent his entire life in Kansas City.


This exception is noteworthy, and may point to early an innovation in the development of the Midland dialect of Kansas City. The behavior of the sample as a whole, though, supports the general conclusion that ANAE’s diagnostics for the Midland have not yet developed in Kansas City at the turn of the twentieth century.

3.2 Central Ohio Thomas (2010) provides perhaps the most comprehensive model for the type of exploration we are conducting here in his study of the North-Midland boundary in Ohio. His acoustic data is drawn primarily from his analysis of DARE recordings of Ohioans born between 1880 and 1907. He provides individual vowel plots for two Midland speakers. One born in Dover, Ohio in 1887 shows conservative positions for back vowels, with F2 measurements consistent with those we observe in Kansas City. The other, born in Mount Vernon, Ohio, appears to be more innovative in fronting back vowels, as well as displaying a probable merger of LOT and THOUGHT in pre-/t/ contexts (2010: 391). Across all Thomas’s speakers, three Midlanders show advanced or intermediate GOAT fronting, compared with seven speakers with more conservative back productions (p. 408). Four show more advanced MOUTH fronting against six showing more conservative forms (p. 406). One is merged in LOT and THOUGHT, four are transitional, and five are distinct (p. 410). Durian’s (2012) reanalysis of field notes from linguistic atlas interviews of speakers from Columbus seems to support these findings. Among two men, one born in 1846 and the other in 1854, he finds diphthongization (but not yet fronting) of GOAT, “beginnings of centralization/fronting” of TWO, and no MOUTH fronting (p. 89). He also notes a close realization of LOT and THOUGHT before /t/. His acoustic measurements of speakers born as early as 1896 appear to confirm the conservative positions of back vowels, though all appear to realize TWO in a front-of-center position (e.g., p. 363), and a few appear to have relatively fronter productions of GOAT (e.g., p. 379, 410). These findings hint at the future development of the Midland dialect, but clearly remain far behind the productions characteristic of the region defined in ANAE. 3.3 Indianapolis Fogel (2008) provides data from four Indianapolis speakers born before 1940, including one speaker born as early as 1928. She describes the city as “largely prototypical of the Midland region” (p. 147). Her oldest group appears to conform to this characterization, showing fronted MOUTH and GOAT (pp. 140-141). One of her four oldest speakers has merged LOT and THOUGHT in pre-/n/ contexts, and one has merged them in pre-/l/ contexts. As such, her oldest sample shows a clear emergence of Midland dialect traits. We sampled a single Indianapolis speaker, born around 1904, through the Center for Applied Linguistics Collection. Figure 9 shows her overall system. MOUTH remains back of a central position at 1467 Hz in F2. GOAT, at 1150 Hz in F2, is slightly back of the Midland threshold for fronting.


Figure 9. Mean productions of vowel classes for Indianapolis woman, b. 1904 Acoustically, she seems to show relatively little distinction between LOT and THOUGHT, especially in F2. Figure 10 shows productions of LOT and THOUGHT tokens with a following stop. There is a great deal of overlap in vowel space, with lot, got, and copies occurring near thought, bought, and taught, and walk occurring at a relatively low position. Overall differences between these classes are significant in F1 (LOT F1: 761, THOUGHT F1: 704, p < .0033), but not F2 (LOT F2: 1189, THOUGHT F2: 1162, p = .3934). While not shown due to space considerations, a similar pattern appears for PIN and PEN, which show separation only in height.


Figure 10. LOT and THOUGHT tokens for Indianapolis woman, b. 1904 While our study of a single Indianapolis speaker can be only suggestive, it seems to support Fogel’s finding of an emergent low back vowel merger in the first half of the twentieth century. There is also a similar development toward the PIN=PEN merger, reminiscent of the pattern exhibited by our 1902 Kansas City man.

3.4 St. Louis ANAE portrays St. Louis “as a northern enclave in the Midland area” (Labov et al. 2006: 276). St. Louisans in their survey meet roughly half of the diagnostics for the North, including the AE1, O2, UD, and ED criteria (see NCS discussion above). On the other hand, St. Louis speakers do not show the most advanced productions characteristic of the NCS, for instance TRAP becoming fronter and higher than DRESS, or STRUT backing toward THOUGHT. ANAE also finds St. Louis meeting Midland thresholds for GOAT and MOUTH fronting, though never to the extreme front positions observed in cities like Kansas City, Columbus, and Indianapolis. St. Louis is also noted for its locally unique productions of the NORTH and FORCE classes. Rather than merging at the higher position of FORCE as happened in much of the United States, the classes are said to remain distinct and START tokens raise to the position of NORTH (the CORD=CARD merger). In some popular stereotypes of old St. Louis speech, forty is pronounced as “farty.” The merger is generally seen to be receding in St. Louis speech, but four of five St. Louis speakers in ANAE still demonstrate the pattern (Labov et al. 2006: 277-278).


We analyzed recordings of two St. Louis speakers, one born in 1912 and the other in 1915, drawn from interviews conducted in the late 1990s.7 Vowel charts for both speakers are shown in Figure 11.

A

7 Both interviews are part of the USS Schley Oral History Project, archived at the State Historical Society of Missouri. Collection information is available at http://shs.umsystem.edu/manuscripts/invent/4068.pdf.

http://shs.umsystem.edu/manuscripts/invent/4068.pdf


B Figure 11. Mean productions of vowel classes for St. Louis men, born (a) 1912

and (b) 1915 By ANAE diagnostics, the system does not appear to be obviously either Midland or Northern in character. TWO and DEW appear to be nearing merger as expected for the Midland, but POOL also seems to be fronting with GOOSE, characteristic of today’s South. The /u/ vowels are particularly front for the 1912 man. GOAT is fronter for both men than it is for our Kansas City speakers, but has not reached the 1200 Hz Midland threshold. TRAP appears to have raised in the direction of DRESS and fronted past it, but neither speaker crosses the AE1 threshold of 700 Hz. Furthermore, the higher position of PAN suggests phonetically conditioned raising, rather than the general TRAP raising characteristic of the North (cf. Labov et al. 2006: 176). As with Buffalo and Grand Rapids speakers above, LOT shows more NCS-like innovations. Both speakers meet the ANAE O2 threshold for LOT F2 exceeding 1450 Hz – the 1912 speaker does so comfortably at 1488 Hz, the 1915 speaker just exceeds it at 1460 Hz. Both speakers also meet the ED criterion as their fronted LOT reduces distance in F2 from DRESS below 375 Hz. The 1912 speaker has a difference of 213 Hz and the 1915 speaker a difference of 317 Hz. The front position of LOT also allows both speakers to just meet the UD criterion, despite a mean STRUT F2 of 1458, which also remains front of the Midland STRUT threshold. As such, these two speakers meet three of the four NCS thresholds that St. Louis speakers surveyed in ANAE meet, only lagging in the AE1 criterion, while also reflecting some Midland developments. In terms of TRAP raising, the 1915 speaker appears to lead, despite his trailing the 1912 speaker in measures related to LOT fronting. Figure 15 shows values for all TRAP and DRESS tokens (vowels preceding a nasal consonant, /l/, or /r/ are excluded). While, as a whole, his TRAP productions are lower in vowel space than his DRESS productions, there is a relatively high degree of overlap in F1 between the two classes (Figure 12). Several words, e.g., bad, scrap, passed, have F1 below 700 Hz.


Figure 12. TRAP and DRESS tokens for St. Louis man, b. 1915 In terms of the St. Louis CORD=CARD merger, the vowel chart for the speaker born in 1912 in Figure 11 is clearly suggestive of the pattern. His START and NORTH means group together near THOUGHT, and his FORCE tokens are separated strongly at a relatively high and back position. Based on the vowel means, the speaker born in 1915 appears to maintain a three-way distinction in the classes. The distribution of the individual tokens he produced, shown in Figure 13, suggests an erosion of the traditional St. Louis pattern. Many NORTH tokens collocate with START tokens, as expected for a speaker with the CORD=CARD merger, but overall his NORTH tokens cluster in the range of FORCE tokens.


Figure 13. NORTH, FORCE, and START tokens for St. Louis man, b. 1915 Indeed, impressionistically, the 1915 speaker’s NORTH tokens sound much more FORCE-like than those of the 1912 speaker. As such, he may represent an early step toward the dissolution of the CORD=CARD merger as a marker of St. Louis speech. The evidence from these speakers provides time depth to the uniqueness of the St. Louis Midland dialect. As was the case in Grand Rapids and Buffalo, by several ANAE measures NCS characteristics may already be developing at the beginning of the twentieth century. The CORD=CARD merger also seems evident, but with a possibility of emerging recession. 3.5 Pittsburgh (and western Pennsylvania) “Pittsburghese” is popularly salient as unique among US dialects for a list of lexical, morphosyntactic, and phonological markers. These include MOUTH monophthongization (dahntahn for downtown) and the merger of KIT and FLEECE before /l/ (Stillers for Steelers) (cf. Johnstone et al. 2002). In ANAE, Pittsburgh also generally participates in Midland patterns for fronting back vowels (e.g., Labov et al. 2006: 156, 158, 159), and demonstrates a complete, rather than transitional, merger of LOT and THOUGHT (p. 287). Pittsburgh also demonstrates a locally unique lowering of STRUT, which can occupy a low-central position (p. 288). Johnstone et al. (2002) analyze the development of MOUTH monophthongization over time by reanalyzing linguistic atlas field notes from interviews with ten Pittsburghers born between 1851 and 1896, and by impressionistically coding documentary footage of five local speakers born between 1900 and 1919. They find no evidence of monophthongization for


speakers born before 1900, but extensive monophthonization for speakers born after (pp. 156-157). This suggests that this local feature was emerging near the time of interest in our study. Evanini’s (2009) study of Erie, Pennsylvania potentially sheds light on the advanced status of the low back merger in the area around Pittsburgh. In exploring the historic North-Midland boundary between Erie and Pittsburgh, Evanini provides an acoustic analysis of one Pittsburgh woman, born in 1929, who shows a Euclidean distance of only 66 Hz between mean values for LOT and THOUGHT, suggesting the merger was already solidly in place in the city in the first half of the twentieth century (p. 137). Complicating this, however, Evanini’s presumably Northern speakers from Erie all also seem to be merged (e.g., a man born in Erie in 1912 shows a LOT/THOUGHT Euclidean difference of 78 Hz) (pp. 147-148). In archive data, he finds a man born in Erie in 1887 with a 250 Hz difference between LOT and THOUGHT, for whom the distinction between vowels seems present but tenuous (p. 152). The advanced state of this merger in a region geographically near to – but presumed to be dialectically different from – Pittsburgh suggests a longer history for the feature in the area. The low back vowel merger had likely taken hold in and spread from Pittsburgh at a time much earlier than its occurrence in other Midland cities. Like St. Louis, Pittsburgh’s unique Midland dialect seems already to have been emerging at the start of the twentieth century. 4 Conclusion We have explored one of the most significant dialect boundaries in American English by sampling the speech of people born on either side of the North/Midland divide well over a century ago. Our historical investigation was informed by reports in the dialectological literature of traditional patterns as well as by more recent surveys of the phonological landscape. On both these scores our results offer a mixed bag. We see, for example, evidence of a distinction between DEW and GOOSE among some Northerners, but, contrary to expectations, we find some Midlanders with the same phonemic contrast. Certain patterns that distinguish the regions today appear in seemingly preliminary form as in the case of the NCS in both the Buffalo and Grand Rapids samples. At the same time, other currently distinctive features, such as the fronting of GOAT in the Midland, are absent among the speakers we analyzed. The general picture that emerges from our admittedly limited investigation suggests that the principal phonological distinctions separating the North and the Midland today arose over the course of the twentieth century even though some seeds of this division were germinating earlier. Acknowledgments We benefitted tremendously from the expertise and generous help we received from archivists and their staffs. We are grateful to Jeff Corrigan and Laura Jolley at the State Historical Society of Missouri, Chuck Haddix and Andrew Hansbrough at University of Missouri-Kansas City, Nancy Richard and Max Eckard at Grand


Valley State University, and Amy Vilz, Scott Hollander, Kris Miller, and Stacy Person at University of Buffalo.

References Ayres, Harry M. and W. Cabell Greet 1930. American speech records at Columbia

University. American Speech 5(5): 333-358. Bailey, Guy 1997. When did Southern American English begin? In Edgar W.

Schneider (ed.), Englishes around the World: Studies in Honour of Manfred Görlach. Amsterdam: John Benjamins, pp. 255-275.

Boberg, Charles 2001. The phonological status of western New England. American Speech 76(1): 3-29.

Durian, David 2012. A new perspective on vowel variation across the nineteenth and twentieth centuries in Columbus, OH. Columbus, OH: The Ohio State University Ph.D. dissertation.

Evanini, Keelan 2009. The permeability of dialect boundaries: A case study of the region surrounding Erie, Pennsylvania. Philadelphia, PA: University of Pennsylvania PhD. Dissertation.

Fogel, Deena 2008. Indianapolis, Indiana: A prototype of Midland convergence. University of Pennsylvania Working Papers in Linguistics 14(1). 135-148. URL: http://repository.upenn.edu/pwpl/vol14/iss1/11

Gibson, Campbell 1998. Population of the 100 largest cities and other urban places in the United States: 1790-1990. U.S. Census Bureau, Population Division Working Paper No. 27. URL: http://www.census.gov/population/www/documentation/ twps0027/twps0027.html

Gordon, Matthew J. 2001. Small-Town Values and Big-City Vowels: A Study of the Northern Cities Shift in Michigan. vol. 84, Publication of the American Dialect Society. Durham, NC: Duke University Press.

Gordon, Matthew J. 2012. English in the United States. In: Raymond Hickey (ed.), Areal Features of the Anglophone World. Berlin: Mouton de Gruyter, pp. 109-132.

Johnstone, Barbara, Neeta Bhasin and Denise Wittkofski 2002. “Dahntahn” Pittsburgh: Monophthongal /aw/ and representations of localness in southwestern Pennsylvania. American Speech 77(2): 148-166.

Kurath, Hans 1949. A Word Geography of the Eastern United States. Ann Arbor: University of Michigan Press.

Kurath, Hans and Raven I. McDavid, Jr. 1961. The Pronunciation of English in the Atlantic States. Ann Arbor: University of Michigan Press.

Labov, William 1994. Principles of Linguistic Change, Vol. 1: Internal Factors. Malden, MA: Blackwell.

Labov, William 2010. Principles of Linguistic Change, Vol. 3: Cognitive and Cultural Factors. Oxford: Blackwell.

Labov, William, Sharon Ash and Charles Boberg 2006. Atlas of North American English: Phonetics, Phonology, and Sound Change. Berlin: Mouton de Gruyter.

Labov, William, Malcah Yaeger and Richard Steiner 1972. A Quantitative Study of

http://repository.upenn.edu/pwpl/vol14/iss1/11

http://www.census.gov/population/www/documentation/


Sound Change in Progress. Philadelphia: US Regional Survey. Lenzo, Kevin 2010. CMU Pronouncing Dictionary.

http://www.speech.cs.cmu.edu/cgi-bin/cmudict (17 September 2013) Lippi-Green, Rosina 2012. English with an Accent, Second edition. London:

Routledge. Lobanov, Boris M. 1971. Classification of Russian vowels spoken by different

speakers. Journal of the Acoustical Society of America 49(2B): 606-608. McCarthy, Corrine 2011. The Northern Cities Shift in real time: Evidence from

Chicago. University of Pennsylvania Working Papers in Linguistics 15(2): 101-110.

Montgomery, Michael and Connie Eble 2004. Historical perspectives on the pen/pin merger in Southern American English. In: Anne Curzan and Kim Emmons (eds), Studies in the History of the English Language II: Conversations between Past and Present. Berlin: Mouton de Gruyter, pp. 429-449.

Rosenfelder, Ingrid, Fruehwald, Joe, Evanini, Keelan and Jiahong Yuan 2011. FAVE (Forced Alignment and Vowel Extraction) Program Suite. http://fave.ling.upenn.edu (17 Sept., 2013)

Sankoff, Gillian and Hélène Blondeau 2007. Language change across the lifespan: /r/ in Montreal French. Language 83: 560-588.

Strelluf, Christopher 2014. “We have such a normal, non-accented voice”: A sociophonetic study of English in Kansas City. Columbia, MO: University of Missouri Ph.D. dissertation.

Thomas, Erik R. 2010. A longitudinal analysis of the durability of the Northern-Midland dialect boundary in Ohio. American Speech 85(4): 375-430.

Thomas, Erik R. 2011. Sociophonetics: An Introduction. London: Palgrave. Wells, J. C. 1982. The Accents of English, Vol. 1: An Introduction. Cambridge:

Cambridge University Press.

http://www.speech.cs.cmu.edu/cgi-bin/cmudict


Johnson and Durian New England --- Page 228 of 525

228

11 New England Daniel Ezra Johnson and David Durian 1 Introduction The six New England states, although they contain less than 5% of the population of the United States (and comprise less than 2.5% of its area), have played an outsized role in the political, economic, and cultural history of the nation. In the study of American dialects, too, a strong focus has been placed on New England. In part, this has resulted from a perception that it is the home of a great deal of linguistic diversity, considering its size. And the speech of Boston (and Eastern New England more generally) does have some characteristics - for example, the combination of non-rhoticity and the use of the “broad a” - that are fairly unique in the North American context, and recall features of some Southern British English varieties. The early volumes of Dialect Notes contained many contributions from New England. Then, the pilot endeavor of the Linguistic Atlas of the United States and Canada (LAUSC) project was chosen to be the Linguistic Atlas of New England (LANE) (Kurath, et al. 1939-1943). These volumes, modeled on contemporary European dialect atlases, turned out to be the only LAUSC product that would be published in the form of an atlas. LANE is known for the attention paid to social class and age in its sampling procedure (the oldest speakers were born before 1850), and for the use of nine fieldworkers to cover the territory, each trained in on-the-spot phonetic transcription (since recording devices were not available at the time of initial fieldwork). However, the employment of multiple fieldworkers has been criticized in the years since, especially as some of them are seen to have been less skilled than others. Put more generously, the techniques developed for impressionistically recording dialects in the field may have been better suited for the dialects of Europe, where larger phonetic differences tended to exist. On the other hand, many of the phonetic differences among North American dialects are quite subtle, and some of the fieldworkers unfortunately fell back on conservative transcriptions of changes in progress (Labov 1963, Boberg 2001). Because of this, it is especially fortunate that some of the LANE fieldworkers, under the direction of Miles Hanley, produced a large set of aluminum disc recordings in the early 1930s, mostly by revisiting informants previously interviewed for LANE. These “Hanley Recordings” (Hanley 1936, Waterman 1974) provide the data for the studies of ten LANE speakers conducted in this chapter. Despite appeals to linguists to utilize the valuable Hanley Recordings (Purnell 2012), until now they have mostly remained in their repositories such as the Library of Congress, largely uncatalogued and unused (although see Thomas, 2001 for a notable exception). Meanwhile, without the benefit of these recordings, several studies (e.g.


229

Bloch 1935, Chase 1935) were produced using the original LANE transcriptions, and the LANE data was later combined with that from the Linguistic Atlas of the Middle and South Atlantic States to produce the two overall masterworks of the LAUSC tradition, A Word Geography of the Eastern United States (Kurath 1949) and The Pronunciation of English in the Atlantic States (PEAS) (Kurath and McDavid 1961). In PEAS, a mass of phonetic detail was presented alongside structural-phonological analyses that compared the vowel systems of the major East Coast dialects. Still, these summaries and later syntheses (e.g. Wetmore 1959) ultimately had to rely on the field records of LANE, which were not necessarily reliable in all cases. Labov (1963) kept the linguistic spotlight on New England with his Martha’s Vineyard study, although the variable centralization of the vowel nuclei in the PRICE and MOUTH sets has rarely been investigated in further work (but see Roberts 2007). Several sociolinguistic studies relating to Boston phonology have appeared (Parslow 1967, Laferriere 1977, 1979). More recently, variationist work has looked at rhoticity (Nagy and Irwin 2010), the evolution of the dialect boundary between Eastern and Western New England (Stanford et al. 2012), and produced useful overviews of Western New England (Boberg 2001) and overall New England phonology (Nagy and Roberts 2004). In general, these studies contrast the enduring influence of early settlement patterns, in keeping with the Doctrine of First Effective Settlement (Zelinsky 1973), with more recent changes that may result from internal factors or from the arrival of immigrants, the migration of speakers or the diffusion of locally-prestigious forms (for example, the influence of Boston and its non-rhotic speech was felt in many parts of New England and even beyond, during the nineteenth and early twentieth centuries). Because New England was settled so early, the Hanley Recordings do not reach particularly far back into its past, relatively speaking. Even our oldest speakers were born between 100 and 200 years after settlement (for interior and coastal regions, respectively). So we do not have the opportunity found in the Origins of New Zealand English project (Gordon et al. 2007) to hear the voices of people only a generation or two removed from the original settlements. However, the Hanley Recordings are old enough to predate some major phonological changes that have been identified in New England English. Considering the work of Johnson (1998, 2010) and Durian (2012), there is reason to believe that some of the characteristic vowel patterns of contemporary New England have actually developed comparatively recently. In particular, there are two areas of the vowel system that we will be investigating in this chapter. First, most of New England today is known for having the “nasal system”, where the TRAP/BATH1 lexical sets are tensed, raised, and potentially offgliding before all instances of /n/ and /m/ (regardless of syllable structure or grammatical status), and nowhere else. The main exception to this is

1 Throughout our discussion here, we use the keywords of Wells (1982) for each of the vowel classes we analyze, with the exception of the /uw/ class. There, we use the keywords SHOES and BOOT to represent two distinct subclasses. SHOES is used to represent /uw/ with preceding coronals (except for /r/ and /l/), while BOOT is used for all other preceding consonants.


230

some Eastern vestiges of “broad a” (BATH words pronounced as [a:]). However, evidence both inside and outside New England suggests that the current pattern has evolved from a more complex earlier situation, which we will illustrate and discuss. Turning to the low vowels, modern New England is sharply divided. In Eastern New England (ENE, basically meaning Eastern Massachusetts, New Hampshire, and Maine), the lexical set PALM (along with START and any BATH words realized with broad a) is produced fairly far front, contrasting with a low, back, often rounded merged class including LOT and THOUGHT. Most of the rest of New England, like the Northern dialect area more generally (Labov et al. 2006), has merged PALM with LOT instead, with THOUGHT remaining distinct as a far back, variably high and rounded vowel. (This pattern is found in Connecticut, Rhode Island, and two small areas of Southeastern Massachusetts adjacent to the Rhode Island border. It was also found in Western Massachusetts, where there are now signs of merger between LOT/PALM and THOUGHT; merger of the three lexical sets has also generally occurred in Vermont; Boberg 2001).) These two principal patterns have been known for decades, at least since the time of PEAS, but the historical context (based on the LANE transcripts and secondary sources) again suggests that major changes have taken place. In fact, even today some elderly speakers in Southeastern New England retain a three-way contrast between PALM, LOT, and THOUGHT (Johnson 2010). Because of the irreversibility of mergers (at least on the community scale), this unmerged pattern is bound to be the original one, and it is found throughout England and in Southern Hemisphere Englishes, as well as being a known older pattern in New York and some other Eastern cities. Therefore we expect the Hanley Recordings to reveal more of this unmerged low vowel system. Another feature of New England speech that often draws attention is the realization of the NORTH and FORCE classes. In many present-day dialects of US English, these vowel classes are merged as the FORCE vowel. Interest in the variation involving NORTH and FORCE in nineteenth century New England speech comes from the fact that, during this time period, the vowel classes were still fairly distinct at least among some speakers. Given this interest, we will spend some time discussing these vowel classes, as well. There are several hundred Hanley Recordings of LANE speakers, but our chapter focuses on just 10 “cultured” speakers, all but one of whom were singled out for analysis (and the creation of an overall vowel “synopsis”) in the PEAS volume. Because of this selection, we hope not only to accurately analyze these vowels with acoustic analysis, and compare the phonetic and phonological patterns we find (to each other and to modern systems), but we also hope to be able to further comment on the accuracy of the work done in the LAUSC tradition. 2 Previous studies As the primary focus of our discussion will center on short a, the low vowels, and variation involving the NORTH and FORCE vowels, we first provide some background discussion of the patterns of variation noted for each of these vowel classes in previous studies. This includes discussion of variation involving the classes both in New England, and, where relevant, elsewhere in US English.


231

2.1 Short a Short a has a somewhat storied history in the New England area. In the earliest studies of New England (Kurath et al 1939; Kurath and McDavid 1961), short a was found to differ in only one significant way in areas located within the New England area among speakers born during the nineteenth century. In parts of ENE, short a was found to be realized with two allophones: a retracted allophone [a:], which occurs in many of the tokens belonging to the BATH word class, and a non-retracted allophone [æ], which occurs in many of the tokens belonging to the TRAP word class. In Western New England (WNE), only one allophone [æ] was found to occur in short-a words, regardless of their membership in either the TRAP or BATH word classes. No other special properties, such as significant amounts of raising or fronting, was found to typify realizations of short a in the region, leading to the characterization of the short a of WNE as “flat.” In later studies, however, both WNE and ENE were found to exhibit additional characteristics to the realization of short a, albeit somewhat different ones, depending on the study. Labov et al. (2006), investigating vowel variation in ENE, found speakers born during the twentieth century to show continued use of the BATH-TRAP division of realization, with a nasal system of raising for TRAP. That is, /æ/ is raised only when a nasal consonant follows the vowel. For WNE, they also found nasal raising to typify the systems of most speakers in their data, although they argued that some older speakers also sometimes show a continuous system of raising for TRAP. That is, tokens “occur in a more or less uninterrupted smear from mid-front or high-mid-front position on down to low central position” (Labov et al. 2006: 180). Boberg (2001), investigating vowel variation among a larger set of speakers born throughout the twentieth century in the ANAE data, found the systems of older WNE speakers analyzed to exhibit characteristics not of the continuous system, but rather of the Northern Cities short-a system instead. Among these speakers, he claimed to find a general raising of TRAP not conditioned specifically by any particular following consonants, hence the difference from the continuous system (although his Figures 6 and 9 contradict his text by showing following nasals as the most raising environment) . Not all of these older speakers show this characteristic so robustly, however, and so he classified the NCS features of the short-a system as being “variable.” Among speakers born after mid-century, he found this general raising to be on the decline, with the youngest speakers appearing to show the nasal system, just as Labov et al. (2006) later found. Laferriere (1977), meanwhile, investigated vowel variation among a larger set of speakers born throughout the early to mid twentieth century in Boston, and there, she found ENE vowel systems exhibiting some characteristics of a continuous system, as well, but only among younger speakers. Older speakers instead showed the use of a split system, with the realization of short-a divided into allophones, as is often seen in the vowel systems of speakers from New York City or Philadelphia. Given that these speakers were born somewhat earlier than many of the ENE ANAE speakers, it is perhaps not surprising that they show these continuous systems, as studies in many areas of the United States have shown the continuous system is often found to occur in areas among older speakers before the nasal


232

system emerges among younger speakers (e.g. Boberg and Strassel 2000; Dinkin 2009; Durian 2012). Among nineteenth century born speakers, instrumental and impressionistic reanalyses of older data conducted since the late 1990s have found somewhat different results for ENE and WNE than the initial work of Kurath et al (1939) and Kurath and McDavid (1961), as well, calling into question the general accuracy of the LAUSC field workers for short a throughout the New England area. In ENE, Thomas (2001), conducting instrumental analysis of some speakers recorded for the Hanley discs, found some signs of raising before nasal consonants for speakers living in New Hampshire and Massachusetts. Meanwhile, in WNE, Johnson (1998), using impressionistic analysis to investigate speakers born during the 1860s and living in New Haven, CT, found speakers using a split short-a system. That is, speakers appeared to be realizing /æ/ with two allophonic variants: high, tense /æ:/, which occurs in tokens of short-a words where /æ/ occurs before front nasals, front voiceless fricatives, and, variably, before voiced stops, and low, lax /æ/, which occurs in tokens of short-a words before all other consonants. The source for this kind of split system is the historical lengthening of /æ/ before fricatives and front nasals, a process which represents an innovation going back in English to at least the seventeenth Century (Dobson 1957; Lass 1976), and possibly as far back as the fifteenth Century (Ekwall 1946; Wyld 1936). At some point later, raising before voiced stops also began, along with the introduction of additional extra-phonetic constraints, such as the open syllable and the function word constraint (Labov 2007; Ferguson 1972). Taken together, the results of Johnson (1998), Boberg (2001), Thomas (2001), Laferriere (1977), and Labov et al (2006) suggest that ENE and WNE systems continue to be similar to one another, and have developed along similar paths since the middle to end of the nineteenth century. The principal difference between the areas continues to be that portions of ENE are still differentiated from the rest of New England by the robust use of the /a:/ vowel for BATH class words. However, the combination of these findings also call into question the accuracy of the LAUSC field workers. In particular, the findings of Thomas (2001) and Johnson (1998) do so, given the difference in their findings for the nineteenth century born speakers versus the speakers discussed in the LAUSC era publications. In addition, the twentieth century findings of Boberg (2001) and Laferriere (1977) further questioned the accuracy of the LAUSC fieldworkers given that the development of the systems shown among younger speakers in their data would suggest the systems would have had to have looked something more like the systems found by Johnson (1998) and Thomas (2001) to be as developed as they are in their data. Given the different results of the Johnson (1998) and Boberg (2001) studies for WNE, and the Laferriere (1977) and Labov et al (2006) studies for ENE, several questions remain unanswered about the development and occurrence of the types of short-a systems found in WNE, and even to some extent in ENE. First, given the differences between the results of these later studies and the LAUSC era analyses, what might an instrumental analysis of actual LANE speakers reveal about the transcription accuracy of the LAUSC field workers? Is it the case that there in fact was less vowel variation happening in the data, and thus, there was simply less to report for the field workers? Or might they have possibly missed important patterns of variation occurring in their data?


233

Second, given that Johnson (1998) focused only on New Haven, the question of what type of short-a system or systems were to be found among nineteenth century WNE speakers more generally has remained unanswered. Third, given the difference in system types for speakers in WNE found by Johnson (1998) and Boberg (2001), the question of how the system may have changed from the type found by Johnson to the type found by Boberg also remains unaddressed. Did the system in fact change from something like the split system of Johnson (1998), or was Boberg (2001) perhaps incorrect in his diagnosis of the systems of his informants as being NCS systems rather than a continuous system? Recently, Durian (2012) has conducted a reanalysis of short-a systems as they occurred in nineteenth century English that also raises a fourth unanswered question about short-a systems in New England, particularly in light of the results of Johnson (1998), Boberg (2001), and Laferierre (1977). Durian (2012) finds, through instrumental reanalysis of short-a in twentieth century speaker vowel systems in Columbus, OH, as well as a reanalysis of nineteenth century raw impressionistic field records for Central Ohioans living near Columbus when interviewed for The Linguistic Atlas of the North Central States in 1933, that Columbus had the same kind of split short-a system Johnson (1998) found in New Haven, among nineteenth century-born speakers, and Boberg and Strassel (2000) found among older twentieth century born speakers, in nearby Cincinnati, OH. As a part of his reanalysis, Durian took a cue from Johnson (1998) and began to reexplore older studies of short-a systems as documented during the late nineteenth and early twentieth century. He found that a variety of linguists had documented split short-a systems during this time period, even though much of the research since Labov (1966) has not included reference to these older reports. These areas include: Ithaca, New York (Emerson 1891); Maryland; Virginia (“the Valley of Virginia”); Western Tennessee (Grandgent 1892: 271); Newark, New Jersey, eastern Nebraska, and Rhode Island (Trager 1930: 399), and even possibly a good part of the Middle Atlantic States (New York, New Jersey, and Pennsylvania), the Middle West (Ohio, Indiana, Illinois, Wisconsin, Minnesota, Iowa, and Northern Missouri), and “Further West” to the Pacific Coast by Kurath (1928a: 286). As a result, many reports since 1966 have tended to see the occurrence of split systems in US English as being limited to only the East Coast. Yet, as Durian notes, these older studies suggest; a) that split systems historically occurred in a much wider variety of locales in the United States; and b) these systems appear to have developed at the same time as split short-a systems were developing and being used on the east coast in cities such as New York and Philadelphia. These older studies further confirm the findings of a growing body of more recent studies suggesting short-a systems can actually be found in older speaker vowel systems in many locales located throughout the Eastern and Midwestern United States, as well as New Orleans. These areas include Cincinnati (Boberg and Strassel 2000), New Haven (Johnson 1998), the Hudson Valley area of New York state (Dinkin 2009); additional cities along the East Coast in the area between and surrounding Philadelphia and New York City (Ash 2002), such as Newark, DE and Trenton, Brick, and Bridgeton, NJ (among other cities); and New Orleans (Labov 2007). In these more recent studies, split systems have often been found in speakers born during the twentieth century before World War II. Speakers born since this


234

time period usually either show continuous or nasal systems, with speakers born since 1970 most often having nasal systems. Given the combination of older and recent findings, Durian (2012) hypothesized that the split system in US English did not develop first in New York City and then diffuse to other areas after first developing there, as argued by Labov (2007), but instead, that a split system was present in the other areas just as early as it was in New York, perhaps even being inherited from Southern British English, as suggested for Philadelphia by Ferguson (1972). Taking Durian’s (2012) hypothesis into account, as well as the results of Johnson (1998) for New Haven, Boberg (2001) for WNE more generally, and Laferriere (1977) for Boston, a fourth unanswered question arises: how would a deeper look at short-a systems in New England add to or change Durian’s (2012) analysis? Each of the questions detailed above will be addressed in section 5 of this paper. 2.2 The low vowels One of the best-studied phenomena in American English is the “low back merger”, the unconditioned merger of the lexical sets LOT and THOUGHT. Note that in this analysis, THOUGHT will include CLOTH, as the two sets are united in all varieties of American English, including in these recordings. However, even though in non-rhotic pronunciations, NORTH and possibly FORCE vowels might also be identical to THOUGHT, they will not be combined; NORTH and FORCE will be analyzed separately.

The LOT-THOUGHT merger has been present for as long as we know in Western Pennsylvania, and has spread to – or developed internally in – nearby parts of Ohio, West Virginia, and Kentucky (Irons 2007). It made a sudden appearance in the early twentieth century in Northeast Pennsylvania (Herold 1990, 1997). It has largely swept the South and is in progress (see Durian 2012) in much of the Midland. Most of the West is thought to have been merged for some time, but incompletely so in San Francisco, for example (Hall-Lew 2009). Along the eastern edge of the West, the merger has also been reported in progress in states like Missouri (Gordon 2006), Iowa (Olsaker 2013), Wisconsin and Minnesota (Benson et al. 2011). The merger even seems to be incipient in places where a strong distinction recently prevailed, such as Philadelphia (Fisher et al. 2014), New York State (Dinkin 2011), and even New York City (Johnson 2010, Wong 2012, Newlin-Łukowicz 2013).

But the area of LOT-THOUGHT merger in ENE is different from all the above areas because it does not include the PALM lexical set (which shares its vowel with START, and with some BATH words for some speakers). In other words, all the reports of low back merger mentioned above are really reports of the merger of THOUGHT with an already-merged LOT/PALM vowel, although this is rarely made explicit. Indeed, compared to its merger with THOUGHT, the merger of LOT with PALM has been rather thoroughly neglected. For example, the Atlas of North American English interviews did not directly ask about this potential distinction (Labov et al. 2006: 230). Nor, to our knowledge, has there ever been a study specifically devoted to this merger in any American community. This lack of attention is surprising because while today LOT and PALM remain distinct mainly


235

in ENE, it was not that long ago that they were distinct in many other parts of the country.

Although the text can be vague on the matter, the PEAS synopses show that a LOT-PALM distinction is the majority pattern among cultured speakers in the Atlantic States. The phonetics of the distinction varies geographically: in Georgia and South Carolina, PALM is longer and more diphthongal (offgliding) than LOT; in North Carolina, Virginia and Maryland, the same is true, but PALM is also further back (and sometimes lower) than LOT. PALM is also longer and further back than LOT in the New York City area. By contrast, the PEAS synopses show PALM as longer and further front than LOT in most of New England. (Kurath and McDavid 1961: 31-100).

The areas where PEAS shows LOT and PALM as merged are either away from the coast, rhotic, or both: Asheville NC, Lexington VA, all of West Virginia and Pennsylvania, upstate New York (beyond the Hudson Valley), and Litchfield CT, Springfield MA and Burlington VT in Western New England. However, in the same region, Deerfield, Northampton and Pittsfield MA are shown with a distinction (PALM is further front than LOT, like in ENE). Middletown and New Haven CT, while they are framed as merged, also appear potentially distinct from the phonetic records.

Clearly the LOT-PALM merger has gained much ground since the time of the PEAS speakers. It is tempting to connect the merger to the return of rhoticity to many areas, such as the South. The most non-rhotic areas remaining in the United States are ENE, New York City, and New Orleans, and the LOT-PALM distinction is still found in all three. If this connection to rhoticity is valid, it must relate to the large number of START words that have the same vowel as PALM. The rarer PALM set itself has little do with rhoticity; rhotic speakers could distinguish father and bother, but on the whole, they seem not to. However, a non-rhotic speaker’s vowel quality and/or length difference between cart and cot could be reinterpreted by a rhotic speaker as an allophonic effect of /r/, endangering the LOT-PALM distinction. (A wrinkle in this account is that Rhode Island has merged LOT and PALM while remaining mostly non-rhotic; see Johnson 2010.)

The situation between LOT and THOUGHT is different in that both rhotic and non-rhotic speakers can either maintain a distinction between the vowels or merge them. The PEAS synopses indicate a LOT-THOUGHT distinction everywhere except Western Pennsylvania and parts of New England. The merger is indicated in New London CT, Newport and Providence RI, but this is known to have been a fieldworker error (Moulton 1968; McDavid 1981; Johnson 2010). More reliably, the speakers from Billerica MA, Concord NH, Portland and Nobleboro ME show a clear LOT-THOUGHT merger, while the records leave some doubt about Deerfield and Plymouth MA. And despite its core Eastern location, Boston MA is shown to have the distinction.

Considering the irreversibility of mergers by linguistic means (Garde 1961) and the fact that England - the country from which most early American (and certainly most New England) settlement came - has very little sign of either the LOT-PALM or LOT-THOUGHT mergers either in modern varieties or traditional dialects, we can assume that the earliest New England speech had a three-way contrast between PALM, LOT, and THOUGHT (at least once the


236

PALM/START/broad-BATH category became clearly distinct from TRAP, which happened by 1700; Dobson 1957: 790).

And this three-way distinction survived in some places into the nineteenth century, as indicated explicitly by the PEAS editors for Boston, Northampton and Pittsfield, and suggested by the records for Deerfield, Plymouth, Middletown and New Haven – seven of the 17 New England speakers, born between 1847 and 1889. We have a description from a Boston/Cambridge speaker born in 1862 of the three-way pattern, roughly as [a] vs. [ɑ] vs. [ɔ] (Grandgent 1890), and a more recent self-report from a Providence speaker born in 1914: [ɑ:] vs. [ɑ] vs. [ɔ] (Moulton 1968). Note that the PALM-LOT distinction can be maintained as a difference in vowel quality, as in Boston, or one of length, as in Providence.

Johnson (2010) found six speakers along the Massachusetts-Rhode Island border, born between 1912 and 1924, who retained the three-way contrast. This represented 10% of the senior citizens interviewed. The rest of the senior citizens native to the area, along with all younger adults, exhibited either a PALM-LOT merger (in RI and two adjacent parts of MA) or a LOT-THOUGHT merger (elsewhere). There was a very sharp boundary between the two areas, which corresponded roughly to earlier settlement patterns. Anecdotal evidence suggests that most New Englanders today have merged either PALM-LOT (RI, CT, older Western MA) or LOT-THOUGHT (Eastern MA, NH, ME), if indeed they have not merged all three categories (Vermont, younger Western MA, and some younger speakers elsewhere). This and other evidence prompted the following suggestions:

The first dialects to coalesce in Massachusetts Bay and Plymouth had a more conservative back rounded LOT, not far phonetically from the new monophthongal THOUGHT. In Rhode Island… a dialect formed with a more innovative LOT, unrounded and more central, which became the short counterpart of PALM… In each area a different merger eventually took place… In the east, LOT merged with THOUGHT… In the west, PALM merged with LOT… The communities in each area were affected by one of these two mergers for internal (structural) reasons, not because of diffusion.

(Johnson 2010: 39-40)

The current acoustic study expands the geographic coverage to all of Southern New England. We first want to establish the inventory and realization of PALM, LOT, and START for our ten speakers. Do we find the two flavors of three-way distinction mentioned above? If so, where? Do speakers further west show PALM further back than LOT, as was common further south (e.g. in New York City)?

If any of the speakers have a merger (LOT-THOUGHT or PALM-LOT), can their ages tell us anything about the timing of that merger in their area? And comparing unmerged speakers, can we see evidence for phonetic approximation of the categories that would later be merged? Or does it seem more likely that the LOT-THOUGHT and LOT-PALM mergers occurred suddenly (merger-by-expansion)?

2.3 NORTH and FORCE


237

The distinction between the NORTH and FORCE word classes is the least common, worldwide, of the three oppositions considered here. While it has been lost in non-rhotic RP, and maintained in conservative rhotic Scottish and Irish Englishes (Hickey 2004: 73), in the United States it seems to be non-rhotic dialects that best preserve it today. In PEAS, the distinction was shown to occur everywhere in the Atlantic States except in Maryland, Pennsylvania (and adjacent parts of Ohio and West Virginia), New Jersey, the New York City area, Long Island, and the Hudson Valley. The merger was also found sporadically in WNE, but rarely in ENE (Kurath and McDavid 1961: Maps 43-44).

In studies of speakers born more recently, the merger of the two classes has been shown to be significantly on the increase, with the distinction rapidly disappearing in present-day English. In their survey of 439 speakers, Labov et al (2006) found much of the US to now show NORTH-FORCE merger or quite close near-merger, both in production and perception. According to their results in Map 8.2, the only sections of the country still showing a distinction are some areas in the South, a few isolated areas in southern Illinois and Indiana, and the northeastern portion of New England (pp. 50-52).

Given the persistence of the NORTH-FORCE distinction in portions of New England today, and the historical occurrence of the vowel classes as distinct in larger portions of New England generally in the past, we will ask how the patterns diagnosed by the LAUSC fieldworkers for our ten speakers correspond to what we find in our instrumental analysis of their vowel systems. Second, what might these patterns of variation tell us about later states of this merger as it likely unfolded in time since the time of the LAUSC fieldwork? 3 Materials and methods As mentioned in section 1, data for the analysis in this chapter are drawn from the Hanley Recordings, a collection of several hundred recordings of speakers made under the direction of Miles Hanley in 1933-1937. 10 speakers have been selected for this analysis because they appear both in Hanley’s recordings and in the analyses of New England vowel systems presented both in LANE (Kurath, et al 1939) and PEAS (Kurath and McDavid 1961). The version of the recordings used were digitized as .wav files by archivists at the Library of Congress under the direction of Marcia Segal in 2003-2004 (American Folklife Center 2009). Permission for their use in research, provided there was no “publication” of the recordings (e.g. dissemination of the recordings or quotation of connected speech) was obtained from Ann Hoog in 2013. The quality and length of the ten recordings vary greatly. Several speakers had a number of discs recorded during their interview with Hanley, and thus, they contributed as much as forty to fifty minutes of audio for analysis. Other speakers had only a few discs recorded, and thus, only contributed around twenty minutes of audio. In addition, these direct-to-phonograph recordings all feature a significant amount of background and playing surface noise. As well, some recordings may be impacted by experimentation with different recording equipment Hanley seemed to be employing as he did his field work. For instance, portions of some recordings feature swirling noise, typical of carbon microphones (“carbon hiss”), and turntable


238

speed fluctuation, likely at the time of recording, since sometimes Hanley would use his car battery to provide a portable power supply when AC current was not easily available on location (Hanley 1936). In addition, by the time the digital transfers were made in the early 2000s, it was clear some discs had been played much more heavily than others, with some now beginning to show notable degradation of the speech signal due in part to physical media playback wear issues such as groove wear and groove distortion. In most cases, however, the recordings we used for this project were of sufficient quality to use for instrumental acoustic analysis. Portions of the recordings that were not up to this quality standard were not used for the analysis. Thus, while measuring tokens, we aspired to measure vowel formants only from audio that was deemed clean and clear enough to obtain well formed F1 and F2 tracks throughout the course of the vowel. In situations where measurements were compromised by sound quality, we found that F1 was often more affected than F2, and that F1 was most often compromised with SHOES, BOOT, and FLEECE, as well as some cases of THOUGHT, and LOT. These findings resonate with previous research discussing compromised audio signals (e.g. Plitcha 2004, Hansen and Pharao 2006, and Rathcke and Stuart-Smith 2014). Acoustic analysis of the digitized .wav files was conducted in Praat (Boersma and Weenink 2014), with the authors using a variable window of 8-14 LPC coefficients depending on the quality of the token. The data were orthographically transcribed and the TextGrids were aligned automatically using FAVE-align (Rosenfelder et al. 2011). After alignment, the TextGrids created by FAVE were double-checked and any misaligned boundaries were hand-corrected by the authors. After corrections were made, measurements were taken from 13 duration points throughout each vowel. The points for measurement were as follows: 1%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 90%, and 99%. For the purposes of statistical analysis and vowel plotting, only the following points among these 13 are used: 25%, 30%, 40%, 50%, 60%, and 75%. These points were chosen because they have been shown by Jacewicz et al. (2011) to provide the most useful points across the vowel’s duration for intensive data analysis. Data across these points were normalized using the Lobanov z-score method (Lobanov 1971). For normalization purposes measurements from 10 vowel tokens for each of the following vowel classes were obtained: FLEECE, KIT, FACE, DRESS, TRAP, MOUTH, LOT, THOUGHT, STRUT, GOAT, FORCE, FOOT, BOOT, and SHOES. Additional tokens were obtained from the vowel classes for which we provide extensive analysis in sections 5, 6, and 7: TRAP, BATH, PALM, START, LOT, THOUGHT, NORTH, and FORCE. Here, we obtained measurements from as many tokens per vowel class as possible from the audio files. To avoid skewing the normalization, the mean values of each word class were calculated first and these were the source of the grand mean and standard deviation used for the Lobanov normalization. 4 General patterns of variation in the individual speaker vowel systems Before moving on to our in-depth analysis of short a, the low vowels, and NORTH and FORCE, we first wish to describe some of the general patterns of variation we


239

found in the vowel systems of our individual speakers, to provide some context for that more detailed discussion. As discussed in the introduction, only 10 of the “cultured” speakers who appear in the original LANE field records and on the Hanley discs are presented here. Geographically speaking, five of our speakers are from ENE; the other five are from WNE. In WNE, these locations include: Litchfield, Middletown, and New Haven, CT, as well as Pittsfield and Northampton, MA. In ENE, these locations include: New London, CT; Newport and Providence, RI; and Plymouth and Beverly, MA. These 10 speakers were ultimately chosen for detailed analysis of the vowel variation in their major vowel classes because they allowed detailed comparative analysis between their original impressionistic field records made during the LANE field work and the instrumental analysis of the same classes which we were able to conduct within the recorded Hanley data. This is so because impressionistic data for nine of the 10 can actually be located both within the synopses made for PEAS as well as the original maps appearing in the LANE volumes. Regarding additional social factors, such as age and sex of the speaker, the following social characteristics about the speakers should be noted: although five of the speakers are women and five are men, their geographic and generational distribution is fairly diverse. Among the WNE speakers, three of the five are men, born in 1847, 1856, and 1871. Meanwhile, the two women were both born in the 1880s. Because of the age difference between the men, we argue the younger man and two women belong to a younger generational group than the two older men. For the ENE speakers, two are men and three are women. Among the women, two were born in the 1850s and 1860s, while the third was born in the 1880s. Meanwhile, one of the two men was born in 1859 and the other was born in 1889. Thus, we class two of the women and one of the men in the older generation, while the others belong to the younger generation. These characteristics are summarized in Table 1. In addition, the names, dates of birth, and LANE and PEAS code numbers for these ten speakers can be seen both in Table 1 and on Map 1.


240

Map 1. Locations of speakers with names, codes from LANE and PEAS, and

dates of birth.

Region Name Sex Born Gen Location PEAS/LANE Partridge M 1847 1 Pittsfield, MA 37/242.2 Hubbard M 1856 1 Litchfield, CT 35/16.2 Russell M 1871 2 Middletown, CT 30/38.3 Clark F 1880 2 Northampton, MA 27/226.2

Western New England

Schofield F 1886 2 New Haven, CT 32/26.2 Miner M 1859 1 New London, CT 19/32.2 Brewster M 1889 2 Plymouth, MA 13/112.2 Baker F 1857 1 Beverly, MA 9/182.2 Weeden F 1868 1 Providence, RI 21/80.3

Eastern New England

Covell F 1880 2 Newport, RI 17/60.2 Table 1. Regional and social characteristics of the 10 speakers In the following analysis, individual vowel classes will be discussed, with comparisons and contrasts drawn between our instrumental analysis of the speakers based on the Hanley discs data and the original impressionistic analyses of the LANE field workers. Specifically, we compare the impressionistic data as presented in the synopses of Chapter 2 of PEAS, the summary of regional patterns provided in PEAS throughout Chapter 2, and, when relevant, additional impressionistic data observations discussed in Chapters 3 and 5 of PEAS, as well as Chapter 1 of LANE. When relevant, some discussion of other studies of vowel system variation for


241

speakers born around the same general time period as ours here in ENE and WNE, such as Thomas (2001) and Boberg (2001), will be included. For the sake of cross-referencing clarity, Table 4.1 includes the PEAS vowel synopsis numbers for each speaker. Note that the actual synopsis for Baker (PEAS 9) does not actually appear in PEAS even though she was assigned a number. Because this is so, we created our own synopsis by hand using her raw data as plotted in LANE and PEAS, to ensure that an accurate comparison of her patterns to our own could be made. In the following, we first focus on the vowel classes showing less variation. We then move on to discussing the classes exhibiting more variation, and also those which, historically, have tended to be focused on more heavily in previous studies given the complex nature of the patterns of vowel variation found among both WNE and ENE speakers. Thus, we begin our discussion with FLEECE, KIT, FACE, DRESS, MOUTH, STRUT, GOAT, FOOT, BOOT, and SHOES, and then we will move on to discussion of TRAP, BATH, PALM, START, LOT, THOUGHT, NORTH, and FORCE. Note that we have not included vowel variation involving PRICE or CHOICE because we did not use these vowel classes for normalization. We do wish to note in passing, however, that those tokens of PRICE and CHOICE we did observe while completing our measuring of vowel formants generally conform to the patterns discussed previously for these vowels in Thomas (2001) and Kurath and McDavid (1961). One other point we wish to discuss at the outset is speaker rhoticity, a subject researchers have always been concerned with when dealing with speakers from New England. For the purposes of our analysis here, rhoticity was measured across all instances of START, FORCE, and NORTH, not counting any instances of “linking r.” The results of this analysis are shown in Table 2. As these results demonstrate, rhoticity shows a nice, ENE to WNE geographic pattern, with speakers who lived furthest to the east showing the least rhoticity and those living furthest west showing the most rhotcity. A significant jump in percentage of rhoticity occurs between speakers living between Providence, RI and the Connecticut River, and those living west of the Connecticut River, in either Massachusetts (as shown by Clark and Partridge) or in Connecticut (as shown by Schofield and Hubbard).

Furthest East: Covell: 4%, Brewster: 5%, Baker: 6%

Between Providence and the Connecticut River: Miner: 10%, Weeden 13%, Russell 14%

West of the Connecticut River in Massachusetts: Clark 26%, Partridge 29%

West of the Connecticut River in Connecticut: Schofield: 51%; Hubbard: 99%

Table 2. Percent of rhoticity by speaker and by speaker’s geographic location Moving on to the analysis of general trends in the vowel systems of the speakers, we begin by looking at the long front vowels FLEECE and FACE. For most of our speakers, FLEECE and FACE are clearly diphthongal. FACE, in particular, tends to be fairly wide among our younger speakers, while for older speakers it tends to be


242

shorter and more monophthongal. This corresponds closely to the results of previous studies on FACE in the area. FLEECE shows more variation among speakers, with our two older ENE women showing a markedly shorter FLEECE than other speakers. Our men, on the other hand, show more consistency in their realization of a longer FLEECE, regardless of age. Given the patterns among the women, this suggests our data document the end of a change in progress for FLEECE, from a shorter to a longer realization, which may have been nearing a cycle of completion in the latter half of the nineteenth century in ENE. This trend differs from PEAS, where a lengthening in ENE is not reported. Turning next to the short front vowels KIT and DRESS, all show little evidence of undergoing significant variation, other than a mild tendency toward lengthening of DRESS on occasion in environments which tend to promote lengthening, such as before /d/. The same is also generally true for the short back vowels STRUT and FOOT. This finding overlaps with the analysis of the four vowels in PEAS, and also with other speakers from the area recorded by Hanley and analyzed by Thomas (2001). Moving on next to the long back vowel diphthongs MOUTH, GOAT, BOOT, and SHOES, our data here also generally reveal similar patterns of vowel variation to those reported in earlier studies, although we also note some patterns in our data, which differ from earlier reports. For MOUTH, we tend to find our speakers look much like the description presented in PEAS. For our WNE speakers, we tend to find use of a fairly fronted form, with a strong upglide for the glide. For our ENE speakers, we find older speakers using a form somewhat further back, while the younger speakers typically use the form further front. A trend not noted previously, that we find in our data, is that women tend to front more often than men of comparable age, with the women in ENE showing the most fronted realizations of the nucleus of MOUTH in our data overall. In PEAS, Kurath and McDavid noted that GOAT was beginning to show a transition from being a monophthong to being a diphthong throughout the New England region, with WNE speakers leading over ENE speakers in the use of the diphthongal form. This form was noted to be both fronter and longer than the older form. They also noted a related older form, often called “short o” in the literature, was becoming archaic and was to be found still in more “homely” words in the area (Avis 1961). Our data generally concur with their appraisal, with our older speakers in both WNE and ENE generally realizing GOAT with a realization further back and more monophthongal, while younger speakers make use of a form further front and more diphthongal. Regarding “short o,” only Russell – a WNE male born 1871 – uses this form to any regular extent in our data, although others do periodically show use of it in words like “home” and “road.” In addition, our analysis notes that the nucleus of GOAT tends to be lower among the speakers in our dataset than it often is in the vowel systems of speakers born throughout much of the twentieth century throughout the United States (see for example, Labov et al. 2006; Thomas 2001; and/or Clopper et al. 2006), a finding which resonates with the data presented in PEAS. This trend is not discussed explicitly by Kurath and McDavid in their analysis, however; instead, it is simply indicated in several of the vowel system synopses made for PEAS (as well as the one we made for Baker).


243

One key difference for GOAT we note in our data versus the PEAS data is a stronger tendency in our data towards realizations that are front gliding rather than back gliding. This trend is found among both ENE and WNE speakers in both age groups, although more back gliding forms are found in the individual tokens of younger speakers. This suggests a pattern of change in progress in our data. Moving on to BOOT and SHOES, we find similar fronting trends to MOUTH and GOAT for some, but not all of the speakers. For SHOES, we find some fronting of the nucleus among speakers living in ENE. By far, the stronger fronting trend is found among the WNE speakers, where all of the speakers, save one male, show a SHOES that is close to the center of the speaker’s vowel system. This trend is true for both men and women, regardless of age. In ENE, however, we find fronting among only the three younger speakers, with the older two speakers showing a SHOES well towards the back of their vowel systems. In addition, we find the nucleus of SHOES to be close to the nucleus of FOOT for some older speakers, a trend which is less pronounced among the younger speakers. For BOOT, we also find tendencies towards nuclear realizations close to FOOT among speakers for some tokens, a pattern that is slightly stronger among older speakers. Regarding frontness, we find, in contrast to SHOES, that BOOT tends to be backer than SHOES for most speakers. Among the oldest speakers in both ENE and WNE, BOOT is close to the back of the speakers’ vowel systems, typically parallel to GOAT. The one exception to this trend is the older WNE female, whose system shows a fronter BOOT than other speakers of comparable age. Among the younger speakers, we find the beginning of frontward movement in both ENE and WNE, as the younger speakers typically have a fronter BOOT on average than older speakers. Rather interesting in this regard is the system of our younger man from ENE, who in fact shows more fronting for BOOT versus his WNE counterparts of similar age, despite showing a backer SHOES realization, on average. This trend is surprising given previous reports, such as in PEAS and Thomas (2001), where /uw/ in ENE has usually been found to be backer than /uw/ in WNE. It may be possible that previous reports have missed this trend since earlier studies have not treated SHOES and BOOT as separate subclasses as we do here. We should note, however, that this trend may also simply result from a difference in this one individual’s system rather than representing true community-wide patterns of variation. More research specifically on patterns of movement targeting SHOES and BOOT as subclasses of /uw/ in older ENE data is needed before anything more conclusive can be stated about this possible trend toward change. Turning to the low front vowel classes TRAP and BATH, the low back vowel classes PALM, START, LOT and THOUGHT and the mid back vowel classes NORTH and FORCE, we find, like previous studies before us, a complex series of variation behaviors typifying the realization of these classes among our speakers. These trends are especially true in ENE, although as our discussion will also show, the patterns in WNE can at times be rather complex, as well. Given the complexity of the patterns for each vowel class. we now turn to addressing them in more detailed dedicated sections. In particular, we will deal with the variation involving these vowel classes by investigating them in the following combination sets: TRAP and BATH, to be discussed in section 5; PALM/START, LOT and THOUGHT, to be discussed in section 6; and NORTH and FORCE, to be discussed in section 7.


244

5 TRAP and BATH Turning now to the low vowels, we first focus on the low front vowel classes TRAP and BATH. In our data, we find different results from many of the studies reported in section 2. For one, we find split systems showing raising behavior for TRAP among many WNE informants, making their systems like those of the nineteenth century born speakers of Johnson (1998) and Tuttle (1903) in New Haven. In addition, in ENE, we find somewhat more diversity in the speaker vowel systems we have observed than has usually been reported as well, again with more raising before nasals for TRAP than reported in earlier studies, as well as more diversity for BATH realization that most other studies, with the exception of Laferriere’s (1977) study of speakers born somewhat later than our speakers. Finally, we also find more use of broad a for some WNE speakers than has previously been reported. We believe all of these differences have emerged in our analysis because of the more intensive methods of instrumental investigation we have applied to our data, which have allowed a more thorough investigation of the data than most previous work. To conduct our analysis of short-a system variation in the data, we plotted all tokens of TRAP and BATH and looked at the clustering relationships for the classes. This allowed us to see which speakers more heavily used the broad a realization for BATH class words, as well as determining how the diffusion patterns for each subclass are spread in the vowel spaces of our speakers. In addition, we coded the data for the various stages of split short-a system raising, to help us determine if a split pattern of raising could be found in our data. We did so given the results of Johnson (1998), which suggested that we might in fact find split-like systems among our speakers in at least some parts of WNE, if not also more widely throughout the area. This involved coding the data for TRAP using a slightly adapted version of the stages for a split short-a system developed by Durian (2012), which is based on earlier observations made by Babbitt (1892) and Ferguson (1972). The stages are as follows (in all cases, a closed syllable is implied): Stage 1: Raising before nasal codas and front voiceless fricatives Stage 2: Raising before simple nasals, voiced stops, and /ʃ/ Stage 3: Raising before voiced fricatives In the data, Stage 1 was coded as TRAP 1, Stage 2 was coded as TRAP 2, and all other instances were coded TRAP 0. Once we did this, we determined that the split system might be operational in our data, specifically stage 1 and 2, and so also used the criteria for determining split system occurrence in our data provided by Durian (2012), which is based on the descriptive approach used in Labov et al. (2006) and Labov (2007). Following this method, we drew a dividing line in each speaker’s system that marks off more robustly raised tokens of TRAP from less robustly raised tokens. This line takes into consideration how close the raised tokens are to DRESS, with those close to, and above, DRESS to be considered more robustly raised, and those lower than this point, not so. To hold the line consistently across


245

speaker vowel systems, this meant in practice that we used .08 z units for F2 and .03 z units for F1, corresponding roughly to 1750 normalized Hz for F2 and 650 Hz normalized Hz for F1, employing the scaling method of Labov et al. (2006). Using these measurement points, those tokens above and to the left of (.03,.08) were considered more robustly raised, while those below and to the right of this point were considered less robustly raised. Examples of the kinds of vowel plots we used for this analysis following these protocols are included here as Figures 1 and 2.

Figure 1. Short-a system of Schofield, an incipient split system. TRAP1: before

front nasal clusters and front voiceless fricatives. TRAP2: before simple front nasals, voiced stops, and /ʃ/. TRAP0: other TRAP words


246

Figure 2. Short-a system of Brewster, a more robust broad-a system. TRAP 1:

before front nasal clusters and front voiceless fricatives. TRAP 2: before simple front nasals, voiced stops, and /ʃ/. TRAP 0: other TRAP words. BATH: words produced with broad a

In our data, two of the five ENE speakers have systems that look like they could turn directly into the nasal system found among speakers of following generational groups. This is likely because of their strong use of the BATH subclass, which typically causes the TRAP subclass to be lower and more retracted, as well (Thomas 2001, Labov et al. 2006). These two speakers, Baker and Brewster, hail from the eastern most part of our study area. The other eight speakers have systems that variably suggest an eventual use of a short-a system resembling either the NYC city split short-a system, a nasal system, or else possibly the continuous system of ANAE among speakers of following generational groups. According to the findings of later studies, such as Johnson (1998), Boberg (2001), Laferriere (1977), and Labov et al. (2006), all of these are in fact the exact types of systems that develop and follow on from these systems in the areas we surveyed in the vowel systems of twentieth century born speakers. Among these eight speakers, 2 of the 5 WNE--Scofield and Hubbard--appear to have a system looking most like an incipient form of the split short-a system, with not just tensing, but also some raising, of TRAP being variably conditioned by environments typically found in previous studies to condition split tensing and raising--namely front nasals and front voiceless


247

fricatives. The other six speakers have somewhat more fluid systems, showing less robust evidence of this conditioning. The principal historical difference then, between the areas we have included in our study here, is that the easternmost area shows a system that appears to have been likely to switch rather quickly to the use simply of a nasal system than the western areas, especially the western most area. We argue this is the case because this use of the [a] realization for BATH class words appears to have set up a generally backer and more retracted TRAP, although not for the nasal classes, as shown by the speaker system of our two BATH system users here. This is further shown by the general tendency of these speakers to have laxer TRAP 2 tokens, on average, than the other speakers, particularly our western most speakers. This is also true of their TRAP 1 tokens, although much less so. As shown by Brewster’s system in Figure 2, his TRAP 1 tokens tend to be tensed (fronter) fairly often, but this tensing does not contribute to raising in any significant way. In contrast, his TRAP 2 tokens are more lax (further back). Meanwhile, the only tokens showing raising are prenasal, with the exception of one token each of ‘has’, ‘that’, and ‘catch’. Thus, the co-occurrence of a BATH subclass with a TRAP subclass seems to have had a strong impact on lowering of most tokens in the TRAP subclass for speakers living in the far eastern area of ENE, as well as laxing a good number of these tokens, as well. For the TRAP-raising areas, we believe the split-like system we observe here occurs because TRAP was less constrained with regard to tensing as well as raising. Therefore, not surprisingly, perhaps, variable raising as well as tensing in the area has been noted, not only in later studies (Johnson 1998; Boberg 2001; Labov et al. 2006), but also in earlier discussions (Tuttle 1903). Here, the tensing and variable raising is found to differing degrees in the vowel systems of the eight speakers. Among all eight speakers, both TRAP 1 and TRAP 2 tokens typically show tensing more often than not, while raising is more variable. The most raising, and the most environments found to condition the variable raising, occur among the two most western speakers we mentioned above – Schofield and Hubbard. The other speakers show lesser amounts of raising and more variability with regard to the segments conditioning their raising of short a. For Schofield and Hubbard, short a raising is conditioned by nasal codas, simple nasals, voiceless fricatives, and the voiced stop /d/. Examples of tokens showing this variable raising can be seen in Schofield’s TRAP 1 and TRAP 2 tokens in Figure 1. (Her plot also shows the more pronounced tensing for these token sets.) For the other speakers (Partridge, Hubbard, Clark, Miner, Weeden, and Covell), raising appears to be only consistently conditioned by nasal codas and simple nasals, although most show some raising before at least some of the other consonants that also condition raising for Schofield and Hubbard--/d/ and the voiceless fricatives. The difference between the speakers here is thus not which segments condition raising, but rather, how much raising they induce among speakers and how many segments induce it. Occasionally, all the eight speakers who raise TRAP also sometimes raise [æ] before voiceless stops, such as /t/ and /k/. This is a finding that has been reported previously for speakers in New York State and WNE (Labov et al. 2006; Dinkin 2009), although not among nineteenth century-born speakers. It has also been reported previously for ENE speakers living in Boston by Laferriere (1977), although again, not for nineteenth century-born


248

speakers. Another commonality among our speakers is what appears to be the operation of two constraints that appear to condition raising as shown by the speakers. These are the open syllable constraint for nasal words, which makes words such as ‘hammer’ and ‘planet’ lax and the function word constraint, which also makes simple nasal words like ‘and’, ‘ran’, and ‘can’ lax. Both of these findings confirm the results of Johnson’s (1998) study of New Haven short-a systems. A final commonality is that the conditioned raising of /æ/ by following environment is variable for all speakers – not all instances of /æ/ are raised in these environments, but when /æ/ is raised, and enough to be noticeably so, it occurs in the environments noted above. Generally, the finding here of nineteenth century systems in ENE, showing an incipient version of the split system, is a new one, while the finding of systems in WNE showing the incipient split system is also fairly new, having only been previously suggested for the larger New England area in two unpublished reports – Johnson (1998) and Tuttle (1903) – both studies of only New Haven. The system we have observed among most speakers in both areas is more like a split system than a continuous system or perhaps a pre-Northern Cities Shift system (Boberg 2001 suggested this could be possible), for several reasons. First, the systems show variable, yet consistent, raising for TRAP before several of the key environments noted to typify split systems in US English previously, particularly in late nineteenth century English (Grandgent 1892; Kurath 1928a; Durian 2012). These include before nasal codas, simple nasals, and the voiceless fricatives /f/, /s/ and /θ/, but not, to any great extent, segments conditioning raising that typify the NCS, such as before voiceless stops and /g/. In addition, the operation of additional constraints on raising, such as the open syllable constraint and the function word constraint, further support this view. The systems of these TRAP raising WNE and ENE also show a type of short-a system that would also eventually become nasal like the “far east” ENE speakers we discussed earlier, but in these areas, the system would go through a longer “change over” phase, transforming from a split style system, to a continuous system, and then finally to the nasal system. This is, at least, what happened in other areas in the country that showed the split system historically, but now show the nasal system predominating among the youngest speakers in the present day. These areas include Cincinnati (Boberg and Strassel 2000), Columbus (Durian 2012), New Orleans (Labov 2007), cities in the Hudson Valley (Dinkin 2009), and even parts of New York City itself (Becker 2012). As discussed in section 2, the vowel systems of speakers from these portions of ENE and WNE analyzed by Boberg (2001) and Laferriere (1977), as well as Labov et al (2006), demonstrate this type of continuation pattern – from split to continuous to nasal system – likely occurred, given how the birth dates of the speakers in their studies line up with those of our speakers here. Given the split-like system we find among many of the speakers, especially the WNE speakers, our findings here add to the increasing set of data from different studies (e.g. Grandgent 1892, Kurath 1928a; Trager 1930; Johnson 1998; Durian 2012) calling into question the conclusions of Labov (2007) and Dinkin (2009) regarding the occurrence of split short-a systems in areas outside the East Coast during the late nineteenth and early twentieth centuries, that is, that the NYC style split short-a system first developed in New York City and then later diffused into


249

other parts of the US. As discussed by Durian (2012) and as the data in studies such as Grandgent (1892), Emerson (1891), Babbitt (1896), Tuttle (1903), (Kurath 1928a), Trager (1930) and Johnson (1998) show, split short-a systems have been documented in the vowel systems of a variety of speakers of US English born throughout the nineteenth century, contemporaneous to their use in New York City. The data here add to these findings given the birthdates of our informants, who show the use of similar conditioning constraints for short a raising as speakers of comparable age, as studied by Babbitt in his 1892 study of New York city speech (Babbitt 1896). In addition, our findings confirm those of Johnson (1998) and Tuttle (1903), which were the first to suggest split systems might be found historically in this part of the country, although their focus was limited to New Haven. To close this section, we wish to note that, although most of our speakers show an overall similarity in the realization of their TRAP class, there are some notable differences between speakers which we also need to mention briefly. One of the WNE speakers – Partridge – consistently and quite frequently made use of what appears to be an imitation broad a pronunciation of “after” as “arfter”. Two of our other WNE speakers – Russell and Clark – made variable use of BATH and TRAP realizations for several BATH class words throughout their field interviews. Russell used “staff” with both vowel realizations, as well as the words “vast” with TRAP and “fast” with BATH. Clark varies her realizations of “dance” and “past/passed” between TRAP and BATH in several spots in her interview, as well. These differences suggest that both speakers might be making use of variable pronunciations because of a possible awareness of BATH realizations for these words being perceived as prestige pronunciations given their currency as a standard realization in Boston during these speakers’ formative years (Laferriere 1977; Grandgent 1920). Finally, one additional point we wish to discuss briefly about our speakers is that while all show raising of TRAP in certain tokens of multiple syllable words, such as “gathering” or “graduated,” as well as words of extremely short duration, such as “has,” “as,” “at,” or “have”. it should be noted that words of these types are often raised regardless of dialect. Therefore, this raising can be ruled out in a diagnosis of short-a system configuration states such as we have made here regarding the occurrence of a split system for our speakers in the preceding analysis. 6 PALM, LOT and START Turning to our analysis of the low vowels, a series of linear mixed-effects models was applied using the R package lme4 (Bates et al. 2014) to test the differences in mean F1, mean F2, and duration between the PALM and LOT classes, and between the LOT and THOUGHT classes. (The PALM-THOUGHT difference was not tested directly because these two classes have never been known to merge in New England unless they are in a three-way merger with LOT. However, a configuration where LOT contrasts with a combined PALM/THOUGHT class has been reported in Tidewater Virginia (Kurath and McDavid 1961: 73, 76, 79-80).


250

When comparing the means of vowel classes, it is important to control for the phonological environment, because otherwise imbalances in the distribution of surrounding consonants can be mistaken for word class differences. For example, the THOUGHT vowel is more often followed by /l/ than the LOT vowel, and F2 tends to be lowered by a following /l/, so a spurious difference in F2 can appear significant if the following environment is not taken into account.

The models controlled for both preceding and following environment (whether in the same word or the adjacent words) using the same 13 categories: labial obstruents, coronal obstruents, velar obstruents, /m/, /n/, /ŋ/, /l/, /r/, /h/, front vowel, central vowel, back vowel, and pause. Another predictor controlled for whether the vowel occurred in an open or closed syllable, and whether it occurred in a final or non-final syllable. This last factor is especially important as LOT does not occur in word-final position. Finally, a random intercept was included for word, which helps avoid distortions due to differences between the particular words uttered by each speaker.

The above variables were estimated from all the speakers’ data pooled together, resulting in more stable estimates, though at the cost of assuming that phonological environment affects each word class for each speaker in the same way. For each speaker, the model estimated one coefficient for the vowel quality (F1 or F2) or duration of LOT and another for the difference between LOT and THOUGHT or PALM. The significance of these word-class differences was obtained using lmerTest::summary, which calculates p-values from lme4 models using the Satterthwaite approximation (Kuznetsova et al. 2014).

Table 3 shows the results for LOT vs. THOUGHT while Table 4 shows the results for LOT vs. PALM. In both cases, the coefficients are expressed relative to LOT; positive values for F1 mean the other vowel is lower than LOT, positive values for F2 mean the other vowel is further front than LOT, and positive values for duration mean the other vowel is longer than LOT. The values for F1 and F2 were taken at the midpoint (50%) of each vowel, because this achieved the best separation between vowel classes known to be distinct.


251

Speaker F1 difference

F2 difference

Euclidean distance

duration difference

Baker (Beverly MA)

-1.50*** -0.56*** 1.60 .009 (+6%)

Brewster (Plymouth MA)

-0.38** 0.17 0.41 .011 (+9%)

Clark (Northampton MA)

-0.71*** -0.84*** 1.09 .006 (+5%)

Covell (Newport RI)

-1.20*** -0.70*** 1.39 .032 (+23%)***

Hubbard (Litchfield CT)

-0.78*** -0.66*** 1.02 .021 (+17%)*

Miner (New London CT)

-0.41*** 0.12 0.43 .014 (+10%)

Partridge (Pittsfield MA)

-0.27* -0.62*** 0.67 .007 (+7%)

Russell (Middletown CT)

-0.67*** -0.51*** 0.84 .026 (+20%)***

Schofield (New Haven CT)

-0.56*** -1.01*** 1.16 -.002 (-1%)

Weeden (Providence RI)

-0.61*** -0.35*** 0.70 .000 (+0%)

Table 3. F1, F2 and duration differences between LOT and THOUGHT, by

speaker *** p < .001, ** p < .01, * p < .05, italics: p > .05)

Table 3 shows that all speakers have a significant word class distinction between LOT and THOUGHT. For all ten speakers, THOUGHT is higher than LOT, with the difference ranging between 0.27 and 1.50 z-score units. And for eight of ten speakers, THOUGHT is also further back than LOT, with the difference ranging between 0.35 and 1.01 units. The other two speakers do not show a significant F2 difference. Only three of the speakers show a significant duration difference, although for eight of the ten, THOUGHT is longer than LOT. The largest duration difference, for Covell (Newport RI), is 32 msec; her THOUGHT vowel is 23% longer on average than her LOT vowel.

A significant difference in any of these measures is enough to show a word-class distinction, and only Brewster lacks any of the three at the p = .001 level. Brewster, from Plymouth, and Baker, from Beverly, are both from the Eastern area that would later develop the LOT-THOUGHT merger, but Brewster (b. 1889) is considerably younger than Baker (b. 1857), who maintains a much wider contrast. So it is possible that Brewster’s smaller distinction (Euclidean distance: 0.41) reflects phonetic approximation between the classes. On the other hand, Miner also has a small distinction (ED: 0.43), and he comes from New London, where LOT and THOUGHT remain distinct to this day.

Assuming this classification is correct, then the impressionistic auditory transcriptions of the LANE fieldworkers resulted in the correct recording of a LOT-


252

THOUGHT distinction for Baker and Brewster, though it was not systematized as such in PEAS. The merger was misattributed to Covell, Miner, and Weeden by fieldworker Rachel Harris (see also McDavid 1981, Johnson 2010: 32). A correct identification of a LOT-THOUGHT distinction was made for the five speakers further west (Kurath and McDavid 1961: 36-46, except Kurath et al. 1939-43 for Baker, who has no PEAS synopsis). Speaker F1

difference F2 difference

Euclidean distance

duration difference

Baker (Beverly MA)

0.37** 0.47*** .60 .025 (+18%)**


-0.08 0.41*** .41 .035 (+27%)***


0.25* 0.17** .31 .043 (+36%)***

Covell (Newport RI)

0.09 -0.09 .13 .057 (+39%)***


-0.18 0.15 .23 .038 (+29%)***


-0.17 0.17* .24 .039 (+28%)***


0.35* 0.28*** .45 .057 (+51%)***


-0.04 -0.04 .05 .061 (+45%)***


0.18 -0.01 .18 .051 (+38%)***


-0.01 0.03 .03 .046 (+25%)***

Table 4. F1, F2 and duration differences between LOT and PALM, by speaker

(*** p < .001, ** p < .01, * p < .05, italics: p > .05)

The situation with LOT and PALM, as seen in Table 4, is rather different, but again the evidence is that all speakers distinguish these two classes. While the formant differences are smaller than for LOT-THOUGHT, the duration differences are larger. In Eastern Massachusetts, the area where LOT and PALM would remain distinct in the twentieth century, we see a difference in vowel quality, with PALM lower than LOT for Baker (by 0.37 units) and further front than LOT for both Baker and Brewster (by 0.47 and 0.41 units, respectively). Elsewhere, Clark (Northampton MA), Partridge (Pittsfield MA) and Miner (New London CT) have similar but smaller distinctions, most consistently in F2. The remaining five speakers have no significant difference in either F1 or F2. We do not see any evidence of the New York (and Southern) pattern where PALM is further back than LOT.

Nevertheless, all ten speakers show a significant duration difference between LOT and PALM, with PALM between 25 and 61 msec longer, on average. Dividing these differences by each speaker’s estimated LOT duration, we can say


253

that PALM is between 18% and 51% longer than LOT. Interestingly, Baker, who produces the greatest distinction in quality (ED: .60), is the person with the smallest difference in duration, while the five speakers with no significant vowel quality distinction have a PALM class that is between 25% and 45% longer than the LOT class. (A similar inter-dialectal trade-off between quality and duration as cues to the same vowel contrast was recently observed by Fridland et al. 2014.)

While these duration differences between LOT and PALM are not as large as the those usually observed in languages that distinguish vowel length in their phonologies (Tsukada 2009), they are comparable to another Providence LANE speaker, whose PALM vowel was measured as 40% longer than her LOT vowel (Johnson 2010: 37). And recall that Moulton (1968) specifically claimed the contrast was only in quantity: LOT [ɑ] vs. PALM [ɑ:].

Looking at the LANE/PEAS data, we see a fairly similar picture. Baker shows a partially-consistent distinction in terms of quality but not quantity, while Brewster shows it consistently in both dimensions: [ɒ] for LOT, [aˑ] for PALM. The three Harris informants are also shown with [a] for PALM, but this may be part of the way Harris misapprehended their vowels in the light of her own Boston-area system. Of the five western speakers, Clark and Partridge are shown with a distinction in both quality and quantity (we also observed both), and a corresponding phonemic distinction is indicated. Schofield is shown with [ɑ] for LOT and [a ˑ] for PALM, and Russell has [ɑ] for LOT and [a], [a:] or [ɑ:] for PALM, but no phonemic distinction was drawn by the editors. (In our analysis, we noted the length difference more than any difference in quality.) Only Hubbard’s synopsis shows little trace of a LOT-PALM distinction – John and college overlap with palm and father. Hubbard is the most “interior” of our speakers; he is by far the most rhotic, and along with Schofield he completely lacks the broad a. So it is possible that he could have the LOT-PALM merger characteristic of other rhotic areas away from the East Coast (see section 2). Still, he shows a significant length difference, although at +29% it is smaller than that of the other four Western speakers.

LOT and PALM are largely merged today in Western Massachusetts, Connecticut, and Rhode Island. (In rhotic areas, it passes without comment, but when it is combined with non-rhoticity, this merger has become the source of humorous commentary on the Rhode Island dialect: “mock my words”, “pocking lot”, “hot attack”, etc.) However, we cannot see evidence for any phonetic approximation preceding merger among our speakers. It is true that our oldest Western speaker, Partridge (b. 1847), has the biggest duration difference, with his PALM 51% longer than his LOT. But the next two oldest speakers have much smaller differences: 29% for Hubbard (b. 1856) and 28% for Miner (b. 1859). Meanwhile, the three youngest Western speakers have duration differences in the middle of the range: 36% for Clark (b. 1880), 39% for Covell (b. 1880), and 38% for Schofield (b. 1886). Considering that Moulton also preserved the length distinction, despite being born as late as 1914, we can conclude that the loss of the LOT-PALM distinction in Western New England – and perhaps elsewhere on the East Coast – was primarily a twentieth-century phenomenon. Map 2 summarizes the configuration of these three vowel classes.


254

Map 2. PALM-LOT-THOUGHT configurations, centered on LOT (size of

points proportional to duration) 7 NORTH and FORCE Moving on to NORTH and FORCE, it is very difficult to identify the individual points on the crowded PEAS maps, but it seems that Clark and Hubbard may have been represented with this merger. Turning to the synopses, most of our speakers are shown as distinct, with the transcriptions [ɔ], [ɔˑ], or [ɔə] in the NORTH word horse and [ɔə], [o ə], or [oə] in the FORCE word hoarse (this is the only true minimal pair presented in the PEAS synopses, although other pairs elicited, like morning/mourning, may have been considered in the analysis). With several other words included, Clark shows a regular small distinction between [ɔˑ] and [ɔə], but Hubbard looks more merged, including a “flip-flop” (Hall-Lew 2013) between [oɚ] in forty (NORTH) and [ɔɚ] in four (FORCE).

Today, the NORTH-FORCE merger has spread considerably, taking over most of the South and consolidating its hold on New York and Western New England. However, the distinction is still produced in ENE by some non-rhotic or partially-rhotic speakers (Labov et al. 2006: 47-53; compare Maps 7.1 and 8.2). In Boston, and probably elsewhere, low NORTH realizations like [ɔ] or [ɒ] have become socially marked, initiating a merger-by-transfer into the higher FORCE word class (Laferriere 1979). (Intriguingly, speakers consider realizing “short” as [ʃoət] to be


255

“putting the r in” (605).) This transfer has been described as a feature of current Boston mayor Marty Walsh’s “very modern take on the Boston dialect” (Baker 2013); for example, Walsh “pronounces the neighborhood he grew up in [as] ‘Dohchestah’ rather than the ‘Dawchestah’ of old” (i.e. he uses [oə] rather than [ɒ]).

We conducted a parallel analysis using mixed-effects models controlling for the preceding and following segment, syllable position, and word identity. Because the number of NORTH and FORCE tokens per speaker was smaller than the word classes in the previous section, moderate differences in means are less statistically-significant than they were there, as seen in Table 5. Nevertheless, we must conclude that evidence for the NORTH-FORCE distinction is weaker across the board. Speaker F1

difference F2 difference

Euclidean distance

duration difference

Baker (Beverly MA)

-0.26 -0.12 0.28 0.029 (+22%)


-0.64*** 0.46** 0.78 -0.004 (-3%)


-0.43** -0.10 0.44 0.032 (+23%)

Covell (Newport RI)

-0.20 -0.11 0.23 0.005 (+2%)


-0.63*** -0.07 0.63 0.011 (+9%)


0.18 0.12 0.22 0.029 (+22%)


-0.81*** -0.12 0.82 0.021 (+15%)


-0.07 -0.15 0.17 0.036 (+22%) **


-0.02 -0.13 0.13 0.008 (+6%)


-0.34* 0.10 0.36 0.043 (+22%) **

Table 5. F1, F2 and duration differences between NORTH and FORCE, by

speaker (*** p < .001, ** p < .01, * p < .05, italics: p > .05)

Six speakers showed evidence of a NORTH-FORCE distinction. From the F1 model, FORCE is higher than NORTH for nine of ten speakers, with the difference reaching significance for Brewster, Clark, Hubbard, Partridge, and Weeden. The only speaker with a significant F2 distinction was Brewster, whose FORCE was – unexpectedly – further front than his NORTH. In terms of duration, we might expect a distinct FORCE, often represented as a diphthong, to be longer than NORTH, and it is for nine of ten speakers, but the difference is only significant for Russell and Weeden.


256

Another way to look at NORTH and FORCE is in the context of the neighboring vowel classes. In the classic pattern where NORTH and FORCE are distinct, NORTH is close or identical to THOUGHT, while at least the nucleus of FORCE is the same as that of GOAT. This can be seen in the older transcriptions with [ɔ] and [o], respectively. Unlike modern RP, where NORTH/FORCE is clearly the same as THOUGHT, in most modern American accents we find a merged nuclear quality between [ɔ] and [o]. Unless THOUGHT is very raised, it is clearly lower than the merged NORTH/FORCE vowel (even in NYC, sauce was measurably lower than sauce, despite speakers’ judgments; Labov et al. 1972). At the same time, the American merged NORTH/FORCE is lower than GOAT.

Of our ten New England speakers, four show the older pattern where GOAT and FORCE are similar in F1, and are higher than THOUGHT and NORTH, which are also similar in F1. These four are Brewster, Clark, Hubbard, and Partridge. The latter two are the oldest Western speakers (b. 1856 and 1847). Brewster is the youngest speaker in the sample (b. 1889), but he is Eastern and non-rhotic, the same profile of speaker who might retain a NORTH/FORCE distinction even today. Note that the distinctions shown by Clark and possibly Hubbard do not seem to match the PEAS records discussed above.

Three of our speakers, Baker, Russell, and Schofield, show the more common modern pattern where NORTH and FORCE are close together in height – indeed, possibly merged – in an intermediate position: higher than THOUGHT but lower than GOAT. While Russell and Schofield are among the younger speakers, the inclusion of Baker in this group is surprising, since she is older, non-rhotic, and has such a conservative ENE vowel system. However, Baker only produced nine NORTH tokens, which may not have been enough to accurately assess the relationship among these vowels.

For Covell and Weeden, NORTH and FORCE may be intermediate between the old and new patterns: NORTH is higher than THOUGHT and FORCE is lower than GOAT, but NORTH and FORCE are not a separate class, as they seem to be for the three speakers above. Finally, Miner’s pattern is unclear and unlike the others’ (e.g. with NORTH higher than FORCE). We should note that Miner’s recording was of even worse quality than the others, leading to more overlapping of vowel classes and distinctions and patterns that were generally less clear.

Map 3 shows the configurations of these four vowels for the ten speakers.


257

Map 3. THOUGHT-NORTH-FORCE-GOAT configurations, centered on

NORTH (size of points proportional to duration)


258

8 Conclusion

Our study analyzed parts of the vowel systems of ten “cultured” informants from the Hanley Recordings for three reasons. As mentioned already, we wanted to be able to evaluate the impressionistic transcription abilities of an earlier generation of dialectologists. In this respect, we found that these pioneers were not very accurate in dealing with short a, and were generally quite a lot better dealing with the other low vowels. The second reason is to highlight that LANE did not only interview “NORMs” (non-mobile rural older males), a practice that is sometimes turned into a criticism of traditional dialectology. On the contrary, LANE and PEAS also treated the regional pronunciation of both men and women, including educated informants with wide social networks. It is these varieties, perhaps more than the local dialects of farmers and fishermen, that have evolved into the regional accents spoken today, and this is the third reason they are of special interest.

Regarding short a, our study reinforces that the split pattern that today, on the East Coast, is restricted to the environs of New York City, Philadelphia, and Baltimore, was once found much more widely, including in WNE. Even ENE showed signs of the split in cases when the broad a did not occur. This finding reinforces that of Durian (2012) in casting some doubt on the diffusion account of Labov (2007), which suggests that certain other cases resembling split short a (for example, in Cincinnati and New Orleans) originated from contact with New York. While the story has by no means been fully told, it seems possible that instead, a split short a (or broad a pattern) was original to much of American colonial settlement, at least in the North and Midland (making the few cities that retain it relic areas). Our results also call into question the idea that the Northern Cities Shift began in WNE, as argued by Boberg (2001), rather than in the Inland North itself. While we agree with Boberg that short a is key to the development of the NCS, our New England speakers, like others elsewhere (e.g. Durian 2012; Becker 2010; Boberg and Strassel 2000), appear to be following a trajectory from a split to a continuous to a nasal system. The continuous system can quite easily be mistaken for the kind of everywhere-tense system that is likely to have been the precursor of the NCS, but we found no such pattern in any part of New England. As far as the other low vowels are concerned, the phonological configuration of PALM, LOT, and THOUGHT was very different for our speakers, born mostly in the second half of the nineteenth century, than for most New Englanders born in the twentieth century. The characteristic mergers of PALM and LOT (WNE) or LOT and THOUGHT (ENE), found to be nearly ubiquitous in Johnson (2010), had not yet occurred in the areas we studied, and we still know little about how they did (there is some evidence that LOT and THOUGHT merged earlier in New Hampshire and Maine). Phonetically, though, PALM was fronter than LOT for the speakers in Eastern Massachusetts and several others. But in Rhode Island and Connecticut, the PALM-LOT distinction was one of length, as stated in Moulton (1968). This purely quantitative opposition, fairly uncharacteristic for American English, is another way in which WNE was different from New York State and the rest of the Inland North dialect area, where PALM and LOT apparently fell together much sooner (e.g. for a speaker born just before 1800 in the Hudson Valley; Labov 2010: 162).


259

There are 286 New England speakers in the Hanley Recordings; this has only been a partial analysis of ten cultured speakers from Southern New England. To go to the opposite extreme, the farmers, fishermen and sea captains of Northern New England could teach us much about more conservative lexical and phonological systems. We can only reiterate the appeal of Purnell (2012) to make use of further use of this incredibly rich resource. References American Folklife Center 2009. American Dialect Society Collection.

AFC1984/011. Washington, DC: American Folklife Center. Ash, Sharon 2002. The distribution of a phonemic split in the Mid-Atlantic region:

Yet more on short a. University of Pennsylvania Working Papers in Linguistics 8.3:1-11.

Babbitt, E. H. 1896. The English of the lower classes in New York City and vicinity. Dialect Notes 1: 457-464.

Baker, Billy 2013. In Walsh, students of Bostonese have found their avatah. Boston Globe, http://www.bostonglobe.com/metro/2013/11/17/bostonaccent-strong-mayor-elect walsh/AFouySIXDXFVE58IMsJGwI/story.html

Bates, Douglas, Martin Maechler, Ben Bolker and Steven Walker 2014. lme4: Linear mixed-effects models using Eigen and S4. R package version 1.1-6. http://CRAN.R project.org/package=lme4

Becker, Kara 2010. Social conflict and social practice on the Lower East Side: A study of regional dialect features in New York City English. Ph.D. dissertation, New York University.

Bloch, Bernard 1935. The treatment of Middle English final and preconsonantal R in the present-day speech of New England. Ph.D. dissertation, Brown University.

Boberg, Charles 2001. The phonological status of Western New England. American Speech, 76.1: 3-29.

Boberg, Charles, and Stephanie Strassel 2000. Short a in Cincinnati: A change in progress. Journal of English Linguistics 28: 108-126.

Boersma, Paul, and David Weenink 2014. Praat: Doing linguistics by computer. Version 5.3.71 http://www.fon.hum.uva.nl/praat

Chase, Margaret Taft. The derivatives of Middle English short o in the speech of New England. Masters thesis, Brown University.

Clopper, Cynthia, David B. Pisoni, and Kenneth de Jong 2005. Acoustic characteristics of the vowel systems of six regional varieties of American English. Journal of the Acoustical Society of America, 118.3: 1661-1676.

Dinkin, Aaron 2009. Dialect boundaries and phonological change in Upstate New York. Ph.D. dissertation, University of Pennsylvania.

Dinkin, Aaron 2011. Weakening resistance: Progress toward the low back merger in New York State. Language Variation and Change 23.3: 315-345.

Dobson, Eric J. 1957. English Pronunciation, 1500-1700. Volume II: Phonology. Oxford: Clarendon Press.

http://www.bostonglobe.com/metro/2013/11/17/boston

http://CRAN.R

http://www.fon.hum.uva.nl/praat


260

Durian, David 2012. A new perspective on vowel variation across the nineteenth and twentieth centuries in Columbus, OH. Ph.D. dissertation, The Ohio State University.

Ekwall, Eilert 1946. American and British Pronunciation. Uppsala: American Institute of Uppsala.

Emerson, Oliver F. 1891. The Ithaca dialect: A study of present English. Dialect Notes 1: 85-173.

Ferguson, Charles 1972. ‘Short a’ in Philadelphia English. In: M. Estellie Smith (ed.), Studies in Honor of George L. Trager. The Hague: Mouton, pp. 259-274.

Fisher, Sabriya, Hilary Prichard and Betsy Sneller 2014. The apple doesn’t fall far from the tree: Incremental change in Philadelphia families. Paper presented at NWAV 43, Chicago.

Fridland, Valerie, Tyler Kendall and Charlie Farrington 2014. Durational and spectral differences in American English vowels: Dialect variation within and across régions. Journal of the Acoustical Society of America 136(1): 341-349.

Garde, Paul 1961. Reflexions sur les différences phonétiques entre les langues slaves. Word 17: 34-62.

Gordon, Matthew J. 2006. Tracking the low back merger in Missouri. In: Thomas Murray and Beth Simon (eds), Language Variation and Change in the American Midland: A New Look at ‘Heartland’ English. Philadelphia: John Benjamins, pp. 57-68.

Grandgent, Charles H. 1890. Vowel measurements. Publications. of the Modern Language Association of America. Supplement to Vol. V, No. 2.

Grandgent, Charles H. 1892. ‘Haf’ and ‘haef’ Dialect Notes 1: 269-275. Grandgent, Charles H. 1920. Fashion and the broad a. Old and New: Sundry

Papers. Cambridge, MA: Harvard University Press, pp. 25-30. Hall-Lew, Lauren 2009. Ethnicity and phonetic variation in a San Francisco

neighborhood. PhD dissertation, Stanford University. Hall-Lew, Lauren 2013. ‘Flip-flop’ and mergers-in-progress. English Language

and Linguistics 17.2: 359-390. Hanley, Miles 1936. Phonographic recording, In Daniel Jones and Dennis B. Fry

(eds) Proceedings of the Second International Congress of the Phonetic Sciences. Cambridge: Cambridge University Press, pp. 75-82.

Herold, Ruth 1990. Mechanisms of merger: The implementation and distribution of the low back merger in Eastern Pennsylvania. Ph.D. dissertation, University of Pennsylvania.

Herold, Ruth 1997. Solving the actuation problem: Merger and immigration in Eastern Pennsylvania. Language Variation and Change 9.2: 165-189.

Hickey, Raymond 2004. A Sound Atlas of Irish English. Berlin/New York: Mouton de Gruyter.

Irons, Terry Lynn 2007. On the status of low back vowels in Kentucky English: More evidence of merger. Language Variation and Change 19.2: 137-180.

Jacewicz, Ewa, Robert Allen Fox, and Joseph Salmons 2011. Cross-generational vowel change in American English. Language Variation and Change 23: 45-86.

Johnson, Daniel Ezra 1998. The tensing and laxing of short ‘a’ in New Haven, Connecticut. BA thesis, Yale University.


261

Johnson, Daniel Ezra 2010. Stability and Change along a Dialect Boundary. The Low Vowels of Southeastern New England. Publication of the American Dialect Society, 95. Raleigh: Duke University Press.

Kurath, Hans, 1928a. American pronunciation. S[ociety for] P[ure] E[nglish] Tract XXX: 279-297.

Kurath, Hans 1928b. The origin of the dialectal differences in spoken American English. Modern Philology 25.4: 385-395.

Kurath, Hans, and Raven I. McDavid, Jr. 1961. Pronunciation of English in the Atlantic States. Ann Arbor, MI: University of Michigan Press.

Kurath, Hans, Marcus L. Hansen, Julia Bloch, and Bernard Bloch (eds) 1939. Handbook of the Linguistic Geography of New England. Providence, RI: Brown University.

Kurath, Hans, Miles L. Hanley, Bernard Bloch, Guy S. Lowman, Jr., and Marcus L. Hansen (eds) 1939-1943. Linguistic Atlas of New England. 6 Vols. Providence, RI: Brown University.

Kuznetsova, Alexandra, Per Bruun Brockhoff and Rune Haubo Bojesen Christensen 2014. lmerTest: Tests for random and fixed effects for linear mixed effect models. R package version 2.0-6. http://CRAN.R-project.org/ package= lmerTest

Labov, William 1963. The social motivation of a sound change. Word 19: 273-309. Labov, William 1966. The Social Stratification of English in New York City.

Washington, DC: Center for Applied Linguistics. Labov, William 2007. Transmission and diffusion. Language 83.2: 344-387. Labov, William 2010. Principles of Linguistic Change: Cognitive and Cultural

Factors. Oxford: Wiley-Blackwell. Labov, William, Malcah Yaeger and Richard Steiner 1972. A Quantitative Study of

Sound Change in Progress. Philadelphia: U.S. Regional Survey. Labov, William, Sharon Ash, and Charles Boberg 2006. The Atlas of North

American English: Phonetics, Phonology, and Sound Change. Berlin: Mouton de Gruyter.

Laferriere, Martha 1977. Boston short a: Social variation as historical residue. In: Studies in Language Variation: Semantics, Syntax, Phonology, Pragmatics, Social Situations, Ethnographic Approaches (ed.) Ralph W. Fasold and Roger W. Shuy. Washington, D.C.: Georgetown University Press, pp. 100-107.

Laferriere, Martha 1979. Ethnicity in phonological variation and change. Language 55: 603-617.

Lass, Roger 1976. English Phonology and Phonological Theory. Cambridge: Cambridge University Press.

Lobanov, B. M. 1971. Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America 49.2: 606-608.

McDavid, Raven I. 1981. Low-back vowels in Providence: A note in structural dialectology. Journal of English Linguistics ??: ??-??.

Moulton, William G. 1968. Structural dialectology. Language 44.3: 451-466. Nagy, Naomi. and Patricia Irwin 2010. Boston (r): Neighbo(r)s nea(r) and fa(r).

Language Variation and Change 22.2: 241-278. Nagy, Naomi. and Julie Roberts 2004. New England: Phonology. In: Edgar

Schneider, Kate Burridge, Bernd Kortmann, Rajend Mesthrie and Clive

http://CRAN.R-project.org/


262

Upton (eds) A Handbook of Varieties of English. Volume 1: Phonology. Berlin, NY: Mouton de Gruyter, pp. 270-281.

Newlin-Łukowicz, Luiza 2013. Is the low back merger facilitated by L1? Poster presented at NWAV 42. Pittsburgh, PA.

Parslow, Robert L. 1967. The pronunciation of English in Boston, Massachusetts: Vowels and consonants. Ph.D. dissertation, University of Michigan.

Purnell, Thomas 2012 Dialect recordings from the Hanley Collection, 1931-1937. American Speech 87.4: 511-513.

Roberts, Julie 2007. Vermont lowering? Raising some questions about (ay) and (aw) south of the Canadian border. Language Variation and Change 19.2: 181-197.

Rosenfelder, Ingrid, Josef Fruehwald, Keelan Evanini and Jiahong Yuan 2011. FAVE (Forced Alignment and Vowel Extraction) Program Suite. http://fave.ling.upenn.edu.

Thomas, Erik R. 2001. An Acoustic Analysis of Vowel Variation in New World English. Publication of the American Dialect Society 85. Durham, NC: Duke University Press.

Trager, George L. 1930. The pronunciation of ‘short a’ in American Standard English. American Speech 5: 396-400.

Trager, George L. 1940. One phonemic entity becomes two: The case of ‘short a’. American Speech 15: 255-258.

Tsukada, Kimiko 2009. An acoustic comparison of vowel length contrasts in Arabic, Japanese and Thai: durational and spectral data. International Journal of Asian Language Processing 19.4: 127-138.

Tuttle, George 1902. Phonetic notation. In: Edward Scripture (ed.), Studies from the Yale Psychology Laboratory X: 96-117.

Waterman, Margaret 1974. The Hanley tapes. Unpublished manuscript, University of Wisconsin-Madison.

Wells, J. C. 1982. Accents of English. 3 Vols. Cambridge: Cambridge University Press.

Wetmore, Thomas 1959. The Low Central and Low Back Vowels in the English of the Eastern United States. Publication of the American Dialect Society, 32. University, AL: University of Alabama Press.

Wong, Amy Wing-mei 2012. The lowering of raised-THOUGHT and the low-back distinction in New York City: Evidence from Chinese Americans. University of Pennsylvania Working Papers in Linguistics 18.2: Article 18.

Wyld, Henry Cecil 1936. A History of Modern Colloquial English. Oxford: Basil Blackwell.

Zelinsky, Wilbur 1973. The Cultural Geography of the United States. Englewood Cliffs, NJ: Prentice Hall.


Purnell, Raimy and Salmons Upper Midwestern English --- Page 263 of 525

263

12 Upper Midwestern English Thomas Purnell, Eric Raimy, Joseph Salmons 1 Upper Midwestern English Transcriptions and recordings made over a half-century ago can help us understand contemporary vowel changes. In this paper, we turn to archival recordings of speech in Upper Midwestern English (henceforth UME). The region of the United States referred to as the Upper Midwest stretches from Chicago and northern Illinois to the Upper Peninsula of Michigan and westward to the Dakotas (Map 1). We focus narrowly here on Wisconsin, Minnesota and Michigan’s Upper Peninsula (UP), a region split across dialect areas under various views. For example, Carver’s (1987) lexical isoglosses divide it across the “North” (Map 3.3, p. 56), Upper North (Map 3.7, p. 68) and Upper Midwest (Map 3.11, p. 83).

Ongoing Upper Midwest vowel changes include changes to the GOAT (/o/), CLOTH (/ɔ/) and TRAP (/æ/) vowels (Labov 1991, Labov, Ash and Boberg 2006: Map 14.8, p. 203, Benson, Fox and Balkman 2011). Archival data shows, in essence, the prehistory of the patterns these three vowel classes display. First, older (Allen 1973) and newer (Thomas 2001) reports claim a Scandinavian substratal monophthongal GOAT vowel. Minnesota speakers with variable monopthongization are geographically arranged, with speakers in the southeastern corner of the state more diphthongal, not closely matching Scandinavian settlement patterns. For the Low Back Merger, involving LOT and CLOTH, Labov (1966) and Labov, Ash and Boberg (2006) draw isoglosses for the merger within the region. Previous work leads us to expect more merged speakers over time with a west-to-east spread. Data from Allen’s Linguistic Atlas of the Upper Midwest (LAUM) provides evidence for neither. Finally, researchers (e.g. Labov et al. 2006) report that the Northern Cities Shift (NCS) is present in parts of the Upper Midwest (Gordon 2000: 116, 2004: 296; see also Speaker 10, Lino, MN, in Thomas 2001: 72); however, our evidence focuses on ‘short-a tensing’ of /æ/ in TRAP as a precursor to NCS features only at the southeastern edge of the region. Archival recordings of ‘Arthur the Rat’ from the Dictionary of American Regional English and the Wisconsin English Language Survey help date these changes by consonantal environment, providing evidence for pre-apical raising long before the NCS is thought to have begun (see Johnson and Durian, this volume) or reached the region.

Our findings revise the regional history of these vowels. For each, modern variants are well represented in speakers born in the last century. These are less changes spreading systematically through the region than variables widely present from early on, now emerging as regional markers in various configurations and realizations.

This chapter is structured as follows. First, we introduce archival sources on UME. Second, we examine the GOAT, CLOTH and TRAP vowel classes. Third,


264

we provide conclusions, the most important being that older data suggests a more nuanced input to contemporary patterns for those three vowels.

Map 1. Upper Midwestern states by narrow or broad reference. 2 Archives of Upper Midwestern speech The Upper Midwest’s history has shaped its English: diverse indigenous populations existed before migration of Yankee (Anglo-American English speakers) from the east and heavy European immigration especially by speakers of Germanic and Slavic languages. Later northern migration by African Americans preceded the arrival of speakers of Southeast Asian languages and Spanish (Salmons and Purnell 2010, Purnell, Raimy and Salmons 2013). Beyond vowels, UME presents interesting lexical patterns (Carver 1987, von Schneidemesser 2013) and grammatical features (Salmons and Purnell 2010). Obstruents show differences in voice onset times (Jacewicz, Fox and Lyle, 2009; Litty 2014; Rodgers 2014) and final devoicing (Purnell, Salmons and Tepeli 2005; Purnell et al. 2005b).

UME is only being ‘enregistered’ today (Remlinger, von Schneidemesser and Salmons 2009; Jacewicz, Salmons and Fox 2006; Purnell, Raimy and Salmons 2009), but earlier speech here is remarkably well documented. Four main archival sets provide data (Tables 1 and 2): the Linguistic Atlas of North Central States (LANCS), the Linguistic Atlas of the Upper Midwest (LAUM), the Wisconsin English Language Survey (WELS) and the Dictionary of American Regional English (DARE). Records exist for speakers born before effective English-speaking settlement of the region, see Figure 2 (Ostergren 1997:138), with settlement generally spreading east to west (Map 2). Frederic Cassidy transcribed a LANCS speaker born in 1850, while Harold Allen transcribed a speaker from MN born in


265

1859. With respect to recordings, Allen’s oldest LAUM speaker from MN was born in 1864, Cassidy’s oldest WELS speaker in 1870, and the oldest Upper Midwestern (MI UP, MN, WI) DARE speaker in 1880. Other audio recordings have not been transcribed or analyzed. In Wisconsin alone, William Schereck oversaw 55 recordings in 1955,1 nine more are part of the Helene Stratman-Thomas collection,2 and the Wisconsin Historical Society has 199 interviews recorded by Malcolm Rosholt in four counties, 1953-1971.3

Map 2. Settlement patterns into Wisconsin between 1830 and 1920 (Ostergren

1997:138). By courtesy of the University of Wisconsin Press.

1 http://digital.library.wisc.edu/1711.dl/wiarchives.uw-whs-mss00332 2 http://digital.library.wisc.edu/1711.dl/wiarchives.uw-mus-mus001 3 http://digital.library.wisc.edu/1711.dl/wiarchives.uw-whs-audi00842a

http://digital.library.wisc.edu/1711.dl/wiarchives.uw-whs-mss00332

http://digital.library.wisc.edu/1711.dl/wiarchives.uw-mus-mus001

http://digital.library.wisc.edu/1711.dl/wiarchives.uw-whs-audi00842a


266

Table 1. Upper Midwestern datasets. “Transcripts” refers to manual

transcriptions of lexical items, not free conversation. Source Material Year/s data

collected Linguistic Atlas of the North Central States (LANCS)

Transcripts only 1940, 1941

Linguistic Atlas of the Upper Midwest (LAUM)

Transcripts, Audio recordings

1947, 1948, 1956

Wisconsin English Language Survey (WELS)

Audio recordings 1951, 1952, 1953, 1955

Dictionary of American Regional English (DARE)

Audio recordings 1965-1969

We draw first on Cassidy’s fieldnotes for Albert Marckwardt’s unpublished Linguistic Atlas of the North Central States. Field notebooks and preliminary results housed at DARE and the McDavid archive (Newberry Library) indicate vowel variability in the GOAT, CLOTH and TRAP vowels. These data are preserved solely in transcriptions, with only one data point for the Upper Peninsula, Sault Ste Marie, and the lexical maps lack Wisconsin data. Cassidy conducted fieldwork in WI, and his unpublished notebooks from 1940-1941 provide data missing on the mimeographed maps. 49 speakers were born 1850-1894, and one speaker in 1907. These may be the oldest coherent phonetic transcriptions for WI English.

A second dataset is Allen’s (1973) Linguistic Atlas of the Upper Midwest (LAUM), covering North Dakota, South Dakota, Nebraska, Iowa and Minnesota. The 34 recordings from MN recently became available as part of the Raven McDavid materials (Table 2). Like LANCS, LAUM contains mostly speakers born in the nineteenth century. We use only fieldnotes published in LAUM, with sample recordings to confirm vowel positions.


267

Table 2. Distribution of 277 UME speakers by birth date and gender. 1850 to

1879 1880 to 1889

1900 to 1909

1919 to 1929

1930 to 1960

NA

F M F M F M F M F M LANCS Wisconsin 11 31 3 4 -- 1 -- -- -- -- -- LAUM Minnesota 8 17 12 14 2 8 -- -- -- -- 1 WELS Wisconsin 3 4 10 4 8 8 6 5 2 -- 1 DARE Michigan’s Upper Peninsula -- -- 4 10 5 9 6 3 4 1 -- Minnesota -- -- 2 7 2 2 3 3 2 -- -- Wisconsin -- -- 13 9 6 4 9 5 2 1 2 Total 22 52 44 48 23 32 24 16 10 2 4 The last two datasets are the Dictionary of American Regional English (DARE, Cassidy and Hall 1985-2014) and the Wisconsin English Language Survey (WELS). Cassidy collected questionnaire data and made recordings for WELS that motivated the American Dialect Society to embark on a national dialect dictionary, leading to DARE (Cassidy 1948, Cassidy and Duckert 1953). Some speakers in these sets overlap in age with those in other datasets and extend the birth years of speakers into the twentieth century. 3 Monophthongization of GOAT The American GOAT vowel is generally considered a relatively uniform long, back diphthong with an upglide, [ou], that is vertical or transverse in the vowel space (Kurath 1964, Hartman 1985, Labov, Ash and Boberg 2006: 12, Thomas 2001). Kurath (1964: 114-116) notes variation in the syllable position of /o/, occurring most frequently in open position and least in closed syllables with secondary stress (sailboat, notation, p. 114). Moreover, regionalisms occur most often in pre-rhotic position. Key is that UME GOAT is claimed to be more stable than elsewhere. Labov, Ash and Boberg (2006) describe Inland Upper North speakers as having higher, more back nuclei compared to the US as a whole. Thomas (2001: 28-32) found GOAT in the Inland North to resist fronting or lowering seen elsewhere. Thomas (2001) and Allen (1973) both suggest that GOAT monophthongization is connected with Scandinavian influence. If so, we expect a correlation between monophthongs and Scandinavian settlement. Ostergren (1988: 16-18, elsewhere) shows Swedish and Norwegian settlement heaviest in the east and southeast and the same for German settlement, which could also play a role, since German has long


268

monophthongal /o:/ as well. Scandinavian settlement in Wisconsin was heaviest in the west and in Minnesota in the southeast.

Thomas (2001) includes one speaker from near the Twin Cities, Minnesota (Lino Lakes, b. 1940), whose GOAT vowel is described as consistently monophthongal. Coda environments are not controlled, however, and one might assume monophthongization across environments. For comparison, Thomas’ three Inland North speakers (near Cleveland, Ohio) exhibit rising diphthongs, with most vowel trajectories in the vowel space being transverse even if the nucleus of the vowel is somewhat centralized (Thomas’ Speakers 13-16, pp. 77-80). In studying syllable structure effects on boat, bode, beau, Purnell (2010) found northwestern Wisconsin speakers produced a horizontal trajectory in BEAU words, similar to eastern MN speakers and unlike the more transverse productions of southeastern WI speakers. One possibility is that the lower the offglide (i.e., the more horizontal the gesture), the more speakers perceive the vowel as monophthongal.

Monophthongized Minnesota /o/ was described in Allen (1973, volume 3, pp. 22-23) for words such as ago, coat and road. Allen notes that the main variants are [o] and two types of upgliding diphthongs, [oʊ] and [oo^] (“… indicates only the briefest and slightest tongue movement”, p. 23). We use this distinction in our examination of words in the GOAT class.

For Minnesota in LAUM, the monophthong appears 16% of the time in open position (BEAU class, e.g., ago) and 19% of the time when followed by a voiced coda consonant (BODE class, e.g., road), vs. 47% in closed words with a voiceless consonant (BOAT class, e.g., coat). The full diphthong occurs 51% of the time in the BEAU class and only 15% and 18% of the time for BODE and BOAT, respectively. The reduced offglide diphthong occurs most often with BODE (66%) while the vowel appears in the BEAU and BOAT classes 33% and 35%, respectively. BEAU, as an open syllable word class, allows the vowel to be fully stressed and diphthongal or partially diphthongal (84%). The voicelessness of the coda in BOAT appears to reduce the syllable vowel duration. Conversely, the longer BODE vowel is diphthongal 81% of the time. Nevertheless, for some speakers, e.g. Speaker 41 (Montevideo, Chippewa County, MN, shown in Figure 1, open syllables such as go are monophthongal.


269

Figure 1. Waveform and spectrogram of LAUM Speaker 41 (b. 1877, male,

Montevideo, Chippewa County, MN) saying go with a monophthongal GOAT vowel.

Cassidy did not transcribe the speech of his Wisconsin LANCS speakers with the same nuances of slight upgliding and fronting as Allen did. Nevertheless, his notation does not include offgliding, (e.g., [kot], [kout], [koət] and [koət] variants for coat; Book 22.1). Comparing tokens monophthongized by open, closed-voiced and closed-voiceless syllables, monophthongization occurs in a pattern identical (but not percentage-wise) to Allen’s Minnesota data: less frequently in open syllables (40% of the time, ago); slightly more frequently before voiced consonants (46% in road, 54% in home); and most frequently in syllables closed by a voiceless consonant (64% in coat).

These percentages do not tell the entire story of UME words in the GOAT class. If the offglide is /o/ (Speaker 40, Washington, MN: know transcribed as [oo^]), the vowel trajectory might be fairly flat in the vowel space and near-monophthongal, whereas a /u/ or /ʊ/ offglide (Speaker 18, St. Louis, MN: know transcribed as [o<u<]) would represent the traditional, albeit fronted, raised offglide. Examining the spatial relation of the GOAT words (ago, coat, road, home, know) as transcribed by Allen for MN and Cassidy for WI we see a geographic split of the speakers who produce no [u] offglide or few (1 or 2) offglides of the five words. Map 3 displays the Upper Midwestern speakers and the number of GOAT words (N=5) with [u] offgliding.


270

Map 3. Geographic distribution of [u] offgliding in GOAT class words in the

Upper Midwest. (Sources: LAUM and LANCS) For MN, there are fairly discrete regions, with the northwestern MN area represented by primarily flat /o/ diphthongs, and the northeastern part of the state displaying more rising offglides. Also, for mixed offglides, mapping [u] or [ʊ] offglides yields an east-west pattern. The offgliding words in MN significantly co-vary in this mixed group with latitude (F(3,31)=5.04, p < 0.05) such that the lower (and according to the maps, more easterly) the speaker, the more the offglides are high offglides, and the further one goes to the center of MN, the more the offglides are [o].

The WI LANCS speakers display a similar pattern with a preference for monophthongization in the northern portion of the state. The [u] offglides for GOAT are found in the southern and southeastern portion of the state. In Figure 1, the distribution of speakers with monophthongal variants for the four target GOAT words show, first, that monophthongization was present in WI, and the predominance of speakers producing words with monophthongization were in the middle to upper portion of the state. A band of such speakers stretches from the middle of WI to the northwestern region of MN.

The geographic distribution of the GOAT vowel and ostensible UME monophthongization in the 1950s informs modern patterns. Moreover, while the distribution addresses a portion of the anecdotal claim that /o/ defines both speakers from Minneapolis and Milwaukee, more work is needed in southeastern WI to see if contextual variants are more monophthongal (e.g., stressed open syllable words such as sentence initial so and no). Monophthongization was present in both MN and WI for LAUM and LANCS, respectively. It coexisted in some areas with [u] diphthongization, particularly around the Twin Cities region in MN and western


271

WI. Some areas are distinct, for example, portions of MN closest to Lake Superior in the northeast had only [u] monophthongization while the northwest had no [u] diphthongization. The geographical off gliding (rather, monophthongization) variation in both states does not necessarily coincide with patterns of Scandinavian immigration in either state. While the 1890 settlement pattern for Norwegians, Swedes and Germans in the Upper Midwest was highly segregated (Ostergren 1988: 15), settlement by Scandinavians in Ostergren’s maps of the 1890 census (Fig. 1.5 and 1.6, 1988:17-18) suggests that the settlement influence on monophthongization would have to be, at best, a pan-Scandinavian one. However, the area of monophthongization also covers a region of strong German immigration (Ostergren’s Fig. 1.4, 1988:16). For example, there is less Scandinavian than German presence in the Green Bay, WI, area, which our data shows favored monophthongization (Figure 1, above). The mismatch of settlement with monophthongization suggests that a substratal account is probably not a direct inheritance. Rather, it raises the possibility of a reallocation (Kerswill and Trudgill 2005) of an ethno-linguistic pattern to reflect a broader regional pattern that includes both MN and WI. It is doubtful that speakers adopted forms in avoidance of Yankee or German settlers. Hence, we should consider this an example of a centripetal force where speakers adopt a coalesced local pronunciation in perceived space (Britain 2002, Preston 2002). Such centripetal action effectively increases—not decreases—the vernacularity of spoken forms in the region, as evidenced by the markedness of monophthongization relative to Standard American English (Allen 1973-1976, Thomas 2001). Considerable work is required to support this contention, however. 4 Merger of LOT and THOUGHT Vowels Labov (1991) sees the LOT and THOUGHT vowels (the CAUGHT-COT merger or Low Back Merger) as critical for distinguishing American dialects. In UME, the merger is claimed to be creeping eastward (Benson, Fox and Balkman 2011) so that contemporary lowering of THOUGHT occurs without LOT fronting. Labov, Ash and Boberg (2006:64, Map 9.4) place two boundaries through MN. The first, from Labov’s 1966 telephone survey, separates WI from MN as well as the southeastern tip of MN from the rest of MN. Those to the east do not merge while those to the west do. The second isogloss shows the ANAE boundary; speakers to the north merge while those to the south do not. This directional pattern is seen as Minnesota influence on western and central Wisconsin (Benson, Fox and Balkman 2011: 301-303). We have no a priori reason to doubt any of these descriptions, but seek to understand how much has changed since LAUM and LANCS. Although merger is spreading into Wisconsin, we suggest that the region has, since the mid 1800s, maintained a ‘stable mess’ with a range of variants, rather than becoming a merged region. The variants appear in the same location and some merged variants appear in eastern MN. In short, the Upper Midwest was and remains a ‘both’ region.4

4 This claim is both challenged and reinforced by ongoing work among late “tweens” and early teens by Matt Bauer in the Minneapolis area where some students increase their degree of merger over a three-year period (Bauer, p.c.).


272

LAUM provides strong evidence that the merger was present but stable. Allen (1973: 20) sees it as really a low central vowel with extensive front-back variation: “[t]he variety of phonic types that may be subsumed under the rubric /a/ and then of /ɔ/ probably provides opportunity for more controversy than does any other part of the American phonetic inventory” (p. 20). Minnesota (Allen 1973: 20) has [a] for /a/ (e.g., in father). Allen (1973: 23) observes the covariation of dialect mixture (Midlands with Northern speakers) and shift of THOUGHT to LOT. However, locations can have both variants (Allen 1973: 24), suggesting that the change was underway but not completed in the 1950s. Today, this same situation is present in Wisconsin (Benson, Fox and Balkman 2011).

There are two usual assumptions here. First, the merger of THOUGHT and LOT has historically divided the region into a merged western and unmerged eastern region. Historically, the Mississippi River was seen as a barrier to the spread of the merger from MN into WI. With suburbanization of western Wisconsin coming from Minneapolis and Saint Paul, Minnesota, it is unsurprising that the change has spread eastward even into areas where TRAP and BATH raise, particularly urban centers within an hour’s drive from the border (e.g., Eau Claire; Benson, Fox and Balkman 2011: 301-303). Second, because the merger is established in the west and now appears to be moving eastward, some assume the same west-east pattern for Minnesota. A remapping of Allen’s (1973) data shows that 1950s speakers who are on the non-WI bordered periphery of the state (Lake Superior, North Dakota, Iowa) display less merging than those closer to the border. Allen notes:

The weakly rounded low-back [ɒ], which is common in central New York and in much of Pennsylvania, consistently appears also in both Northern and Midland speech areas of the UM. … Although this regional weighting in the widespread distribution of both [ɔ] and [ɒ] reflects the general situation in the eastern states, the fact that both vowels often appear in the same locality reflects the intermingling of settlers with antecedents in distinct [ɔ] and [ɒ] areas in the East. (1973: 24)

Map 4 shows the locations of the main variants in the words frost, daughter and law. Law is the most THOUGHT-like with the most downward arrows ([ɔ˕]) and black dots ([a]), while daughter the most LOT-like with fewest squares ([ɔ]) plotted. Yet speakers producing pure THOUGHT vowels in law are geographically peripheral (north by Lake Superior, west and southeast near Iowa) and the modified THOUGHT speakers form a northeasterly band from the southwestern corner towards Minneapolis. More importantly, the pure LOT vowels, that are raised or have some other nuance, and the lowered THOUGHT vowels (presumably those THOUGHT vowels tending to be LOT vowels) are more distributed around MN with the pure LOT vowels closer to the Wisconsin border and the modified (raised) LOT vowels appearing, with one exception, in the eastern two-thirds of the state. An analysis of variance reveals no significant differences for birth date for any of these words by the THOUGHT vowel.


273

A. frost

B. daughter


274

C. law

Map 4. Distribution of the THOUGHT vowel in frost (A), daughter (B) and

law (C) from LAUM where speakers produced a non-lowered variant of /ɔ/ (gray squares), lowered /ɔ/ (downward open triangle) or a non-raised variant of /a/ (black circle).

Cassidy’s unpublished LANCS data reflect similar effects and fills out the picture (Map 5). Allen found that law has the canonical /ɔ/ vowel more frequently in this word. Likewise, in Wisconsin law has fewer lower variants. A pocket of speakers , clustered around Milwaukee, produce a lowered variant, perhaps due to NCS lowering. When we look at frost in WI, a number of speakers produce the /ɔ/ vowel. However, the majority of lowered /ɔ/ variants occur as a cluster in the southern part of the state. Also, in many locations, two speakers use different variants.

As with GOAT variation, the archival THOUGHT data reveals unexpected patterns, namely that variants co-exist simultaneously. Moreover, the geographic distribution does not support the historical west-to-east change. Future research needs to examine whether the merger is in a holding pattern or undergoing change. 5 Breaking in the TRAP Vowel The raising of the TRAP vowel, /æ/, has occupied much discussion about cities in the northern US, particularly for speakers born in and after the 1950s. While /æ/ raising alone certainly does not signal the full-blown Northern Cities vowel raising effect, we should expect it to be the first historical step and prevalent even in


275

locations that do not have the full complement of NCS changes. There is a striking pattern among NCS dialects that is relevant in UME: TRAP raising occurs most before coronals and least before velars, the ‘DG’ pattern discussed in Labov, Ash and Boberg (2006). Prevelar raising of BAG (as /bejg/ or /bɛg/; Zeller 1997; Bauer and Parker 2008; Purnell 2008) provides the opposite pattern, favoring raising before velars over coronals, hence the ‘GD’ pattern. The latter appears in both states in the Upper Midwest and its relationship to the DG pattern has been unclear. A. frost


276

B. daughter

C. law

Map 5. Distribution of the THOUGHT vowel in frost (A), daughter (B) and

law (C) from LANCS where speakers produced a non-lowered variant of /ɔ/ (gray squares), lowered /ɔ/ (downward open triangle), raised /a/ (upwards black triangle) or a non-raised variant of /a/ (black circle).


277

Historical recordings and transcriptions sharpen our picture of TRAP raising of the Northern Cities Shift (DG) and Upper Midwestern (GD) types. Our data show, first, that DG effects are apparent far earlier than usually thought in southeastern Wisconsin. Second, a raised pre-velar pattern (BAG class) emerges after raised pre-apical (BAD class), and the distributional patterns often vary by gender (Labov 2001). Finally, our evidence suggests that GD patterns reflect a change in conditioning, a source in the DG pattern.

Labov, Ash and Boberg (2006: 181-183) give GD data (showing greater pre-velar than pre-apical raising) from a broad area, including much of Canada, the city of Erie, Pennsylvania, as well as much of Wisconsin and one speaker from Chicago. They report GD patterns particularly for Minnesota and Wisconsin, and, following Zeller 1997, treat this as a merger, where the vowel in bag rhymes with the /ej/ in vague. Pre-velar raising, exemplified by bag, is an emerging stereotype of Wisconsin. This general GD pattern is striking since post-vocalic velars in the NCS generally are claimed to be an inhibiting environment, see Labov, Ash and Boberg (2006: 183):

The reversal of the phonetic effects of /d/ and /g/ on the raising of /æ/ is unusual and unexpected, since most environmental effects are the products of the action of a uniform articulatory apparatus and operate in the same way across dialects.

Table 3 summarizes some reported patterns of TRAP raising by following consonants. Examples of DARE speakers’ partial Bark-Difference vowel spaces show that for the General American English (GAE) speaker (Figure 2) the apical and velar tokens are in the same general area of the vowel space. For the DG speaker (Figure 3), apical tokens are higher in the vowel space than velar tokens. Lastly, the GD speaker (Figure 4) presents velar tokens that begin separating from apical tokens by increasing their peripherality. All three speakers are from southeastern Wisconsin. Table 3. Order of selected conditioning effects of following consonants on /æ/

raising. Conditioning Most Least GAE n d, ɡ DG – NCS generally n d ɡ DG – Small-town Michigan θ, ð, l n d ɡ GD – Wisconsin n ɡ d


278

Figure 2. Vowel plot for speaker General American English (GAE) speaker,

female, b. 1890, Menomonee Falls, WI.


279

Figure 3. Vowel plot for speaker who produces BAD-vowels over BAG-vowels

(DG) speaker, male, b. 1920, Greenfield, WI.

Figure 4. Vowel plot for speaker who produces BAG-vowels over BAD-vowels

(GD) speaker, female, b. 1923, Menomonee Falls, WI.


280

Much discussion of short-a tensing focuses on ‘leading environments’, segmental contexts showing most advanced change. Given apparently dramatic differences in phonetic conditioning of a sound change in progress, then, our question is this: Are the differences in environment conditioning an old pattern, going back to the beginnings of /æ/ raising, or was the original phonetic conditioning the same across regions, with differences emerging later? Evidence from LANCS, LAUM, WELS and DARE points to changes that can be understood as precursor to subsequent DG and GD patterns. If a DG pattern is found among older and younger speakers, we have evidence that Wisconsin is participating in the NCS. If we find the GD pattern earliest, it could vitiate the case for Wisconsin’s inclusion in NCS. By contrast, should we find evidence for an early DG pattern later replaced by a GD pattern that would represent prima facie evidence that southeastern Wisconsin once followed an NCS pattern, but deviated from it, perhaps as part of the phonological reinterpretation of the shift.

We begin with LAUM and LANCS data to see how much prevelar raising was transcribed in the 1940s for the generations preceding the assumed onset of short-a tensing. Table 4 shows, that for /æ/ variants in LANCS and LAUM, WI speakers who raise prefer to do so before fortis fricatives (half, ashes, glass) vs. a MN preference for /æ/ raising before lenis velar plosives (bag). The geographic distribution is important because six of the seven raising variants in MN are in the Duluth and Iron Range area (Map 6). Raising-friendly tokens in WI are observed in a diagonal band (Wisconsin River-Fox River direction, southwest to northeast) for /æ/ before a velar nasal primarily in the word drank, but before the voiced velar stop in the contemporary shibboleth bag, little to no raising is transcribed in LANCS.


281

Table 4. Comparison of TRAP variants in UME. Light shading identifies WI preference for raising before fortis fricatives; dark shading identifies MN preference for raising before lenis velar plosives.

half ashes Glass bag MN WI MN WI MN WI MN WI Ø 1 2 2 1 Non-raised variants

[a] 1 1 1 [a˔] 1 1 [æ] 48 41 48 35 12 44 8 32 [ææ ],[ææ˔] 8 3 1 42 47 7

[æ˕],[æ ] 2 3 6 2 2 1 5 Raised variants

[æ˔] 4 4 3 9 2 5 5 3

[ɛ] 1

[ɛə],[ɛɪ] 1 1

[ɛ˕] 1 Now we turn to audio recordings in WELS and DARE. For WELS, 20 of 55 speakers only read the Arthur passage and this changed slightly between WELS and DARE. Most important, Arthur lacks the voiced velar environment, and some conversations lack examples of the target vowel in this environment (e.g. MN001). In short, we make the best of bad Arthur data by examining the DARE data from southeastern WI. Fifteen DARE recordings were used, all from 1968. Speakers’ ages at the time of recording ranged from 35 to 80, with birth years 1888-1933, as in Table 6.


282

Map 6. Distribution of the TRAP vowel from LAUM and LANCS where

speakers produced at least one raised variant of /æ/ before anterior consonants (labial or alveolar; black circles), before a velar (downward empty triangle) or before all consonant places of articulation (black triangle).

Table 5. Words from Arthur used in the present study. Some words appear more

than once (e.g., back, that, etc.).

Following Consonant Labials: perhaps, happened, half Apicals: that, asked, last Velars: back, exactly Nasals: answer, aunt, answered, can’t

To understand the phonetic basis of /æ/ shifting and compare results with NCS Cities regions (e.g., Thomas 2006), tokens were taken from the “Arthur the Rat” reading passage (Table 5). Words were separated into those with /æ/ before labials, apicals, velars and nasals. Since the passage only contains voiceless coda consonants following /æ/, in some cases, additional vowels were measured from the spoken passages for discussion only. Goldstein (1976) showed that formant trajectories are often more informative to listeners than just an individual point of measure of a vowel. This is particularly important for /æ/ because it is often diphthongized in Wisconsin (e.g., /æ/ > [ej], [ɛa], [æɛ], [æ a]). Thus, two measurements were taken for each token.

Overall, females from larger cities (Janesville) or areas farther southeast (Burlington) are expected to display more and earlier shift compared to males from northern (Manitowoc) or rural areas (Hustisford). Geographic proximity to urban


283

centers is important for NCS (Callary 1975). Second, women tend to pick up innovations faster than men (Labov 1994). Finally, the western boundary of the shift cuts roughly through the middle of southern Wisconsin, leading us to expect that the shift spread into the state from the south and east.

Evaluation of raising was made first by observing, on a speaker-by-speaker basis, the relation of the values for members of a coda group (labials, apicals, velars) to the nasals. We considered primarily the location of the first vowel quality, and secondarily the direction and length of the separation between the first and second measures.

Subjects were divided into three groups based on the relation of /æ/ to a following nasal as compared to the vowel before the other classes of sounds. Vowels in the first group of subjects, the General American English (GAE) group, show pre-nasal /æ/ higher and more front than other groups, pre-velar /æ/ not as high as nasals but back in the mouth, and pre-apical /æ/ back and lower than other environments. The second group represents a general raising of /æ/ before apicals (DG), argued to be the dominant NCS pattern as noted. The final group consists of subjects whose pattern displays either a greater forward or upward movement of the velars. The pre-velar vowel qualities should fall above or in front of the pre-nasal vowels. A strongly shifting speaker can belong to both the DG and GD groups if both their pre-velar and pre-apical vowels are shifting. Additionally, to be considered as falling into one of the shifting groups, one need only display a pattern of some subgroup raising or fronting. For example, if a speaker raised the apicals in prominent positions and kept the unstressed vowels low and more central in the mouth, they would be considered part of the DG group provided the split was fairly systematic. Finally, inclusion of vowel fronting with raising addresses the issue of peripherality in a vowel space.


284

Table 6. DARE subjects from southeastern Wisconsin used in the present study. Subjects are listed by general acoustic pattern.

DARE Subject Number

Observed Pattern

Gender Birth Year

Age At Recording In 1968

Urban, Rural/ Region

Dialect City

WI022 GAE M 1888 80 sc/SW Janesville WI049 GAE F 1890 78 sc/CE Menomonee

Falls WI062 DG F 1890 78 v/SE Burlington WI019 GAE M 1892 76 sc/SW Janesville WI021 GAE F 1892 76 sc/SW Janesville WI010 GAE M 1898 70 r/NW Hustisford WI020 GAE F 1908 60 sc/SW Janesville WI011 GAE F 1908 60 r/NW Hustisford WI018 GD F 1914 54 v/CW Jefferson WI071 GD F 1916 52 sc/NE Manitowoc WI017 DG M 1918 50 v/CW Jefferson WI048 DG M 1920 48 lc/CE Cudahy WI050 GD F 1923 45 sc/CE Menomonee

Falls WI013 GD F 1926 42 v/CW Jefferson WI047 GD F 1933 35 lc/CE Milwaukee NB Codes for location size based on population figures accompanying the DARE recordings (lc=large city, sc=small city, v=village, r=rural). Regions include S=south, C=central, N=north and E=east and W=west for relative arrangement of the locations. As noted, speakers were assigned to groups based on several characteristics. That grouping is shown in the ‘Observed Pattern’ column in Table 6. First, there appears to be a difference between speakers 60 or older compared to younger ones. Second, among the older speakers, only the speaker in the southernmost location (WI062) appeared to be shifting /æ/. This pattern resembles the Northern Cities DG pattern. Moreover, both males under 60 show the more conservative DG pattern, rather than the GD pattern seen in the younger females. Above we asked whether differences across consonant environments that co-vary with vowel raising go back to the beginnings of TRAP raising, or whether the original phonetic conditioning was the same across these regions, with differences emerging later. It appears that the DG pattern entered WI initially and then the GD pattern emerged later. We see this among older speakers displaying some apical raising, particularly the speaker from Burlington (WI062). The GAE speakers’ use of prosodically-influenced raising and diphthongal realizations along with the coarticulation of vowels with following nasal and velar consonants facilitated the adoption of the DG system. That pre-apical raising was prevalent in southeastern WI prior to the WELS recordings is seen in LANCS data (Map 6).

Our analysis of vowel measurements and vowel trajectories has uncovered much about DG speakers. First, long before NCS is generally thought to have


285

emerged, we find evidence for pre-apical raising. Indeed, the oldest DG speaker, the female closest to Chicago, was born already in 1890, over a half-century before one expects to find DG patterns. Her gender and geographical origins make her well positioned to show early NCS effects. In contrast, the other two DG speakers are males born about 30 years later. This suggests that the DG pattern typical of NCS is older than GD in Wisconsin. Real-time evidence, then, shows that southeastern Wisconsin shows DG patterns earlier than they are reported for Chicago (although see Gordon and Strelluf, this volume).

One might ask whether the GD system came about from the DG system via a DG raising rule and reorientation toward the articulatory bias, the incompatibility of a low front vowel followed by a dorsal stop. Diphthongal pronunciation could be a vehicle for this. The most striking GD speaker was WI018 who had only 8.3% monophthongal tokens. Thus, WI018 is very similar to the innovator or early adopter WI062. The monophthongs highlight that vowels can be short and low (GAE) or short and raised (especially GD speakers). Even with additional tokens from free conversation, all tokens with short distances are pre-apical vowels for WI018 as are six of the eight tokens for WI047 (a GD speaker almost a generation older than WI018). Of the two remaining vowels produced by WI047 with short trajectories, one is the shibboleth bag [bɛɡ]. 6 Conclusion and implications This chapter aims to advance understanding of the GOAT, LOT, THOUGHT and TRAP vowels in Upper Midwestern English, drawing on transcriptions and recordings of UME from archival sources. These ‘new’ resources buttress some recent views, while complicating the picture of these UME features. First, monophthongal GOAT is geographically distributed from early on, but not in ways suggestive of Scandinavian (and/or German) substrate effects. Instead, it appeared across the region and has consolidated its position to become a marker of UME, and stereotype in Minnesota. Second, data points toward the development of LOT/THOUGHT merger early on and across the region broadly, contrary to a widely assumed west-to-east spread. Third, TRAP raising appears far earlier in Wisconsin than earlier work has found, first with a pattern of raising preferably before apicals and later flipping to the pre-velar raising pattern that is now a stereotype of Wisconsin speech in particular. Most importantly, archival data shows widespread variation from early on, contrary to assumptions about linear geographical spread. Upper Midwestern English is emerging less by areal diffusion of features but rather by consolidation of particular patterns introduced as variants during settlement, akin to ‘new dialect formation’ (cf. Salmons and Purnell 2010). The question now is how past UME variation will shape these still-emerging geographical patterns, structurally and socially. Acknowledgements We thank Raymond Hickey for the invitation to contribute to this volume, and Joan Houston Hall, Greg Iverson, Monica Macaulay, Luanne von Schneidemesser and


286

Erik Thomas for discussions and comments on earlier versions. Alison Hinderliter at the Newberry Library was especially helpful in making Harold Allen’s tapes available. The usual disclaimers apply. References Allen, Harold B. 1973-1976. The Linguistic Atlas of the Upper Midwest. 3 vols.

Minneapolis: University of Minnesota Press. Bauer, Matthew and Frank Parker 2008. /æ/-raising in Wisconsin English.

American Speech 83.4: 403-431. Benson, Erica J., Fox, Michael J., and Balkman, Jared 2011. The bag that Scott

bought: The low vowels in northwest Wisconsin. American Speech 86.3: 271-311.

Britain, David 2002. Space and Spatial Diffusion. In: J.K. Chambers, Peter Trudgill and Natalie Schilling-Estes (eds) The Handbook of Language Variation and Change, Vol. 1. Oxford: Blackwell, pp. 603-637.

Callary, Robert E. 1975. Phonological Change and the Development of an Urban Dialect in Illinois. Language in Society 4:155-169.

Carver, Craig 1987 American Regional Dialects: A Word Geography. Ann Arbor: University of Michigan Press.

Cassidy, Frederic G. 1948. On collecting American dialect. American Speech 23.3/4: 185-93.

Cassidy, Frederic G., and Audrey Duckert. 1953. A Method for Collecting Dialect. Publications of the American Dialect Society, 20.

Cassidy, Frederic G., and Joan Houston Hall 1985-2012. The Dictionary of American Regional English. Cambridge, MA: Harvard University Press.

Goldstein, Ursula G. 1976. Speaker-identifying features based on formant tracks. Journal of the Acoustical Society of America 59: 176-182.

Gordon, Matthew J. 2000. Phonological correlates of ethnic diversity: Evidence of divergence? American Speech 75.2: 115-136.

Gordon, Matthew J. 2004. New York, Philadelphia, and other northern cities: Phonology. In: Bernd Kortmann and Edgar W. Schneider (eds) A Handbook of Varieties of English, Volume 1. Berlin: Mouton de Gruyter, pp. 282-299.

Hartman, James 1985. Guide to pronunciation. In: Frederic Cassidy (ed.) Dictionary of American Regional English, Volume 1, xli-lxi.

Jacewicz, Ewa, Joseph Salmons, and Robert Fox 2006. Prosodic prominence effects on vowels in chain shifts. Language Variation and Change 18.3: 285-316.

Jacewicz, Ewa, Robert Allen Fox, and Samantha Lyle 2009. Variation in stop consonant voicing in two regional varieties of American English. Journal of the International Phonetic Association. 39.3: 313-334.

Kerswill, Paul, and Peter Trudgill 2005. The birth of new dialects. In: Peter Auer, Fans Hinskens, and Paul Kerswill (eds) Dialect Change: Convergence and Divergence in European Languages. Cambridge: Cambridge University Press, pp. 196-220.

Kurath, Hans 1964. A Phonology and Prosody of Modern English. Ann Arbor: University of Michigan Press.


287

Labov, William 1966. The Social Stratification of English in New York City. Washington, DC: Center for Applied Linguistics.

Labov, William 1991. The three dialects of English. In: Penelope Eckert (ed.) New Ways of Analyzing Sound Change. New York: Academic Press, pp. 1-44.

Labov, William 1994. Principles of Linguistic Change. Vol. 1: Internal factors. Oxford: Blackwell.

Labov, William 2001. Principles of Linguistic Change. Vol. 2: Social factors. Oxford: Blackwell.

Labov, William, Sharon Ash, and Charles Boberg 2006. The Atlas of North American English: Phonetics, phonology and sound change. Berlin and New York: Mouton de Gruyter.

Litty, Samantha 2014. Stop. Hey. What’s that sound? Initial VOT in Wisconsin German and English. Paper presented at the twentieth Germanic Linguistics Annual Conference, Purdue University.

Ostergren, Robert C. 1988. A Community Transplanted: The trans-Atlantic experience of a Swedish immigrant settlement in the Upper Middle West, 1835-1915. Uppsala: Acta Universitatis Upsaliensis.

Ostergren, Robert C. 1997. The Euro-American settlement of Wisconsin, 1830-1920. In: Robert C. Ostergren and Thomas R. Vale (eds) Wisconsin Land and Life. Madison: University of Wisconsin Press, pp. 137-162.

Preston, Dennis 2002. Perceptual dialectology: Aims, methods, findings. In: Jan Berns and Japp van Marle (eds) Present-day Dialectology: Problems and Findings. Berlin: Mouton, pp. 57-104.

Purnell, Thomas 2008. Pre-velar raising and phonetic conditioning: Role of labial and anterior tongue gestures. American Speech 83.4: 373-402.

Purnell, Thomas 2010. Upper Midwestern [o]: Differences between open- and closed-syllable word classes. Presentation at the of the American Dialect Society Annual Meeting, Baltimore, January 7-9.

Purnell, Tom, Eric Raimy, and Joseph Salmons 2009. Defining dialect, perceiving dialect and new dialect formation: Sarah Palin’s speech. Journal of English Linguistics 37.4: 331-355.

Purnell, Tom, Eric Raimy, and Joseph Salmons (eds) 2013. Wisconsin talk: Linguistic diversity in the badger state. Madison, WI: University of Wisconsin Press.

Purnell, Thomas C., Joseph C. Salmons and Dilara Tepeli 2005a. German substrate effects in Wisconsin English: Evidence for final fortition. American Speech 80:135-164.

Purnell, Thomas C., Joseph C. Salmons, Dilara Tepeli and Jennifer Mercer 2005b. Structured heterogeneity and change in laryngeal phonetics: Upper Midwestern final obstruents. Journal of English Linguistics 33: 307-338.

Remlinger, Kathryn, Luanne von Schneidemesser and Joseph Salmons 2009. Revised Perceptions: Changing dialect awareness in Wisconsin and the Upper Peninsula. American Speech 84: 177-191.

Rogers, Blake 2014. Energy Acoustic Measurements in the Stop Voicing Contrast. PhD Dissertation, University of Wisconsin – Madison.

Salmons, Joseph and Thomas Purnell 2010. Contact and the development of American English. In: Raymond Hickey (ed.) Handbook of Language Contact, 455-477. Oxford: Blackwell.


288

Thomas, Erik R. 2001. An acoustic analysis of vowel variation in New World English. Durham, NC: Duke University Press.

Thomas, Erik R. 2006. Evidence from Ohio on the evolution of /æ/. In: Thomas E. Murray and Beth Lee Simon (eds) Language Variation and Change in the American Midland. Amsterdam and Philadelphia: John Benjamins, pp. 69-89.

von Schneidemesser, Luanne 2013. Words used in Wisconsin. In: Thomas Purnell, Eric Raimy, and Joseph Salmons (eds) Wisconsin Talk: linguistic diversity in the badger state. Madison, WI: University of Wisconsin Press, pp. 68-81.

Zeller, Christine 1997. The investigation of a sound change in progress: /æ/ to /e/ in Midwestern American English. Journal of English Linguistics 25: 142-155.

Fridland and Kendall Western United States --- Page 289 of 525

289

13 Western United States Valerie Fridland and Tyler Kendall 1 Introduction While a number of studies (e.g. Feagin 1986, Fridland 2001, Gordon 2002, Irons 2007, Labov 1966, 1980, Thomas 2001) have treated the substantial variation within Northern varieties and Southern varieties, traditional as well as modern dialectological accounts of the pronunciation of English in the U.S. have generally considered the Western portion of the U.S. as a single dialect region, sometimes – as with the case of the “third dialect” (Labov 1991) – even extending this “Western dialect region” to cover as much as two-thirds of the geographic space of the continental U.S. One reason English in the Western U.S. has been lumped into one large undifferentiated mass may simply stem from the general paucity of modern studies. In comparison to work on the Eastern U.S., only a relative handful of studies have examined the phonetics and phonology of Western speakers in detail (e.g. Luthin 1987, Di Paolo and Faber 1990, Labov, Ash, and Boberg 2006, Fridland and Kendall 2012) and there has not been much investigation into the purported vowel shift, often referred to as the (Northern) California Vowel Shift (Eckert n.d.), occurring within the Western region and the degree to which different settlement and migration patterns within the West may contribute to differences and similarities in shift realization. In part, the relative paucity of linguistic atlas data and early sources that helped document early language change (such as Civil War Veteran Questionnaires) in the North and South for the more recently settled West contributes to our lack of knowledge about this region. This chapter originates from interest in two related questions: When a Western koiné might have formed and how variable the inputs to this koiné were. We attempt to shed light on these questions by conducting an acoustic analysis of the vowel systems of speakers contained in archival recordings housed in several Western archives. For the sake of the present chapter, we limit our focus to speakers in Northern California (CA) and Nevada (NV). Comparing our archival speakers to more recent recordings from speakers from these same places, we examine the degree to which Western speakers born in the latter half of the nineteenth century (recorded in the mid twentieth century) anticipate the vowel systems found in current research (such as Eckert n.d., Labov et al. 2006, Thomas 2011: 148, Fridland and Kendall 2012). We also consider the individual variability and evidence of other regional patterns further East that are found in these data. 2 California and Nevada settlement history Much of our early sociolinguistic information about the areas in which our speakers lived – Northern California and Nevada – comes from Elizabeth Bright’s (1967)


290

dissertation on California and Nevada speech. This work, along with David DeCamp’s study of San Francisco speech (1953, 1959) and Carroll and David Reed’s work more generally in the West (1972), is one of the few sources on early speech in the region. What we do know is that the (White) settlement of the Western U.S. is both more recent and more diffuse (in terms of migratory origin) compared to that of the Eastern portions of the U.S.

Originally inhabited by a large indigenous Native American population, Northern California and Nevada were both part of the Spanish (1542-1821) and Mexican territories (1821-1846), with early European settlement occurring mainly in the late 1700s in Spanish missions and presidios. However, most Spanish and Mexican exploration remained on the coast of California, with little settlement inland (Bright 1967: 11-12). As discussed by Bright, California’s seizure by the U.S. in 1846 and its growing status as a gold-mining center caused its population to explode, with much settlement coming from Eastern states such as New York and Ohio as well as directly from Britain, Germany, China and other, mainly European, countries. Early settlement centered to a large extent on San Francisco, an area that quickly became a transportation and communication hub (Bright 1967, Reed and Reed 1972). And it is this Bay Area settlement, along with the related settlement of Nevada, which we will focus on primarily in this chapter.

According to Bright (1967: 15, 224), the discovery of gold was highly significant to the development of California and Nevada speech. Unlike other patterns of settlement in the U.S. where cohesive groups settled in different areas and some aspects of the initiating groups’ speech persisted for a long time (Wolfram and Schilling-Estes 2006: 29-30), the discovery of gold in California prompted rapid migration of people from all over the U.S. (and other nations) to converge and then migrate outward to other areas (Bright 1967: 57). Thus, the persistence of dialect traits that we see in the Eastern U.S. is not replicated on the West Coast, since centers like San Francisco served more as corralling stations for other points in the West. Other aspects, such as the change in government from Spanish to Mexican and, finally, to American, also influenced the rate and type of development of local speech (such as widespread Spanish borrowings), but the effects of the gold rush era appear to be most critical in laying the foundation for more modern speech (Bright 1967: 226).

As David DeCamp writes in his early study of San Francisco speech (1953, 1959), California’s history before 1900 was very much tied to the history of San Francisco. In addition to migration from other U.S. sites, over a third of the inhabitants of the Bay Area were foreign born, primarily from Ireland and Germany, followed in later years by increasing Italian immigration. Many European immigrants came to California to escape unrest in their native lands, such as the 1848 revolution and general political turmoil in Germany and the Irish Potato famine (1845-8, with its peak in 1847-48 respectively) (DeCamp 1959: 383). As of 1890, Irish immigrants comprised the leading foreign nationality (see Hall-Lew ms for a treatment of the possible Irish influence in a part of San Francisco). Similarly, by 1900, German immigrants consisted of over 30% of the area’s foreign-born population. Though not as numerous, British, Italian and Chinese immigrants also had a sizable presence in San Francisco. Thus, in the late 1800s, our speakers would have found themselves predominantly surrounded by diverse dialects of English along with German, or, in some areas, with Chinese. In addition, the


291

establishment of the transcontinental railroad in 1869, followed by expansion of the Southern Pacific and Santa Fe lines, increased the opportunity for inter-state migration from Eastern and Southern states, with over 70,000 new migrants arriving each year via railroad after its inception (Bright 1967: 21). Reed and Reed (1972) suggest that, overall, foreign settlers left little linguistic trace, an example being the relatively sparse linguistic influence despite heavy early German and Scandinavian settlement (p. 136). DeCamp too indicates that inter-state linguistic influence on the area was greater, as settlers from the East retained their original dialect with English speakers from outside the United States adopting to more general forms of English in following generations. In addition, foreign-born speakers (children in particular) often acquired English and lost or minimized the use of their native languages (p. 31) or settled in ethnic enclave communities (p. 39). In contrast, internal migration from other states did not tend to create the same kind of cohesive communities or language loss experienced by the foreign-born, with economic status playing a greater role in district of settlement (p. 40). A relocated Eastern states’ influence in San Francisco itself, however, may have had somewhat of an ethnic enclave effect on dialect formation in urban neighborhoods, such as the heavily Irish (via New York) settlement of the Mission District. Ongoing work by Lauren Hall-Lew (ms) suggests that such neighborhoods did retain strong characteristics of Eastern settlement, resulting in lingering linguistic effects that echo New York speech characteristics.

Nevada’s population growth trailed that of California until the discovery of rich silver ore was made public in 1859, inspiring a population boom. Though intimately tied to the settlement of California, Northern Nevada’s settlement (beyond Native American and Spanish exploration) began a bit differently than that in California. Before the mining boom, most of Nevada’s European settlement came primarily from the Mormon expansion of the Utah territory in the mid 1800s. There was a large Mormon settlement in the town of Genoa (about 50 miles from present-day Reno and close to the California border) and the Mormons who settled Nevada were primarily involved in agriculture (Bright 1967: 22-23). The discovery of Nevada’s precious ore potential with the Comstock Lode in 1859 changed the nature and purpose of immigration into Nevada. Unlike the early gold rush era in California, mining in Nevada required more equipment and expertise and was, for that reason, typically run by companies from the Bay Area, an important communication and transportation center for Northern Nevada in the mid and late 1800s. As migration into Nevada shifted to reflect mining interests, so did the type of immigrants, with fewer Mormons and many more foreign-born and California-born migrants flocking into the area. Such migrants, however, were less interested in settling the area than in making money and did not establish the same long-term settlement pattern as they had in California (Bright 1967: 23). In addition, in an effort to separate Nevada from Mormon interests and the Utah territory, these non-Mormon settlers helped lead to Nevada’s establishment as a state in 1864. In 1860, about a third of Nevada’s population was foreign-born. European settlers in Nevada were mainly of German, Irish and English descent (Bright 1967: 23) and, like the Bay Area, Nevada also had a significant amount of Chinese settlement. In addition to mining, Chinese laborers were particularly important in the construction of the transcontinental railroad that made the area much more accessible. Beyond foreign immigration, quite a bit of inter-state migration came


292

via both California and New York. More scattered migration from Illinois and Ohio also contributed a Midland influence (Reed and Reed 1972). Most of these settlers came for mining, though some cattle-ranching and agricultural interests spread from the Sacramento Valley (Reed and Reed 1972: 137).

In terms of what we know of early speech in the region, most linguistic atlas work has focused on lexical patterns (Bright 1967, Reed and Reed 1972), with some limited early exploration of the phonology of San Francisco and the Bay Area (DeCamp 1953, 1959). A large atlas project directed by David Reed was begun in 1952 and records of lexical term use and phonetic transcription of informants’ responses were made, though resulting analysis of these records has been limited. Beyond this work, not much early linguistic research was conducted in the California/Nevada area. Typically, the pattern of usage of lexical terms explored by Bright (1967) suggests that Northern California and Northern Nevada fall within the same dialect area, as isoglosses typically group the two areas, with only minor separations (pp. 120-136). This most likely reflects the patterns of settlement, with heavy transnational migration from East to West occurring over the Donner Pass into Northern CA and then a reversal of the migration back into Nevada after the discovery of the Comstock Lode. The primary lexical isoglosses found for both CA and NV instead reflect different migratory routes and settlement patterns in the Southern vs. Northern sections of each state where rugged mountain geography made travel between the areas difficult (Bright 1967: 69, 121).

Reed and Reed (1972) suggest that Nevada shows more of a Midland influence (in lexical items) than California as a result of its later settlement and the flow of immigrants from California, many of whom came not just from Northern but also from Midland states at the time of the main migration flows into Nevada (p. 137). Thus, drawing on migratory history of the area coupled with these earlier works showing traces of a Northern and Midland lexicon, we assume that the primary influence on the speech recordings collected from CA and NV for this project would be both somewhat similar between Northern CA and Northern NV and also reflect the predominately Northern/Midland influences of migrants at the time of our speakers’ births. 3 Contemporary speech in the West Since we are primarily interested in tracing the genesis of the contemporary features that characterize the CA and NV vowel system(s), we first must determine what is most defining of the Western vowel system at present and in contemporary work. William Labov’s (1991) article “Three dialects of English” (Labov) articulates the generally accepted stance that U.S. regional dialects can be divided into three main regional groups based on how they are affected by various vowel shift processes. The most relevant contemporary vowel differences in terms of this regional demarcation are the Northern Cities Shift (NCS) affecting much of the North, the Southern Vowel Shift (SVS) affecting dialects across the South, and the low back merger affecting a large swathe of the continental U.S. which falls into what has been termed the “third dialect” (Labov 1991). According to The Atlas of North American English (ANAE; Labov et al. 2006), the primary distinguishing feature of the West is the low back vowel merger, the conflation of the historical


293

/ɑ/ and /ɔ/ classes into a single vowel phoneme. While most of the West participates in the merger, the Bay Area is unusual in that it has been reported to resist merger (Labov et al. 2006), though several recent studies suggest the merger is now characteristic of San Francisco speech as well (Hall-Lew 2009, ms, Moonwomon 1987, 1991).

Beyond this merger, the California Vowel Shift (CVS), involving the lowering and/or retraction of the front lax vowels and the fronting of the high, and sometimes mid, back vowels, has been identified in some areas of California (Eckert 2008, n.d.). In addition to the general retraction of the front lax vowels, /æ/ is also documented with a pre-nasal split in which, rather than retracting, pre-nasal /æ/ raises extensively. While the changes to the front vowels are specific to the Western region, back vowel fronting and the low back merger are also associated with speech in other regions in the U.S. Back vowel fronting, for example, has a long history in Southern speech, dating as far back as the mid 1800s (Bailey 1997), and is also reported in Midwestern English (Labov et al. 2006). Similarly, the low back merger is also a feature of Eastern New England and Western Pennsylvania speech, with traces of the merger reported in early Eastern U.S. atlas data (Labov et al. 2006: Chap. 9) and found in early recordings (Thomas 2001: 63-70).

Early documentation of the phonological system in the West is quite limited. DeCamp (1959) examined the phonemes of San Francisco speech (based on 25 informants) and suggested that the semi-vowels and consonants of San Francisco resembled that of most of the rest of the U.S., differing “in only a few particulars” (p. 55). However, his description of the low back vowels, in particular the fronted unrounded /ɔ/ variants in some of his speakers, suggests early reflexes of merger may have been present in his informants (p. 60). Later studies in California such as Hinton et al. (1987) also suggest incipient but not wholesale degrees of merger in San Francisco, despite more advanced merger elsewhere in the West. Moonwomon (1987) found a more advanced merger among her younger Bay Area subjects, suggesting the merger was continuing to advance in that area. Back vowel fronting is also mentioned in work in California starting in the 1980s (Hinton et al. 1987, Luthin 1987, Fought 1991). Beyond these two features, there is very little early documentation of any of the other changes associated with the California Vowel Shift prior to the 2first century. A number of recent studies have begun to investigate this shift in earnest in various parts of the West and to study Western vowel phonology more generally (Nelson 2011, Podesva et al. 2013, Wassink et al. 2009), but to a large extent our knowledge of the span of the California Vowel Shift and variability within the region awaits further sociophonetic research on the Western U.S. 4 Obtaining and working with archival recordings Our primary question here is whether any contemporary Western features such as the low back merger, the pre-nasal /æ/ split and lax front vowel retraction can be found in early Western speech as produced by speakers born in Northern California and Nevada in the late 1800s. Undertaking this work, we sought archival recordings from the earliest possible recorded oral history projects, and, perhaps unsurprisingly, finding suitable archival recordings proved quite a task. Most of the


294

oral history projects contained in regional archives were either recorded too recently to fit our needs or had media that were too disorganized to be useful or were just inaccessible. In some cases, relevant recordings could be found, but were on media (such as non-standard reel to reel recordings) for which transfer to digital was too costly or impossible. National archives, such as those at the U.S. Library of Congress, tend not to have local speech recordings, primarily focusing on folklore preservation or national media broadcasts.

All the recordings examined in this chapter were obtained at the Bancroft Library at the University of California, Berkeley and at the Nevada Historical Society, in Reno, NV, with the authors arranging for and funding the digitization of a number of original materials, including reel-to-reel tapes and glass records. We also obtained some recordings at the library at Stanford University and visited several other archives though we have not analyzed recordings found at these locations yet.

Again, although a number of older recordings that might have been useful were found in these locations, many were either in disrepair, not accessible due to lack of funding for equipment that could play the recordings, or were not organized in such a way that the actual media could be located (mainly a problem at historical societies that have limited staffing). Finally, another problem we encountered with mining archives for older recordings was that a number of potentially suitable recordings were found that did not contain sufficient information about or identification of the speakers involved and so had to be rejected for the project. A high quality oral history recording from the 1940s with older speakers is not very useful if no information is available about the demographics or life histories of the speakers.

Thus, in many ways, the most difficult part of this chapter proved to be finding suitable archival recordings, in large part due to inadequate interest in and funding for such projects. Many of these recordings are being lost to us as research resources decrease every year, as glass recordings break and reel to reels become damaged, when digitization would prevent such loss. But to rescue such material would require much more funding than any of these archives or libraries have. As a benefit of our work, a number of recordings that were at risk of being lost forever are now converted to digital recordings, which were made available to the source agency as we converted them. As a side-note of this project, we urge researchers to look locally toward the preservation of these rich resources and to use any additional funds they have to digitize at least small parts of local collections or else much of our rich linguistic heritage will be forever lost.

Despite these difficulties, we were able to obtain and analyze several relatively good quality recordings that were similar in type (oral histories). For the present chapter, we examine two speakers from California and two speakers from Nevada.

Our first California speaker, William Edward Colby, was born in Benicia, CA, in the Bay Area, near San Francisco, in 1875. His father, Gilbert Colby, was a “forty-niner”, arriving from New England in the gold rush era. His mother, Caroline Smith, came from Maine and traced her ancestry back to the Mayflower. The Colbys were a very successful family, with Gilbert both an assemblyman and state senator in California and William becoming a lawyer and an early founder of the Sierra Club. Our other Bay Area speaker, Frank Tracy Swett was born in 1869


295

and raised in San Francisco, one of eight children of John Swett, originally from Vermont, and Mary Louise Tracy, from Connecticut. John, the father, was a California educator and politician and was credited with establishing the free public school system in California. Recordings for both of these speakers were obtained through the Bancroft Library’s oral history archives. Swett was recorded around 1964 as a part of a project by the Sierra Club to record memories of John Muir, its founder. Colby’s recording was made in 1959 but was also centered on recounting early topics related to the Sierra Club.

In Nevada, our female speaker, Emma Bowler Welsh, was recorded in 1959. Born in the 1880s, she grew up primarily in Hawthorne and Reno, Nevada, graduating from Reno High School in 1902. Her grandparents, the Curlers (originally from Vermont and of Dutch origin), were early pioneers in Nevada, settling in Nevada in 1859. Her recording recounts the history of her family. Unlike the other 3 speakers we examine here, which contain interview speech and more spontaneous narratives, Bowler’s talk appears to be prepared and scripted. Our male Nevada speaker, William Thomas, was born in 1876 in Austin, Nevada. His parents were British (from Plymouth) and moved to Nevada before he was born. After a brief move to Idaho, his family returned to Austin, NV, when he was 5 or 6 and he remained there until early adulthood, subsequently moving to other parts of the West before returning to Nevada. His recording is one of a series of oral histories from this rural Nevada settlement recorded between 1959-1961.

One caveat of this work, and, we anticipate, of all work with “found” archival recordings from this period, is that the recordings were more difficult to work with than typical sociolinguistic interview recordings or lab-based recordings. Our recordings had tape quality degradation making formant tracking difficult and, in our quest to obtain the oldest recordings available for the oldest speakers, we necessarily worked with older speakers than we often would. The speakers examined range in age from their 70s to 95 (Swett). 5. Modern data and baseline Before we examine the data from our archival speakers, it is useful to look first at specific examples of modern Western speakers from CA and NV. Figures 1 and 2 show two modern Northern CA speakers, Hana and Mia, young adult females (between the ages of 18 and 25) from Sebastopol, about 50 miles north of San Francisco, in Sonoma County, CA, and Lafayette, in Costa County, just east of Berkeley, CA, respectively. These modern speakers come from a large and growing database of regional speakers (Fridland and Kendall 2012, Kendall and Fridland 2012). The plots for the modern speakers are based on reading passage and word list elicitation and, unfortunately, do not contain the same vowel classes and subclasses as the archival speakers. However, these modern plots depict a range of vowels of interest and, we believe, are still useful comparators. In all plots, formant values were extracted using the formant analysis tools in Praat (Boersma and Weenink 2013) and are given in raw Hz. Individual tokens are shown in gray in the background. (For the modern speakers, tokens in all caps depict words from the word list while sentence case is used for tokens from the reading passage.) Ellipses


296

are used throughout to indicate one standard deviation around the category mean when more than two tokens are available for a given vowel class or subclass.


297

Figure 1. Hana, young adult female from Sebastopol, CA


298

Figure 2. Mia, young adult female from Lafayette, CA Looking at both systems, a number of contemporary Western speech features are readily apparent. First, both speakers show extensive low back merger. For Mia, both vowel classes overlap in low back vowel space. Hana’s low back vowels are also quite merged, but are not quite as lowered as Mia’s. Both also show a tight clustering of the tokens of these classes, including pre-liquid tokens that tend to lag in the process of merger, indicating the merger has likely gone to completion. Moving to examine their high back vowels, note that we have separated the /u/ class into post-coronal (labeled /tu/) and non-coronal tokens (labeled /u/), as coronal contexts promote fronting for most speakers while non-coronals are often less fronted. Here, we see that both these young women are engaged in prominent back vowel fronting, with only pre-lateral tokens defining the back periphery of their systems. For both speakers, post-coronal /u/ is well front of mid-central and non-coronal /u/ is also relatively fronted, though, as we would expect, somewhat less so than in coronal contexts. /o/, though still back of /ʌ/, has also moved front to some degree.

Both low back merger and back vowel fronting are features that define a fairly broad swathe of American regional dialects. We also want to look at the time-depth of features, such as those associated with the Californian Vowel Shift (CVS), that more uniquely define Western speech. The most defining CVS shift, beyond back vowel merger, is the retraction of the front lax system, particularly /æ/, which also exhibits a nasal split. Looking at Mia and Hana, we see that they are clear


299

participants in this component of the shift – both exhibit greatly retracted /æ/ classes, appearing essentially low central rather than low front in acoustic position. In addition, both speakers show a much higher and fronter pre-nasal /æ/ location, with /æɴ/ firmly in front vowel space and having moved higher than even their lax mid-vowel class /ɛ/. Clearly these speakers exhibit unequivocally differentiated nasal and non-nasal /æ/ tokens.

Similarly retracted, though not as extensively, are their other front lax vowels. Notably, /i/ and /e/, in contrast to /ɪ/ and /e/, define the high vowel system for these speakers. Finally, /ɑɪ/ and /ɑʊ/ fronting in these modern systems is not as advanced as we find in other dialects in the U.S., with both classes in mid-central position, as opposed to the greatly fronted position of these vowels in, for example, the South.

Figure 3. Jocelyn, young adult female from Las Vegas, NV


300

Figure 4. Ryan, young adult male from Reno, NV Depicted in Figures 3 and 4 are our modern Nevada speakers, Jocelyn and Ryan, who show similar systems to those of our modern CA speakers, though perhaps not quite as advanced in some of the shifts just discussed. Both sets of speakers show similarly fronted high back vowels, with post-coronal /u/ (/tu/) realized essentially in the front vowel system, although Ryan shows a relatively conservative non-coronal /u/ position. The more noticeable coronal/non-coronal split in relative fronting suggests that Ryan is, in general, less participatory in the Western vowel features we are examining here. Likewise, while both NV speakers do show substantial overlap in their low vowel system, the overlap is not as complete in Ryan’s system and the tokens of the two low back classes are not as tightly clustered (visible through the locations of the ellipses which do not overlap as completely), which may suggest the merger is not as complete in the Nevada area as it is in Northern CA (or that there may be a gender difference in advancement). However, both speakers show substantially overlapping low back classes. The Nevadans’ front vowels share many of the same CVS reflexes as the CA speakers, with /æ/ greatly retracted into mid-central position and a clear pre-nasal split. Pre-nasal /æ/ is not as high relative to /ɛ/ as we find in the CA systems which may also suggest greater advancement in lax vowel retraction among the CA speakers. Though, again, for our NV speakers as well, the vowels, /i/ and /e/, are located in high vowel space, with lowering in the front lax vowels apparent. So, in summary, our modern speakers have a number of identifying features that characterize the


301

modern Western system. It also deserves mention, as a side note here, that the modern Nevada speakers (though Ryan is somewhat less advanced), show many of the features of the so-called California Vowel Shift, indicating that this shift is perhaps more ubiquitous to the larger region than acknowledged by its conventional label. 6 Archival data and analysis We now move to look at our archival speakers to see if the features present in the speech of modern Californians and Nevadans have any presence in this earlier era. Again, our main interest in this chapter is to determine whether there are any incipient tendencies toward the modern vowel shifts affecting contemporary Western speech. The most identifiable recent shifts in Western speech are the low back vowel merger, the split and retracted /æ/ system along with front lax retraction more generally, and /u/ (but little /o/) fronting. Recalling Figures 1 – 4, we see that all of these tendencies were reflected in our modern speakers from California and Nevada.

Figures 5 – 8 depict the vowel spaces for each of the four archival speakers, two speakers, William Colby and Frank Swett, from the Bay Area (California) and two speakers, Emma Bowler and William Thomas, from Nevada. All of our speakers’ parents settled in different areas of the West – but all were areas primarily settled during the height of mining booms. Thus, our speakers were born in an era of rapid settlement and disparate language and dialect influences. In presenting these plots, we would suggest our speakers represent the second stage of dialect development on the path of the new Western koiné – the first generation of native children, a period of variability based on the variable inputs of parents from a variety of languages and dialects (Trudgill 1986, 1998) – something we will discuss more after examining the data.

Before discussing these plots, two notes are in order: First, since back vowel fronting involves advanced fronting primarily in post-coronal contexts and very little fronting in pre-liquid environments, we have separated the /u/ tokens by a post-coronal (e.g. two, news) vs. non-coronal (e.g. hoover, food) distinction as well as by pre-liquid contexts (e.g. school) as we did for the modern comparators to better examine any early tendencies driven by these contextual effects. Post-coronal tokens are depicted in the plots as /tu/, with /uʟ/ depicting pre-liquids and /u/ depicting all other /u/ class vowels. In addition, as alluded to in our discussion of the difficulties with archival recordings, these recordings were often more difficult to analyze acoustically than modern recordings conducted for sociophonetic analysis. While we have done our best through careful analysis to limit the “noise” in our measurements, there is generally a greater amount of variability in the archival figures resulting from both the speakers’ ages and recordings’ quality (e.g. Bowler’s back vowels in Figure 7). That said, consistent trends still emerge in the recordings that suggest our measurements, even if less than ideal, are still quite useful.


302

Figure 5. William Colby, born 1875 in Benicia, CA


303

Figure 6. Frank Tracy Swett, born 1869 in San Francisco, CA


304

Figure 7. Emma Bowler, born 1880s in Hawthorne, NV


305

Figure 8. William Thomas, born 1876 in Austin, NV Beginning with the front vowel subsystem, we find a number of variable aspects of vowel realization across the four speakers. While all four have the traditional alignment of tense and lax vowels, there is quite a range in the degree of overlap and position in front vowel tokens among the speakers. For example, Emma Bowler shows very distinct high/mid and tense/lax vowels – there is no overlap in her tokens in these classes (despite the general noise in our measurements for her) – unlike the other Nevada speaker, William Thomas, who shows a great deal of overlap in his mid tense lax vowel pair, and even with his /ɪ/ and /e/ (FACE) classes. More similarly positioned across our speakers, the low front vowel, /æ/, is realized at a mid-low position for most of the speakers and none show a split nasal system.

Looking back at our earlier modern plots (Figures 1 – 4), we see that there is less tendency toward front vowel retraction in the archival speakers than in the modern speakers. In particular, the low front vowel is neither separated by nasal context effects, as mentioned above, nor particularly retracted. This contrast is very noticeable, for example, in a comparison between Emma Bowler, in Figure 7, and her modern comparators, like Mia (Figure 2) or Jocelyn (Figure 3). Both of these young contemporary women have an extreme nasal split and very retracted non-nasal /æ/ vowel (essentially now a low central vowel) while Bowler clearly does not. Recalling that the front tense vowels /i/ and /e/ define the high front periphery of vowel space, with /ɪ/ and /ɛ/ retracted, for most of the modern speakers, our


306

archival speakers for the most part still retain the more traditional relationship between /i/ and /ɪ/ and /e/ and /ɛ/. This suggests that the CVS shift, as it affects the front lax system, has yet to have appeared in speech in the West for speakers born in the nineteenth century. However, our archival speakers are not without any evidence of variation in the front vowels that may have prognosticated some of the shifts in the front system in modern speech. William Colby’s system, in Figure 5, does show both a slight tendency toward a nasal split (though no /æ/ retraction and /æɴ/ is lower, not higher, than his /æ/) and a relatively low /ɪ/ in relation to /e/. In general, though, there is little to suggest any real CVS tendencies among our archival speakers.

We also see a noticeable contrast between the archival speakers and the modern speakers for the back vowels. Colby, Swett, Thomas and Bowlers’ plots show much less evidence of fronting generally in the /u/ class, confirming the notion that the fronting of back vowels in the West is also a fairly recent change. However, though certainly not fronted like our modern speakers, it is also clear that post-coronal tokens (represented in the plots as /tu/) are realized quite a bit forward of non-coronal tokens, especially apparent in the separation from the /uʟ/ tokens that typically mark the back periphery of vowel space. As discussed above, this contextually based distribution is replicated in modern fronting, with post-coronal tokens typically showing more advanced fronting than non-coronals (Labov et al. 2006).

Thus, this pattern suggests that perhaps early reflexes of /u/ fronting actually do appear in our archival speakers – if we assume that post-coronal fronting caused originally by contextual effects then provided a model (due to misperception, e.g. a lack of perceptual co-articulatory compensation; Ohala 1993) for non-coronal tokens to front. In fact, recent work has suggested this mechanism underlies back vowel fronting in Standard Southern British English, a set of varieties in which such fronting is also attested (Harrington 2012). Similarly, modern Western speakers may have now phonologized a fronted /u/ that, for our archival speakers, was simply contextually based phonetic variation. While our archival speakers do not show either spectrographic or impressionistic evidence of onglided /u/ tokens, it is quite possible that loss of an older feature of ongliding (/ju/) after coronals, coupled with the subsequent retention of fronted position, may have provided the impetus for the contemporary shift if these tokens were perceptually recognized as defining a more fronted /u/ position generally.

In contrast to /u/, /o/ tokens in our archival plots appear back of mid-vowel space, a position similar to that found in several of our modern speakers. This is perhaps not surprising given the unevenness of this shift in the West even in modern speech and the likelihood that /o/ fronting is a subsequent shift to /u/ fronting. There is no evidence in modern U.S. dialects of regions with /o/ but not /u/ fronting. In other words, /u/ fronting appears to be a necessary precursor to shift in the mid-back class and, as such, would not be likely in speakers not participating yet in any generalized fronting of the high back system.

The merged low back vowels are the classic feature of the modern West and our modern speakers maintained very little contrast in these two classes. Early reports from data collected in the 1950s (DeCamp 1953, 1959) suggested that merger had just started making inroads into the Bay Area in several of the speakers


307

measured (although DeCamp notes the shift was starting in some parts of the Pacific Northwest more extensively (1958: 60), and McLarty, Kendall, and Farrington (under review) find clear evidence of merger in the speech of Oregonians born around the turn of the twentieth century), suggesting that our archival speakers most likely pre-date the arrival of this shift in the Northern CA/NV area.

Supporting this suggestion of more recent inception for the merger in this part of the West, William Colby, born in the Bay Area in the 1870s, shows very distinct low vowels. In fact, his /ɔ/ class remains very much in the mid-back system, a striking positional difference from the descended vowel in many modern speakers. Similarly, our other archival speakers show a distinction in their low back vowels, with none exhibiting any overlap in tokens and generally showing a much higher and more back /ɔ/ class than any of our modern speakers. Our archival speakers’ parents mostly stem from New England, an area where the low back merger is prevalent (cf. Thomas 2001: 63-70). However, clearly, none of our archival speakers inherited a merged system which suggests either that their parents systems’ also lacked the merger or that contact with unmerged systems, likely a more prevalent system given the large influx of New York and Midland migrants, provided the acquisitional basis for the pattern we see evidenced here. Looking at William Thomas’ and Frank Swett’s systems, though, we do see a more backed /ɑ/ and lowered /ɔ/ than we see for the other two speakers. We can even recognize how close in distribution some tokens are between the two classes compared to Bowler and Colby’s highly distinct token distribution. Thus, though clearly distinct and not showing much evidence of actual merger, this variability among our speakers in the relationship of these two vowels certainly provides dialect input that could be taken, by the next generation, a step closer to a merged system, particularly if additional movements (such as /æ/ retraction) created a context for the backing of /ɑ/. Since it appears the merger was already making in-roads into Pacific Northwest speech, contact with speakers with a merged system might have also hastened the process along in subsequent generations.

This supports an interpretation that low back vowel merger in Northern CA and Northern NV is a recent shift, originating after the early Western settlement period represented by the pioneer families analyzed here. In addition, it also suggests that the descent of the mid back class (and perhaps unrounding) is a key distinction between the older and modern system that may have played a central role in the tendency toward merger. DeCamp’s data from the 1950s indicates a number of low unrounded /ɔ/ variants among his speakers which may suggest this change in positioning was a necessary precursor to merger.

In general, the data provided from our four speakers contribute several points that are of interest to us in our endeavor to examine the early input to a Western dialect. First, our speakers display vowel positions that clearly predate the beginnings of the major shifts we find in modern Western speakers. Comparing our archival to our modern speakers, we see little in the way of front lax vowel retraction, a split /æ/ system, fronted back vowels or low back vowel merger. At the same time, however, we find quite a bit of variability among our four speakers – speakers who come from, in some cases, similar parental dialect backgrounds but


308

who no doubt had extensive contact with diverse input beyond that of their parents. One speaker shows the beginnings of a separation in the nasal/non-nasal /æ/ system, one speaker shows much more separate front tense/lax classes, two speakers show much more proximal low back vowels than the other two, and all of our speakers show post-coronal /u/ fronting, but no general fronting tendencies. In other words, we find both diverse beginnings in the early Western speech patterns and, perhaps, subtle hints of what may be the foundation of later shift patterns. 7 Conclusion We entered this project with an interest in the foundations of the Western regional accent of the U.S., a remarkably massive dialect region in terms of its geographic distribution (Labov et al. 2006). A fascinating question, of course, is when and how this modern Western system began.

Based on its historical roots and modern similarity across the region, it seems appropriate to view the dialect of the Western U.S. as a koiné, a variety brought about through contact-induced change by speakers of a wide-range of mutually intelligible dialects of the same language (Kerswill 2013, Siegel 1985). Our discussion above has shown that the four archival speakers examined can be interpreted to anticipate some major features of the vowel system of the modern Western U.S. but that they do not actually show these features to any large degree. To what extent do they help us understand the similarities across the expanse of the modern West? One idea is to consider something akin to the hypothesized origin for the general (non-nasal split) raising of /æ/ in the Northern region, possibly deriving from a melting pot of short /æ/ systems that came into contact during the building of the Erie Canal (Labov 2010, Labov et al. 2006). Perhaps, in the case of the West, the groundwork for the opposite tendency, the split and retraction of /æ/, was laid during a similar period of intense dialect mixture resulting from foreign and transnational in-migration. Recalling our archival plots, we do see some variability among our speakers in how the low front system was realized. For example, we see that William Colby’s /æɴ/ is separated from his /æ/ class more than we see for our other speakers. Also, William Thomas’ low front class exhibits quite a distributed pattern of tokens along F1, a more continuous system than that of the other speakers. These two systems suggest that, early in the history of the West, a number of different possible configurations of the low front vowels were evidenced, providing variable input to the formation of a new dialect, such as that proposed by Trudgill (1986), where contact between speakers from different dialect backgrounds leads to leveling and focusing. Similarly, Thomas’ low back vowels are also closer in proximity compared to Colby and Bowler’s systems, with an acoustically lower realization of the /ɔ/ class. Again, though it is not evidence for any of the modern shifts, this data does show that variability was a characteristic of these early systems. This general absence of Western features in the archival speakers and their individual differences actually still supports an interpretation of koineization as the process leading to the modern West. “Rapid” dialect change is a hallmark component of koineization, and between these early Western immigrants and the


309

time of their grandchildren much has changed. According to Trudgill’s stages in koineization and his account of the development of New Zealand English (Trudgill 1986, 1998; see also Kerswill 2013), our speakers – the first native children of immigrants – represent stage two, a stage whose hallmark is massive variability. Thus, it is the next generation, the children of our speakers, in whom we would expect to find accommodation toward a new regional dialect. As a result of exposure to multiple dialect systems at the early stage of koineization exhibited by our archival speakers, for example, re-analysis of the fronted coronal /u/ tokens or the separation of nasal tokens must have occurred and then these forms became leveled toward a less variable realization for the next generation.

Although our archival speakers show only isolated variants that hint at what was to come, DeCamp’s work in the 1950s in San Francisco indicates that speakers born in the early part of the twentieth century were engaged a bit farther along in the journey toward a more cohesive Western system, particularly in the low back system. Largely absent in our archival speakers, CVS tendencies and the low back merger may however have been extant in subtle ways as noted in our early data. In closing, while the data we have presented here are only preliminary and are quite limited in many ways, our work suggests perhaps an incipient tendency toward both back vowel fronting and the nasal /æ/ system in isolated speakers born in Northern California and Nevada in the mid to late 1800s. While, more generally, none of the main shifts found in modern Western speakers are present in these early data, the seeds for shift, even if primarily in the form of idiolectal variation, may have been planted. Presumably, the generation following that of the speakers examined here – luckily a generation whose recordings will be less hard to come by – will shed great light on the rapid dialect leveling that occurred in the expansive West during the twentieth century. Acknowledgments Parts of this research have been supported by National Science Foundation grants # BCS-0518264 & BCS-1123460 (PI Fridland), and BCS-1122950 (PI Kendall). We are grateful to Craig Fickle, at the University of Oregon, and Erin Golden and Sohei Okamoto at the University of Nevada, Reno, for help with the research reported here. References Bailey, Guy 1997. When did Southern American English begin? In Edgar

Schneider (ed.) Englishes Around the World 1. Studies in Honour of Manfred Görlach. Amsterdam: John Benjamins, pp. 255-275.

Bigham, Doug. 2010. Correlation of the Low Back Vowel Merger and TRAP-Retraction. Penn Working Papers in Linguistics 15.2: 21-31.

Boersma, Paul and David Weenink 2013. Praat: Doing phonetics by computer. (Computer Program).

Bright, Elizabeth 1967. A Word Geography of California and Nevada. PhD dissertation, University of California, Berkeley.


310

DeCamp, David 1953. The Pronunciation of English in San Francisco. PhD dissertation, University of California, Berkeley.

DeCamp, David 1959. The pronunciation of English in San Francisco (Part II). Orbis 8: 54-77.

Di Paolo, Marianna and Alice Faber 1990. Phonation differences and the phonetic content of the tense-lax contrast in Utah English. Language Variation and Change 2: 155-204.

Eckert, Penelope. n.d. Vowel shifts in Northern California and the Detroit suburbs http://www.stanford.edu/~eckert/vowels.html (accessed 12 December 2013)

Eckert, Penelope 2008. Where do ethnolects stop? International Journal of Bilingualism 12: 25-42.

Feagin, Crawford 1986. More evidence for vowel change in the South. In: David Sankoff (ed.) Diversity and Diachrony. Philadelphia and Amsterdam: John Benjamins, pp. 83-95.

Fought, Carmen 1999. A majority sound change in a minority community: /u/-fronting in Chicano English. Journal of Sociolinguistics 3: 5-23.

Fridland, Valerie 2001. Social factors in the Southern Shift: Gender, age and class. Journal of Sociolinguistics 5: 233-253.

Fridland, Valerie and Kathryn Bartlett 2006. The social and linguistic conditioning of back vowel fronting across ethnic groups in Memphis, TN. English Language and Linguistics 10: 1-22.

Fridland, Valerie and Tyler Kendall 2012. Exploring the relationship between production and perception in the mid front vowels of U.S. English. Lingua 122.7: 779-793.

Gordon, Matthew J. 2002. Investigating chain shifts and mergers. In: J. K. Chambers, Peter Trudgill and Natalie Schilling-Estes (eds), The Handbook of Language Variation and Change. Oxford: Blackwell Publishing, pp. 244-266.

Hall-Lew, Lauren 2009. Ethnicity and Variation in San Francisco English. Ph.D. Dissertation, Stanford University.

Hall-Lew, Lauren. manuscript. “I went to school back East ... in Berkeley”: San Francisco English and San Francisco identity. http://www.lel.ed.ac.uk/~lhlew/Hall-Lew_underreview_VoxCA.pdf (accessed 12 December 2013).

Harrington, Jonathon 2012. The coarticulatory basis of diachronic high back vowel fronting. In: Maria-José Sole and Daniel Recasens (eds) The Initiation of Sound Change. Perception, Production, and Social Factors. Amsterdam: John Benjamins, pp. 103-122.

Hinton, Leanne, Sue Bremmer, Hazel Corcoran, Jean Learner, Herb Luthin, Birch Moonwomon, and Mary Van Clay 1987. It’s not just Valley Girls: A study of California English. Proceedings of the Annual Meeting of the Berkeley Linguistics Society 13: 117-127.

Irons, Terry 2007. On the status of low back vowels in Kentucky English: More evidence of merger. Language Variation and Change 19 137-180.

Kendall, Tyler and Valerie Fridland 2012. Variation in perception and production of mid front vowels in the U.S. Southern Vowel Shift. Journal of Phonetics 40: 289-306.

Kendall, Tyler and Erik Thomas 2012. Vowels.R: Vowel manipulation, normalization, and plotting. R package version 1.2. http://CRAN.R-

http://www.stanford.edu/~eckert/vowels.html

http://www.lel.ed.ac.uk/~lhlew/Hall-Lew_underreview_VoxCA.pdf

http://CRAN.R


311

project.org/package=vowels Kerswill, Paul 2013. Koineization. In: J.K. Chambers and Natalie Schilling (eds),

The Handbook of Language Variation and Change. Second edition. Oxford: Wiley-Blackwell, pp. 519-536.

Labov, William 1966. The Social Stratification of English in New York City. Washington DC: Center for Applied Linguistics.

Labov, William 1980. Locating Language in Time and Space. New York: Academic Press.

Labov, William 1991. The three dialects of English. In: Penelope Eckert (ed.) New Ways of Analyzing Sound Change. New York: Academic Press, pp. 1-44.

Labov, William 2010. Principles of Linguistic Change, Volume III: Cognitive and Cultural Factors. Oxford: Wiley-Blackwell.

Labov, William, Sharon Ash and Charles Boberg 2006. The Atlas of North American English: Phonetics, Phonology and Sound Change. Berlin: De Gruyter.

Luthin, Herbert 1987. The story of California (ow): The coming-of-age of English in California. In: Keith M. Denning, Sharon Inkelas, Faye McNair-Knox and John Rickford (eds) Variation in Language: NWAV-XV at Stanford (Proceedings of the fifteenth annual New Ways of Analyzing Variation conference). Stanford, CA: Department of Linguistics, Stanford University, pp. 312-324.

McLarty, Jason, Tyler Kendall and Charlie Farrington. Under review. Investigating the development of the contemporary Oregonian English vowel system. In: Valerie Fridland, Betsy Evans, Tyler Kendall, and Alicia B. Wassink (eds) Speech in the West: Pacific Coast. Publication of the American Dialect Society. Durham, NC: Duke University Press.

Moonwomon, Birch 1987. Truly awesome: ( ) in California English. In: Keith M. Denning, Sharon Inkelas, Faye McNair-Knox and John Rickford (eds) Variation in Language: NWAV-XV at Stanford (Proceedings of the fifteenth annual New Ways of Analyzing Variation conference). Stanford: Department of Linguistics, Stanford University, pp. 325-336.

Moonwomon, Birch 1991. Sound Change in San Francisco English. Berkeley, CA: University of California, Berkeley dissertation.

Ohala, John J. 1993. The phonetics of sound change. In: Charles Jones (ed.), Historical Linguistics: Problems and Perspectives. London: Longman, pp. 237-278.

Nelson, Katherine 2011. A cross-generational acoustic study of the front vowels of native Oregonians. Paper presented at New Ways of Analyzing Variation 40. Washington, DC: Georgetown University.

Podesva, Robert, Jeremy Calder, Hsin-Chang Chen, Annette D’Onofrio, Isla Flores Bayer, Seung Kyung Kim, and Janneke Van Hofwegen 2013. The status of the California Vowel Shift in a non-coastal, non-urban community. Paper presented at The American Dialect Society 2013 Annual Meeting: Boston, MA.

Reed, Carroll E. and David W. Reed 1972. Problems of English speech mixture in California and Nevada. In: Lawrence M. Davis (ed.), Studies in Linguistics in Honor of Raven I. McDavid. Auburn, AL: University of Alabama Press, pp. 135-143.

Siegel, Jeff 1985. Koines and koineization. Language in Society 14: 357-378.


312


Thomas, Erik R. 2011. Sociophonetics: An Introduction. New York/Basingstoke, Hampshire: Palgrave Macmillan.

Trudgill, Peter 1986. Dialects in Contact. Oxford: Blackwell. Trudgill, Peter 1998. The chaos before the order: New Zealand English and the

second stage of new-dialect formation. In: Ernst-Håkon Jahr (ed.), Advances in Historical Sociolinguistics. Berlin: Mouton de Gruyter, pp. 1-11.

Wassink, Alicia, Robert Squizzero, Mike Scanlon, Rachel Schirra and Jeff Conn 2009. Effects of style and gender on fronting and raising of /æ/, /e:/ and /ε/ before /g/ in Seattle English. Paper presented at New Ways of Analyzing Variation 38. Ottawa, CA.

Wolfram, Walt and Natalie Schilling-Estes 2006. American English: Dialects and Variation, Second edition. Oxford: Blackwell.

Thomas Ex-Slave Recordings --- Page 313 of 525

313

14 Analysis of the Ex-Slave Recordings Erik R. Thomas 1 Early recordings of African Americans African American English (AAE) is the most heavily studied group of dialects in North America. Its origins are hotly disputed, though, and one of the frustrating reasons is that there are relatively few early audio recordings of African Americans. Racist ideologies in the early twentieth century led to an assumption that African American life was less interesting to radio listeners and less important for scholarship. Of particular interest are recordings of African Americans who had been born as slaves. The small number of known recordings of former slaves have been collected by the Library of Congress and are now publicly available at <http://memory.loc.gov/ammem/collections/voices/title.html>. Often called the “ex-slave recordings” (ESR), they include interviews from a variety of sources, including recordings by professional folklorists such as John and Alan Lomax and John Henry Faulk, recordings made by linguists Lorenzo Dow Turner, Archibald A. Hill, and Guy S. Lowman, Jr., and a few recordings made by private researchers. The recordings vary greatly in their recording media and their sound quality and thus in their usefulness for linguistic analysis. Some of the folklore recordings are of music, which is less linguistically useable than ordinary speech. However, the collection has proved important in linguistic research on the history of AAE. The most notable exposition of them is Bailey, Maynor, and Cukor-Avila (1991), which provides an overview, transcripts of many of the recordings, and linguistic analyses by ten scholars. Other analyses of the ESR include Myhill (1995), Poplack and Tagliamonte (2001), and Sutcliffe (2001, 2003).

There are a number of corpora collected later that also have recordings of elderly African Americans, some born as early as the 1880s. The Linguistic Atlas of the Middle and South Atlantic States (LAMSAS) recorded all interviews conducted after 1950, though not all of the recordings still survive. All interviews conducted for the Linguistic Atlas of the Gulf States (LAGS), made 1968-1983, were audio recorded, and a small sampling of the recordings, including those of sixteen African Americans, are publicly available from the Digital Archive of Southern Speech (<http://wakespace.lib.wfu.edu/handle/10339/37613>) in mp3 format. The Dictionary of American Regional English (DARE) conducted a survey of the entire United States from 1965 to 1970 and audio recorded about half of their subjects, including a significant number of African Americans from the South and non-Southern cities. Information about access may be found at<http://dare.wisc.edu/>. In addition, a variety of smaller-scale surveys have included audio recordings of elderly African Americans (e.g., Pederson 1965; Wolfram 1969; Butters and Nix 1986; Nichols 1986; Cukor-Avila 2001;Wolfram and Thomas 2002, among others). Most recordings from those small-scale projects are not publicly accessible because of human subject protections.

http://memory.loc.gov/ammem/collections/voices/title.html

http://wakespace.lib.wfu.edu/handle/10339/37613

http://dare.wisc.edu/


314

The discussion here, however, will focus on the ex-slave recordings, which predate all the other known collections of recordings of African Americans. The value of these recordings for morphosyntactic research has been demonstrated in Bailey et al. (1991), though the limitations of what they can tell us about earlier AAE has as well. The main limitation is that it is impossible to state how well the individuals whose voices were preserved represent AAE of the generation born in the mid-nineteenth century. One might expect another limitation to be that they are unsuitable for acoustic phonetic analysis because of the primitive recording media and degradation of the recordings over the years. In fact, however, they can be used for various acoustic analyses, and have been by Thomas and Bailey (1998), Thomas (2001), and Thomas and Carter (2006). Discussion of what kinds of acoustic analyses are possible and the problems that the recordings present will ensue. 2 Problematic issues Uncertainties about the ESR revolve around a number of issues. The most basic issue is that of how well the people in the recordings represent African Americans of their vintage. Rickford (1991: 194) raises this problem in asking, “do they represent, albeit non-statistically, the range of social types and experiences in the ex-slave population?” The ESR now include a few natives of Virginia, several from Texas, a few from Alabama, and one each from Mississippi and Louisiana, one (Charlie Smith) who had lived in multiple states, and several Gullah speakers from Georgia and South Carolina. There is also a North Carolina native, but her recording is one of the most incomprehensible ones. While this distribution includes the cotton, tobacco, and rice-producing cultures, it still leaves some gaps. For example, there are no recordings from the French part of Louisiana and no speakers came from areas with relatively sparse concentrations of slaves. There are both male and female subjects. There are also both field hands and house servants, though it is unclear how much of a difference that distinction made to subjects’ speech. Another uncertainty is in the level of rapport the interviewers established with the subjects. As one of the interviewers, John Henry Faulk, discussed in an interview (Brewer 1991), it was difficult for white interviewers to shed their own patronizing attitudes. The interviewers also frequently had their attention diverted by the bulky recording equipment. See also Poplack and Tagliamonte (2001: 69-77). This limitation, one hopes, affects linguistic factors less than it affects the content of the interviews.

The ESR were made with several kinds of recording media. This fact should not be surprising, in that the earliest ones were made in 1932 (or possibly 1931) and the last in 1974. During the 1930s, the recordings were made with discs of various types. Those made during the 1940s mostly employed reel-to-reel magnetic tape, and the two recorded during the 1970s used cassette tapes (<http://memory.loc.gov/ammem/collections/voices/title.html>). Of these recordings, the best sound quality is perhaps to be found on some made with reel-to-reel devices. The early disc recordings tend to have poor retention of higher frequencies, especially above 2 kHz. Whether the problems with higher frequencies were due to weaknesses of the original recording equipment or degradation over the

http://memory.loc.gov/ammem/collections/voices/title.html


315

years is unclear. However, the Library of Congress has been able to amplify the higher frequencies to enhance the recordings so that they sound more understandable. The disc recordings also exhibit occasional crackling noises. The recordings made on cassette tapes have some detectable hiss, a problem endemic to magnetic recordings of any kind but more prominent on narrow cassette tape than on wider reel-to-reel tape. Even the reel-to-reel recordings show some loss of signal components at high frequencies, though.

In spite of the problems with sound quality, the ESR are largely understandable. The only ones that are mostly incomprehensible are the three made by Roscoe Lewis in Virginia, which had very poor reception of sound above 1 kHz, though even in them some of the words can be deciphered. The recording of George Johnson (from Mississippi) is also difficult to understand, but more because of Johnson’s rapid rate of speech than because of recording problems. In the other recordings, there are indistinct words here and there, variously due to sound quality problems, mumbling by the interviewee, or unusual idioms. Nevertheless, it has been possible to produce transcripts of all but a few of the recordings. Transcripts of eleven interviews are included in Bailey et al. (1991). More recently, the Library of Congress has produced its own transcripts, including those for some recordings unknown to Bailey et al., which are available at the website given above. Neither set of transcripts is perfect. For example, in the Laura Smalley transcript in Bailey et al., one passage is rendered as “…out there, an’ brought it to the kitchen. When I was a chil’.” A clause is omitted, and what Smalley actually said was, “…out there, an’ brought it to the house, an’ ?then brought it to the kitchen. When I was a chil’.” Similarly, the Library of Congress transcript for Aunt Phoebe Boyd has Guy Lowman, one of the interviewers, saying “We’re sadly together now,” which makes little sense, in answer to a question from Boyd about Lowman’s association with the other interviewer, Archibald Hill. When one listens to the recording, it is quite clear that Lowman in fact said, “We’re traveling together now.” Rickford (1991) discusses the uncertainties he encountered in transcribing Wallace Quarterman’s Gullah. However, the presence of errors should not be taken to mean that the transcribers were sloppy. Transcription is time-consuming and painstaking, and some of the ex-slaves are exceptionally difficult to understand, so the transcribers are to be commended that their work is as good as it is. The main problems with the transcripts is that, as Wald (1995) asserts, transcriptions of ambiguous items in the recordings may be affected by transcribers’ expectations, which in turn can skew linguistic analyses, particularly of morphosyntactic variants such as presence or absence of –s inflections. 3 Past linguistic analyses of the ESR Transcripts of the ESR have been essential for the linguistic analyses of the interviews, especially for morphosyntactic analyses. Indeed, much of the previous work on the ex-slave recordings has focused on morphosyntactic variables. Three of these studies, Singler (1991) and Poplack and Tagliamonte (1991, 2001), compare morphosyntactic data from the ex-slave recordings with data from other corpora. Singler examines verbal aspect and plural marking of nouns in comparison with usage in Liberian Settler English, whose speakers are descended from freed


316

American slaves. Poplack and Tagliamonte (1991) compare verbal –s marking by the ex-slaves with that by another expatriate population descended from U.S. slaves, that of Samaná in the Dominican Republic. Poplack and Tagliamonte (2001) expand the analysis of verbal -s and add a detailed analysis of past tense marking. All three studies find a great deal of similarity in grammatical and phonological conditioning between the ex-slaves and the expatriates. The broad similarity should be expected, considering that the expatriate groups left the United States, and thus contact with AAE and other forms of American English, at roughly the same period that the ex-slaves were growing up and establishing their speaking norms. Singler, though, notes two puzzling differences: the ex-slaves, at least part of the time, use contracted forms of will, would, and have/had/has, as well as the possessive –’s, while the Liberian settlers’ descendants lack any marking of those forms.

Other articles – Mufwene (1991), Holm (1991), and Sutcliffe (2001) – discuss the ex-slave recordings in terms of creole features. Mufwene focuses on Wallace Quarterman, the only Gullah interview available to Bailey et al. (though the Library of Congress collection contains several others that have surfaced since 1991). He notes that Quarterman uses a mixture of creole forms such as remote time been and non-creole forms such as gerunds and contends that the Gullah of the 1980s is not a decreolized form of the Gullah of Quarterman’s generation. Holm discusses a longer list of constructions, such as completive done, copula absence or realization, pronominal forms, and subject/verb order, and whether they occur in the ex-slave interviews to argue that there was never a widespread creole in the United States. Instead, he states (p. 246) that “the most likely scenario is that blacks born in most parts of the American South spoke a semi-creole from the beginning.” Sutcliffe concentrates on trying to find creole markers that other researchers overlooked. 4 Analyses that cannot be performed on the ESR Morphosyntactic studies require only that the identity of words be discerned correctly. This prerequisite is most difficult to fulfill for certain one-phone morphemes, such as final –s or –ed. The reason is that consonants ordinarily have lower amplitude than vowels and thus are more easily obscured by noise or poor recording equipment performance. Moreover, sibilants add an additional problem because their frication noise occurs at high frequencies that early recording equipment picked up poorly. In comparison to morphosyntactic analysis, however, acoustic analysis has many more points at which debilitating problems could occur. How usable are the ESR for acoustic studies? Here, I will illustrate various kinds of acoustic techniques and how they play out with one of the recordings. The recording to be featured here is that of Phoebe Boyd, who hailed from Dunnsville, in the Tidewater section of Virginia. This recording is one of those whose existence was made known after the publication of Bailey et al. (1991), and thus it is not included in that book. Born about 1848, Boyd was interviewed and recorded by Guy S. Lowman, Jr., and Archibald A. Hill in 1935. In fact, she may have also served as a subject for the Linguistic Atlas of the Middle and South Atlantic States (LAMSAS), as informant VA 15N, though the destruction of the personal information on LAMSAS subjects makes this possibility impossible to verify. Boyd


317

worked as a domestic slave, but after freedom she was involved in tobacco and cotton farming. Her interview typifies the disc recordings within the ex-slave corpus in terms of its sound quality. It is divided into eight sections, each about five minutes long, that represent both sides of four discs.

There are certain kinds of analyses for which the ESR are unsuited. Most voice quality analyses, or at least comparison of voice quality between different interviews, fall into this category. It should be borne in mind that comparisons within a recording are less problematic than comparisons between different interviews. Within a recording, the equipment is constant and recording conditions are relatively constant, so even if some frequencies were captured more poorly than others, it is still possible to examine, for example, whether the speaker is creakier at some points than at others. Comparisons between recordings are where the major problems occur. Because the ESR were made with diverse recording media, their fidelity varies considerably. The equipment was quite inferior to today’s equipment as well. Most of them were recorded in subjects’ homes, which introduces the factors of background noise and possible echo, though such noise is not apparent most of the time in the ESR. Nonetheless, voice quality is among the most sensitive aspects of speech to sound fidelity. As a result, comparisons of voice quality by any particular ex-slave with that of other ex-slaves or with more recent interviews should not be attempted.

Figure 1 shows a wideband spectrogram of the phrase drive on from the Boyd interview. Two recording problems are immediately apparent. One is that there is a considerable amount of noise, visible as the static in the background behind the darker parts of the spectrogram that represent elements of Boyd’s voice. The other obvious problem is the transience, or crackling, on the recording, manifested as vertical dark marks. The noise is relatively constant and would not prevent comparisons from one part of the interview to another. As for the crackling, most of it falls at higher frequencies that are of minimal importance for most aspects of speech, and the few crackles that extend to lower frequencies could be avoided by not taking readings where they occur. However, there are other, subtler problems. An important one for 1930s-era recordings is that the equipment picked up low-frequency sound well but high-frequency sound poorly. The Library of Congress attempted to counteract this problem by enhancing the high-frequency parts of the signal and, apparently, damping the lower-frequency parts. This procedure introduced its own distortion, however, as can be seen in the narrowband power spectrum in Figure 2, taken from the first syllable of chicken as found in the Boyd interview. Ordinarily, either the fundamental frequency (F0) or other harmonics lying in the vicinity of the first formant (F1) have the greatest amplitude in the entire spectrum. However, here, the highest amplitudes occur in the vicinity of the second and third formants (F2 and F3) because of the enhancement. Figure 1. Wideband spectrogram of the phrase drive on from the Boyd interview,

illustrating the noise and transience (crackling) found in the recording.


318

Figure 2. Narrowband power spectrum of the vowel in the first syllable of

chicken from the Boyd interview. Because of enhancement, harmonics near F2 and F3 have greater amplitudes than the lowest harmonic (F0).

One set of commonly measured voice quality features are the harmonics-to-noise ratio (HNR), shimmer, and jitter. The problems with recording quality make the


319

HNR virtually of no value. Likewise, shimmer, which represents the amount of local variation in amplitude, is rendered useless by the background noise. Only jitter, which measures the amount of local variation in F0, can be reliably gauged because F0 is preserved well in the ESR. Another commonly utilized method focuses on phonation, such as breathy and creaky voicing. This method involves comparisons of the amplitudes of the lowest harmonics, which can be examined in narrowband power spectra. In breathy voicing, the lowest harmonic (i.e., F0) has a much greater amplitude than other harmonics, while for creaky voicing, the second or third harmonic will show a greater amplitude than the first, and modal voicing is intermediate. Figure 3 compares a power spectrum from the Boyd interview with one from a recording of a female voice made in a soundproof booth with modern equipment. Both are from the word got uttered in spontaneous speech. Peaks from the lowest nine harmonics can be located easily in the Boyd spectrum. However, there are two problems. First, the greater amount of noise in the Boyd recording, some of it visible as jagged spikes between harmonics, adds amplitude to the harmonics, and not necessarily at equal amounts across different frequencies. Hence, readings of their relative amplitudes are inaccurate. Second, and more damaging, the enhancement applied by the Library of Congress to the recording has the effect of reducing the amplitude of the first harmonic relative to that of higher harmonics, which distorts any assessments of phonation. Figure 3. Comparison of narrowband power spectra, both from the vowel in

utterances of the word got, from Boyd’s speech (left) and that of a woman recorded with modern equipment (right).

As with voice quality analyses, there are some kinds of consonantal analyses for which the ESR are poorly suited. Analyses of frication noise fall into this category.


320

These kinds of analyses rely on spectra of the frication, most often assessing the frequency of the spectral peak and/or the “spectral moments.” The peak is the frequency at which the spectrum reaches its greatest amplitude. The spectral moments include measures of the center of gravity of the energy in the spectrum, how evenly the energy is distributed across frequencies, the way the energy is skewed across frequencies, and the degree of peakedness of the energy. Frication tends to have lower amplitude than the sound produced by vocal pulsing, making it easier for extraneous noise to drown out the frication. Frication also does not show the regular patterning of sound produced by vocal pulsing, making it difficult to distinguish frication noise from extraneous noise. Furthermore, the ESR did not capture the higher frequencies well, but these frequencies are important for analysis of frication, particularly for sibilants. The enhancement applied to the recordings did not eliminate this problem and to some extent may have exacerbated it. Figure 4 compares smoothed average power spectra of two [s] utterances, each covering about 77 ms, one by Boyd and the other by the same female speaker recorded in a soundproof booth as in Figure 3. The spectrum from the modern recording shows much better-defined peaks and valleys than the spectrum of Boyd. The modern recording also shows a decided tilt in favor of higher frequencies, with the highest amplitudes above 7000 Hz, whereas the Boyd recording shows its highest amplitudes below 4000 Hz. A large portion of the energy that the Boyd spectrum contains is due to adventitious noise in the recording introduced by the recording equipment or by deterioration of the discs. The enhancement process undoubtedly introduced some distortion as well. Figure 4. Comparison of smoothed average power spectra of utterances of [s]

from Boyd’s speech (left) and that of a woman recorded with modern equipment (right).


321

5 Analyses involving formant measurements Nevertheless, some kinds of consonantal analyses can be performed successfully on the ESR. The consonantal analyses that can be performed most readily are those that involve formant measurements. Formants, while usually associated with vowels, play a part in nearly all consonants as well. This attribute is most easily seen for approximants such as [l], which behave like vowels in that they exhibit formant structure throughout their course. Figure 5 shows a spectrogram of the phrase the Lord will bless uttered by Boyd with an LPC formant track superimposed on it. The three [l] tokens all show wide spacing between F1 and F2, with F2 falling around 1700 Hz. This broad gap indicates that the [l]s are “clear,” that is, not velarized. In contrast, most varieties of North American English today show a great deal of velarization, even in syllable onsets. In a velarized [ɫ], F2 is quite low, usually with little or no visible gap between it and F1 in a wideband spectrogram. Figure 5. Spectrogram of the phrase the Lord will bless with superimposed

formant tracks. The large difference between of F1 and F2 frequencies for the three [l] sounds indicates that the [l]s are not velarized.


322

Another example of a consonantal analysis that can be performed with the ESR is assessment of rhoticity, or r-fulness. In most varieties of English, /r/ is characterized by lowering of F3. As is well known, however, the /r/ articulation can be lost, in which case a vowel sound such as schwa remains. Non-rhoticity usually occurs in pre-pausal and pre-consonantal contexts, though in AAE and old-fashioned white speech of the U.S. South, the process can be extended to positions before a vowel, such as in for a and carry. F3 tends to stay within a narrow range for most vowels (with a modest increase for high front vowels), so the kind of drop in F3 that typifies /r/ constriction is usually salient in spectrograms. As a result, F3 values that fall within the range of vowels indicate non-rhoticity, while F3 values lower than that of any vowel not adjacent to an /r/ indicate rhoticity. Figure 6 shows an example of the word mother from Boyd’s interview. It can be seen that F3 has about the same frequency in the second syllable as in the first, and impressionistically the token sounds non-rhotic. There certainly is no resonance corresponding to F3 that is close to the F2 resonance. For a rhotic pronunciation, F2 and F3 lie quite close to each other, often resembling a single, wide formant. Figrue 6. Spectrogram of the word mother with superimposed formant tracks. F3

values for the second syllable are no lower than those for the vowel in the first syllable, indicating a non-rhotic pronunciation.


323

Measurements of /l/ and /r/ are not the only kinds of formant analyses that can be performed with consonants. Measurements of formants at the transitions between consonants and vowels are useful for examining place of articulation, and the ESR are suitable for such analyses, even though they are more difficult than with cleaner recordings. However, the most commonly conducted formant analyses are those that are used to determine vowel quality. Vowel formant analyses certainly can be performed on the ESR. There are a few complications that arise, but there are strategies to deal with them.

The main problem for formant analysis of the ESR is the fact that higher frequencies were captured less well than lower frequencies. As noted already, the Library of Congress attempted to counteract this problem by enhancing the higher frequencies. Even before the enhancement, however, it was possible to extract formant readings, at least for F1 and F2. Figure 7 shows a sample spectrogram from one of the ESR – in this case, from the Laura Smalley recording – before the enhancement. The difference in amplitude between lower and higher frequencies is obvious. With recordings of this sort, three strategies can be effective. One strategy is to dispense with linear predictive coding (LPC) and estimate the formant values based on which harmonics have the highest frequencies, as viewed in a narrowband power spectrum. LPC is the usual method for taking formant frequency readings today, and it is included in most spectrographic analysis packages – in fact, many students today do not know of any other method – but it is not the only method. Estimation from harmonic values can work well; see Thomas (2011: 46) for specific details. If one chooses to use LPC, there are two other strategies that can be utilized. One, effective if the drop in amplitude occurs in the range of 2-4.5 kHz, is to lower the analysis range for LPC. Thus, if one sets the upper limit of the formant readings to 4 kHz, none of the measured formant values will exceed 4 kHz. The default upper limit is usually 5 or 5.5 kHz. Of course, lowering the analysis range


324

ordinarily requires lowering the number of LPC coefficients as well so that fewer formant readings will be taken. After all, there are fewer formants in the reduced range. The other LPC-specific method is to use different numbers of LPC coefficients for different formants. This practice is frowned upon for cleaner recordings because, in an LPC analysis, the different formant readings affect each other. Sometimes, however, especially with imperfect recordings, there is no other way to procure readings of all the desired formants. In the recording in Figure 7, for example, there is no single number of LPC coefficients that will yield good readings for all three of the lowest formants. A setting that gives a good reading for F1 will not pick up F3 or, for front vowels, F2. Conversely, in order to capture F2 and F3, the LPC coefficients have to be set so high that F1 appears to be split into two formants: i.e., two separate formant values will appear for what is in reality just one formant, F1.

Figure 7. Wideband spectrogram of a section of the Laura Smalley recording

without enhancement. The amplitude drops off noticeably above 1000 Hz.

Vowel formant measurements can be used for a number of purposes. A common kind of display is an F1/F2 plot of a speaker’s entire vowel system, showing either the individual tokens or, as in Figure 10 for Boyd, the mean values of each vowel class. Plots of entire vowel systems are most useful for looking for shifts of particular vowels because the relative position of the vowels of interest can be compared with the rest of the system. For example, it can be seen in Figure 8 that Boyd shows no evidence of Southern Shift developments. The Southern Shift (e.g., Labov 1991, 1994) involves interchange of the positions of the FLEECE and KIT nuclei and of the FACE and DRESS nuclei, as well as fronting of the GOOSE/TOOT and


325

GOAT nuclei.1 None of these mutations, with the possible exception of TOOT fronting, appears in Boyd’s speech. However, she does show the old-fashioned Virginia allophony of PRICE and PRIZE, in which the nucleus is higher before voiceless obstruents than before voiced obstruents (Kurath and McDavid 1961). It might be noted that she does not show much glide weakening of PRIZE, which characterizes the speech of younger generations of Southerners. Even though her PRIZE glide does not reach the level of her DRESS vowel, it is still strong enough to sound impressionistically like a full glide, at least to this author’s ears. Southerners who sound as if they have glide weakening or outright monophthongization of PRIZE show considerably less formant movement than Boyd does (see Thomas 2001 for examples). Certain other old features are apparent in her speech as well. She maintains distinctions between the BIN and BEN vowels and between the NORTH and FORCE classes. Many Southerners of younger generations merge each of those pairs. Figure 8. Formant plot showing the mean values of the vowels of ex-slave

Phoebe Boyd. Arrows indicate the gliding of diphthongs. Squares signify measurements 35 ms after the onset of the vowel, circles indicate measurements at the midpoint, and triangles represent measurements 35 ms before the offset.

1 The keywords used here are those created by Wells (1982), with additions when necessary. TOOT refers to instances of the GOOSE vowel that fall after a coronal consonant, such as two and do, but not before /l/. These tokens are omitted from the mean values designated as GOOSE in Figure 10.


326

Boyd does not show the fronting of the MOUTH/PROUD/HOW/DOWN complex that typifies Southern White speech. She shows what seems to be an archaic distribution of the diphthongs in this complex, however. The classic Virginia distribution, described, e.g., in Kurath and McDavid (1961), was a system in which the nuclei are higher before voiceless obstruents and lower before voiced consonants and word-finally. However, Lowman (1936) mentioned what was apparently an older distribution in which all tokens showed higher nuclei except those before or after /n/, as in down and now. The lowering would seem to have been related to the effects of nasality on F1 values, as discussed in Thomas (2001: 52-53). Lowman’s actual transcriptions often show higher glides associated with higher nuclei, i.e., [əu] vs. [æʊ]. Boyd’s system does not match the one that Lowman described, but it may have some properties in common. Before voiced obstruents, shown in Figure 8 as PROUD, she shows low nuclei but notably high glides. Before voiceless obstruents, designated as MOUTH, her nuclei are higher than for PROUD but her glides are not quite as high. These differences may have to do with the short durations that occur before voiceless consonants, and short durations have the effect of truncating diphthongs at their onset, offset, or both. In word-final positions, designated as HOW (based on tokens of the words how and now, the only such words in the recording), Boyd shows quite low nuclei and the lowest glides of any in the complex. Her tokens before nasals, designated as DOWN, show apparent raising of the nucleus and mild lowering of the glide. In nasal contexts, the oral F1 tends to become replaced by two nasal formants and the auditory impression often


327

does not seem to match what spectrograms show, creating a muddled situation. Could Boyd’s system have been the one that Lowman (1936) described? Lowman, of course, did not have access to modern acoustic techniques that could have provided greater precision for the phones he attempted to specify by ear.

Another kind of vowel formant analysis involves making a series of measurements through the course of each vowel token. This approach is used to examine the trajectory of a vowel in order to determine how diphthongal it is and, if it is a diphthong, which direction it glides. Figure 9 shows trajectories of twenty tokens of Boyd’s FACE vowel. For each trajectory, 1 indicates a point 1/10 of the duration from onset to offset, 2 a point 2/10 of the duration, and so forth. The actual onset (the 0/10 point) and offset (10/10) are not shown. Formant movement is evident for these tokens, but the dynamics are not the kind typical of diphthongs. Some of Boyd’s FACE tokens begin in the interior of the vowel envelope, move toward the perimeter, and then move back to the interior. This kind of formant movement is characteristic of consonantal transitions, with a single vocalic target approached most closely near the center of the vowel. These examples of Boyd’s FACE vowels come closest to their target when they are closest to the edge of the vowel envelope, though other vowels might have their targets in more interior positions. A true diphthong would show onset and offset values in different places, with a significant portion of the trajectory taken up by a decided rising, falling, backing, or fronting movement. A few tokens among Boyd’s FACE vowels show a steadily outward, and often upward, moving trajectory. Most of these tokens fall before a /k/, however (e.g., take, bake), and dorsal consonants such as /k/ are characterized by F2 transitions that make F2 higher at the offset than it is within the vowel. Thus, these tokens do not actually reflect a diphthongal trajectory. They illustrate the importance of taking the consonantal context into account before deciding whether a vowel can be fairly labeled as a diphthong. The acoustic evidence, then, corroborates the findings of Dorrill (1986), who found, based on the auditory transcriptions of the Linguistic Atlas of the Middle and South Atlantic States, that several vowels, including FACE, were more likely to be monophthongal for African Americans than for European Americans. See Thomas and Bailey (1998) for acoustic analyses showing that other ex-slaves had monophthongal or nearly monophthongal FACE and GOAT vowels. Figure 9. Trajectories of twenty tokens of FACE vowel as produced by Boyd.


328

One caveat that should be acknowledged about Boyd’s formant readings is that certain measurements may reflect some distortion. The main problem is in the F1 values for her high vowels, which appear a little greater in Figure 8 than they should be. For all but a few women, the high vowels usually exhibit F1 values lower than 500 Hz. The distortion seems to stem from the enhancement procedures employed by the Library of Congress, which appear to have damped the lower frequencies, thereby lowering F1 amplitudes and shifting the center frequencies of F1 upward. Fortunately, the positions of the vowels relative to each other are not affected, as all the high vowels and, to a lesser extent, the mid vowels are influenced in the same way. Although F3 and F4 are not shown in Figure 8, there was greater uncertainty about their measurements, especially those of F4, than about those for F1 and F2. 6 Analyses involving the fundamental frequency The fundamental frequency, or F0, plays a minor role in certain consonantal and vocalic contrasts. In many languages, particularly in Africa and eastern Asia, it is used for lexical specification. However, its main use in Western languages is for intonation. Fortunately, F0 is well preserved in early recordings such as the ESR. In fact, it is one of the most robust phonetic properties of speech. Even when recordings fail to capture frequencies above 1 kHz or somewhat lower, vocal pulses


329

and the lowest few harmonics are still evident and allow measurement of F0. As a result, the F0 contours that make up intonation are easily discernible. Only when the phonation becomes especially breathy or creaky does F0 become hard to measure.

Analysis of intonation requires some sort of transcription before any further procedures can be performed. At present, the standard method for transcribing intonation is the Tone and Break Index, or ToBI, system (Beckman and Hirschberg 1994). ToBI requires construction of a textgrid with at least four tiers, including one for orthographic transcription of the speech sample and one for transcription of the various kinds of tones, including edge tones and pitch accents, that ToBI recognizes. Edge tones indicate the end of a prosodic phrase, of which there may be more than one kind (as in English), while pitch accents occur on some syllables to mark them as more prominent than syllables without a pitch accent. A sample utterance from the Boyd interview, with ToBI annotation, is shown in Figure 10. It exhibits a pattern that typifies many varieties of AAE to this day, one of high tones separated by clearly defined troughs as opposed to more gradual falls in F0. Every stressed syllable except those in the words I and out contains a pitch accent. This high density of pitch accents seems to be more common in present-day AAE than in present-day European American varieties, and its presence in the ESR suggests that it has existed in AAE for a long time. Figure 10. An utterance from the Boyd interview with ToBI annotations. A pitch

track with a scale from 80 to 270 Hz (black line) is superimposed on a narrowband spectrogram with a scale from 0 to 750 Hz. The series of H* annotations represent the pitch accents, while H- and L-H% are phrasal edge tones. Note the F0 troughs between the pitch accents.


330

Other kinds of analyses can be conducted once the ToBI transcription is complete. Some analyses simply involve computations of the frequency of different ToBI tonal designations. However, there are phonetic analyses that require numerical measurements. One example is the measurement of peak delay, which has to do with the position of the point of highest F0 of a tonal contour relative to the onset and offset of its host syllable. In English, stressed syllables serve as hosts, but some contours involve gradual rises that begin on the host syllable and reach their peak on a following unstressed syllable. Figure 11 shows a L+*H contour, the commonest rising contour in English, in Boyd’s speech. The host syllable is hold, but as can be seen, the peak falls on the next word, the unstressed us. The peak delay is calculated as the proportion of the duration of the host syllable at which the peak is found, which in this case, since the peak is past the end of the host syllable, is 1.09. Thus far, little work has been conducted comparing dialects of English for peak delay, but the ESR could provide invaluable evidence on early AAE if such research were to be undertaken. See Thomas (2011) for examples of other intonational metrics that can also be computed. Figure 11. A pitch accent with a rising contour from the Boyd interview, showing

the relevant features for computing the peak delay. A pitch track with a scale from 75 to 475 Hz (white line) is superimposed on a wideband spectrogram with a scale from 0 to 4000 Hz.


331

7 Analyses of timing Timing does not appear to be affected noticeably in the ESR, including the Boyd interview. Although it is impossible to know exactly how faithfully the recordings reflect the actual timing of Boyd’s utterances, and overall speeding or slowing are certainly possible, any such deviations are not evident when one listens to the recordings. It is also possible that the speed of the recordings increased or decreased between the beginning and end of each side of a disc, but quantitative measures would be required to detect such differences and they are not evident by ear, either. As a result, reasonably reliable analyses of rhythm and duration can be performed on the Boyd recording and other ESR.

Timing differences range from segmental-level durations to measures of a speaker’s overall rate of speech. On a segmental level, measurements can involve the intrinsic length of particular phonemes (e.g., Peterson and Lehiste 1960) and durational differences due to phonetic context, such as the well-known tendency for vowels to have shorter durations before voiceless obstruents than before voiced obstruents (e.g., House and Fairbanks 1953). For example, the tokens of the FLEECE vowel measured for Figure 10 have a mean duration of 86 ms, while those of the KIT vowel that were measured averaged only 46 ms. When only tokens before voiceless obstruents are included, FLEECE still shows longer durations than KIT, 57 ms to 40 ms.


332

At the opposite end of the timing spectrum, the overall rate of speech describes how quickly an individual speaks in general. As Kendall (2013) has shown, speech rate differs from individual to individual and dialect to dialect and thus can encode sociolinguistic meaning. One of the best measures of rate of speech is the articulation rate, which counts the number of syllables per second, excluding silent periods between utterances (Robb, Maclagan, and Chen 2004). One section of the Boyd interview has been analyzed in this way, and Boyd’s mean articulation rate for the 65 utterances with no ambiguous words is 5.1 syllables/second. This rate is about average for English, but slower rates are typically found in Romance languages (Arvaniti and Rodriquez 2013). Comparisons of the ESR in different parts of the interview could also be conducted.

A more specialized way of examining speech rate is analysis of prosodic rhythm. Prosodic rhythm has to do with variations in the relative durations of segments. It is intended to describe how syllable-timed or stress-timed a language, dialect, or speaker is. Various methods compare durations of vocalic intervals, consonantal intervals, or both (see Thomas 2011). One commonly used method, nPVI, focuses on vocalic intervals. For each pair of adjacent vowels in an utterance, the absolute value of the difference in their durations is divided by the average of the two durations (Low, Grabe, and Nolan 2000). A sample utterance from the Boyd interview – the first part of the utterance from Figure 10 – is shown in Figure 12. In this excerpt, the nPVI calculations work as follows. The vowel in But is 89 ms long, while that in I has a duration of 70 ms. The difference of their durations, 11 ms, divided by their mean value, 75.5 ms, yields an nPVI score of 0.24. For the comparison between the vowels in I and was, nPVI=0.80; between the vowels in was and hired, nPVI=1.45; and between those in hired and out, nPVI=0.44. Because out falls before a phrase boundary and is subject to phrase-final lengthening, it could be excluded from the calculations. nPVI scores vary greatly from one pair of words to another, so a large number of them have to be taken for each speaker, after which a mean or median value can be obtained. Smaller scores indicate more syllable-timing and larger scores more stress-timing. For the section of Boyd’s interview analyzed for speech rate, there were 367 nPVI comparisons, with a mean value of 0.595 and a median value of 0.544. These figures represent a fairly stress-timed rhythm. Thomas and Carter (2006) used nPVI analyses of the ESR to argue that AAE has become more stress-timed than it once was. Figure 12. An utterance from the Boyd interview with consonantal (C) and vocalic

(V) intervals delineated for analysis of prosodic rhythm.


333

8 Prospects As can be seen, a great deal of acoustic analysis can be performed on the ex-slave recordings. Other researchers demonstrated over twenty years ago how important these recordings are for morphosyntactic variants. The ESR are clearly valuable for exploring some kinds of phonetic variables as well. Even though certain kinds of analyses are impossible to conduct on them, other kinds of analyses certainly can be performed. Analyses of vowel quality and prosody are quite doable.

The limitations of the ESR, in particular the impossibility of knowing how well the speakers in them represent African Americans born in the mid-nineteenth century, have been pointed out before. These problems, however, should not dissuade researchers from utilizing the ESR. The recordings certainly cannot be used to show that a particular feature was absent in nineteenth-century AAE, and they are not easily used to demonstrate that any feature was predominant. However, their chief value is for proving that certain features occurred in AAE at that time. They can show that some features that are present today are old and that features no longer extant occurred at one time. For vowel quality, they can be used to corroborate linguistic atlas records, which include some African American subjects from the same generation. The presence of monophthongal FACE and GOAT vowels in both the linguistic atlas records and the ESR testifies to this value. For prosody, there is no other record at all of African American features of that time. The ESR are the only window available on whether prosodic patterns found in AAE today existed in mid-nineteenth-century AAE. Evidence of this sort can shed new light on how AAE might have originated by shifting attention from the already heavily studied morphosyntactic variables to lesser known phonetic variables.

The methods illustrated here for analyzing these variables can be applied to other early recordings of other dialects and languages as well. It is my hope that


334

other researchers will see that early recordings are indeed suitable for these techniques. It would be a shame to waste the insights that early recordings can provide just because someone considers their sound quality too poor for any modern acoustic methods.

References Arvaniti, Amalia, and Tara Rodriquez 2013. The role of rhythm class, speaking

rate, and F0 in language discrimination. Laboratory Phonology 4: 7-38. Bailey, Guy, Natalie Maynor, and Patricia Cukor-Avila (eds) 1991. The Emergence

of Black English: Text and Commentary. Creole Language Library 8. Amsterdam/Philadelphia: John Benjamins.

Beckman, Mary E., and Julia Hirschberg 1994. The ToBI annotation conventions. Online typescript. http://www.ling.ohio-state.edu/~tobi/ame_tobi/annotation_conventions.html

Brewer, Jeutonne 1991. Songs, sermons, and life-stories: The legacy of the ex-slave narratives. In: Bailey et al. (eds), pp. 155-189.

Butters, Ronald R., and Ruth A. Nix 1986. The English of Blacks in Wilmington, North Carolina. In: Michael B. Montgomery and Guy Bailey (eds), Language Variety in the South: Perspectives in Black and White. Tuscaloosa: University of Alabama Press, pp. 254-63.

Cukor-Avila, Patricia 2001. Co-existing grammars: The relationship between the evolution of African American and White Vernacular English in the South. In: Sonja L. Lanehart (ed.) Sociocultural and historical contexts of African American English.Varieties of English around the World, General Series 27. Amsterdam: John Benjamins, pp. 93-127.

Dorrill, George T. 1986. Black and White Speech in the South: Evidence from the Linguistic Atlas of the Middle and South Atlantic States. Bamberger Beiträge zur Englischen Sprachwissenschaft 19. New York: Peter Lang.

Holm, John 1991. The Atlantic creoles and the language of the ex-slave recordings. In: Bailey et al. (eds), pp. 231-248.

House, Arthur M., and Grant Fairbanks 1953. The influence of consonant environment upon the secondary acoustical characteristics of vowels. Journal of Speech and Hearing Research 5: 38-58.

Kendall, Tyler S. 2013. Speech Rate, Pause, and Sociolinguistic Variation: Studies in Corpus Sociolinguistics. Basingstoke, U.K.: Palgrave.

Kurath, Hans, and Raven I. McDavid, Jr. 1961. The Pronunciation of English in the Atlantic States. Ann Arbor: University of Michigan Press.

Labov, William 1991. The three dialects of English. In: Penelope Eckert (ed.), New Ways of Analyzing Sound Change. New York: Academic Press, pp. 1-44.

Labov, William 1994. Principles of Linguistic Change. Volume 1: Internal Factors. Language in Society 20. Oxford, U.K./ Malden, MA: Blackwell.

Low, Ee Ling, Esther Grabe and Francis Nolan 2000. Quantitative characterizations of speech rhythm: Syllable-timing in Singapore English. Language and Speech 43: 377-401.

http://www.ling.ohio


335

Lowman, Guy S., Jr. 1936. The treatment of /au/ in Virginia. In: Daniel Jones and D. B. Fry (eds), Proceedings of the Second International Congress on Phonetic Sciences. Cambridge: Cambridge University Press, pp. 122-125.

Mufwene, Salikoko S. 1991. Is Gullah decreolizing? A comparison of a speech sample of the 1930s with a sample of the 1980s.In Bailey et al. (eds), pp. 213-230.

Myhill, John 1995. The use of features of present-day AAVE in the ex-slave recordings. American Speech 70: 115-147.

Nichols, Patricia C. 1986. Prepositions in Black and White English of coastal South Carolina. In: Michael B. Montgomery and Guy Bailey (eds), Language Variety in the South: Perspectives in Black and White. Tuscaloosa: University of Alabama Press, pp. 73-84.

Pederson, Lee 1965. The Pronunciation of English in Metropolitan Chicago. Publication of the American Dialect Society 44. Tuscaloosa: University of Alabama Press.

Peterson, Gordon E., and Ilse Lehiste 1960. Duration of syllable nuclei in English. Journal of the Acoustical Society of America 32: 693-703.

Poplack, Shana, and Sali Tagliamonte 1991. There’s no tense like the present: Verbal –s inflection in early Black English. In: Bailey et al. (eds), pp. 275-324.

Poplack, Shana, and Sali Tagliamonte 2001. African American English in the Diaspora. Language in Society 30. Oxford, UK/ Malden, MA: Blackwell.

Rickford, John R. 1991. Representativeness and reliability of the ex-slave narrative materials, with special reference to Wallace Quarterman’s recording and transcript. In: Bailey et al. (eds), pp. 191-212.

Robb, Michael P., Margaret A. Maclagan, and Yang Chen 2004. Speaking rates of Amercian and New Zealand varieties of English. Clinical Linguistics and Phonetics 18: 1-15.

Singler, John Victor 1991. Liberian Settler English and the ex-slave recordings: A comparative study. In: Bailey et al. (eds), pp. 249-274.

Sutcliffe, David 2001. The voice of the ancestors: New evidence on nineteenth-century precursors to twentieth-century African American English. In: Sonja L. Lanehart (ed.) Sociocultural and historical contexts of African American English.Varieties of English around the World, General Series 27. Amsterdam: John Benjamins, pp. 129-168.

Sutcliffe, David 2003. African American English supersegmentals: A study of pitch patterns in the Black English of the United States. In: Ingo Plag (ed.) Phonology and Morphology of Creole Languages. LinguistischeArbeiten 478.Tübingen: Max Niemeyer Verlag, pp. 147-162.


Thomas, Erik R. 2011. Sociophonetics: An Introduction. Basingstoke, U.K./New York: Palgrave.

Thomas, Erik R., and Guy Bailey 1998. Parallels between vowel subsystems of African American Vernacular English and Caribbean creoles. Journal of Pidgin and Creole Languages 13: 267-296.


336

Thomas, Erik R., and Phillip M. Carter 2006. Rhythm and African American English. English World-Wide 27: 331-355.

Wald, Benji 1995. The problem of scholarly predisposition: G. Bailey, N. Maynor, and P. Cukor-Avila (eds), The Emergence of Black English: Text and Commentary. Language in Society 24: 245-257.


Wolfram, Walt 1969.A Sociolinguistic Description of Detroit Negro Speech. Washington, DC: Center for Applied Linguistics.

Wolfram, Walt, and Erik R. Thomas 2002. The Development of African American English. Language in Society 31. Oxford, UK/ Malden, MA: Blackwell.

Boberg Earlier Canadian English --- Page 337 of 525

337

15 Archival data on Earlier Canadian English Charles Boberg 1 Introduction This chapter will examine what archival data can tell us about the phonetics and phonology of earlier Canadian English. It begins with introductory remarks on the study of the English language in Canada, then proceeds to both an auditory-impressionistic and an acoustic-phonetic analysis of the speech of Canadian veterans of the First World War, recorded in the 1960s. These analyses provide data bearing on the age, origins and development of several widely-known features of modern Canadian English. The history of the English language in Canada now goes back more than four centuries, to the establishment of Newfoundland as Britain’s first colony in the Americas, in 1583. Since that time, English has become the dominant language of modern Canada, making it one of the world’s main English-speaking countries. It is now home to over 19 million native-speakers of English, accounting for about 59 percent of its total population of over 34 million (Statistics Canada 2011). Of the remainder, 22 percent speak French, with which English shares official status, while 19 percent speak a wide array of non-official languages. These include both indigenous languages, like Cree, Inuktitut and Ojibway; and immigrant languages, like Chinese, Panjabi, Spanish, Italian, German, Tagalog and Arabic. Though French is dominant in the province of Quebec and non-official languages have achieved local dominance in certain communities and districts, English is dominant everywhere else, being the language most often spoken at home by 65 percent of the population in the country as a whole and by 82 percent outside Quebec. The history of English in Canada, explored in detail in Boberg (2010) as well as in shorter treatments like Avis (1973) and Chambers (2006a), can be roughly divided into the following four phases. 1) Foundation (1583-1783). English is planted by fishermen, traders and colonists

in Newfoundland, in small fur trading outposts around Hudson’s Bay, in Nova Scotia and, following the British conquest of New France in 1759-63, in Quebec.

2) Growth and Expansion (1783-1883). Canada’s English-speaking population receives major additions: first from an influx of about 40,000 United Empire Loyalists from the former American colonies following the American Revolution; then from a more gradual immigration of close to a million people from Ireland, England and Scotland during the early nineteenth century. At the confederation of Ontario, Quebec, New Brunswick and Nova Scotia as the autonomous Dominion of Canada in 1867, the country is two-thirds English-speaking.

3) Consolidation (1883-c. 1983). Following the opening of the trans-continental Canadian Pacific Railway, which reaches the Rocky Mountains in 1883, the


338

western half of the country is effectively opened for agricultural settlement. This creates a land boom that attracts several million settlers from eastern Canada, the United States, Britain and Europe to the Prairie region and British Columbia. In most places, ex-Ontarians are the most numerous and influential component of this migration. Canadian English now extends “from sea to sea”.

4) Globalization and Change (c. 1983-present). As a result of changes in communications technology brought about cable and satellite television and by the rapid development of personal computers and the internet, Canadian English gains increasing exposure to influence from abroad, particularly from American English but also from global Anglophone culture.

Given that 86 percent of Canada’s mother-tongue English-speakers now live in the five western and central provinces from British Columbia to Ontario, the key phase in the development of Canada’s current linguistic landscape was the third or ‘consolidation’ phase, of the late nineteenth and early twentieth centuries. During this period, Ontario English, firmly established during the previous phase as a fairly homogeneous dialect of North American English with significant influence from several British dialects, was transplanted, with some modifications and other influences, to western Canada, thereby producing modern Canada’s main block of English-speakers. Moreover, whereas data on the first two phases of the history of Canadian English are restricted to written sources and therefore have relatively little to say about phonology and phonetics, it is also during the third phase that the development of sound recording technology gives us our first records of what earlier Canadian English sounded like. Canadian archives, such as those established by the federal government (Library and Archives Canada, accessible at http://www.collectionscanada.gc.ca), as well as by provincial and municipal governments, by universities and museums, and by other institutions and corporations, hold many such records, from folk songs to oral histories to speeches by public figures. Unfortunately, the vast majority of this material is only accessible on-site; while digitization of much of it is now complete or underway, almost none of it has so far been made available to off-site researchers via the Web. A survey of the largest federal and provincial archives made in preparing this chapter confirmed this state of affairs at the time of writing. Perhaps partly because of this limited accessibility, very little archival material has been used in past linguistic studies of Canadian English. These have concentrated first on written surveys of variation in contemporary vocabulary, grammar and pronunciation (e.g., Avis 1954-1956; Scargill and Warkentyne 1972; Chambers 1998a, b), then on oral sociolinguistic surveys of modern urban speech (e.g., Clarke 1991; Woods 1991; Gregg 1992; Boberg 2004a; Tagliamonte 2006). With a few exceptions, discussions of the state of Canadian English before the mid-twentieth century, especially with regard to phonology and phonetics, have therefore been largely speculative. Among the exceptions are studies by Thomas (1991), who explores the origin of Canadian Raising in Ontario using the field records of interviews with Ontario informants made in the 1930s for American Linguistic Atlas projects; by Dollinger (2008), who investigates the development of modal auxiliaries in late eighteenth- and nineteenth-century Canadian English,

http://www.collectionscanada.gc.ca


339

using newspapers, diaries and letters in the Corpus of Early Ontario English; by Baxter (2010), who examines the origins of the merry-marry merger in Eastern Townships Quebec English by carrying out an acoustic analysis of archival interviews with informants born between 1895 and 1915; and by Denis (forthcoming), who seeks evidence of the early development of discourse-pragmatic markers in Ontario English in two corpora of oral history made in the 1970s and 80s, in which the oldest speakers were born in 1879. The present chapter offers a contribution to this small but growing tradition of archival research on earlier Canadian English, using one of the few corpora of archival data on late-nineteenth and early-twentieth-century speech patterns that is available on-line: a set of interviews with Canadian World War I veterans, called Oral Histories of the First World War: Veterans 1914-1918. This ‘web exhibition’, presented by Library and Archives Canada in partnership with Veterans Affairs Canada and the Canadian Broadcasting Corporation (CBC), is based on a CBC radio broadcast called In Flanders Fields, aired in 1964-1965. It featured one-on-one interviews with veterans of the Canadian Expeditionary Force, in which they recall experiences relating to the Battles of Ypres, Vimy Ridge, the Somme and Passchendaele, trench warfare and the war in the air. Excerpts from the interviews lasting a few minutes each are available at: http://www.collectionscanada.gc.ca/first-world-war/interviews/index-e.html. World War I was a tragic episode in Canada’s history, as in Europe’s, but was also an important turning point in Canada’s development from a British colony into an independent nation. When the war began in 1914, Canadian participation in the British war effort, like that of the other British dominions, was taken for granted and Canadians served under British commanders as part of an integrated imperial force. Canada ultimately sent 620,000 personnel overseas, from a population of less than eight million; of these, almost 70,000 were killed and over 170,000 wounded, a number that does not include the thousands more who were left physically intact but afflicted by ‘shell shock’, now recognized as post-traumatic stress disorder (Granatstein and Oliver 2011: 85, 102). This appalling sacrifice is still remembered today, in long lists of casualties on the walls of every turn-of-the-century church and school in Canada and with cenotaphs and memorial buildings at the heart of almost every Canadian town, as well as in the well-known memorial poem, In Flanders Fields, written by Canadian Lt. Col. John McCrae. Canadian units served effectively in many of the war’s most important engagements, most famously at Vimy Ridge, and became known for toughness and reliability, a “matchless fighting record” (ibid.: 482). As the war dragged on, many Canadians felt the country’s military service demanded greater recognition and autonomy. By the end of the conflict, Canadian units were reorganized under a separate Canadian Corps, which was commanded for the first time by a Canadian general, Sir Arthur Currie, and the Prime Minister, Sir Robert Borden, pushed successfully for Canada to be recognized as an independent signatory to the Treaty of Versailles and as a founding member of the League of Nations. Though Canada would not attain full legislative autonomy until the Statute of Westminster in 1931, many Canadian historians regard its participation in the First World War as a kind of coming-of-age that allowed Canadians to think of themselves for the first time as an independent nation, related to but distinct from Britain. Granatstein and Oliver point out that “the efforts of the

http://www.collectionscanada.gc.ca/first-world-war/interviews/index-e.html


340

soldiers … had allowed Borden to press for increased autonomy for Canada within the British Empire…. The colony of 1914 was gone, in its place a new nation…” (2011: 482). Morton remarks that “Vimy Ridge was a nation-building experience. For some, then and later, it symbolized the fact that the Great War was also Canada’s war of independence…” (1985: 145). One of many facets of that independence was linguistic: the fact that the English of most Canadians was by this time clearly distinct from that of most Britons contributed to Canadians’ sense of their own cultural identity. The oral histories of those who experienced the war first-hand, then, have a double significance: both as traumatic personal stories of terror and bravery; and as testimonies to a crucial phase of the development of Canadian English, part of the larger story of Canada’s evolution toward nationhood and a distinct identity. From the sociolinguistic point of view, veterans’ reminiscences are the ultimate “danger-of-death” narratives, prized by Labov (1972: 92-94) as the closest approximation in an interview setting to the ‘vernacular’, the way people speak when they are not conscious of observation, and therefore the best source of undistorted linguistic data. In terms of the history of Canadian English, they preserve a record of what the language may have sounded like in the 1890s, the decade in which most First World War soldiers were born and in which Canadian English was being transplanted from Ontario to the West. This potentially extends our view of key variables in the sound patterns of Canadian English half a century earlier than the first systematic studies of those variables in contemporary speech in the 1950s, thereby allowing us to move at least tentatively beyond speculation in our assessment of the age and origins of these features. This record must be interpreted with caution, of course, given that it was set down in the 1960s, some sixty or seventy years after the speech patterns of the 1890s and early 1900s would have been acquired by the veterans as children. The question naturally arises: is this a record of the 1890s, or of the 1960s, or of some hopelessly blended amalgam of these two periods and everything in-between? The ‘apparent-time hypothesis’ allows us to interpret the speech of older generations as evidence of earlier periods in the history of the language by assuming that speech patterns are largely stable over the lifetimes of individuals (Labov 1963), an assumption that has been questioned by recent research (e.g., Boberg 2004b; Sankoff and Blondeau 2007; Wagner 2012). The current consensus in this debate seems to be that many changes in progress involve a mixture of community change – implying stability over individual lifetimes – and individual change. The latter may involve older individuals in some cases becoming more conservative as they age, but in other cases advancing the rate of community change by adopting new features that they hear in the speech of younger individuals. Late adoption of new features is more likely to happen when those features are isolated elements like new words, or pronunciations of words, rather than the more abstract and systematically integrated core properties of phonology or syntax (Boberg 2004b: 265). Apparent-time data, then, are far from worthless, but must be taken, in the absence of confirmation from real-time data, as suggestive rather than conclusive evidence of earlier linguistic patterns. In dealing with the heritage of mechanically recorded speech, which only begins around the 1890s and does not become widespread until much later, we are compelled to rely to some extent on these suggestive data if we wish to know anything at all about the phonetics of the 1890s: as Labov has


341

famously remarked, much of historical linguistics involves “making the best use of bad data” (1994: 11). 2 Method In an initial approach to the Oral Histories of the First World War corpus, it was immediately obvious that, of the 20 excerpts available, ten were from interviews with men who would be of little value as representatives of Canadian English. At a first impression, Curtis, Stevens and Wakeman appeared to be from England; Leckie and Mitchell from Scotland; and Uprichard from Ireland; Francoeur and Lindsay were Francophones and Lasner and Wiseman sounded like European immigrants with second-language English. While these ten speakers were excluded from further analysis on this basis, their presence is itself an interesting piece of data about Canadian English in the early twentieth century, and specifically in the experience of First World War soldiers, who would often have had their first important contacts with people from outside their own communities during their military service: though Canadian English was already the dominant type of speech in early twentieth-century Canada, it was by no means alone, and was therefore subject to possible influence from the other types of speech with which it was in contact. This is particularly true of the British dialects spoken by the most recent wave of immigrants from Britain. The western land boom following the completion of the Canadian Pacific Railway had fuelled a renewed surge of immigration to Canada, much of which continued to come from Britain, by this point more from England than from Ireland; in the last years before the war, around 150,000 British immigrants arrived every year, a significant number for a country of eight million (Boberg 2010: 89). Many of these displaced Britons were the first to enlist when the war began, perhaps feeling a closer affinity to Britain’s cause than many native-born Canadians (especially French-Canadians: another legacy of the war was its Conscription Crisis, which pitted largely pro-conscription English-Canadians against largely anti-conscription Francophones). Granatstein and Oliver report that two thirds of the first contingent of the Canadian Expeditionary Force were British-born immigrants; this declined to over one third by 1918, still a substantial proportion (2011: 481). Canadian soldiers were therefore exposed not only to the British speech of their British commanders and comrades in Europe, but also to that of men in their own units who may have become close friends, as well as that of neighbors in the communities they came from back in Canada. Though Canadian linguistic patterns were apparently well established by the 1890s, as will be demonstrated below, the possibility of continued British influence on Canadian English during this period should not be discounted; on the other hand, exposure to these British dialects may have increased native Canadians’ consciousness of their own identity as North Americans. Even after the non-Canadian-English group was excluded, the remaining set of interviews included two (with Barnes and Dodds) that seemed to feature two speakers, so that individual identities were questionable. As these were also very short excerpts, they were likewise excluded from further analysis. This left a set of eight excerpts, two of which represented the same individual (Cooper), so the following analysis is based on the speech of seven veterans: Wallace Carroll; H.S.


342

Cooper; W.H. Joliffe; Vick Lewis; D.M. Marshall; J.R. McIlree; and E.S. Russenholt. Unfortunately, without a more extensive search, it was not possible to establish much background information about most of these men: a related Library and Archives Canada site provides a searchable database of the Canadian Expeditionary Force, but most of the individuals in question could not be found among its records, and even when they could, the records did not always include the attestation papers that soldiers filled out when they enlisted, containing such important data as birth date and home town. Nevertheless, one of the veterans, Russenholt, became a prominent citizen after the war, so a good deal is known about him (Manitoba Historical Society 2103). Edgar Stanford Russenholt (1890-1991) was born at Uxbridge, Ontario, northeast of Toronto, but moved with his family to Manitoba in 1899 to homestead, or farm, in the Swan River Valley. During the War, he served in the forty fourth Battalion, the Royal Winnipeg Rifles, and received a battlefield commission. After the war he pursued a varied career at the provincial hydro-electric company, in broadcasting and as a conservation activist. Russenholt, then, perfectly represents the westward spread of Ontario English discussed above: he and his family would have been bearers of Ontario influence during the foundational period of the West. Russenholt himself would have been about nine years old when he arrived in Manitoba, with Toronto-region linguistic patterns well established. We cannot, of course, say with any certainty what happened thereafter in terms of the kind, quantity or direction of linguistic influence among his social group during his later childhood and adolescence, but it is tempting to imagine that he may have retained some of his Ontario linguistic features and acted as a model of Canadian English for those around him, who may have been less well-rooted in the Anglo-Canadian linguistic milieu. Of the other two veterans who are analyzed acoustically below, Marshall was a lieutenant in the same battalion as Russenholt, so would likely have been a fellow Manitoban, while McIlree appears to have been a graduate of Trinity College School in Port Hope, Ontario (class of 1907), who served as a captain in the seventh Battalion of the Canadian Infantry (Trinity College 2013: 19). That Battalion was based in and recruited mostly from British Columbia, suggesting that he, like Russenholt, was an Ontarian who went west between his graduation from high school and enlisting in the army, or that his family moved west at an earlier point and sent him back to Ontario for school (Trinity is a private boarding school). As for the other men in the sample, in the absence of more data it can at least be stated, judging by their speech, that they all grew up in Canada and, judging by their participation in the First World War and the date of their interviews, that they would have been born between about 1885 and 1900; judging by their last names, they also represent what was then the large majority of English-speaking Canadians in being of British ethnic ancestry. The speech of this small corpus was analyzed in two ways. First, all seven of the interviews were subjected to an auditory-impressionistic analysis that examined eight phonological or phonetic variables that have been well studied in previous research and have come to characterize modern Canadian English (four of these can be combined into two general processes, making six headings below): 1) Canadian Raising of /aw/ and /ay/, or MOUTH and PRICE, producing non-low

vowels before voiceless obstruents (Joos 1942; Chambers 1973, 2006b;


343

Thomas 1991; Labov, Ash and Boberg 2006: 221-222; Boberg 2008, 2010: 149-151, 156-157);

2) the Canadian Shift, or lowering and retraction, of /æ/ and /e/, or TRAP and DRESS (Clarke, Elms and Youssef 1995; Boberg 2005; Labov, Ash and Boberg 2006: 130, 219-221; Boberg 2008, 2010: 146-147, 155-157);

3) the advancement or fronting of /uw/, or GOOSE, to central or front position (Labov, Ash and Boberg 2006: 101, 153; Boberg 2008, 2010: 151-152);

4) the flapping of /t/ (Gregg 1957: 25, 2004: 17-40; Scargill and Warkentyne 1972: 58; Woods 1999: 78-86; Gregg 2004: 17-40; Boberg 2010: 135-136, 157-158);

5) the deletion of [j] after coronals in new, student, tunic, etc. (Avis 1956: 48; Scargill and Warkentyne 1972: 51-52; Clarke 1993; Chambers 1998a: 17-19, 1998b: 235-244; Woods 1999: 93-96; Gregg 2004: 46-48; Clarke 2006; Boberg 2010: 134-135, 157-158);

6) the deletion of [h] in wh-words (wheel, when, while, etc.; Avis 1956: 53; Gregg 1957: 25; Scargill and Warkentyne 1972: 71; Chambers 1998a: 26; Woods 1999: 138; Gregg 2004: 49; Boberg 2010: 124-125, 157-158).

The results of this analysis are presented in Table 1, below. Though Canadian Raising and the Canadian Shift were examined impressionistically, as described above, they can be more accurately assessed with acoustic analysis, which produces objective data on formant frequencies that can be compared with other sociophonetic studies. The three longest interview excerpts – those with Marshall, McIlree and Russenholt – were therefore selected for acoustic analysis. These were downloaded, converted to .wav format, and analyzed acoustically in Praat (Version 5.3.04; Boersma and Weenink 1992-2012), using measurement techniques more or less identical to those of Labov, Ash and Boberg (2006) and Boberg (2010). A spectrogram with linear predictive coding analysis was made of each excerpt and the most prominent words (those bearing the main phrasal stress) were selected for measurement of the first and second formants (F1 and F2, indicating vowel height and advancement, respectively) near the mid-point of the main stressed vowel, usually at the maximal value of F1. This produced 324 measurements for Marshall, 294 for McIlree and 270 for Russenholt, enough in each case to give an overall view of the outline of the vowel space, as well as the position within it of most of the phonemes and some of their most important allophones. The raw formant data were then normalized, to make them more directly comparable with other studies, using the Constant Log Interval Hypothesis version of the additive point system set forth in Nearey (1978). Since there were only three speakers, their scaling factors were calculated using the group average of the Phonetics of Canadian English sample of 86 individuals analyzed in Boberg (2008). The PCE group mean of F1 and F2 values taken together was 1119 Hz, of which the natural log is 7.02 (ibid.: 134); scaling factors for the veterans analyzed here were therefore 1.03 for Marshall, 1.18 for McIlree and 1.16 for Russenholt. The normalized data on these speakers is presented in Figures 1-3, below. 3 Results


344

3.1 Auditory-impressionistic analysis The main results of the auditory-impressionistic analysis of the above-described sample are shown in Table 1. Since the data on any one speaker is limited by the length of the excerpts and the frequency of occurrence of the variables in question, inter-speaker variation will be set aside, in preference for the aggregate pattern of the group, shown in the ‘total’ row on the bottom of Table 1. Table 1. Auditory-impressionistic coding of 7 Canadian-sounding World War I

veterans for 8 phonological variables (see key for explanation of variables; nd = no data).

Name CR-

aw CR-ay

CS-æ CS-e

Adv.-uw

Flap-t

Del.-j

Deasp.-wh

Carroll, Wallace

4/5 3/4 1/6 0/5 0/2 7/9 nd nd

Cooper, H.S. 4/4 1/1 2/12 0/1 0/1 17/22 1/2 1/2 Joliffe, W.H. 1/1 1/1 0/10 0/4 0/1 9/13 nd 1/1 Lewis, Vick 3/3 4/4 0/7 0/2 0/1 3/3 nd 0/3 Marshall, D.M.

2/2 5/5 0/13 0/5 0/4 15/16 3/3 nd

McIlree, J.R. 3/4 5/7 8/20 0/4 1/6 2/4 0/5 2/4 Russenholt, E.S.

5/5 4/4 0/15 0/6 0/7 20/23 2/2 0/3

TOTAL (n) %

22/24 92%

23/26 88%

11/83 13%

0/28 0%

1/22 5%

74/90 82%

6/12 50%

4/13 31%

Key to variables: CR-aw = Canadian Raising of /aw/ (MOUTH), n audibly raised above low position. CR-ay = Canadian Raising of /ay/ (PRICE), n audibly raised above low position. CS-æ = Canadian Shift of /æ/ (TRAP), n audibly retracted and lowered to [a]. CS-e = Canadian Shift of /e/ (DRESS), n audibly retracted and lowered toward [a]. Adv.uw = Advancement of /uw/ (GOOSE), n advanced to high-central position. Flap-t = Flapping of /t/ in city, better, forty, twenty, shelter, etc.: n flapped. Del.-j = Deletion of [j] after coronals in new, student, tunic, etc.: n deleted. Deasp.-wh = Deletion of [h] in wh-words (wheel, when, while, etc.): n deleted. The first two columns show the frequency of Canadian Raising of /aw/ and /ay/ (MOUTH and PRICE). This is virtually categorical: there is no indication of inter-speaker variation, a strong confirmation of Thomas’ conclusion that “the date of origin of Canadian raising in Ontario must be pushed back as far as 1880” (1991: 162). By contrast, there is little evidence of the Canadian Shift operating at this time, suggesting it is a newer development, perhaps beginning in the mid-twentieth century: impressionistic analysis found /æ/ and /e/ (TRAP and DRESS) to be in their pre-shift positions, as low-front or lower-mid-front and upper-mid front vowels, respectively, again with no indication of inter-speaker variation. Likewise, the advancement or fronting of /uw/ (GOOSE), from its original position as a high-back


345

vowel to its current position as a high-central or even high-front vowel, has barely begun in this corpus: most tokens of /uw/ are still in the upper-back quadrant of the vowel space. On the other hand, Table 1 suggests that the flapping of /t/, like Canadian Raising, was already normal at the end of the nineteenth century, though it shows somewhat more variation. One of the few instances where flapping is absent where it would be expected today is the common phrase at all. In these data it is still syllabified as /ə.ˈtohl/, as in modern British English, with an aspirated syllable-initial /t/, whereas modern Canadian English has shifted to /æt.ˈohl/, with a flapped /t/. Other instances are in longer, more formal words, such as figuratively (Cooper), participated (Joliffe) or appreciated (Russenholt), or one syllable removed from post-tonic position, as in billeted (Carroll). However, as in modern Canadian English, flapping already extends to many post-nasal instances, such as twenty (though the /t/ is usually voiced, as twendy, rather than completed deleted, as twenny), and even to several instances after /l/, as in casualties and shelter (Marshall). There was also a limited amount of inter-speaker variation in flapping, as between Marshall, who shows almost categorical application, and Joliffe, who does not flap in skirted, started, or skirting, all normal flapping contexts today. Palatal glide deletion is much more variable than flapping in these data, as it still is today, at least based on the small set of tokens that could be analyzed in this corpus. Marshall and Russenholt show glide deletion in reduced (Marshall) and knew (Russenholt), whereas McIlree retains the glide in tunic, knew (three instances) and newly. This variation matches the data from Chambers’ older participants (1998a: 19; 1998b: 242), as well as Boberg’s view that glide loss was already underway at the beginning of the twentieth century (2010: 135). There is also some evidence in Table 1 of an early origin for the deaspiration of /hw/ in wh-words, though this evidence is again of limited quantity and reveals a good deal of variation. Lewis retains aspiration in why and wheel, as does Russenholt in while and which, whereas McIlree varies even with the same word (where). The auditory-impressionistic analysis also turned up a number of more isolated observations that nonetheless have something to say about the British influence discussed above. A number of words and pronunciations that would today be thought of as anglicisms occur in these narratives: for instance, Cooper and McIlree have lorry for truck; Cooper, Marshall and Russenholt have chap(s) for guy(s); Russenholt also uses lad for guy and pronounces schedule in the British way, with /ʃ-/ rather than /sk-/, a variant now rare in Canada (Boberg 2010: 142). Given the British-dominated military experience of the veterans, it is hardly surprising that words like lorry and chaps should have entered their vocabulary; this cannot, of course, be taken as evidence that such words would have been heard in civilian life back in Canada. Another feature that might be thought of as British is the retention of the contrast between /æ/ and /e/ (TRAP and DRESS) before intervocalic /r/, as in marry v. merry (Baxter 2010; Boberg 2010: 133), which today survives only in Montreal English and variably in Newfoundland. The speech of the veterans analyzed here, none of whom has a known connection with either of these places, suggests that the marry-merry contrast was once more common in Canada: Cooper has an unmerged quality in carried and Lewis and Russenholt in barrel(s).


346

Finally, we also find evidence of features that are thought of today as typically Canadian, such pronouncing again to rhyme with gain (Cooper and McIlree) and using /æ/ (the TRAP vowel) rather than /ah/ (the PALM vowel) in ‘foreign (a) words’ (Boberg 2010: 137-140). Data on the latter are very few, but the word barrage, as in artillery barrage, does occur, unsurprisingly, in three excerpts and shows a variable pattern: Marshall and Russenholt use the unique Canadian pronunciation with /æ/, /bəˈræʒ/, found more commonly today in the similar word garage, whereas Joliffe uses the /ah/ that is standard in American English. One other token is, somewhat less expectedly, the Italian poet Dante: when McIlree describes the horrors of the Western Front as resembling a scene out of Dante’s Inferno, he pronounces Dante with /æ/. 3.2 Acoustic analysis The results of acoustic analysis of the longest interview excerpts – those of Marshall, McIlree and Russenholt – appear in Figures 1-3, which display mean first and second formant values (F1 and F2, in Hz) for the most important vowel phonemes for each speaker, arranged in a conventional vowel chart format. In order to avoid over-crowding in the charts, some vowels, particularly those with few tokens, not essential for establishing the outlines of the vowel space and not directly related to the discussion, are omitted. Some more problematic omissions were imposed by a lack of data. One of the problems in dealing with natural speech as opposed to word lists is that the quantity of data on any given vowel is a matter of chance, as well as of distributional patterns in the language. For Canadian English, this creates a serious problem in the case of Canadian Raising of /aw/ (MOUTH), since fully-stressed tokens of the unraised allophone of /aw/, such as cow, how, loud, now and proud, are relatively rare in natural speech. The height of the raised allophone in doubt, house, south, etc., cannot therefore be measured in relation to the unraised allophone, but must be assessed either in terms of its absolute rather than relative position, which is to some extent possible with normalized data, or in relation to other low vowels, such as /æ/ and /o/ (TRAP and LOT). In two cases, those of Marshall and McIlree, there are no data at all on unraised /aw/, but comparisons can be made with their other vowels. Another problematic gap is data on /uw/ (GOOSE) before /l/, as in cool, pool or tool, which normally resists the centralization of /uw/ and therefore anchors the high-back corner of the vowel space, an important benchmark for the centralization of /uw/ in other environments. The presence of this allophone will have to be imagined in Figures 1-3, based on the positions of /uw/ and /ow/ (GOAT). A key to the phonemic symbols used in Figures 1-3 is given in Table 2. Table 2. Key to phonemic symbols used in vowel charts (Figures 1-3). Standard

vowel class keywords from Wells (1982) are given in SMALL CAPS. Where these are inappropriate, other keywords are given in italics.

/æ/ = TRAP /awT/ = house /i/ = KIT /owl/ = goal /æN/ = sand /ay/ = tie /iy/ = FLEECE /u/ = FOOT /ahr/ = START /ayT/ = tight /o/ = LOT /uw/ = GOOSE


347

/aw/ = cow /e/ = DRESS /oh/ = THOUGHT /ʌ/ = STRUT /awn/ = down /ey/ = FACE /ow/ = GOAT Figure 1. Vowel system of D.M. Marshall

Figure 2. Vowel system of J. R. McIlree


348

Figure 3. Vowel system of E. S. Russenholt


349

The vowel systems in Figures 1-3 have a common overall outline: they typify the inverted trapezoid shape of the older Canadian English vowel system, in which /æ/ (TRAP) and /o/ (LOT) form the bottom corners, at more or less the same height (Boberg 2011). As already suggested above, there is no evidence here of the combined retraction of /æ/ and fronting of /uw/ that has reversed the relative positions of these vowels and produced the triangular shape of the modern Canadian English vowel system. Boberg measures this reversal with an ‘Index of Phonetic Innovation’ (IPI), which is calculated by subtracting the mean F2 of /uw/ from that of /æ/ (2011: 22). Conservative systems, found especially among older men on the Prairies and in Atlantic Canada, have positive IPI values, with /æ/ still further front than /uw/, the traditional orientation; innovative systems, found especially among younger women in Ontario and British Columbia, have negative IPI values, with the traditional orientation of /æ/ and /uw/ reversed in the F2 dimension. Not surprisingly, the present speakers, being older men and at least two of them from the Prairies, all have strongly positive IPI scores: 487 Hz for Marshall; 324 Hz for McIlree; and 376 Hz for Russenholt; values in line with that of the older male examined in Boberg (2011: 27), who was born in 1920. Even without any data on /uwl/, it can be seen that /æ/ is still low-front, or even lower-mid-front, and /uw/ still back of center for all three men, though for two of them, McIlree and Russenholt, it has begun to creep forward, particularly after coronal consonants, as in do(ing), shoot or you, and in the word troop(s). A similarly conservative, uncentralized position for /ow/ (GOAT), also reported as a typical feature of Canadian English (Labov, Ash and Boberg 2006: 145, 224), is also found in all three systems.


350

What is more surprising about the data in Figures 1-3 is that two of the men display a ‘low-back’ distinction between /o/ and /oh/ (LOT and THOUGHT). A merger of these vowels is now regarded as a defining feature of Canadian English, first observed in the mid-twentieth century (e.g., by Joos 1942: 141 and Gregg 1957: 21-22) and confirmed as more or less universal by the 1970s (Avis 1973: 64; Boberg 2010: 127-129). This merger is exemplified here by Marshall, in Figure 1: t-tests confirm that there is no significant difference between the mean F1 or F2 measures of these vowels in his data, so they have been combined in Figure 1 as a single phoneme, /o-oh/. Some evidence, however, discussed by Boberg (2010: 128-129), points to the low-back merger being still incomplete in the mid-twentieth century, at least in some parts of the country, including the Prairies; the data from McIlree and Russenholt, in Figures 2 and 3, apparently confirm this. T-tests of the difference between the mean formant values of /o/ and /oh/ for these speakers found a significant difference in F1 (p = 0.004 for McIlree; p = 0.011 for Russenholt), but not in F2, though even F2 showed a marginally significant difference in the expected direction (F2 higher for /o/ than for /oh/) for Russenholt (p = 0.070). Unfortunately, there were no data in these excerpts on /ah/, or PALM, which is involved in a double-merger with /o/ and /oh/ in modern Canadian English, but the present data suggest that the merger of LOT and THOUGHT, now so well entrenched in Canada, must have attained that status only gradually, over the course of the twentieth century. Another important matter of phonemic inventory is the status of /æ/, which can be split into two phonemes, the TRAP and BATH sets, as in modern Standard British English, or retained as a single phoneme, TRAP, as in most modern North American English, including Canadian English, with only allophonic raising and fronting, most commonly before nasals. Marshall and Russenholt, the Manitobans, display this modern Canadian feature, with the more moderate version of pre-nasal fronting typical of the Prairie region (Boberg 2008: 147; 2010: 209); the excerpts unfortunately furnished too little data on /æ/ before /g/ to allow a parallel analysis of pre-velar raising, which is strongly associated with western Canada. In McIlree’s speech, however, we find a remarkable vestige of the British split /æ/. His tokens of /æ/ did not show the usual effect of a following nasal, so they were reorganized according to the British system. He produced eight tokens of BATH (afterward(s) (2); bastard; chance; (chemistry) classes; commander; giraffes; transport drivers) and 33 of TRAP (action, back, battalion, flashes, sandbag, that, etc.); a t-test of the difference between these categories found no significant difference in F1 but a 250-Hz difference in F2, significant at p = 0.008. Unlike the use of British words discussed above, this more abstract phonological feature is unlikely to have been acquired through British influence during the war; rather, it likely reflects a pattern learned in childhood, perhaps from British-immigrant parents or at McIlree’s private boarding school. In any case, since it occurs in the speech of a man who is otherwise thoroughly Canadian, as indicated by such features as full rhoticity, /t/-flapping and Canadian Raising, it suggests that the Canadian /æ/ system, even among the native population, was not always as uniform as it is today. Finally, Canadian Raising of both /aw/ and /ay/ (MOUTH and PRICE) can be seen in all three figures, along with raising and centralization of /ahr/ (START), another characteristically Canadian vowel quality. In fact, /awT/, /ayT/ and /ahr/ are usually produced very close to each other, in lower-mid-central position, along with /ʌ/


351

(STRUT), forming a tight cluster of mean nuclear values distinguished phonemically by the presence and direction of post-nuclear glides (high-back, high-front, rhotic in-glide, or absence of glide). The co-occurrence of /awT/ with the other nuclei in this space leaves little question as to its raised status, regardless of the absence of data on unraised /aw/; in the one chart where unraised /aw/ does appear (Marshall, Figure 1), it strongly confirms the raising analysis, with /aw/ the lowest vowel in the system, almost 200 Hz below /awT/. 4 Conclusion The group of seven First World War veterans analyzed here, particularly Marshall, McIlree and Russenholt, have afforded a remarkable glimpse at what Canadian English may have sounded like at the end of the nineteenth century, a window on the past made possible only by the availability and analysis of archival data. Direct evidence of this crucial period in the process of consolidation and diffusion that spread Canadian English across the country is now lost in data on contemporary speech, even among the oldest generation of speakers. The foregoing analysis shows that some modern features of Canadian English have a long history: Canadian Raising, for instance, has been well established for over a century now. Other modern features are of more recent vintage, or have only recently become uniform across the country: the fronting of /uw/ and the Canadian Shift, together with the reversal of /uw/ and /æ/ in F2 space that they bring about, appear to be a late twentieth century innovation. More surprisingly, the low-back merger of /o/ and /oh/ and even the modern phonemic status and allophonic distribution of /æ/ were not always the way they are today: these defining features of modern Canadian English seem to have evolved gradually, from a dialect landscape that was once much more varied. This process may partly reflect the strong British influence, including not just common war experiences but heavy British immigration to Canada throughout the nineteenth century, that would cause Canadians to use anglicisms like lorry and chaps, which have vanished from Canadian speech today. As has been shown in other contributions to this volume, in the Canadian context, ‘listening to the past’ has furnished valuable insights on the present. References Avis, Walter S. 1954. Speech differences along the Ontario-United States border.

Journal of the Canadian Linguistic Association 1(1): 13-18 (Vocabulary); 14-19 (Grammar); 2(2): 41-59 (Pronunciation).

Avis, Walter S. 1973. The English language in Canada. In: T. A. Sebeok (ed.), Current Trends in Linguistics, vol. 10: Linguistics in North America. The Hague: Mouton, pp. 40-74.

Baxter, Laura 2010. Lexical diffusion in the early stages of the merry-marry merger. University of Pennsylvania Working Papers in Linguistics 16(2), Article 3.


352

Boberg, Charles 2004a. Ethnic patterns in the phonetics of Montreal English. Journal of Sociolinguistics 8(4): 538-568.

Boberg, Charles 2004b. Real and apparent time in language change: Late adoption of changes in Montreal English. American Speech 79(4): 250-269.

Boberg, Charles 2005. The Canadian Shift in Montreal. Language Variation and Change 17(2): 133-154.

Boberg, Charles 2008. Regional phonetic differentiation in Standard Canadian English. Journal of English Linguistics 36(2): 129-154.

Boberg, Charles 2010. The English Language in Canada: Status, History and Comparative Analysis. Cambridge: Cambridge University Press.

Boberg, Charles 2011. Reshaping the vowel system: An index of phonetic innovation in Canadian English. Penn Working Papers in Linguistics 17(2): 20-29.

Boersma, Paul, and David Weenink 1992-2012. Praat: Doing Phonetics by Computer. Accessed at: www.praat.org.

Chambers, J. K. 1973. Canadian raising. Canadian Journal of Linguistics 18(2): 113-135.

Chambers, J. K. 1998a. Social embedding of changes in progress. Journal of English Linguistics 26(1): 5-36.

Chambers, J. K. 1998b. Inferring dialect from a postal questionnaire. Journal of English Linguistics 26(3): 222-246.

Chambers, J. K. 2006a. The development of Canadian English. In: Kingsley Bolton and Braj B. Kachru (eds), World Englishes: Critical Concepts in Linguistics London: Routledge, pp. 383-395.

Chambers, J. K. 2006b. Canadian Raising: retrospect and prospect. Canadian Journal of Linguistics 51(2-3): 105-118.

Clarke, Sandra 1991. Phonological variation and recent language change in St. John’s English. In: Jenny Cheshire (ed.), English Around the World: Sociolinguistic Perspectives. Cambridge: Cambridge University Press, pp. 109-122.

Clarke, Sandra 1993. The Americanization of Canadian pronunciation: A survey of palatal glide usage. In: Sandra Clarke (ed.), Focus on Canada. Amsterdam: John Benjamins, pp. 85-108.

Clarke, Sandra 2006. Nooz or nyooz?: The complex construction of Canadian identity. Canadian Journal of Linguistics 51(2-3): 225-246.

Clarke, Sandra, Ford Elms and Amani Youssef 1995. The third dialect of English: Some Canadian evidence. Language Variation and Change 7: 209-228.

Denis, Derek. Forthcoming. The Development of Pragmatic Markers in Canadian English. Doctoral dissertation, University of Toronto.

Dollinger, Stefan 2008. New-Dialect Formation in Canada: Evidence from the English Modal Auxiliaries. Amsterdam: John Benjamins.

Granatstein, J. L., and Dean F. Oliver 2011. The Oxford Companion to Canadian Military History. Don Mills, ON: Oxford University Press.

Gregg, Robert J. 1957. Notes on the pronunciation of Canadian English as spoken in Vancouver, B. C. Journal of the Canadian Linguistic Association 3(1): 20-26.

Gregg, Robert J. 1992. The Survey of Vancouver English. American Speech 67(3): 250-267.

http://www.praat.org


353

Gregg, Robert J. 2004. The survey of Vancouver English, 1976-1984: Methodology, planning, implementation and analysis. In: Gaelan Dodds De Wolf, Margery Fee and Janice McAlpine (eds), The Survey of Vancouver English: A Sociolinguistic Study of Urban Canadian English. Kingston, ON: Strathy Language Unit, Queen’s University, pp. 1-138.

Joos, Martin 1942. A phonological dilemma in Canadian English. Language 18: 141-144.

Labov, William 1972. Sociolinguistic Patterns. Philadelphia: University of Pennsylvania Press.

Labov, William 1994. Principles of Linguistic Change: Internal Factors. Oxford: Blackwell.

Labov, William, Sharon Ash and Charles Boberg 2006. The Atlas of North American English: Phonetics, Phonology and Sound Change. Berlin: Mouton de Gruyter.

Manitoba Historical Society 2103. “Memorable Manitobans: Edgar Stanford Russenholt (1890-1991).” Retrieved from http://www.mhs.mb.ca/docs/people/russenholt_es.shtml.

Morton, Desmond 1985. A Military History of Canada. Edmonton: Hurtig Publishers.

Nearey, Terrance Michael 1978. Phonetic Feature Systems for Vowels. Bloomington, IN: Indiana University Linguistics Club.

Sankoff, Gillian, and Hélène Blondeau 2007. Language change across the lifespan: /r/ in Montreal French. Language 83(3): 560-588.

Scargill, Matthew Henry, and Henry J. Warkentyne 1972. The Survey of Canadian English: a report. English Quarterly 5(3): 47-104.

Statistics Canada 2011. Census of Population. Catalogue no. 98-314-XCB2011016. (Retrieved from http://www12.statcan.gc.ca.)

Tagliamonte, Sali A. 2006. “So cool, right?”: Canadian English entering the tewnty-first century. Canadian Journal of Linguistics 51(2-3): 309-332.

Thomas, Erik R. 1991. The origin of Canadian raising in Ontario. Canadian Journal of Linguistics 36. 147-170.

Trinity College 2013. Trinity College School Record, May 1919 - January 1922. Old Boys' Service List. Retrieved from: http://archive.org/stream/trinitycollegesc2224trin/trinitycollegesc2224trin_djvu.txt

Wagner, Susan Evans 2012. Real-time evidence for age grad(ing) in late adolescence. Language Variation and Change 24: 179-202.


Woods, Howard B. 1999. The Ottawa Survey of Canadian English. Kingston, ON: Strathy Language Unit, Queen’s University.

http://www.mhs.mb.ca/docs/people/russenholt_es.shtml

http://www12.statcan.gc.ca

http://archive.org/stream/trinitycollegesc2224trin/trinitycollegesc2224trin_djvu

Clarke, De Decker and Van Herk Canadian Raising in Newfoundland? --- Page 354 of 525

354

16 Canadian Raising in Newfoundland?

Insights from early vernacular recordings

Sandra Clarke, Paul De Decker and Gerard Van Herk∗ 1 Introduction As is typical of (post-)insular varieties, the distinct features and striking linguistic retentions of Newfoundland English are to a large degree determined by the region’s complex history and long period of isolation (physical, social and political). Those same factors may also be responsible for a dearth of extremely early records of vernacular speech, as Newfoundland’s marginal status would have contributed to a lack of awareness of, or interest in, the details of its intangible culture, at least among the mainstream elite who would have determined what was recorded and archived. In what follows, we demonstrate how the recordings that do exist, combined with the inherently conservative nature of NE, permit us to glimpse an earlier stage of English and one of its variable features, ‘Canadian Raising’. The earliest audio recordings of Newfoundland English, which date back to the early 1920s, were largely intended for radio broadcasting. Very few broadcast recordings made prior to the mid 1940s have been preserved, however. As Webb (2008: 13) notes, ‘[a] fraction of the programming of the [Newfoundland] Broadcasting Corporation was recorded, given the difficulties and expense of recording, and only a fraction of that has survived.’ These early recordings typically represent the public language of prominent speakers, among them members of the government and the clergy. They thus involve in large measure a standard, perhaps hyper-standard, morphology and syntax, particularly the ‘cultivated Anglo-Irish’ (Kirwin 1993) accent of the merchant and political class of Newfoundland’s capital and largest city, St. John’s. Some speakers of this cultivated variety even employed features borrowed from their perceptions of British Received Pronunciation, among them highly retracted vowel pronunciations in the LOT and THOUGHT lexical sets (cf. Pringle 1985: 189-190 for a parallel in Canadian English in general, given the association of British English with erudition and refinement). We are fortunate nonetheless that the region has a long and rich history of the documentation of folk culture, which includes recordings of vernacular speech. By way of example, two years after Newfoundland became the tenth province of Canada in 1949, the Canadian Broadcasting Corporation (CBC) introduced the weekday radio program Fishermen’s (now Fisheries) Broadcast. An important component of this program consisted of conversations with fishers from the many tiny ‘outports’ which dot the province’s coastline. In addition, from the early 1960s,

∗ We extend our thanks to Philip Hiscock, Linda White and Memorial’s MUNFLA archive for allowing us access to their recorded samples of traditional Newfoundland speech. We are also grateful to Raymond Hickey for information on several points raised in this chapter.


355

various departments and units of the province’s sole university (Memorial University of Newfoundland) became involved in the preservation of cultural heritage, including the region’s conservative vernacular speech forms. Foremost among these units was the Folkore and Language Archive (MUNFLA; www.mun.ca/folklore/munfla). Its audio collection, both reel and cassette – a portion of which has now been digitized – contains many interviews and other field research conducted, in particular, by Memorial faculty and undergraduate Folklore students. The archive houses material from approximately 30,000 Newfoundlanders, or some 4% of the province’s entire population. Most of the interviewees are elderly and rural. In short, despite the relative lateness of vernacular speech recordings in Newfoundland, existing recordings enable us to extend apparent-time analyses to speakers born as early as 1870. This time-depth compares quite favourably with that of the earliest recorded vernacular regional speech data in other parts of the English-speaking world, among them Britain (the Survey of English Dialects or SED, recorded in the 1950s; e.g. Orton and Dieth 1962, Orton et al. 1962-71), and New Zealand (the ONZE corpus, originally recorded in the 1940s; e.g. Gordon et al. 2004). In this chapter, we utilize (MUNFLA recordings to investigate the phonetic feature usually termed Canadian Raising (CR). Though it has been claimed (Trudgill 1985: 40) that this feature is not found in Newfoundland English (NE), early vernacular recordings indicate the contrary. We analyze acoustically a small sample of traditional vernacular speech, to determine what light this sheds on the origins of CR, both in NE and more generally. Prior to that, however, we provide a brief historical overview of both NE and CR, including the divergent linguistic approaches that have been offered to explain the emergence of this feature in a number of varieties of English. 2 Newfoundland English: A brief introduction Within the North American context, NE is unique. To this day, it remains distinct from mainland Canadian English (CE), though it has certain features in common with CE, particularly as spoken in Canada’s neighbouring Maritime Provinces (see e.g. Clarke 2010). The unique character of NE stems from several factors not shared with most of the Canadian mainland. Among the chief of these are time of settlement and population origins, NE being a product of relatively early and extremely localized source area out-migration. Though the bulk of permanent European migration to Newfoundland occurred in the late eighteenth and early nineteenth centuries, settlement on the east coast of the island dates back to the first decades of the seventeenth century. Early European settlers came from two principal sources. Starting in the early seventeenth century, West Country merchants brought out fishery workers (initially on a seasonal basis) from the counties of Devon and Dorset, along with the border areas of such neighbouring counties as Somerset and Hampshire (Handcock 1977). From c. 1675 onwards, these merchants also engaged workers from the southeast of Ireland – principally, from within a 30-mile radius of the city of Waterford, encompassing County Waterford, southwest Wexford, south Kilkenny, and the southeastern portions of

http://www.mun.ca/folklore/munfla)


356

Tipperary and Cork (Mannion 1977a: 8). Geographical isolation also played a role in the development of NE. Until the mid-twentieth century, Newfoundland’s small population was scattered in hundreds of tiny coastal fishing outports, many linked to the outside world only by boat. A cross-island railway was completed only in 1898, and a paved highway, not before 1967. Even today, a number of small outports on the island’s south coast remain without road connections. Such conditions were far from conducive to much outside input, or to large-scale dialect mixing. Other than in larger communities, the southwest English and southeast Irish populations remained largely geographically segregated, the Irish confined for the most part to the southern Avalon Peninsula in the province’s southeast corner. As a result of these various factors, the English varieties that emerged on the island of Newfoundland are characterized by their conservative nature. Linguistic conservatism is readily apparent in the data of the new online Dialect Atlas of Newfoundland and Labrador <www.dialectatlas.mun.ca>, grounded in MUNFLA recordings of rural speakers almost all of whom were born between 1871 and 1912. Its phonetic and morphological components document the regional distribution of 58 variable linguistic features in 69 coastal communities. Many of these point to the maintenance of source variety features that continue to differentiate Irish- and southwest-English-settled areas of the province. By way of example, syllable-initial /h/ deletion in lexical words is found among conservative rural NE speakers of southwest-English descent, but is largely absent from comparable ‘Irish-origin’ speakers (cf. also Clarke 2010: 47-48). These recordings, then, suggest that /h/-deletion (along with environmentally-conditioned addition of initial non-phonemic [h]) was a feature brought to Newfoundland by early settlers from southwest England. This is of considerable interest, relative to the claim of Wells (1982: 255) that /h/-deletion did not generally arise in England until well after the founding of the American colonies. Likewise, the online Dialect Atlas indicates that the traditional speech of English- and Irish-settled regions of the island is distinguished by the use of a ‘dark’ or velar postvocalic /l/ in the former, as opposed to a ‘clear’ or palatal pronunciation in the latter. The widespread presence in traditional NE of a velarized contoid postvocalic /l/ runs counter to the suggestion of Trudgill (1999: 237) that a dark /l/ variant did not arise in British English before the late nineteenth century (cf. Hickey 2002, who suggests, rather, an ‘ebb and flow’ in the velarization of /l/ in the history of English). In this paper, we utilize MUNFLA recorded data of conservative traditional speakers of NE to investigate a phonetic feature whose origins and chronology remain unresolved within the history of English. Given its iconic association with CE – despite its occurrence in a number of other varieties – this feature is generally referred to as Canadian Raising. It involves environmentally-conditioned realizations of the diphthongs /ai/ and /au/ (see e.g. Chambers 1973, 2006). 3 Origins of Canadian Raising Though this feature had been earlier described by linguists, the term ‘Canadian Raising’ was coined by Chambers (1973), who observed that the phenomenon was not unique to CE. CR is generally viewed as involving distinct realizations of the

http://www.dialectatlas.mun.ca


357

nuclei of the diphthongs /ai/ and /au/ when they are followed by a tautosyllabic voiceless consonant. Before a voiceless coda, the nucleus is raised to a mid schwa- or wedge-like variant. This gives rise to such contrasts as louse [lʌʊs] vs. lousy [laʊzi], and white [wʌɪt] vs. wide [waɪd] and why [waɪ].

CR has been associated with CE for at least the past century and a half. Using dialect atlas data, Thomas (1991) showed that the raising of both /ai/ and /au/ characterized the speech of residents of Ontario born as early as 1861. Yet despite its status as an iconic feature of CE, CR does not occur among all present-day speakers, nor in all areas of the country (Labov, Ash and Boberg 2006: 222, Boberg 2010: 204-205). Boberg (2008: 139-140), for example, found CR to be variable in Quebec as well as in Newfoundland, and less strong in British Columbia than in a number of other Canadian provinces.

Though documented in various locations outside Canada, CR is fairly rare in World Englishes. It has not been noted in England itself, apart from the English Fens (Britain 1997). None the less, CR is a feature of a handful of early, and often conservative, transatlantic varieties of English. These include American varieties spoken in New England (Kurath and McDavid 1961, Thomas 1991) – among them Martha’s Vineyard (Labov 1963) and Vermont (Roberts 2007) – and as far south on the US Eastern seaboard as South Carolina and Georgia (Kurath and McDavid 1961). CR also occurs in some insular varieties of the Caribbean (the Bahamas, Bermuda, Saba) and the South Atlantic (the Falklands, Tristan da Cunha, St. Helena) (Trudgill 1985, 1986). Several studies (e.g. Vance 1987, Dailey O’Cain 1997, Moreton and Thomas 2007, Sadlier-Brown 2012) have documented the existence of CR in the northern USA, particularly in areas bordering Canada; some of these attribute its presence to more recent dialect contact with CE. Across English dialects, the CR pattern occurs considerably less frequently for the diphthong /au/ than for /ai/.

Historically, the Modern English diphthongs /ai/ and /au/ derive from the Middle English long vowels /i:/ and /u:/, respectively, as illustrated by fi:f ‘five’ and hu:s ‘house’. Their current standard English phonetic realizations are the result of the Great Vowel Shift of the fourteenth and fifteenth centuries, which affected Middle English long vowels: in a chain-shift-like development, the raising of the low and mid long vowels resulted in diphthongization of the high vowels /i:/ and /u:/. This shift yielded many possible diphthongal reflexes in various Early Modern and conservative present-day regional varieties of British English. A word like Middle English hu:s, for example, has resulted in regional British pronunciations that vary in nuclear height, all the way from a low [haʊs] pronunciation through (variably fronted) mid realizations – including [həʊs], [hɛʊs] and [hæʊs], along with monophthongal variants – through to the historically unchanged high back vowel [hu:s] realization found in strongly vernacular varieties in northern England and in Scotland.

The origins of the CR pattern in English are disputed. Two chief theories prevail, one grounded in dialect contact, the other in phonetic causation. The first of these was advanced by Peter Trudgill (Trudgill 1985, 1986), who claimed that dialect mixing was a necessary condition for the development of CR. According to Trudgill, the allophonic variants of /ai/ and /au/ associated with different (post-Great-Vowel-Shift) regional inputs in situations of dialect contact, and ensuing new dialect formation, may undergo reduction (focusing) and subsequent reallocation.


358

In the case of /ai/ and /au/, these variants re-align themselves with different phonetic environments. It is Trudgill’s claim that dialect mixing clarifies the presence of CR in CE. Dialect contact is also viewed as the source of the only documented case of CR in Britain, which occurs in the English Fens (Britain 1997); likewise, the development of CR in South African Cape Flats English has been attributed to dialect mixing (Finn 2004).

The second approach sees CR as a natural phonetic development, which may be independently re-innovated rather than inherited. In many varieties of English, vowels and diphthongs tend to be shorter in pre-voiceless environments than elsewhere; it has been hypothesized (e.g. Chambers 1973; cf. Thomas 2001: 36) that shorter durations may result in truncated (i.e., more raised) onsets. More recently, however, Moreton and Thomas (2007) – in an analytically sophisticated sociophonetic account of CR – have failed to uncover durational shortening for /ai/ in pre-voiceless environment. Rather, they advance an ‘Asymmetric Assimilation’ hypothesis to account for the cross-dialect phonetic differences observed among English diphthongs in pre-voiceless vs. other contexts. According to this account, when diphthongs precede a voiceless coda, their offglides tend to be peripheralized in vowel space. In addition, their nuclei tend to shorten. Shorter nuclei are more subject to the co-articulatory effects of the glide, and hence tend to rise. In other words, diphthongs in pre-voiceless position are dominated by (and assimilate to) their offglide, rather than their nucleus. A corollary of this hypothesis is that in situations of emerging CR, offglide raising precedes nucleus raising in pre-voiceless position, rather than the reverse.

Another area of contention is whether, in any given locality, CR represents a conservative feature (‘failure to lower’, that is, the retention in pre-voiceless environments of historically earlier post-Great-Vowel-Shift mid-vowel onsets), or a true innovation (whereby low vowel nuclei become raised to mid position before voiceless segments). Chambers (1973) suggests that, in CE, pre-voiceless raising was an innovation rather than a retention. Likewise, on the basis of acoustic evidence, Moreton and Thomas (2007) show that raising rather than failure to lower underlies the development of CR in the Cleveland, Ohio region between 1878 and 1977. On the other hand, and while his focus is not CR, Britain (2008) makes a convincing case that the fronted mid (rather than fully low) onsets of the /au/ diphthong in contemporary New Zealand English – typically interpreted as innovative raising – represent, rather, the retention of a conservative feature, inherited directly from nineteenth century British and Irish input varieties. Similarly, Roberts’ (2007) acoustic analysis of Vermont English shows that CR resulted from mid-vowel lowering, rather than low-vowel raising. 4 Canadian Raising: The Newfoundland situation Despite Trudgill’s claim (1985: 140) that CR does not occur in NE, the NE literature is in fairly broad – though by no means unanimous – agreement that CR is, indeed, found in the province. On the basis of auditory evidence, Kirwin (1993: 75) concludes that Irish-origin varieties of NE display a CR-like pattern for /ai/, but not for /au/. Clarke (2010: 38) concurs. She also notes that conditioned raising for the /au/ diphthong (as well as for /ai/) occurs in many areas of the province settled


359

primarily or exclusively by the southwest English. This raising, however, is often not as marked in the case of /au/ as it is among mainland Canadian speakers, in that many traditional Newfoundland speakers do not have fully-lowered onsets in non-pre-voiceless environments. For some in fact, mid vowels appear to be the norm in all environments, pre-voiceless or not. This is the case, for example, in the mixed southwest English/southeast Irish-settled community of Carbonear, as documented by Paddock (1981).

The new online Dialect Atlas of Newfoundland and Labrador – also based on auditory evidence – offers further information on the /ai/ and /au/ diphthongs in the traditional speech of the island of Newfoundland. Of the 69 communities investigated, just under one-third display a CR-like patterning for /ai/, in the form of a somewhat more raised nucleus in pre-voiceless environment. This pattern is more evident among Irish-origin speakers than among those of southwest-English ancestry whose ancestors migrated to the island’s northeast coast, primarily from the counties of Dorset, south Somerset and Hampshire (see e.g. Handcock 1977: 42-43). Speakers from this latter group (along with some traditional speakers of Irish descent) tend to have raised (e.g. [ə, ɐ and ʌ]) onsets in all environments. As to /au/, the majority of the Atlas sample tends to display greater nuclear raising in pre-voiceless position, even though the degree of difference may be small. Lack of /au/ differentiation is most obvious, however, in Irish-settled portions of the island, where the usual nuclear realization is non-fully-lowered [ɐ].

The very small amount of acoustic analysis conducted to date tends to confirm that CR is a variable feature in NE. Thomas (2001: 62-63) identifies a CR pattern for both /ai/ and /au/ in the speech of a single male resident of the capital, St. John’s, born in 1934. Boberg (2008: 139-140) has analyzed acoustically the speech of six young native Newfoundlanders attending McGill University in Montreal. Though he found /ai/ raising generally among these speakers, /au/ raising in pre-voiceless environment was displayed by only half of his sample.

The situation is complicated in NE by the tendency for the nucleus of /ai/ to be variably backed and/or somewhat rounded, articulated in the mid-vowel range (typically, as [ə, ʌ or ʌ]; e.g. Clarke 2010: 39). The online Atlas indicates that almost half of the 69 communities investigated display some degree of retraction and/or rounding for white/wide words; these are located in both Irish- and southwest-English-settled regions of the province, though slightly more prevalent in the latter. In Thomas’ (2001) acoustic analysis of a single conservative St. John’s speaker, /ai/ and /oi/ are merged, so that both tie- and toy-words are pronounced with a mid central vowel. 1

Given the (largely) auditory evidence for raising in both /ai/ and /au/ contexts in traditional NE, the question of its origins obviously arises. As we have noted earlier, it is highly unlikely that CR emerged from a situation of dialect mixing. Outside a handful of mostly larger east coast communities, descendants of the original southwest English and southeast Irish settlers were for the most part geographically segregated. Even within mixed communities, network affiliations were drawn primarily along sectarian (that is, ethnic) lines. Moreover, in terms of 1 A rounded pronunciation of /ai/ occurs in both source varieties of NE, and was also found in earlier North American English (e.g. Miller 2010; see also Roberts 2007: 182).


360

the /ai/ and /au/ diphthongs, the original English and Irish input varieties can be assumed to have been fairly similar.

Nor does it appear likely that CR was inherited directly from input varieties. To our knowledge, this pattern has never been documented in the traditional speech of the two principal source regions for NE, southwest England and southeast Ireland. In both regions, in fact, mid rather than low diphthong nuclei tend to predominate, irrespective of environment. For Ireland, Barry (1982: 103) suggests that the diphthongized phonemes, introduced in the seventeenth (as well as eighteenth) century, in all likelihood contained the centralized mid vowel /ə/, i.e., /əɪ/ and /əʊ/. Likewise, Hickey (2001, 2004) confirms that in the Waterford region of southeastern Ireland, home to most Irish out-migrants to Newfoundland, the usual onset for /ai/ is [ə], and for /au/, [ɛ] or [æ]. As to England, Britain (2008) states that regional dialect evidence from the time of Ellis (1889) right through to the SED data of the 1950s points to mid-open onsets for the /au/ diphthong in the south of England, with open [aʊ] largely restricted to the north and northwest. In their SED Basic Materials, Orton and Wakelin (1967) provide variants of /ai/ and /au/ for the southwestern counties which constituted the principal sources of migrants to Newfoundland: Dorset, Devon, Somerset, Hampshire and Wiltshire. Both pre-voiceless and other environments exhibit virtually identical nuclei, among them [ə], [æ], [a] and [ɒ]. 2

In short, the evidence appears to point to CR as an independent, phonetically-motivated pattern in NE. To investigate the origins of CR in NE, we now turn to the investigation of a small but representative sample of earlier recordings of traditional vernacular speech. 5 Methodology 5.1 Sample Our all-male, traditional, working-class sample was selected to represent the two chief dialect types found in the province, inherited from either southwest England or southeastern Ireland. Table 1 provides information on this sample. Five of the nine speakers come from southern Avalon Peninsula communities settled almost exclusively by the Irish. The remaining four represent communities of southwest English origin; three are located on the northeast coast, while the fourth is the isolated south coast community of Francois.

Within each ancestry group, two different age levels are represented. ‘Older’ speakers were born between 1898 and 1914; ‘younger’ speakers, in the 1930s, approximately one generation later. The original sample consisted of eight speakers, two in each of the four Age/Origin groups. Since this yielded an insufficient number of tokens for the ‘Older Irish’, this category was supplemented by a third speaker, a native of St. John’s born in 1914.

2 These variants are illustrated in Orton, Sanderson and Widdowson (1978), by regional maps of various lexical items containing the two diphthongs. Among these items are ice (Ph103), five (Ph106), louse (Ph150) and boughs (Ph148),


361

Speech samples were extracted from digitized versions of analogue recordings made between 1975 and 1984; the only exception is an informal interview with a 91-year-old, recorded in 2005 in the northeast coast community of Greenspond. Seven of the nine were obtained from the Memorial University’s MUNFLA sound archive; these include several interviews originally aired on the local CBC Fisheries Broadcast radio program. Most samples consist of relatively informal interviews about folk life or issues in the fishery; a single recording (made in 1979, of a speaker from Branch, on the Irish Avalon Peninsula) involves a narrative performance at a St. John’s folk festival.

Irish-origin Southwest-English-origin Older Branch, born 1904, recorded 1979 Francois, b. 1898, rec. 1975 Tor’s Cove, b. circa 1910, rec. 1984 Greenspond, b. 1914, rec. 2005 St. John’s, b. 1914, rec. 1980 Younger Maddox Cove/Petty Harbour, b. c.

1930, rec. 1982 Springdale, b. c. 1930, rec. 1980

St. Stephen’s, b. c. 1935, rec. 1982 Noggin Cove, b. 1935, rec. 1980 Table 1. The nine-speaker sample in terms of Age and Origin 5.2 Acoustic analysis All tokens of /ai/ and /au/ were extracted from the nine speaker samples – which, despite their recording date, were of sufficiently good quality to permit formant analysis. This yielded a total of 348 tokens, 207 for /ai/ and 141 for /au/. In addition, a representative number of tokens was extracted from each speaker for both /a/ (the LOT/CLOTH/THOUGHT vowel) and /æ/ (the TRAP/BATH vowel), in order to facilitate analysis of the position of the diphthongal nuclei in vowel space.

Prior to token extraction, all analogue audio samples were digitized using the free software Audacity (available at soundforge.net) with a minimum sampling rate of 20kHz. This ensured resolution of the first two formant frequency values of each vowel token. First and second formant frequency measurements were performed in Praat (5.3.52) (Boersma and Weenink 2013) using standard Burg LPC formant settings at three temporal locations (20%, 50% and 80%) throughout the vowel. These positions were interpreted as the onset, midpoint and offglide, respectively. Because vowel formant values are partly dependent on the size and shape of the vocal tract, a property that varies from one individual to another, it is necessary to normalize vowel measurements before cross-speaker comparison can be made. All formant values were normalized using the Bark Difference Method as implemented on the NORM website (Thomas and Kendall 2007). This method converts F1 and F2 Hz values to a scale on which the intervals correspond with perceptually equal distances, or critical bands of human audition (Traunmüller 1990). Importantly, for our purposes, the Bark Method does not require formant data from the entire vowel system, something that was not feasible given the limited data available to us.


362

In this study we report the mean F1 nuclear midpoint Bark values for each diphthong in both raising and non-raising contexts, and the statistical significance of any differences determined by independent t-tests, for all four Age/Origin groups: Older English, Older Irish, Younger English, Younger Irish. Statistically significant differences between environments were interpreted as the presence of CR. 6 Results Vowel plots for each of the four speaker groups are provided in Figures 1 to 4. In each, the vertical dimension represents the first formant (F1), or vowel height; the horizontal axis, the second formant (F2), or vowel fronting/retraction. Each Figure plots the group mean value, in Bark units, for both the /ai/ and /au/ diphthongs, contrasting pre-voiceless and ‘elsewhere’ environments. For each diphthong, two points are plotted: the durational midpoint taken at 50% of the vowel duration, and a point measured at 80% into the vowel, which, as noted above, we take to represent the nucleus and the offglide, respectively. By way of comparison, each Figure also provides group mean F1/F2 values for both /æ/ (as in TRAP/BATH) and /a/ (as in LOT/CLOTH/THOUGHT).

Figure 1. Mean values for /ai/ and /au/, Older English group


363

Figure 2. Mean values for /ai/ and /au/, Older Irish group


364

Figure 3. Mean values for /ai/ and /au/, Younger English group


365

Figure 4. Mean values for /ai/ and /au/, Younger Irish group Table 2 lists numerical mean F1 values of /ai/ and au/ for each of the four groups, in both environments investigated. As this Table shows, despite nuclear height differences in the expected direction between voiceless and ‘elsewhere’ contexts in five of the eight cases, only three of these proved statistically significant. Environment Older English Older Irish Younger English Younger Irish /ai/ Voiceless 10.01 8.79 8.24 7.22 Elsewhere 9.50 8.21 8.29 7.25 Significance p = .05 – – – /au/ Voiceless 9.77 9.60 9.91 7.20 Elsewhere 9.43 8.24 7.45 7.33 Significance – p < .05 p = .001 –

Table 2. Mean F1 nuclear midpoint values, in Bark units, per Age/Origin group


366

6.1 Results for /ai/ Figures 1 and 2, along with Table 2, indicate that the two older groups clearly display a CR-like pattern in their articulation of /ai/, via greater (i.e., higher) nuclear means in pre-voiceless environment.3 While t-tests reveal the difference between voiceless and ‘elsewhere’ nuclear realizations of /ai/ to be significant only in the case of the Older English, the difference in mean F1 values between the two environments – obtained by subtracting the latter from the former – proved substantial for both older groups (+0.58 for the Older Irish, +0.51 for the Older English). The two younger groups, on the contrary, display almost identical mean nuclear heights in both environments, suggesting that CR is not a regular feature of their speech.

Yet as Figures 1 and 2 also indicate, the nucleus of /ai/ is mid rather than low for both older groups: the F1 mean nuclear value, for both voiceless and ‘elsewhere’ environments, is higher than that of /æ/ and /a/. This is particularly the case for the Older English. In fact, despite the presence of CR, the two older groups display the highest mean nuclear values for /ai/, in both environments. The Younger Irish, on the contrary (as shown Figure 4) – with voiceless and ‘elsewhere’ F1 mean values of 7.22 and 7.25, respectively – display considerably lower /ai/ diphthongal midpoints than does any other group. Their nuclear mean value is lower than that of /æ/, and approaches /a/. The same tendency is also evident among the Younger English, though not as marked.

Figures 1 through 4 also provide further insight into /ai/ rounding/retraction in traditional NE. As noted earlier, both English- and Irish-origin NE has been claimed to display variably rounded and/or realizations of this diphthong. These vowel plots show that typical variants of /ai/ are centralized, and not more retracted than the typical articulations of /a/, the LOT/CLOTH/THOUGHT vowel. In the case of the Younger Irish, however, there is a significant (p < .05) difference in our sample between voiceless and ‘elsewhere’ environments, with greater /ai/ retraction in the latter. 6.2 Results for /au/ Figures 1 through 4 indicate that three of the four groups investigated – all but the Younger Irish – exhibit a CR-like pattern for /au/. Mean F1 differences between pre-voiceless and ‘elsewhere’ realizations, however, proved significant for only two of these: the Older Irish and the Younger English (see Table 2).

3 To establish the presence of CR in CE, Boberg (2008: 130) requires a minimum F1

difference of 60 Hz between the two environments. The Bark equivalent, given the frequency levels in Figures 1–4, would be in the range of 0.6 to 1 Bark. This would mean that, by Boberg’s criterion, full-blown CR does not occur for /ai/ in the NE sample, though it exhibits borderline presence for both the Older English and the Older Irish. As to /au/, both the Younger English and the Older Irish meet the criterion. Note that though they do not achieve the threshold for /au/, the Older English display a separation, in the expected direction, of 0.34 Bark for this diphthong (see Table 2).


367

As in the case of /ai/, the Older English use raised (i.e., mid) nuclei in both environments, a pattern echoed in this case by the Older Irish. Likewise, as for /ai/, the Younger Irish display considerably lower /ai/ nuclei than does any other group. Yet the Younger English group differs in that, rather than exhibiting lower diphthongal midpoints in both environments, as for /ai/, they have enhanced the CR pattern for /au/. Thus while their mean F1 pre-voiceless value is the highest of any of the four groups, their corresponding mean value in the ‘elsewhere’ environment is substantially less: as Table 2 shows, there is a full 2.46 Bark unit difference between the two contexts for the Younger English. 7 Conclusion Through acoustic analysis, this chapter has shown that CR was variably present, for both /ai/ and /au/, in a small sample of traditional NE speakers born between 1898 and c. 1935. As we have indicated, the origins of this feature cannot be claimed to be historical: regional dialect evidence suggests that the CR pattern is unlikely to have been inherited from the ancestors of these NE speakers, who migrated to the island from southwest England and southeast Ireland between the late seventeenth and early nineteenth centuries. (That said, the addition of NE to one of a number of early transatlantic varieties in which CR is found (see Section 3 above) raises the intriguing possibility that a CR-like precursor may have characterized the historical British inputs into these varieties.) Nor can the origins of CR in NE be attributed to Trudgill’s hypothesis of dialect mixing, given the homogeneity of Newfoundland’s British and Irish founder populations, along with the relative separation within the province of the two founder types as a result of both geographic and sectarian divisions. The evidence suggests, then, that CR may have been an independent phonetic innovation in NE. However, since CR is already a feature of the speech of the oldest Newfoundlanders investigated here, the general absence of earlier recordings precludes, at least for the moment, further acoustic investigation into the phonetic development of this feature in NE – notably, the applicability of Moreton’s and Thomas’ (2007) ‘Asymmetric Assimilation’ hypothesis. While our data also do not provide a definitive answer to the issue of whether raising or ‘failure to lower’ was involved in the development of CR in NE, they do shed some light on this issue. Among the two older groups investigated – both of whom display a CR-like pattern for /ai/ and /au/ – diphthong nuclei, whether in pre-voiceless or other environments, are not fully lowered, but are articulated in the mid range (see Figures 1 and 2). Our study shows, in other words, that a CR pattern can exist in situations other than those involving a fully lowered diphthong nucleus in non-pre-voiceless position. Indeed, comparison of the realizations of the two older groups (Figures 1 and 2) to those of the two younger groups (Figures 3 and 4) suggests that, over a generation, diphthongal nuclei continued to lower in most cases, with ensuing loss of the CR distribution. The situation among our older NE speakers appears similar to that documented by Sudbury (2001) for Falklands Islands English, for which she states (p. 67) ‘the allophonic contrast is less distinctive than in other dialects such as Canadian English, because the onsets of both diphthongs are seldom fully open.’ This may also help to clarify why CR has been claimed by some (cf. Trudgill 1985) to be absent from NE.


368

In conclusion, while our acoustic analysis is grounded in samples of traditional NE speech recorded only in the 1970s and 1980s, these have provided new insights into the feature of CR as used by Newfoundland speakers born near the turn of the twentieth century. When contextualized within the broader articulatory-based results of the new online Dialect Atlas of Newfoundland and Labrador, we are confident that they also provide an accurate snapshot of this phonetic feature as it existed among speakers born in the last few decades of the nineteenth century. Thus, we are able to add a chronological dimension to the investigation of CR in more recent NE. D’Arcy (2000: 47), for example, concludes from a study of a preadolescent and adolescent upper-middle class St. John’s female speech that CR may be a fairly innovative feature in NE (yet cf. Clarke 2012: 515). The present study, however, indicates that it has existed within Newfoundland for a considerable period of time – perhaps subject to the type of ‘ebb and flow’ described by Hickey (2002) for a number of changes in the English language. As such, our study demonstrates the value to contemporary local linguistic endeavours of access to archived real-time language data. References Barry, Michael V. 1982. The English language in Ireland. In: Richard W. Bailey

and Manfred Gorlach (eds) English as a World Language. Ann Arbor: University of Michigan Press, pp. 84-133.

Boberg, Charles 2008. Regional phonetic differentiation in Standard Canadian English. Journal of English Linguistics 36: 129-154.

Boberg, Charles 2010. The English Language in Canada: Status, History and Comparative Analysis. Cambridge: Cambridge University Press.

Boersma, Paul and David Weenink 2013. Praat: doing phonetics by computer [Computer program]. Version 5.3.52. http://www.praat.org/. Accessed 15 August 2013.

Britain, David 1997. Dialect contact and phonological reallocation: ‘Canadian Raising’ in the English Fens. Language in Society 26: 15-46.

Britain, David 2008. When is a change not a change? A case study on the dialect origins of New Zealand English. Language Variation and Change 20.2: 187-223.

Chambers, J. K. 1973. Canadian Raising. Canadian Journal of Linguistics 18.2: 113-135.

Chambers, J. K. 2006. Canadian Raising: retrospect and prospect. Canadian Journal of Linguistics 51.2/3: 1051-1058.

Clarke, Sandra 2010. Newfoundland and Labrador English. Edinburgh: Edinburgh University Press.

Clarke, Sandra 2012. Phonetic change in Newfoundland English. World Englishes 31.4: 503-18. (Special issue on Canadian English, ed. Stefan Dollinger and Sandra Clarke).

Dailey-O’Cain, Jennifer 1997. Canadian Raising in a Midwestern U.S. city. Language Variation and Change 9: 107-120.

D’Arcy, Alexandra 2000. Beyond mastery: a study of dialect acquisition. M.A. thesis, Memorial University of Newfoundland.



369

Ellis, Alexander 1889. On Early English Pronunciation, Part V. London: Trübner. Finn, Peter 2004. Cape Flats English: phonology. In: Edgar W. Schneider, Kate

Burridge, Bernd Kortmann, Rajend Mesthrie and Clive Upton (eds) A Handbook of Varieties of English, Vol I: Phonology. Berlin: Mouton de Gruyter, pp. 964-984.

Kirwin, William J. 1993. The planting of Anglo-Irish in Newfoundland. In: Sandra Clarke (ed.) Focus on Canada. Amsterdam: John Benjamins, pp. 65-84.

Gordon, Elizabeth, Lyle Campbell, Jennifer Hay, Margaret Maclagan, Andrea Sudbury and Peter Trudgill 2004. New Zealand English: Its Origins and Evolution. Cambridge: Cambridge University Press.

Handcock, Gordon 1977. English migration to Newfoundland. In: Mannion (ed.), pp. 14-48.

Hickey, Raymond 2001. The south-east of Ireland. In: John Kirk and Dónall Ó Baoill (eds). Language Links: The Languages of Scotland and Ireland. Belfast Studies in Language, Culture and Politics 2. Belfast: Queen’s University, pp. 1-22.

Hickey, Raymond 2002. Ebb and flow: a cautionary tale of language change. In: Teresa Fanego, Belén Mendez-Naya and Elena Seoane (eds). Sounds, Words, Texts, Change. Selected Papers from the Eleventh International Conference on English Historical Linguistics (11 ICEHL). Amsterdam: John Benjamins, pp. 105-128.

Hickey, Raymond 2004. A Sound Atlas of Irish English. Berlin/New York: Mouton de Gruyter.

Kurath, Hans and Raven McDavid 1961. Pronunciation of English in the Atlantic States. Ann Arbor, MI: University of Michigan Press.

Labov, William 1963. The social motivation of a sound change. Word 19: 273-309. Labov, William, Sharon Ash and Charles Boberg 2006. Atlas of North American

English: Phonetics, Phonology and Sound Change. Berlin: Mouton de Gruyter.

Mannion, John J. 1977a. Introduction. In: Mannion (ed.), pp. 1-13. Mannion, John J. (ed.) 1977b. The Peopling of Newfoundland. St. John’s, NL:

Institute of Social and Economic Research (ISER), Memorial University. Miller, Ben 2010. Early American accents. http://outofthiscentury.wordpress.com/

2010/01/21/ early-american-accents/. Accessed 8 August 2013. Moreton, Elliott and Erik R. Thomas 2007. Origins of Canadian Raising in

voiceless-coda effects: a case study in phonologization. In: Jennifer Cole and José Ignacio Hualde (eds). Papers in Laboratory Phonology 9. Berlin: Mouton de Gruyter, pp. 36-74.

Orton, Harold and Eugen Dieth 1962. Survey of English Dialects: An Introduction. Leeds: E. J. Arnold.

Orton, H., Michael V. Barry, Wilfrid J. Halliday, Philip M. Tilling and Martyn F. Wakelin (eds) 1962-71. Survey of English Dialects: The Basic Material, vols. I-IV. Leeds: E.J. Arnold.

Orton, Harold, Stewart Sanderson and John Widdowson (eds) 1978. The Linguistic Atlas of England. London: Croom Helm.

Orton, Harold and Martyn F. Wakelin (eds) 1967. Survey of English Dialects (B): The Basic Materials, vol. IV. The Southern Counties. Leeds: E.J. Arnold.

http://outofthiscentury.wordpress.com/


370

Paddock, Harold 1981. A Dialect Survey of Carbonear, Newfoundland. (Publications of the American Dialect Society, number 68). University, Alabama: University of Alabama Press.

Pringle, Ian 1985. Attitudes to Canadian English. In: Sidney Greenbaum (ed.). The English Language Today. Oxford: Pergamon, pp. 183-205.

Roberts, Julie 2007. Vermont lowering? Raising some questions about /ai/ and /au/ south of the Canadian border. Language Variation and Change 19.2: 181-197.

Sadlier-Brown, Emily 2012. Homogeneity and autonomy of Canadian Raising. World Englishes 31.4: 534-548.

Sudbury, Andrea 2001. Falkland Islands English. A southern hemisphere variety? English World-Wide 22.1: 55-80.

Thomas, Erik 1991. The origin of Canadian Raising in Ontario. Canadian Journal of Linguistics 36: 147-170.


Thomas, Erik R. and Tyler Kendall 2007. NORM: The vowel normalization and plotting suite. http://ncslaap.lib.ncsu.edu/tools/norm/. Accessed 15 August 2013.

Traunmüller, Hartmut 1990. Analytic expressions for the tonotopic frequency scale. Journal of the Acoustical Society of America 88: 97-100.

Trudgill, Peter 1985. New dialect-formation and the analysis of colonial dialects: the case of Canadian Raising. In: Henry J. Warkentyne (ed.). Papers from the Fifth International Conference on Methods in Dialectology. Victoria, British Columbia: University of Victoria, pp. 35-46.

Trudgill, Peter 1986. Dialects in Contact. Oxford: Basil Blackwell. Trudgill, Peter 1999. A window on the past: ‘colonial lag’ and New Zealand

evidence for the phonology of nineteenth-century English. American Speech 74.3: 227-239.

Vance, Timothy 1987. ‘Canadian Raising’ in some dialects of the northern United States. American Speech 62: 195-210.

Webb, Jeff A. 2008. The Voice of Newfoundland: A Social History of the Broadcasting Corporation of Newfoundland, 1939-1949. Toronto/Buffalo/ London: University of Toronto Press.

Wells, J. C. 1982. Accents of English. Vol. I: Introduction. Cambridge: Cambridge University Press.

http://ncslaap.lib.ncsu.edu/tools/norm/

Gooden and Drayton The Caribbean - Trinidad and Jamaica --- Page 371 of 525

371

17 The Caribbean - Trinidad and Jamaica Shelome Gooden and Kathy-Ann Drayton 1 Introduction This chapter focuses on the analysis and description of intonational patterns in different sets of twentieth century recordings in two Caribbean English-lexicon Creoles, Jamaican and Trinidadian. These historically related restructured varieties have both been shaped by processes of language creation (Winford 2000, Schneider 2008) and are the result of European colonial expansion and West African contact in the Caribbean in the seventeenth century (Lalla and D’Costa 1990). However, the individual development of the English varieties and the related Creoles in Jamaica and Trinidad is different. Differences in the particular mechanics of the language creation process and in the local ecologies thereafter also translate to different outcomes, as the newly formed languages would not have remained static (Plag and Schramm 2006, Smith 2008). During the 40-year period between 1680 and 1720, Lalla and D’Costa suggest that Caribbean Creoles developed their distinctiveness, and the roughly 200 years of West African slave presence left its mark on the grammar and the phonology (Alleyne 1980). More precisely, British presence in the Caribbean came more than 100 years after the Spanish, with the earliest being in St. Kitts in 1624. Jamaica became British in 1655 and remained under British rule until August 1, 1838. In Trinidad, the Spanish ruled from 1498 until the British came in 1797, but after 1783, French and French Creole speaking migrants from other Caribbean territories developed an indigenous French Creole, which was the lingua franca on the island for several generations before gradually being supplanted by the newly developed English and associated English creole (Winford 1997, Ferreira and Holbrook 2001, Scott 2011). The movement toward English was driven by the immigration of English-Creole speaking workers into Trinidad, from other Eastern Caribbean states such as St. Vincent and Barbados. Thus, English in Trinidad has a much shorter and perhaps more complex history than in Jamaica, the earliest reports of an English (Creole) being around 1838 (Winer 1993).

Notwithstanding historical differences, Trinidadian and Jamaican both co-exist with local Standard English varieties in similar though not identical sociolinguistic contexts (Winford 1997). The contemporary language situation in Jamaica for example, puts the Creole variety in a diglossic situation with Jamaican English, which for most speakers is accessed through formal education and writing (Devonish and Harry 2004). Similarly in Trinidad, the standard is the first language for some speakers, and is accessible through the education system (Winford 1993, Youssef and James 2004, Ferreira and Drayton ms, Deuber 2010). Rural Jamaican has been argued to be more radical and divergent from an English superstrate than Trinidadian, which is a relatively uniform intermediate Creole that is used both in rural and urban areas (Winford 1997). The nature of the contact among the different varieties in each country has led to intricate patterns of variation, further


372

complicated by the languages of newer migrants, especially in Trinidad.1 Some of the speakers of these Creoles have access to several social dialects that are invoked in different contexts, but the lines of demarcation between social dialects is not always clear cut. There are often clearer differences in some aspects of the grammar than in others and having shared lexical cognates with the local English varieties feeds linguistic ideologies of the Creoles being substandard or deviant (Irvine 2008). This holds for diachronic aspects of the language as well as for changes over time. The presence of these kinds of sociolinguistic variation makes description of the linguistic systems challenging and at the same time the social significance of linguistic forms can change over time, presenting another challenge for researchers.

While this diachronic change occurs in all areas of the grammar, it can be argued that changes in phonology are perhaps the most salient and most complex of the linguistic sub-systems. Furthermore, given the perceived low-status of the Creoles relative to English, the phonological system is especially sensitive to sociolinguistic variation (see discussions on Jamaican Creole in Beckford-Wassink 2001, Meade 2001, Irvine 2004, Devonish and Harry 2004 and on Trinidadian in Youssef and James 2004). Le Page (1958:63) in his discussion of Jamaican offered that “the different dialects are however, sharply distinguishable by prosodic features.” In this regard, some areas of segmental phonology of Trinidadian and Jamaican are the same as in local English varieties, some Creole features are stigmatized and avoided in English speech and the prosodic properties show areas of overlap alongside areas of sharp differences.

Since prosody, like other areas of the grammar can and will change over time, we can reasonably expect varieties like Jamaican and Trinidadian to develop and display changes in their stress and intonation systems when older and newer versions are compared. Thomason and Kaufman (1998) predict that contact-induced change can affect all aspects of the linguistic system constrained only by sociohistorical factors. We will argue below, that the prosody of Trinidadian (but not rural Jamaican) reveals some diachronic change, in just over 50 years, as a result of changes in the sociolinguistic dynamics, which include contemporary contact situations. This kind of ‘swift’ change is due precisely to the ecological context in which the variety exists. These factors include contact with Indian Bhojpuri speaking immigrants in the late nineteenth and early twentieth century, as well as possible influence from Trinidadian French Creole right up to the first decade or so of the twentieth century (Winford 1972, Youssef and James 2004, Gooden, Drayton and Beckman 2009). We will also show however, that there are other prosodic patterns brought about by contemporary contact situations and the recasting of ethnic identities.

This chapter focuses on two main issues. The dominant task is a review of the phonology of intonation in Jamaican and Trinidadian focusing on intonation, prominence and phrasing in the more recent data. We then evaluate intonation, prominence and phrasing in the older data, so that we can compare the results from

1 We do not imply that newer migrants to Jamaica have had no impact on the Creole spoken there. The evidence for any such argument is more clearly documented in Trinidad. In terms of Indian Bhojpuri speakers for example, nearly 4 times as many speakers were shipped to Trinidad (Mesthrie 1993).


373

both sets of rural Jamaican Creole (Jamaican) and rural and suburban Trinidadian Creole (Trinidadian). A variety of early twentieth century audio recordings spanning a period of just over 60 years are examined and are compared to more recent recordings. The aim is to see whether there are commonalities in the observed phonological patterns in each of the languages. This leads naturally to a second issue regarding the development of systems/prosodic structure in these new varieties, and might then shed light on (a) the differences in the prosodic structure in Trinidadian, that are not observed in Jamaican and (b) differences in the alignment of the F0 on stressed syllables. In examining these varieties, we also bear in mind that the developing norms of usage for the national English varieties in each country (see Wassink 2001, Irvine 2004) quite likely influenced the related Creoles.

The remainder of this introduction provides an overview of the role prosody in language change and reviews the kinds of changes that have been documented in the segmental phonologies of the languages. We then present our theoretical framework for the prosody analysis (section 2.1), a brief description of the data collection methods for the contemporary data (section 2.2) and an overview of the observed prosodic and intonational patterns in the more recent data (section 2.3). Section 3 provides the prosodic analysis for the older data and in section 4 we summarize and discuss the results.

1.1 Prosody in language change The phonological systems of Caribbean varieties of English are comparatively underresearched, yet are very important in achieving a thorough description of these varieties. Following Irvine (2004: 67), phonology is the aspect of the language that is not reproduced in the written texts…introduced…in the school system and it is the aspect of the language that distinguishes the educated Caribbean speaker from his/her counterpart in other parts of the ‘English-speaking’ world (paraphrased and italics added). Prosody is a kind of structure, a ‘grammar’ that has to be learned and/or interpreted correctly. It imposes a rhythmic structure on speech, signaling the divisions of utterances into interpretable parts (Beckman 1996). Intonation is then layered on top of these parts to convey a variety of discourse pragmatic meanings. Both prosodic structure and intonation work to signal information about the relatedness of constituents. The distribution of intonational features like pitch and relative prominence in a given utterance is done only in ways permitted by the prosodic structure (Ladd, 2008). Given these factors, it is clear that changes in the prosodic system of Creoles is a kind of structural change that needs to be studied, alongside our studies of grammatical change if we are to fully understand how the languages developed or changed over time (Devonish 1989; Clements and Gooden 2009). In other words, prosody is an important part of telling the story of the development of creole languages, yet is very often ignored in these discussions. At the same time the value of phonology (and we add phonetics) research to discussions on language change in Creoles has not been completely lost on the field. Earlier works by Cassidy (1961), Cassidy and LePage (1967), DeCamp (1974), later by Singh and Muysken (1995) for example, encourage a reassessment


374

of theories of creole formation guided by findings from phonological research. Lesho’s (2013) dissertation and Beckford-Wassink (1999 et seq) very clearly highlight the value of sociophonetic research to understanding variation and change in the phonological (vowel) systems of Creoles. As Lesho aptly argues, even where a Creole’s phonological system bear resemblances to its input languages, the phonetic implementation could yield significant differences, especially given the effects of social factors and ideological stances of speakers.

Recent research by Sandler et al (2010) argues that prosody and its interaction with sentence structure should be incorporated into any model of language evolution. They suggest that in a new language, prosody may be the only indicator of functions like marking constituent boundaries or relations between them, as well as marking pragmatic functions. Similarly Givón (1979) claims that prosody is vital to the development of ‘new’ languages like creoles, as it signals the relations between constituents before syntactic structures are developed. For us, this means that in (early) Creole formation, prosody played an important role in grammaticalization processes (see also discussions in Wichmann 2011). Certainly, there is no way to truly test this, as there is no access to these early speech recordings with which to evaluate intonational cues to prosodic structure. However, we can look to synchronic and diachronic realizations in different Caribbean Creoles to provide some evidence. These can then complement the existing independent analyses of historical change. Still, we must be cautious since prosodic classification of creoles at any particular stage remains challenging (Drayton 2007), because the research is still in its infancy and we cannot assume that prosodic changes proceed via the same mechanisms as grammatical changes. One could construct an argument along the lines that due to their tonal West African substrates, all Caribbean Creoles were at one time tonal and less conservative Creoles have shifted from tone specification to accent specification. Our position is that this argument might be too simple because the effect of language contact on prosodic systems is complex and the outcomes are not always reliably predictable (Gooden, Drayton and Beckman 2009). Moreover, the processes of change involved in Creole formation are not unique and involves universal principles and internal processes of language change that affect all languages (see discussions in Winford 2003). 1.2 Phonological variation in the vowel systems Although this paper focuses on higher-level phonological structure, segmental phonology also sheds light on the development of Creoles. Smith and van de Vate (2006) for example argue that ‘Suriname type’ vowel systems are to be found in Eastern Maroon varieties in Jamaica, like in Moore Town, such that there are no long vowels and there is preconsonantal monopthongization. In contrast, other varieties of Jamaican are said to preserve diphthongs and vowel length contrasts.2 Devonish and Harry (2004) argue for the presence of implosives in Jamaican and

2 Note that Beckford-Wassink (2001) provides acoustic phonetic evidence to the contrary, showing that other rural varieties of Jamaican Creole do not have a vowel length contrast, meaning that this is not just limited to Maroon varieties.


375

Smith and Haabo (2007) argue the same for Saramaccan. In both cases, researchers use this to support their argument for substrate influence on the phonology of the languages. Phonetic analyses of Creole phonologies are rare however, with some varieties like Jamaican receiving more treatment than others (see Lesho 2013 for a review). We highlight a few examples of sociophonetic analyses of Creole vowel phonology as the area that has received most treatment in the literature.

The vowel systems of Jamaican and Trinidadian have been the subject of sociophonetic analyses examining variation according to sociolinguistic factors like geography, gender, social class and networks of interaction. To date, Beckford-Wassink’s work (1999a, 1999b, 2001, 2006) is the most comprehensive sociophonetic analysis of the vowel system of contemporary Jamaican data. To our knowledge there is no analogous analysis of earlier Jamaican data that would be comparable to say Leung’s (2013) analysis of 1970s Trinidadian. Beckford-Wassink (2001, 2006) examined tense vowels in the BEAT, BATH, BOOT class and lax vowels in the BIT, TRAP, BOOK class in both a rural conservative Jamaican Creole variety and Jamaican English. By using a three dimensional metric (F1, F2 and duration) to analyse the degree of vowel overlap, Beckford-Wassink showed that in these cases the rural Creole variety showed partial spectral overlap, meaning that vowel length rather than spectral differences distinguished between vowel categories. The vowel quality metric was distinctive in Jamaican English, however. This lends some support to Smith and van de Vate’s (2006) hypothesis that vowel length contrasts are retained in more conservative varieties.

Leung (2013) examined variation in monophthongs in Trinidadian English with some comparison with Trinidadian English Creole data from Winford’s 1970 PhD dissertation. The three male speakers were born in 1898, 1911, and circa 1932 and therefore give samples of speech of persons born in the late nineteenth century and early twentieth century. Leung’s findings prove very interesting, revealing some shifts in the production of certain vowels over time. The NURSE lexical set, for example is produced by the informant born in 1898 as an open-mid back rounded vowel with variations [ɔː ~ oː ~ ɘː], due to its close acoustic proximity to the vowels in GOAT and NORTH. Leung points out that this is different from contemporary productions of the vowels where the acrolectal productions of NURSE involve the close-mid central unrounded [ɘː] and the Creole speaker’s open back rounded vowel [ɒ]. The other speaker from the rural area (born 1911) also displayed a large degree of variation in his vowel tokens as well. While the NORTH and NURSE vowels are not as close for the speaker born in 1911, as they are for the speaker born in 1898, Leung noted a tendency for the [+high] vowels like THOUGHT, STRUT and LOT to cluster together as did the [+low] vowels found in TRAP, BATH and START. There was a wide degree of variation shown by this speaker for example [ʌ ~ ʌ ~ ɒ ~ ɔ] for the STRUT and LOT lexical sets. The speaker from the urban area, born in 1932, exhibited a different vowel pattern from these other two, which Leung described as closer to contemporary acrolectal Trinidadian. Leung describes this speaker’s vowel set as being similar to Wells’ (1982) description of Trinidadian English. These data prove very interesting as they highlight a vowel pattern which existed at one time in Trinidad, but which no longer exists except perhaps in very old, rural speakers of more conservative forms. The current situation as Leung


376

shows in her examination of rural and urban speakers (recorded in 2009), is that of a stabilized system in which there are mergers and the amount of vowel variation has been reduced. Our analysis suggests that there are changes in the prosodic systems of the languages, analogous to those documented in the segmental system. 2 Theoretical background The prosody analysis is done in the Autosegmental Metrical (AM) framework (Pierrehumbert 1980, Beckman and Pierrehumbert 1986, Ladd 1996) with ToBI-type transcriptions of intonational tunes (Beckman, Hirschberg, and Shattuck-Hufnagel 2005), and we assume a Strict Layer Hypothesis of prosodic structure (Selkirk 1984). The AM model assumes that pitch accents, which consist of either a single tone or a sequence of tones, are phonologically associated with metrically prominent syllables (Pierrehumbert, 2000). We examine two major parameters of prosody, prominence and phrasing, at both the lexical and postlexical levels since both determine the prosodic property of an utterance. Prominence marking at the word level for Trinidadian (Drayton 2013) and for Jamaican (Gooden 2003) is cued by lexical stress. At the postlexical level, prominence marking is cued by the head of a phrase (marked by a nuclear pitch accent), or by a tone at the phrase edge, or by both (see Table 1). We follow Ladd’s (1996) typology in defining the intonational differences between the varieties. Although the typology was designed to describe intonational differences among languages, it is also applicable to varieties within the same type or in this case putative differences due to changes within the system. We have argued elsewhere (Gooden, Drayton and Beckman 2009) that Jamaican and Trinidadian exhibit type IV differences. That is, phonotactic differences, specifically in the timing of F0 patterns relative to the sequence of speech segments in an utterance. For instance, the languages show differences in the alignment of the rise-fall patterns on prominent syllables in broad focus declaratives.


377

H* A tone target on the accented syllable of a word or phrase, which is in the higher part of the speaker’s pitch range during production of the phrase.

L* A tone target on the accented syllable of a word or phrase, which is in the lower part of the speaker’s pitch range during production of the phrase.

< An early pitch accent diacritic indicating an F0 peak that precedes the accented syllable.

> A late pitch accent diacritic indicating an F0 peak that aligns after the accented syllable.

↑ An upstepped or elevated pitch accent diacritic indicating an unusually high F0 target.

H+L* A bitonal accent sequence consisting of a relatively high rise in F0 followed by a low F0 target on the accented syllable.

H*+L A bitonal accent sequence consisting of a high F0target on the accented syllable followed by a relatively low F0

!H* A high tone that is realised in a downstepped pitch range, that is, the pitch is still relatively high, but is measurably lower than that of the previous H* tone in the IP.

L-, H-, M- A phrasal tone which marks an intermediate level (ip) intonational boundary. L- marks a boundary with a low F0 target at the right edge of the ip; H- marks a boundary with a high F0 target at the right edge of the ip; M- marks a boundary with a mid-level F0 marking the right edge of the ip.

L%, H%, M%

A phrasal tone which marks every full intonation phrase (IP) boundary. L% marks a boundary with a low F0 marking the right edge of the IP; H% marks a boundary with a high F0 marking the right edge of the IP; M% marks a boundary with a mid-level F0 marking the right edge of the IP.

Table 1. Summary of annotation conventions for Trinidadian and Jamaican

intonation 2.1 Prosody of Trinidadian and Jamaican The prosodic systems of the Caribbean English Creoles are perhaps best described as a smorgasbord, reflecting varying degrees of hybridity. This variation is due in part to the contact history of the varieties and in part to the multiple functions of prosody. The variation is also loosely aligned with a classification of more conservative to less conservative languages, shifting from lexical tone usage to no lexical tone (see discussion in Clements and Gooden 2009; Gooden, Drayton and Beckman 2009). There has been some controversy over the characterization of the prosodic systems of Trinidadian and Jamaican as having lexical tone and/or stress. Recent research on the prosody of Jamaican (Gooden 2003, 2014) and Trinidadian (Drayton 2013) however show that both varieties have lexical stress systems with intonationally marked prominences. This means that the F0 aligned with prominent syllables is not contrastive. This is the case for all the data we examine, including


378

the ultra conservative Eastern Maroon Jamaican variety.3 This is in contrast to the situation in other varieties, like Saramaccan (Good 2009), Curaçaoan Papiamentu (Remijsen and van Heuven 2005; Kouwenberg 2004; Rivera-Castillo and Pickering 2004; Rivera-Castillo 2009), Ndjuka (Huttar and Huttar 1994) which clearly show contrasting F0 patterns that can be linked to a lexical or grammatical tone system. Phonology of word level stress. Both Jamaican (Wells 1973; Alderete 1993; Gooden 2003; 2007) and Trinidadian Creoles (Drayton 2013) are weight sensitive systems with trochaic foot structure. Native speaker judgements of stressed syllables correlated strongly those identified by Allsopp (1996) for Trinidadian and Cassidy and LePage (1966) for Jamaican. Main stress in Jamaican generally falls on the leftmost or only heavy syllable in a word but there is no weight-based preference for stress among CVV, CVVC, CVCC and CVC syllables. Secondary stress falls two syllables away from the main stress (in any direction). When the primary stress is on the initial syllable in a trisyllabic word, secondary stress falls on the word final heavy syllable. However, when primary stress is on the penultimate syllable, there is no secondary stress. In Trinidadian main stress also falls close to the left edge of the word, and the computation of stress takes syllable weight into account with closed syllables and syllables with long vowels and diphthongs being heavier than open syllables. Secondary stress is found two syllables away from the main stress, and is found in words longer than three syllables, since the final syllable according to Drayton’s (2013) analysis is not typically stressed in Trinidadian. These stress analyses are also supported by acoustic analyses of the contemporary datasets, targeting words of different lengths in different prosodic positions in sentences and in isolation. Methods. The contemporary Jamaica data are from Gooden (2003). Speakers included men and women (ages 29 - 80+ at the time of the recording, i.e. birthdate between 1973 and approximately 1918) from a rural area, Top Alston, Clarendon. The majority of speakers were basilect-dominant and reported using Jamaican Creole in most informal social settings and English in most formal settings. The data are semi-spontaneous speech elicited using a combination of an elaborated interview style and a picture-task, in addition to conversational style data using a traditional sociolinguistic style interview. The methods mimic carefully controlled laboratory speech while avoiding read speech. The elicitation protocol yielded different prosodic contexts as follows: broad focus statement; yes-no question; wh questions; narrow focus constructions; complex sentences with several subordinated clauses; sentences with multiple foci. Examining these different contexts was important since the realization of pitch accents are lexically specified and are therefore subject to influence from the discourse pragmatics.

Two recent acoustic phonetic studies of Trinidadian, provide quite an amount of recent data (e.g. Leung 2013, Drayton 2013). The data described here was collected by Drayton in 2006-7 and covers a wide geographical area including the areas sampled by Winford in 1970. The data collection methods are similar to those used for Jamaican. These data yielded target words in final and nonfinal position in statements and questions and in a focus condition. The complete dataset 3 We do not address the function of F0 in lexical prominence in Kramanti.


379

includes 16 speakers both male and female in conversation at their homes or a community site. 2 speakers are from the suburbs of Port of Spain and 2 from a similar geographical location to Mayo (rural). The speakers ranged in age from 24-59 yrs at the time of the recordings, (birthdate 1947-1982). Finally speakers self identified as African, Indian and Mixed. Although most speakers could approximate Trinidadian (Standard) English with varying degrees of success, they were primarily speakers of Trinidadian Creole.

These recent Jamaican and the Trinidadian data were digitized at a sampling rate of 44.1 kHz and were analyzed in Praat (Boersma and Weenink 2010). For the intonational analysis, the data were analysed auditorily assisted by visual inspection of the waveform and F0 contour, and were crosschecked by both authors following the initial coding of the individual datasets. For the vowel duration analyses, vowel onset and offset were marked on a wideband spectrogram at points when the F2 ceased transition into and out of the vowel respectively (DiPaolo, Yaeger-Dror and Beckford-Wassink 2010). Details of the other measurements are given below. Statistical analyses were done in R (R Development Core Team 2010).

Acoustic cues to stress. Given that both Trinidadian and Jamaican are stress languages, lexical prominence is cued indirectly by the F0 (Fry 1958) and in both varieties this is realized as a fall (Lawton 1963, Wells 1973, Gooden 2003). Since the F0 by itself is unreliable for marking prominence, other acoustic cues signal stress, i.e. duration, intensity and vowel quality as expressed by F1/F2 differences (Arvaniti 2000). Stress in Jamaican is cued by pitch prominence and duration in addition to the reduction or deletion of unstressed syllables (Gooden 2014). In fact in words like MAda ‘mother’ and maDA ‘female religious leader’, that are often cited as examples of lexical tone contrast, it is syllable durations rather than vowel durations that differentiate between stressed and unstressed items. For Trinidadian, Drayton (2007) reported results from an analysis of 275 words of varying lengths. A series of t-tests revealed that there was a significant difference in the mean duration for stressed vs unstressed syllables (p < 0.01) and vowel quality F1 (p < .05); F2 (p < 0.01), such that unstressed vowels tended to be reduced, but intensity ratios were not significantly different. Stressed vowels were on average 26ms longer than unstressed vowels.

2.2 Intonational phonology As noted earlier, the more recent Jamaican and Trinidadian data show tonal alignment differences. For example, the HL sequence observed in broad focus utterances is analysed as binary pitch accent sequences H+L* or H*+L in Jamaican whereas in Trinidadian the H is associated with a AP phrase boundary and a unary L* pitch accent. The languages also show differences in the LH pattern associated with focused items (in non-final position). Jamaican Creole has a L+H* pitch accent on emphatic focused items whereas Trinidadian has a L* and a very prominent H marking the AP boundary. In other words, the data suggests that Jamaican marks focus with pitch accents and pitch range expansion and Trinidadian marks it through F0 manipulation only. Additional details on the individual intonational systems are summarized just below.


380

To demonstrate more clearly the differences in F0 alignment between the varieties, Figure 1 shows overlaid F0 contours on normalized time4 axes for the target word alligator in non-final position in a broad focus declarative. There are a total of 9 speakers, 4 for Trinidadian and 5 for Jamaican, all from the more recent datasets. We employ normalization because individual differences may greatly affect the description of the F0 contour since it is a speaker-dependent physiological characteristic. Hertz values were converted to an auditory scale, Equivalent Rectangular Bandwidth (ERB), which provides a more accurate representation of speakers pitch perception.

Figure 1. Mean interpolated F0 contours on a normalized time axis. The contours

are averaged across repetitions from a portion of the sentence, Him want wan || alligator and (some) yam. ‘He wants an alligator and some yams’.

The highlighted portion of the graph is the pretonic and tonic syllables [lɪ gɛ]. The fall starts in the pretonic syllable and ends on the tonic syllable, demonstrating that there is a consistent fall in F0 on the stressed syllable for all the speakers in both languages.

Postlexical pitch accents in Jamaican are at least of two types, monotonal accents (e.g. H*), and bitonal accents (e.g. H+L*). Two larger prosodic constituents above the word, an intermediate phrase (ip) and an intonational phrase (IP) are marked with tones at the right edges. IPs can have H%, L% or M% tones and ips can have L- or H- (Gooden 2014). Intonational marking of focus occurs alongside syntactic means, since as with other creoles, Jamaican Creole typically marks focus syntactically (Christie 1988, Patrick 2004, Durrelman 2005, 2007). However, Gooden (2014) showed that while double foci constructions are prohibited via syntactic reorganization, it is permissible when combining prosodic and syntactic strategies. Narrow/emphatic focus constructions are marked with a L+H* and the research so far suggests that there is no deaccenting in the post-focal domain, even when the focused item occurs early in the IP. Table (2) provides a summary of pitch accents and boundary tones.

4 The sentences were separated into syllables and each syllable subdivided into 10 equal parts.


381

Word level L*, H*, H+L*, H*+L, L+H* with emphatic focus, possibly (!H+L*)

Intermediate Phrase

L-, M- or H- on medial ip or final ip preceding IP boundary tones

Intonational Phrase

L% on final or medial IP in broad focus statements, emphatic focus statements, Wh-questions H% on final IPs in yes-no questions, emphatic focus yes-no questions M% on final IPs in yes-no questions, continuations rises

Table 2. Summary of pitch accent and boundary tones in contemporary Jamaican In an analysis of recent data, Drayton (2007, 2013) recognized two larger prosodic constituents above the word, an Intonational Phrase (IP) as well as an intermediate level phrase; an Accentual Phrase (AP). IPs can have H% (e.g. in yes-no questions or continuation rises) or L% edge tones. The AP consists of a word or group of words delimited by H boundary tones at the right edge of the constituent, and has at least one L* pitch accent anchored to stressed syllables (see Figure 2). These L* pitch accents are argued to be present in these recent data regardless of the ethnic identity of the speaker and is the result of contact with Indo-Trinidadians who themselves have been influenced by Trinidadian Bhojpuri.5 Support for this comes from the observation that other ‘Indian Englishes’ have L* accents on prominent syllables as well (Harnsburger 1999, Pickering and Wiltshire 2000). Of note is that the data from Afro-Trinidadians recorded in the 1970s that we have looked at, do not have this pattern.

Items in focus are made prominent through F0 manipulation and duration. Consequently, as shown in Figure (2b), the F0 in the AP preceding the focused item is reduced compared to the analogous AP under broad focus in Figure (2a). The L* pitch accent on the focused item is also realized at a lower F0 and the H tone associated with the postfocal AP is realized much higher, but is not obligatory since it is not observed in absolute phrase final position (see Figure 2c).

(a) (b) (c)

5 The shipment of people from north east India as indentured laborers in European colonies like those in the Caribbean, resulted in koineization of Indian languages like Bhojpuri which like other Bhojpuri varieties shares a lexicon with and is related to Hindi (Mohan 1990). Today many Indo-Trinidadians continue to have religious and cultural exposure to the Hindi language, though Trinidad Bhojpuri in no longer a viable community language.


382

Figure 2. F0 contours of statements (a) broad focus, He wants a banana and yam

(b) narrow focus (non-final), He wants a banana (not mango) and yam (c) narrow focus (final), The calabash has a banana. (female Afro-Trinidadian speaker from Princes Town).

The L* pitch accent is also seen in the spontaneous speech data in Figure 3, see also Figure 4. The H% boundary tone here marks a continuation rise and is preceded by the H tone of the final AP.

Figure 3. F0 contour showing L* H (male Afro-Trinidadian speaker from

Dabadie)

Figure 4. F0 contour showing bitonal pitch accents L+H* L* H in younger

speaker (male Afro-Trinidadian speaker from Dabadie).


383

3 Older recordings The F0 contours presented in this section are more directly comparable with the interview style data we collected, although patterns seen in the elicited semi-spontaneous speech data are also referred to. 1950s Jamaican. The older Jamaican recordings were digitized (sampling rate 48,000 Hz) from David DeCamp’s (1958) original reels,6 recorded in Jamaica between November 1957 and August 1959. The time period overlaps a bit with data reported in Lawton (1963), who as noted above, identified L F0 on stressed syllables in Jamaican and was the first to demonstrate that it was also important for prosodic phrasing. There is also some overlap with data reported in Wells (1973) but the bulk of these speakers were Jamaicans residing in London. The interviews described are from a wide range of Parishes and include rural and urban locations as well as Maroon settlements. We focus on data from Banana Ground, border of Manchester and Clarendon, Trout Hall in Clarendon, and one maroon settlement, Moore Town in Portland. These were chosen as they are all are rural and so can be compared with the more recent Jamaican data described above. The Maroon settlement was chosen as it represents a more conservative form of Creole, the Jamaican Spirit Language and Kramanti (Bilby 1983, Devonish 2005) and might reasonably show prosodic features not seen in other rural varieties. Generally, the patterns are consistent with those seen in the more recent data, although there are some differences. As of yet, there are no observed differences in prosodic structure. IPs have L% boundary tone as in Figure 6 and a medial boundary tone M% (Figures 5, 7, 11). The M% appeared in continuation rises and also in yes-no questions7 in the more recent Top Alston data. The M% boundary is at the mid range of the speaker’s pitch range and appears most consistently in the speech of older speakers in both the older and more recent recordings. The next level of phrasing, the intermediate phrase, has a smaller perceived degree of disjuncture than IPs and we have marked only the clear cases. The L-phrase accent marking it is typically scaled lower than the L% boundary tones (e.g. Figure 6). Figure 5, shows the same speaker as in Figure 6, with M-phrase tones and M% boundary tones. There are several instances however which might be better analysed as utterance internal IPs rather than as intermediate phrases (see Figures 7, 9). In all of these cases these is a measureable or discernable pause between phrases but the degree of perceived disjuncture is not as small as at the ip boundary.

6 We are indebted to Peter Patrick who digitized all the original reels (5 inch Scotch and EMI reels at 7 1/2 ips) and copied DeCamp’s records on the demographic data of the speakers. 7 Due to the nature of the spontaneous speech data, the opportunities for yes-no questions are limited to non-existent, which means we cannot verify whether it also appears in this utterance type in these older data.


384

Figure 5. F0 contour showing syntactic marking of focus and an M% and M- in

the sentence, Whereas, the man ROBBED you lit. whereas, its rob that the man robbed you (Carter, male speaker from Banana Ground). Continued in Figure 6.

Figure 6. F0 contour showing phrase accents in sentence continued from Figure 5,

…. and he is gone, because after all he can't move it (Carter, male speaker from Banana Ground)

Four types of pitch accents were observed; falling (H+L*, H*+L), rising (L*+H, L+H*) high (H*), and low (L*). Figure 7 nicely illustrates the pragmatic function of pitch accents in Jamaican. The first occurrence of bury has broad focus (upper panel) and has a falling F0 (H+L*). The second occurrence is focused and has a rising F0 (upper panel). This is the familiar LH pattern seen in the focus contexts in more recent recordings but there are alignment differences. In this case, there is a L*+H as opposed to a L+H*, so the low F0 is aligned with the initial stressed syllable of the word and the peak is realized later. The L*+H pitch accent is also seen in Figure 9 on another focused rendition of the word bury, and again in Figure 8 on the monosyllabic word dat ’that’.


385

Figure 7. F0 contour showing a variety of pitch accents in sentence, Until after

you bury, after you BURY the dead (upper panel), then they have the, the wake (lower panel). (Barnett, male from Trout Hall)

Figure 8. F0 contour showing focus in the sentence, …This man’s house and

THAT house. (Harris, male from Moore Town)


386

Figure 9. F0 contour showing utterance internal IPs, in the sentence, After they

BURY the dead, then they do the singing. (Barnett, male from Trout Hall)

Figure 10 has focus on the word wan, ‘none’ and is realized as L+H*.8 In Figure 11, the word lean is focused and has the same L+H* pitch accent.

Figure 10. F0 contour …but of the land they gave the maroons, not me….they, they

did not give me ANY. I (alone) should have gotten more. (Harris, male from Moore Town)

8 We can also observe the difference in the length of the [w], such that the whole syllable is more than twice that of the phonetically identical wan ‘alone’, in the broad focus context. That is, 414 ms vs 186ms, due mostly to lengthening of the initial consonant and is noted by Sutcliffe (2003) as a strategy used for emphasis in Caribbean (English) Creoles more generally.


387

Figure 11. F0 contour showing utterance internal IPs with both L% and M%, in the

sentence, But however, we are called ambadasha9 (Kramanti term) (upper panel), and we are LEAN, and we will not fall (lower panel). (Harris, male from Moore Town)

1970s Trinidadian. These older data were collected by Donald Winford in 1970 for his PhD study on phonetic-phonological aspects sociolinguistic variation in two communities (urban and rural). The original analogue recordings were digitized (sampling rate 44,000 Hz) and made available to us. The complete dataset includes 36 male speakers in conversation at their homes or workplaces. 21 speakers were from St. James, a suburb of Port of Spain and 15 were from Mayo, a small village in the south of Trinidad. These speakers ranged in age from 35-80 (birth dates 1890-1935). Unlike in the contemporary data, speakers self-identified only as either African/Mixed or Indian, and the third category ‘mixed’ was not used explicitly. The implications of these self-identifications for the observed prosodic patterns (see Figure 12) are discussed further below.

9 We have marked a LH pitch pattern on the Kramanti word without indicating a starred tone as prominence could involve lexical tone.


388

Figure 12. F0 contour showing L* pitch accents in APs. Indo-Trinidadian from the

Mayo area (rural) This speaker shows the canonical L* pitch accent on stressed syllables and the IP is organized in terms of APs marked by H boundary tones at their right edges. The data from Afro-Trinidadian speakers is interesting in several ways. First it clearly shows bitonal pitch accents on stressed syllables. In Figure 13, there is a rising accent (L+H*) on the word see and a falling accent (H*+L) on the word anything. Figure 14 is from the same speaker and shows a L+H* on working and !H*+L on conditions. In addition, there appears to be APs bounded by a right aligned H tone. These appear to be far less frequent than seen in the speech of Indo-Trinidadians of the same age cohort and younger Afro-Trinidadians (as seen in Figure 3). This is taken as evidence for earlier influence from contact with Trinidadian Bhojouri speakers.

Figure 13. F0 contour showing H*, H*+L and L+H* pitch accents on stressed

syllables (male Afro-Trinidadian from Mayo).10

10 Afro-Trinidadians from St. James also have analogous patterns (Gooden, Drayton and Beckman 2009).


389

Figure 14 F0 contour showing !H*+L and L* pitch accents on stressed syllables

(male Afro-Trinidadian from Mayo).

4 Discussion and summary The period between the late 1890s and the 1940s brought some changes to Caribbean Creoles. Trinidadian shows changes in prosody, mainly in the marking of prominent syllables and in phrasing since there is a more consistent marking of APs in the speech of Afro/Mixed speakers in the more recent data. Using the oldest speakers from both data sets, this would have taken place between 1890 and 1947, so that speakers born after that period would have prosodic features that are similar to that used by the contemporary speakers described here. As discussed earlier, contemporary Trinidadian intonation features include L*H sequences, with L* anchored to the stressed syllable of a content word, and an H tone marking the boundary of an Accentual Phrase. This pattern holds for speakers across various ethnic identities and urban-rural areas. In the older data, especially in the African identified speakers, there was a greater presence of binary pitch accents (e.g. L+H*) and H* tones, in addition to L* on stressed syllables, and a less consistent use of APs. At the same time, the fact that APs were minimally present in the speech of older Afro-Trinidadians, suggests that this prosodic category had started to creep into the prosodic structure, and is now further along in the speech of younger speakers. The data from the Indo speakers more closely resembled that of the ubiquitous modern AP pattern of L* followed by H.

The oldest speakers from both sets of Jamaican data would have been born between the late 1890s and 1918; the youngest between 1930s and 1970. This is roughly the same time period as in Trinidad, i.e. among speakers born after 1930s or 40s. Among the older speakers, there are no changes in the prosody and as discussed earlier, the message of maintenance rather than shift, is also seen in the vowel system, as rural speakers maintain ‘older’ length contrasts while other non-rural speakers make use of spectral properties (Wassink-Beckford 2001). There is not sufficient data from younger speakers in the more recent Top Alston data to tell if there are age related changes in that cohort. Still, the data from these younger speakers is consistent with that from the older speakers in the community. One clear difference is that the M% in continuation rises is observed most consistently in the speech of the older speakers across both sets of data. Older speakers also appear to be making more use of utterance internal IPs. Marking focus prosodically


390

involves increased pitch range, duration and differences in pitch accent marking, a strategy used by both older and younger speakers. The older speaker from the more recent Top Alston data did not show post focal deaccenting and we see a similar pattern in the older data as well.

The changes in Trinidadian are very possibly a marker of a new “Trinidadian” identity among younger Trinidadians (see similar discussions in Winford 1972). Winford highlighted clear ethnolinguistic differences between many older Indo-Trinidadians and older Afro-Trinidadians, including differences in their pitch and intonation patterns. Our hypothesis that a ‘big’ change is in progress in Trinidadian phonology, is also supported by Leung’s (2013) work. She suggested that compared to Winford’s 1970s results, there is now less vowel variation than earlier. The same sociohistorical factors that can account for this decrease in variation, are likely the same ones that drove the prosodic changes. Three factors contribute to these changes. First, a shift in identity that resulted in a larger ‘mixed’ category in speakers born around the 1940s and later (as documented for example in the 2000 Trinidad census data).11 The census reports 40 percent Indo-Trinidadians, 37 percent Afro-Trinidadian and 20 percent of mixed ethnicities for the country. The second factor is the increasing amalgamation of formerly disparate groups, and their participation in mainstream culture (see Mohan 1990 for some discussion) and third, increased contact between Afro-Trinidadian and Indo-Trinidadians, for example through marriage and internal migration and (re)settlement patterns. As speakers redefine themselves in their changing ecologies and create new sociopolitically-driven identities, the linguistic shifts become a reflection of the associated changing identities (Schneider 2003). The combined results from our research and the research on vowel variation suggests that both Trinidadian and Jamaican are likely between the last two stages of emergence; the fourth, i.e. endonormative stabilization,12 and the fifth, i.e. differentiation (Schneider 2003). The differences are determined by the local ecologies in which each variety exists. So, the kind of ethnolinguistic differences seen in Trinidad are not observed in Jamaica since these rural communities are far more ethnically homogenous (at 91.6 percent Afro-Jamaicans island wide).13

In summary, two different processes of language change have affected the varieties in the time period we examine. While we observe convergence in Trinidadian prosody, Jamaican shows retention. The changes in Trinidadian center on ethnolinguistic contact, the outcomes of which are varied and complex and may also interact with other influences on the construction of identity, such as age. As such, we must ideally seek to understand the context in which speakers use the language and how they construct their identities within these contexts.14 At the

11 There are more interracial marriages and mixed race offspring and the categories involve more ethnic groups. 12 In fact, Schneider cites the publication of the Dictionary of Caribbean Regional English (Allsopp 1996) in support of this (p. 252). 13 Jamaican Census 2001. 14 The issue of how speakers’ ideological stances affect their perception of others is also important (Fought 2010) and perhaps especially so when accounting for changes in Creole


391

same time ethnic groups are not static and speakers’ identities might shift due to social context (Fought 2006, 2010). This is demonstrated here, through the shifting ethnic affiliation of younger Trinidadians. We see a consequent convergence of the type of intonational pitch accents in both of the major ethnic groups as well as in the speech of mixed identity speakers, and a strengthening of the AP as a prosodic category below the IP. The maintenance of the features in Jamaican is facilitated by the rural context in which speakers reside. References Alderete, John 1993. The prosodic morphology of Jamaican Creole iteratives. In:

Benedicto, E. (ed.) University of Massachusetts Occasional Papers 20: The UMOP in indigenous languages. Amherst: Graduate Linguistic Student Association, pp. 29-50.

Alleyne, Mervyn C. 1980. Comparative Afro-American: an historical-comparative study of English-based Afro-American dialects of the New World. Ann Arbor: Karoma.

Allsopp, Richard 1996. Dictionary of Caribbean English Usage. Oxford: Oxford University Press.

Arvaniti, Amalia 2000. The phonetics of stress in Greek. Journal of Greek Linguistics 1: 9-39.

Beckman, Mary 1986. Stress and Non-Stress Accent. Netherlands Phonetics Archives. Dordrecht: Foris.

Christie, Pauline 1998. Thematization in Jamaican Speech. In: Pauline Christie (ed.) History and Status of Creole Languages. UWILing Working Papers in Linguistics 3. Department of Language, Linguistics and Philosophy. University of West Indies, pp. 36-49.

Beckford-Wassink, Alicia 1999a. A sociophonetic analysis of Jamaican vowels. Ann Arbor: University of Michigan dissertation.

Beckford-Wassink, Alicia 1999b. Historic low prestige and seeds of change: Attitudes toward Jamaican Creole. Language in Society 28(1): 57-92.

Beckford-Wassink, Alicia 2001. Theme and variation in Jamaican vowels. Language Variation and Change 13(2): 135-159.

Beckford-Wassink, Alicia 2006. A geometric representation of spectral and temporal vowel features: Quantification of vowel overlap in three linguistic varieties. Journal of the Acoustical Society of America 119(4): 2334-2350.

Bilby Kenneth 1983. How the older heads talk: A Jamaican Maroon spirit possession language and its relationship to the Creoles of Suriname and Sierra Leone. Nieuwe West-Indische Gids [New West Indian Guide], pp. 37-88.

Boersma Paul and David Weenink 2010. Praat: Doing phonetics by computer. Version 5.1.43. Online: http://www.praat.org/.

Cassidy Federick G., and Robert B. Le Page 1967. Dictionary of Jamaican English. Cambridge: Cambridge University Press. Reprinted 1980.

phonologies (Wassink and Dyer 2004). However, neither of these studies were designed to directly capture this information.



392

Clements Clancy and Shelome Gooden 2009. Language change in Creole languages: grammatical and prosodic considerations - An introduction. Studies in Language 33.2: 259-276.

Devonish Hubert 2005. Kramanti. http://www.mona.uwi.edu/dllp/jlu/ciel/pages/kramantiarticle.htm [Accessed July 2014]

DiPaolo Marianna, Malcah Yaeger-Dror and Alicia Beckford Wassink 2010. Analyzing Vowels. In: Marianna Di Paolo and Malcah Yaeger-Dror (eds) Sociophonetics: A Student’s Guide. London: Routledge, pp. 87-106.

Drayton Kathy-Ann 2007. Stress and Tone in Trinidadian English Creole. Paper presented at the Society for Pidgin and Creole Languages Conference. Amsterdam.

Drayton, Kathy-Ann 2013. The Prosodic Structure of Trinidadian English Creole. Unpublished PhD Thesis. The University of the West Indies, St. Augustine. Trinidad.

Durrleman, Stephanie 2005. Notes on the left periphery in Jamaican Creole. Generative Grammar in Geneva. Vol 4. University of Geneva.

Durrelman, Stephanie 2007. The syntax of Jamaican Creole: a cartographic perspective. PhD dissertation. University of Geneva.

Ferreira, JoAnne and Holbrook, David 2002. Are They Dying? The Case of Some French-Lexifier Creoles. La Torre - Revista de la Universidad de Puerto Rico 7(25): 367-398 (July-September).

Fought, Carmen 2006. Language and Ethnicity. New York: Cambridge University Press.

Fought, Carmen 2010. Ethnicity and Language Contact. In: Raymond Hickey (ed). Handbook of Language Contact. Malden, MA: Wiley-Blackwell, pp. 282-298.

Fry, Dennis 1958. Experiments in the perception of stress. Language and Speech 1: 126-152.

Givón Talmy.1979. From discourse to syntax: grammar as a processing strategy. In: Givón Talmy (ed.) Discourse and Syntax. Syntax and Semantics, vol. 12. New York: Academic Press, pp. 81-111.

Gooden, Shelome 2003. The Phonetics and Phonology of Jamaican Creole reduplication. Unpublished Ph.D. Dissertation, Ohio State University.

Gooden, Shelome 2014. Aspects of the Intonational Phonology of Jamaican Creole. In: Jun, Sun-Ah, ed. Prosodic Typology II: The Phonology of Intonation and Phrasing. Oxford University Press. Pp. 273-301.

Gooden, Shelome, Kathy-Ann Drayton and Mary Beckman 2009. Tone Inventories and Tune-Text Alignments: Prosodic Variation in “Hybrid” prosodic systems. Studies in Language 33.2: 396-436.

Harnsberger, James D. 1999. The role of metrical structure in Hindi intonation. Paper presented at the 19th South Asian Linguistics Analysis Roundtable, University of Illinois, Urbana-Champaign.

Hall-Alleyne, Beverly 1990. The social context of African language continuities in Jamaica. International Journal of the Sociology of Language 85: 31-40.

Huttar, Mary and George Huttar 1994. Ndjuka. Newbury, MA: Routledge. Irvine, Allison 1994. Dialect variation in Jamaican English: A study of the

phonology of social group marking. English World-Wide 15: 55-78.

http://www.mona.uwi.edu/dllp/jlu/ciel/pages/kramantiarticle.htm


393

Irvine, Alison 2004. A good command of the English language: Phonological variation in the Jamaican acrolect. Journal of Pidgin and Creole Languages 19(1): 41-76.

Irvine, Alison 2008. Contrast and convergence in Standard Jamaican English: The phonological architecture of the standard in an ideologically bidialectal community. World Englishes 27(1): 9-25.

Labov, William 1966. The Social Stratification of English in New York City. Washington, DC: The Center for Applied Linguistics.

Ladd, D. Robert.1996. Intonational phonology. Cambridge: Cambridge University Press.

Lalla Barbara and Jean D’Costa 1990. Language in exile: Three hundred years of Jamaican Creole. Tuscaloosa: University of Alabama Press.

Lawton, D. 1963. Suprasegmental phenomena in Jamaican Creole. Ph.D thesis, Department of English, University of Michigan.

Leung, Glenda-Alicia 2013. A synchronic Sociophonetic Study of Monophthongs in Trinidadian English. PhD Dissertation. University of Freiburg.

Meade, Rocky 2001. Acquisition of Jamaican phonology. LOT: Netherlands Graduate School of Linguistics.

Mesthrie, Rajend 1993. Koineization in the Bhojpuri-Hindi diaspora-with special reference to South Africa. International Journal of the Sociology of Language 99: 25-44.

Mohan, Peggy 1990. The rise and fall of Trinidad Bhojpuri. International Journal of the Sociology of Language 85: 21-30.

National Census Report 2000, Trinidad and Tobago. CARICOM Capacity Development Program.

Patrick, Peter 2004. Jamaican Creole: Morphology and syntax. In: Bernd Kortmann, Edgar W Schneider, Clive Upton, Rajend Mesthrie and Kate Burridge (eds) A Handbook of Varieties of English. Vol 2: Morphology and Syntax. Berlin, New York: Mouton de Gruyter, pp. 407-438.

Plag Ingo and Mareile Schramm 2006. Early creole syllable structure: A cross-linguistic survey of the earliest attested varieties of Saramaccan, Sranan, St. Kitts and Jamaican. In: Parth Bhatt and Ingo Plag (eds), The Structure of Creole Words: Segmental, Syllabic and Morphological Aspects. Tübingen: Niemeyer, pp. 131-150.

R Development Core Team 2010. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Online: http://www.R-project.org/.

Sandler, Wendy, Irit Meir, Svetlana Dachkovsky, Carol Padden, and Mark Aronoff 2011.The emergence of complexity in prosody and syntax. Lingua 121(13): 2014-2033.

Schneider, Edgar W. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2): 233-281.

Schneider, Edgar W. (ed.) 2008. Varieties of English. The Americas and the Caribbean. Mouton.

Sluijter, A. M. C. and Van Heuven, V. J. 1996. Spectral balance as an acoustic correlate of linguistic stress. Journal of the Acoustical Society of America 100(4): 2471-2485.

http://www.R-project.org/


394

Smith Norval and Marleen van de Vate 2006. Population Movements, Colonial Control and vowel systems. In: Parth Bhatt and Ingo Plag (eds) The Structure of Creole Words: Segmental, Syllabic and Morphological Aspects. Berlin: de Gruyter, pp. ??-??.

Smith, Norval and Vinije Haabo 2007. The Saramaccan implosives: Tools for linguistic archaeology? Journal of Pidgin and Creole Languages 21(1): 101-122.

Smith, Norval 2008. Creole phonology. In: Silvia Kouwenberg and John Victor Singler (eds), The Handbook of Pidgin and Creole studies. Malden, MA: Wiley-Blackwell, pp. 98-129.

Sutcliffe, David 2003. Eastern Caribbean suprasegmental systems: A comparative view with particular reference to Barbadian, Trinidadian and Guyanese. In: Aceto, Michael and Jeffrey. P. Williams (eds). Contact Englishes of the Eastern Caribbean. Amsterdam: John Benjamins, pp. 265-296.

Wells, John C. 1973. Jamaican Pronunciation in London. Oxford: Blackwell. Wells, John C. 1982. Accents of English. Vol. 3: Beyond the British Isles.

Cambridge Cambridge University Press. Wichmann, Anne 2011. Grammaticalisation and prosody. In: Heiko Narrog and

Bernd Heine (eds) The Oxford Handbook of Grammaticalisation. Oxford, Oxford University Press, pp. 331-341.

Winford, Donald 1972. A sociolinguistic description of two communities in Trinidad. Unpublished Ph.D. dissertation, University of York.

Winford, Donald 1997. Re-examining Caribbean English Creole continua. World Englishes 16(2): 233-279.

Winford, Donald 2001. ‘Intermediate’ Creoles and degrees of change in Creole formation: The case of Bajan. In: Ingrid Neumann-Holzschuh and Edgar W. Schneider (eds). Degrees of Restructuring in Creole Languages. Amsterdam: John Benjamins, pp. 215-246.

Winer, Lise 1993. Varieties of English Around the World: Trinidad and Tobago. Amsterdam: John Benjamins.

Huber Early Recordings from Ghana --- Page 395 of 525

395

18 Early recordings from Ghana1

A variationist approach to the phonological history of an Outer Circle variety

Magnus Huber 1 Introduction The growing interest in New Englishes has been accompanied by an impressive number of synchronic studies on these varieties. However, diachronic investigations of postcolonial Englishes are still the exception. What studies there are mostly adopt a macro-sociolinguistic perspective and focus on the external history of (post-)colonial Englishes, with little or no reference to linguistic structure. Even Schneider’s (2007: 113-250) case studies, illustrating his Dynamic Model of the evolution of New Englishes and describing postcolonial Englishes at different stages of development, are based mainly on synchronic data.

One main reason for the lack of diachronic studies of the structural development of postcolonial Englishes in the Outer Circle is that in many cases authentic historical language data is either non-existent or has not yet been located and analyzed by linguists.2 The situation is somewhat less problematic with regard to written texts, and some systematic diachronic projects in this area have started recently, e.g. Hoffmann, Sand and Tan’s (2012) Corpus of Historical Singapore English and Biewer et al.’s (2014) Diachronic Corpus of Hong Kong English, both based on the design of the Lancaster-Oslo-Bergen Corpus (1970-1978), or Brato’s (in prep.) Historical Written Corpus of Ghanaian English (2014), modelled on the International Corpus of English. However, for many Outer Circle Englishes few if any early recordings have been located and analyzed so far.

Focussing on Ghanaian English (GhE), this paper is an attempt to redress this situation with regard to West African Englishes. Adopting a quantitative-variationist approach, I will explore some ways in which early radio broadcasts and recordings of political speeches from the 1950s and 1960s, stored in the archives of the Ghana Broadcasting Corporation, can be used to reconstruct the phonological development of GhE. This complements a previous study (Huber and Schmidt 2011), which used early popular music recordings as data for the investigation of the history of some GhE vowels. That study found substantial changes over the past 50 years and it will be interesting to see whether radio broadcasts and political speeches show similar developments.

1 A first version of this article was presented at the nineteenth International Congress of Linguists, Geneva, Switzerland, 21.07-27.07.2013. 2 More progress has been made in recent years with regard to the phonological development of Inner Circle varieties. For New Zealand English, to give an example among many others, studies have been based on the 1940s recordings of the Mobile Disc Recording Unit, e.g. Hay, MacLagan and Gordon (2008: 89-92).


396

In this pilot study I will present the results of an exploratory auditory analysis of the broadcasts and speeches collected, digitized and transcribed so far. After a discussion of the potentials and problems associated with early audio recordings as data, four phonological variables that show variation in present-day GhE will be analyzed: two consonant variables, (ing) and (wh), as well as two vowel variables, (NURSE) and (STRUT). The research question is whether the variation observable today was already in place in the GhE of 50-60 years ago and whether the distribution of variants changed over time. Diachronic studies based on recordings and focussing on language structure can make a significant contribution to models such as Schneider’s (2003, 2007) evolutionary model that have mainly been based on external language history and synchronic structural data. 2 English and indigenous languages in Ghana After a period of first contacts from the 1550s to 1570s, anglophone traders regularly frequented the Gold Coast (modern Ghana) from 1632 on. Formal colonization started about two centuries later, when coastal chiefs yielded part of their jurisdiction to the British crown in the Bond of 1844. English was the language of colonial administration but the percentage of English-speaking indigenous Gold Coasters remained low until English-medium schools were established in the 1880s,3 so 1900 can be taken to be an approximate starting date for modern GhE. The Gold Coast was an exploitation colony and the number of anglophone foreigners has always been low: the 1948 Census, the last before independence, returned 4,102 British, 126 US and 10 West Indian nationals, together a mere 0.1% of a total population of 4.5 million (Government of the Gold Coast 1950: 10, 83). The percentage of anglophone foreigners remained low also after the independence of Ghana, in 1957. In 1960, there were 7,502 UK and Irish nationals (0.1% of the total population) and 1,103 from Canada, the US and the West Indies (0.02%; Census Office 1964: 102) in a population of 6.7 million. The latest census (2010) enumerated 24.7 million inhabitants, among which were 4,493 Europeans (0.02%) and 2,714 aliens from the Americas (0.01%; Ghana Statistical Service 2013: 215, 247). The vast majority of Ghanaians speak indigenous languages as L1 but English has been Ghana’s de facto official language since independence. It is used in formal contexts like the educational system, parliament, higher courts of justice or the media.

With the exception of Hausa, the ca. 80 indigenous Ghanaian languages (Lewis et al. 2013) belong to the Niger-Congo family, whose Kwa genus includes the largest languages in the South: Akan (spoken by 47.5% of Ghana’s population) with its various dialects (Akuapim, Asante, Fante, etc.), Ewe (13.9%) and Ga-Dangme (7.4%). Several Western Oti-Volta languages of the Gur genus are spoken in the North, of which Farefare (ca. 4%), Dagbani (ca. 4%) and Dagaare (ca. 3.5%) are the largest. Twi (a dialect of Akan) is the main lingua franca in southern Ghana (cf. Huber 2014: 87).

3 For more information on the history and status of English in Ghana see Huber (1999: 86-95; 2004: 842-848; 2012: 382-383).


397

3 The history of broadcasting in Ghana and material analyzed Broadcasting in colonial Ghana started in 1935. A wired relay station (“Radio ZOY”) transmitted BBC Empire Service programmes and music to a small number of subscribers in Accra, the capital of the Gold Coast colony.4 In the following years, the service expanded and re-diffusion stations were opened in other cities. In 1940, a new broadcasting house with a transmitter was built in Accra and programmes in the Ghanaian languages Twi, Fante, Ga, Ewe and later Hausa were introduced in the following years. 1953 saw the establishment of the Gold Coast Broadcasting System. Up to the mid-1950s, the English-language programme mainly consisted of governmental announcements and BBC re-broadcasts. The number of local productions only rose in 1956. On independence in 1957, the station was renamed Ghana Broadcasting System and later the Ghana Broadcasting Corporation (GBC), popularly known as “Radio Ghana” (Buckley et al. 2007: 5-6; Ghana Broadcasting Corporation 2010; Peasah 2009: 3-4).

In the early years, the GBC was assisted by BBC staff, as becomes clear e.g. in a 1958 speech by the Minister for Education and Information, Kofi Baako: “We are grateful […] to the British Broadcasting Corporation for their assistance in seconding a number of their staff to help us”.5

The material analyzed for this study comes from the GBC audio archives. The archives hold one of Africa’s largest collections of gramophone and vinyl records, reel-to-reel tapes and cassettes. In 2008, digitalization of the material started in the music collection (“Gramophone Archive”, http://www.gbcghana.com/gramophone/index.html, accessed 2014-06-18) and for that reason most of the material computerized so far is music.6 However, the GBC also digitalized a small number of its original analog media from the so-called Sound Archive (spoken word) on the occasion of the 2009 centenary of the birth of Kwame Nkrumah, Ghana’s first prime minister. These recordings relate to the Nkrumah era, from the year of independence 1957 to 1966, when the prime minister was ousted in a coup.

The Nkrumah centenary audio files form the basis of the present analysis.7 The data available to me consists of 65 mp3 files of varying lengths (from 3 to 43 minutes), almost 18 hours in total, including music and ambient sound. The GBC digitalization project primarily focussed on Nkrumah’s public speeches, radio announcements and interviews. However, and crucially for the study of early GhE, 4 Part of Governor Sir Arnold Hodson’s 1935 inaugural Gold Coast broadcast is available on the BBC World Service webpage, http://www.bbc.co.uk/worldservice/specials/ 1122_75_years/page2.shtml > Listen to a highlight from 1935 (2014-06-15). 5 “0041(b) Opening Of New Broadcasting House BY DR. NKRUMAH.mp3”, 3:47-4:04. 6 Songs have not been used much for the investigation of earlier stages of New Englishes, but Sebastian Schmidt (University of Giessen) is currently compiling a diachronic corpus of West African popular songs in English, with the aim of analyzing the early phonology of West African Englishes. 7 I would like to thank Mr. Yaw Owusu-Addo, Director of Radio at the GBC, for making a copy of these recordings available for research purposes. Thanks also go to Mr Agyo for his assistance in the Sound Archive.

http://www.gbcghana.com/gramophone/index.html

http://www.bbc.co.uk/worldservice/specials/


398

the recordings also include other speakers, mostly correspondents and radio announcers who report on Nkrumah’s travels, public appearances and politics. They also contain speeches by, and interviews of, Ghanaian national and local politicians as well as other officials.

The digitalized Nkrumah recordings described above were used for the compilation of a small corpus of early educated spoken GhE, according to the following guidelines: first, only material recorded during the Nkrumah era (1957-1966) was selected for the corpus. These are the years immediately following political independence, one of the socio-political events that characterize the Nativization and Exonormative Stabilization phases in the Dynamic Model of the Evolution of New Englishes (Schneider 2003, 2007). Second, to avoid skewing the corpus too much in the direction of Nkrumah’s idiolect, I excluded all of his speeches but one (Nkrumah’s presentation of the Seven Year Development Plan in the National Assembly on 1964-03-11). I further removed all clearly non-Ghanaian speakers. Table 1 provides an overview of the number of words of individual speakers in the corpus: Table 1. Speakers and words in the corpus of early educated spoken GhE Reporters words Politicians and officials words Edward Armah 596 Kwame Nkrumah, prime minister 827 Sam Morris 280 Kojo Botsio, Minister for Trade and Labour 151 Ofori Debrah 387 Anonymous 1 363

Kofi Baaku, Minister for Education and Information 896

Anonymous 2 319 Anonymous 11, national politicial 456 Anonymous 3 319 Anonymous 12, national politician 695 Anonymous 4 121 Anonymous 13, national politician 1040 Anonymous 5 889 Anonymous 14, local politician 149 Anonymous 6 711 James Malcolm, government official 483 Anonymous 7 203 Anonymous 8 487

J.B. Erzuah, High Commissioner for Ghana in India 317

Anonymous 9 312 Anonymous 10 141 Total words 10,142 This corpus of just above 10,000 words represents the GhE of around 1960 as spoken by national and local politicians, chairmen, government officials and radio reporters in a public setting. The individual contributions (reports, speeches, interviews) appear in most cases to have been scripted or were at least based on notes. The recordings thus document the formal educated spoken English of Ghana’s political and cultural elite in the years immediately following political independence.

At present, these are the earliest known recordings of GhE of any substantial length and thus constitute very valuable material for the study of the variety around 1960. However, some words of caution are in order at this point. Until more material has been digitalized by the GBC and transcribed for linguistic purposes, the corpus by necessity remains an unbalanced sample of early GhE and all analyses based on it can only be exploratory: at present, all speakers in the


399

corpus are male, individual speakers’ contributions are of unequal length and they are for the most part scripted, unspontaneous monologues.

A crucial step in the compilation of the corpus was the decision whether a speaker was Ghanaian or not. Speakers’ names, if at all known, can provide a guideline here but have to be used with caution. While a clearly Ghanaian name is a relatively reliable indication that the speaker qualifies for inclusion in the corpus, names such as Sam(my) Morris or James Malcolm are ambiguous since it was, and is, not uncommon for Ghanaians to have English names. In some cases, biographical research helps in establishing the identity of such individuals, as in the case of J. B. Millar, the first director-general of GBC (1954-1960), an Englishman seconded to Ghana from the BBC, who was accordingly excluded from the corpus.

Although I was able to establish the names and sometimes the sociobiographies of a number of the speakers, there are some individuals where this background information is still lacking and where further research is required. It is these anonymous speakers that represent the most serious challenge to the representativeness of the audio material used for the present study: excluding them would reduce an already small corpus in size and internal diversity and therefore run the danger of overlooking (some aspects of) variation that was actually present in early GhE. On the other hand, leaving these speakers in the corpus runs the danger of including the English of non-Ghanaians. The problem is less acute if the individual has an identifiable Ghanaian accent, but the more serious question is how to proceed with individuals who have a British or near-British accent. A British accent alone cannot not be a criterion for removing a speaker from the corpus. It may well have been the case that (some speakers’) early GhE was closer to British English (BrE) than today’s GhE, so excluding these individuals would result in making early GhE more “Ghanaian” than it actually was. A case in point is that of a Mrs Simpson, who, we are told, was the wife of the Solicitor-General of Ghana. In the 1957 recording she speaks English with an RP accent and unmistakably British intonation but on the other hand clearly identifies herself with Ghanaian culture, when sending her greetings to the women of Malaya:

[…] independence here is so recent and so real. We have so much in common: the colonial or protecting power in both cases was Britain, so that along with our special laws and customs we also share the ideas of a large part of the Commonwealth. We live in the tropics at about the same latitude. We may not have beautiful gems of islands like Penang but we can boast of lovely beaches like those on your east coast. Ghanaian women wear what is called kente cloth, a costume very similar to that worn by Malay women.8

Judging by her accent, it seems clear that Mrs Simpson acquired her English in Britain or at least in direct contact with RP speakers, but there is no clue in the recording itself whether she was British or Ghanaian. Excluding her from the corpus on the basis of her accent alone would be tantamount to data-streamlining and making early GhE more Ghanaian than perhaps it was: note that, in addition to the fact that the norm at the time around independence was still exogenous, many prominent Ghanaians were trained abroad, notably in Britain. It was the linguistic

8 “Prime minster (D.R Kwame Nkruma) arrival for commwealth”, 31:23-32:07.


400

reality of the Ghana of 50 to 60 years ago that the Educated GhE speech community included a fair number of Ghanaians with a (near-)BrE accent, and to a certain degree this is still true today. Biographical and archival research is needed in cases such as that of Mrs Simpson, in order to establish whether a speaker is Ghanaian in spite of the British accent and/or name, in which case the recording should not be excluded from the analysis.9

The GBC Sound Archive’s index cards can be of help in identifying speakers. They provide information on the recordings, such as the languages used, the place and date of recording, reporters’ and other speakers’ names and short summaries of the recordings. The archive also has complete scripts of some broadcasts. However, all this documentation still awaits digitalization and was not available for this study. For the present purposes it was therefore decided to leave anonymous speakers in the corpus unless there were strong reasons for doubting that they were Ghanaians. Also note that since identification based on voice quality alone is difficult because of the varying quality of recordings, it may well be that some contributions listed as coming from different anonymous reporters in Table 1 (Anonymous 1-10) were actually made by the same speaker. As the present study does not consider inter-speaker variation, this does not impair the results.

Apart from the considerations regarding the identity of speakers and the representativeness of the corpus, the quality of recordings themselves also pose a challenge for linguistic analyses. While in most cases transcribing the material proved possible, technical limitations of the original recordings (e.g. clipped sound), recording practices (ambient noise during recording, including studio recordings with background noise from neighbouring booths etc.), editing practices (voice-overs, e.g. simultaneous translations) or the deterioration of the analog media before they were digitalized (e.g. dull or distorted sound) have impaired the sound signal in a number of files. There are cases where the sound is so muffled or distorted that transcription and auditory analysis are hopeless tasks. In other cases, some passages could not be transcribed and in yet others the transcriptions can only be of a tentative nature. Tokens in such passages were ignored in the following analysis, as were tokens whose specific realization was unclear, for example because of superimposed ambient noise.

The relatively small size and particular composition of the corpus impose certain limitations on linguistic analyses. For example, the corpus is too small to yield enough tokens for a quantitative analysis of a good number of morphosyntactic variables. Also, since the corpus mostly contains scripted monologues, i.e. material that was written to be spoken, we can expect relatively few deviations from the British norm. This is because nativization in African Englishes proceeds more quickly on the level of phonetics/phonology than in morphosyntax, the latter being more exonormative in its orientation towards BrE, particularly in the written mode (Brato and Huber 2012: 181). In spite of these limitations, 10,000 words are enough for a simple variationist study of frequent variables such as phonemes. Because of the varying sound quality of the

9 In the particular case of Mrs Simpson it turns out that her husband came from Dundee, Scotland (Ghana Office 1957: 191), and in 1962 was appointed judge in Sarawak, North Borneo and Brunei, by the Queen (The Singapore Free Press 1961: 2). In all likelihood, therefore, Mrs Simpson was also British and was therefore excluded from the corpus.


401

recordings, it was decided to perform an auditory analysis, as this often allows for the identification of particular realizations of variables even if the sound signal is not pristine. 4 The development of four phonemic variables in Ghanaian English The following is an exploratory auditory analysis of the corpus of early educated spoken GhE. Four phonological variables were chosen for the study, based on the variation they show between more “British” and “Ghanaian” variants in present-day GhE: (ing), (wh), (NURSE) and (STRUT). These variables will be investigated for the distribution of individual variants and the results will be compared with their distribution in present-day GhE. The aim is to determine whether during the nativization of GhE since independence we can observe an increase of Ghanaian variants at the expense of a pronunciation that is closer to Standard BrE. In this study, the corpus of early educated spoken GhE is taken to reflect the pronunciation at around 1960, while the present-day GhE pronunciation is determined on the basis of structured sociolinguistic interviews mainly conducted during a students’ excursion to Ghana in 2008.10 To insure comparability of the data, only tokens from the reading passage and word list modules were considered here, as they come nearest to the scripted monologues in the 1960 corpus. Table 2 shows the speakers’ L1, gender, age and educational background (S = secondary school leaver, U = university student / graduate) at the time of recording (2008, 2010): Table 2. Sociobiography of speakers of present-day educated spoken GhE Fante Twi Ga Ewe Female 1 23, U (2010) 26, U (2008) 20, U (2008) 29, S (2008) Female 2 22, U (2010) 18, U (2008) 24, S (2008) 32, S (2008) Female 3 26, U (2010) 24, U (2008) 22, U (2008) 22, S (2008) Male 1 45, U (2008) 23, U (2008) 44, S (2008) 38, U (2008) Male 2 23, S (2008) 31, U (2008) 19, U (2008) 21, S (2008) Male 3 24, U (2008) 21, S (2008) 48, S (2008) 23, U (2008) 4.1 Variable 1. (ing) : [iŋ] ~ [in, ĩ] The realization of /-ɪŋ/ as [-ɪn] in post-tonic -ing (e.g. in swimming) has been described for many English accents and is “practically universal” in varieties of English world-wide (Schneider 2004: 1124).

According to Wyld (1921: 289), the alveolar nasal in [-ɪn] for -ing has a long history and at one time was common in almost all varieties in England. In the

10 I would like to thank the excursion participants from the University of Giessen and their partner students at the University of Ghana for their help in collecting the data. The interviews with the female Fante speakers were conducted by George Kodie Frimpong of the University of Ghana in 2010, for which I owe him my heartfelt thanks.


402

1820s, [-ɪŋ] emerged as a spelling pronunciation and developed into a prestige variant. At the time that Wyld was writing (late 1910s), the older [-ɪn] was still common in the dialects of “the South and South Midlands, and among large sections of speakers of Received Standard English”, but the new variant [-ɪŋ] had “a vogue among the educated at least as wide as the more conservative one with -n”. At the beginning of the 2first century, [-ɪn] has become even more restricted geographically and socially in Britain. It is today found “in Northern and West Midland English as a stigmatised feature” (Upton 2004: 1073). With regard to present-day GhE, Huber (2004: 858) observes that “RP /-ɪŋ/ in progressives or deverbal nouns is more often than not replaced by [-ɪn], cf. morning [mɔnin], leading [lidin], the meeting [dɛ mitin]”. This is corroborated by Adjaye (2005: 193, 194), who finds that in her 1980s data from Ewe, Ga and Akan L1 speakers, /ŋ/ “is commonly pronounced by Ghanaians as […] an alveolar nasal, [n], in […] the -ing post-tonic”. Likewise, Koranteng (2006: 242) says that in her data, collected in 2001, “the ‘ing’ form is in almost all instances realized as [ɪn]”. In most varieties of English today, [-ɪn] is the informal, non-standard variant, while [-ɪŋ] is the standard pronunciation. In GhE, however, “[-ɪn] cuts across all socioeconomic and educational levels. It is therefore not necessarily a marked or stigmatized form” (Adjaye 2005: 194).

Simo Bobda (2003: 19) proposes that West African [-ɪn] “was brought, or reinforced, by the eighteenth century African Americans who settled in Sierra Leone” and Adjaye (2005: 194) speculates that it “would be fair to say that in Ghana, it was the [-ɪn] version that became established from the onset”. That [-ɪn] must already have been common in early twentieth century West African English can be inferred from the corrective in Harman (1931: 127), who taught in Nigerian and Gold Coast schools in the 1920s: “Avoid replacing ŋ by n in such words as something, talking […]”.

GhE [-ɪn] can participate in the optional process of syllable-final /n/-elision with compensatory nasalization of the preceding vowel (Adjaye 2005: 191; Huber 2004: 857; a process already observed in 1960s GhE by Strevens 1965: 114), so that meeting has the following GhE pronunciations: (1) meeting [mitiŋ ~ mitin ~ mitĩ] For the present study, a binary variable was constructed, contrasting [ŋ]-realizations (the British standard pronunciation) with realizations from which [ŋ] is absent (the “Ghanaian” pronunciation): (2) (ing) : [iŋ] ~ [in, ĩ] Tokens in neutralizing contexts were ignored, i.e. when the word following -ing had an initial velar plosive /k, g/ (e.g. wedding gifts), because this can trigger assimilation of [n] to [ŋ]. Figure 1 shows the ratio of the two variants in 1960 and 2008:


403

Figure 1. Realization of GhE (ing), 1960 and 2008. (ϰ2=103.164, df=1, ptwo-

tailed<0.001***, psim<0.001***; G=106.028, df=1, ptwo-

tailed<0.001***)11 The above figure shows a strong (42%) and highly significant increase of the Ghanaian variant [in, ĩ] in just 50 years. While in 1960 the British standard variant clearly dominated, with almost 70% of the tokens realized as [iŋ], it has become a minority variant by the beginning of the 2first century, accounting for only 27% of the realizations. That is, in half a century a complete reversal has taken place in the ratio of the two variants.

Nevertheless, the data suggests that the position of the British standard variant [iŋ] is and was stronger in GhE than proposed by Adjaye (2005: 194), who appears to imply that [ŋ] has been so marginal “from the onset” that it “does not have a phonemic status except in the speech of a few individuals”.12 Figure 1 reveals that [in, ĩ] only became dominant in post-independence Ghana. This is also corroborated by Sey’s (1973: 152) observation that in late 1960s GhE there was variation between [-iŋ] and [-in]: “killing, singing, etc., may be pronounced /kilin/ and /siŋin/” (my underlining). If it is legitimate to extrapolate to the past from Figure 1, the Ghanaian [in, ĩ] realization in all probability only existed as a minority or marginal variant in the educated GhE of the first half of the twentieth century. 4.2 Variable 2. (wh) : [w] ~ [hw, ʍ, hʍ]

11 Here and below: psim=Pearson’s chi-squared test with simulated p-value, based on 10,000 replicates; G= Log likelihood ratio (G-test) test of independence with Williams’ correction. 12 Note that 13 out of the 38 speakers in Adjaye’s study only had an elementary school education (Adjaye 2005: 30). Her results thus probably reflect a somewhat more basilectal GhE than the educated GhE represented by the speakers in the present investigation. In terms of style, Adjaye’s data is directly comparable to my 2008 data as it consists of a word list and sentences and passages to be read out loud (2005: 36-38).


404

The pre-aspiration [hw] and/or devoicing [ʍ, hʍ] of orthographic <wh> is generally in decline in varieties of English. On the whole, /ʍ/ or its variants tend to be restricted today to rural, non-standard or conservative varieties and older speakers. In the British Isles, /ʍ/ is attested in Scottish Standard English (Stuart-Smith 2004: 61) and in Shetland (Melchers 2004: 42). In Ireland, Hickey (2004: 78, 81, 92-93) finds /ʍ/ in conservative Ulster Scots, Popular Dublin English, the rural South-West and West and variably in supraregional Southern Irish English. In American English, “dialects have retained a historically older consonant cluster with an initial velar fricative [x] before the approximant [w], so that, unlike many mainstream varieties of English, which is not homophonous with witch” (Schneider 2004: 1086). For varieties of English in the Pacific and Australasia, Burridge (2004: 1093) similarly remarks that “[a]s in other parts of the English-speaking world the distinction between /w/ and /hw/ has virtually disappeared […]. The /hw/ cluster is preserved only for the most conservative speakers of these varieties […]”.

Pre-aspiration/devoicing of <wh> is a feature that has hitherto gone largely uncommented in African Englishes in general and GhE in particular.13 To my knowledge, Gyasi (1991: 27) is the first to mention that in GhE “[t]he voiceless glottal fricative /h/ is clearly heard in where, when, white and wheat”. In the analysis of her 1980s data, Adjaye (2005: 208) says that “wh- words [are] usually pronounced [hw-] in GhE” and Koranteng (2006: 249) finds that “in words with spelling ‘wh’ […] there is the use of [hw-] or [w] alternatively”.

Figure 2 indicates the distribution of the voiced [w] and pre-aspirated/unvoiced variants [hw, ʍ, hʍ] of <wh> in the 1960 and 2008 corpora. <who-> spellings that generally lack a /w/ like who or whole were omitted from the analysis.

13 Apart from GhE (Huber 2004: 861), the only other African English for which the Handbook of Varieties of English (Schneider et al. 2004) mentions the /w/-/ʍ/ distinction is White South African English, in which it is found among “some (particularly older) Cultivated speakers” (Bowerman 2004: 940).


405

Figure 2. Realization of GhE (wh), 1960 and 2008. (ϰ2=8.849, df=1, ptwo-

tailed<0.003**, psim<0.004**; G=8.808, df=1, ptwo-tailed=0.003**) We can observe a very significant increase of [hw, ʍ, hʍ] from about 29% in 1960 to about 49% in 2008. As with (ing), it is the “Ghanaian” variant that has risen in the half-century following independence. Regarding its origin, Huber (2004: 861) submits that this “is another feature that could have its historical origin in Scottish [possibly nineteenth/twentieth century missionary] influence in the Gold Coast, reinforced by spelling pronunciation”. Again, while the voiceless/pre-aspirated variant may have been present in early twentieth century GhE, Figure 2 suggests that it must have been in the minority. 4.3 Variable 3. (NURSE) : [ɜ] ~ [ε] The /ɜː/ > /ε/ substitution in the NURSE lexical set is a characteristic that clearly distinguishes GhE from other African Englishes (Huber 2004: 851), where /ɜː/ > /ε/ is more restricted: most other Englishes in Africa replace the NURSE vowel in <or, our, ur, ir> spellings with /ɔ/ or /a/ (Simo Bobda 2003: 22, 28-29), but in Ghana /ε/ is general for all spellings representing the NURSE vowel.

For the Nigerian and Gold Coast English of the 1920s, Harman (1931: 57) records the following realizations of NURSE: [ε], [ɑ], [ɒ] and [ɔ]. In the literature on GhE, /ɜː/ > /ε/ was first mentioned by Schachter (1962: 18-19), who observed this substitution in speakers with a Twi L1 background. A decade later, Sey (1973: 145-147) called the phenomenon “extremely common” and proposed possible BrE dialectal or American English influence. In her mid-1980s data, Adjaye (2005: 141-144) found the following variants: fronted [ε, ɛ, ɛ] ca. 70%, central [ɜ] ca. 21%, backed [ɔ, o] ca. 5%, [a] ca. 4%. Koranteng (2006: 161-162) has similar proportions in her 2001 data: fronted [ε] ca. 79%, central [ɜ] ca. 19%, backed [ɔ] ca. 2%.

Simo Bobda (2000b: 190) hypothesizes that “the systematic occurrence of /ε/ for RP /ɜː/ is fairly recent. It must have supplanted an earlier /ɔ/ for the graphemes <or, ur, our>, as attested by data collected from Ghanaian speakers of the old generation”. According to Simo Bobda (2003: 33-34), /ɔ/ is “fading out in Ghana as the substitute for the NURSE vowel with <or, our, ur>” and has “over a generation” almost entirely been replaced by /ε/. He judges this replacement to be a conscious divergence occasioned by Ghanaians’ belief that they speak better English than other West Africans.

Figure 3 shows the ratio of the fronted [ε] and central [ɜ] variants of (NURSE) in the 1960 and 2008 corpora:


406

Figure 3. Realization of GhE (NURSE), 1960 and 2008. (ϰ2=8.541, df=1, ptwo-

tailed=0.003**, psim=0.004**; G=8.418, df=1, ptwo-tailed=0.004**) In the half-century following independence, there has been a moderate (ca. 17%) but very significant rise of the “Ghanaian” fronted variant [ε]. [ɔ], marginal in Adjaye’s (2005) and Koranteng’s (2006) data, is not attested in my corpora, neither in 1960 nor in present-day GhE. The absence of [ɔ] and rather strong presence of [ε] in the 1960 corpus call into question Simo Bobda’s (2000b, 2003) assertion that [ε] is a recent innovation, having emerged “over a generation”, i.e. from around 1970. It is true that [ε] was not the majority variant in 1960, but it still accounted for about 28% of the NURSE tokens.14 If the rise of fronted [ε] was more or less constant, the rates in Figure 3 suggests that it must already have existed at the beginning of the twentieth century. 4.4 Variable 4. (STRUT) : [ε] ~ [a] ~ [ʌ] ~ [ɔ] The realization of the STRUT vowel is a further distinguishing feature of GhE. While other Englishes in West Africa usually replace RP /ʌ/ by /ɔ/, GhE shows a lot of variability, including fronted, central and backed variants: [ε] ~ [a] ~ [ʌ] ~ [ɔ].

That STRUT can be realized as [ε] was first noticed around 1890 by a Cape Coast missionary, who reported that missionary school pupils pronounced butter like better and Hull like hell (Kemp 1898: 179). Harman (1931: 53-54) suggested that in 1920s West African English, [ε] was a hypercorrection of the common realization of STRUT as [ɔ], with speakers “push[ing] the tongue well forward to avoid making ɔ”. 30 years later, Schachter (1962: 18) pointed out that Twi L1 speakers realize RP /ʌ/ as [ɔ]. Similarly, with reference to the GhE of the late 1960s, Sey (1973: 145, 147) described [ɔ] as “widespread” and [ε] as a minority 14 Koranteng (2006: 163) also questions Simo Bobda’s recent emergence hypothesis.


407

variant, “common in the Cape Coast area”. He did not specifically mention centralized realizations, but since he described GhE as a system “of tendencies rather than specific Ghanaian usage” (6), we can assume that [ʌ] was present, too. In addition, we can deduce from Sey’s (1973: 147) observation that “/ʌ/ does not occur in L1 but the most likely substitute for it would be /a/ and not /ɔ/ or /ε(:)/” that the [a] variant must have been marginal. Gyasi (1991: 27) only mentions /ɔ/ as a possible realization of STRUT. Interestingly enough, [a] has a much stronger position in Adjaye’s (2005: 71-75) mid-1980s recordings: [ε] ca. 13%, [a] ca. 35%, [ʌ] ca. 16%, [ɔ] ca. 36%. Koranteng’s (2006: 137-145) 2001 data show the following proportions: [ε] ca. 6%, [a] ca. 18%, centralized [ʌ, ä, ə] ca. 55%, backed [ɔ, ɒ] ca. 26%. The great variation between these figures (and the ones presented in Figure 4, below) may well be due to the fact that an auditory analysis of naturally occurring speech is always subject to perceptional differences between researchers and that it is not always easy to distinguish between [a] and [ʌ].

Simo-Bobda (2000b: 188) observes that “only a generation or so ago, Ghana had /ɔ/ for cut, just, mother, done, etc. like the rest of West Africa” and submits that the realization of the STRUT vowel as [a] or [ε] is another change that has only recently occurred in GhE, again as a result of a conscious effort on the part of Ghanaians to dissociate their accent from that of other anglophone West Africans (Simo Bobda 2003: 33-34, 2000a: 258fn).

The development of the four (STRUT) variants in the 1960 and 2008 corpora can be seen in Figure 4. Tokens were identified based on the pronunciation as suggested by the OED. This includes function words like but, does, us or some, whose citation form contains an /ʌ/, because these tend to be realized with a full vowel quality in GhE rather than a schwa, even in unstressed positions.

Figure 4. Realization of GhE (STRUT), 1960 and 2008. (ϰ2=174.002, df=3, ptwo-

tailed<0.001***, psim<0.001***; G=169.705, df=3, ptwo-tailed<0.001***)


408

Figure 4 shows that there has been a highly significant and considerable (31%) increase of [a] at the expense of central [ʌ] and back [ɔ] realizations of STRUT from 1960 to 2008. [ε] has remained marginal and, as the only variant, has seen no significant change. The findings agree with Simo Bobda’s assertion that [a] is spreading but they contradict his hypothesis that it has only recently replaced older [ɔ]. In the 1960s corpus, [ɔ] already is a minority variant (21%), while 50% of the tokens are realized as [a]. This means that the rise of [a] must have started much earlier than supposed by Simo Bobda. Based on apparent time evidence, Huber (2004: 854) proposes that it must have begun in the 1930s. Figure 4 also shows that, contra Simo Bobda, [ε] is not a particularly new variant either. In fact, the missionary’s observation, quoted above, confirms that it was already around in the late nineteenth century.15 Note also that the RP variant [ʌ], still found in about 26% of the tokens in 1960, has all but vanished in the 2008 data (ca. 3%). 5 Conclusion Because of the lack of data, the actual structural development of postcolonial Englishes has not been studied much hitherto. The recordings from the GBC Sound Archive thus represent invaluable data from the period just after Ghana’s independence in 1957. For the first time, the availability of a sufficient amount of real-time data makes it possible to study the diachronic phonological development of GhE in a variationist framework. The comparison of the development of four phonological variables in spoken educated GhE at around 1960 and 2008 has shown that there have been considerable changes over the last half-century or so. In both the 1960 and the 2008 corpus, each of the variables has British (RP) variants but also alternative, more nativized Ghanaian ones. The analysis has shown that in all four variables, the RP variant has receded over time – (ing) : [-Iŋ] by 42%, (wh) : [w] by 20%, (NURSE) : [ɜ] by 17% and (STRUT) : [ʌ] by 23%. If this is representative of the development of the GhE phonological system as a whole, it suggests that spoken educated GhE was closer (but not identical) to RP at the time of Ghana’s independence and that Ghanaian variants have gradually been replacing the British ones over the past 50 to 60 years. That is, we can observe a development from a more exonormatively to a more endonormatively oriented variety. While this is what would be expected during the Nativization Phase in the evolution of a New English (following Schneider 2003, 2007), it was also shown that all variants that are characteristic of GhE today were already in existence in 1960 (even if they were not the majority variants then), in some cases considerably earlier than assumed in the literature. Simo Bobda (2000b: 190, 2003: 33-34, 2000a: 258fn) suggested that in the case of NURSE and STRUT, GhE has developed away from the pronunciation patterns of other West African Englishes since about 1970 as Ghanaians dissociated themselves linguistically from other anglophone West Africans. Attitudes towards 15 For reasons of space, individual factors contributing to the different realizations of GhE STRUT cannot be discussed here. Interested readers are referred to Simo Bobda (2000b: 187-189) and Huber (2004: 851-854).


409

other West Africans’ English may have furthered the spread of the characteristically Ghanaian variants, but as this study has shown, they were certainly not the reason for their emergence. If Simo Bobda’s dissociation theory is correct, a possible scenario may have been that Ghanaians selected and promoted those variants that were most dissimilar from the dominant realization in other West African Englishes. All in all, the four variables have undergone considerable change, attesting to rapid phonological restructuring during the nativization of GhE. The growing acceptance of an endonormative model as a corollary of linguistic emancipation following political independence as well as the dissociation from other West African pronunciation practices may have played a role in this. Another factor in the spread of distinctly Ghanaian variants may have been the 1951 Accelerated Development Plan for Education, which aimed at providing free basic education for all children and led to the establishment of a large number of new primary schools. The resulting staff shortage was alleviated by employing primary school leavers as teachers, dramatically decreasing the percentage of trained teachers in primary and middle schools from over 52% to 28% (Inkoom 2012: 10). Together with the early transition from African languages to English as the medium of instruction, this may have led to an endonormative shift, the Ghanaianization of pronunciation (see also Boadi 1971: 56, Huber 2004: 844). Diachronic phonological studies of early recordings of postcolonial Englishes can provide much-needed data to test and refine evolutionary models that so far have been based mainly on external language history and synchronic structural data. Once more early spoken GhE data will become available, a more sophisticated study of the linguistic and extralinguistic factors determining the choice of individual variants can be conducted, providing insights into possible restructuring processes within linguistic variables during nativization. References Adjaye, Sophia A. 2005. Ghanaian English Pronunciation. Lewiston: The Edwin

Mellen Press. British Broadcasting Corporation (BBC) World Service. (n.d.). Historic moments

from the 1930s. http://www.bbc.co.uk/worldservice/specials/1122_75_years/ page2.shtml > Listen to a highlight from 1935 (2014-06-15).

Biewer, Carolin, Tobias Bernaisch, Mike Berger and Benedikt Heller 2014. Compiling The Diachronic Corpus of Hong Kong English (DC-HKE): motivation, progress and challenges. Poster presented at the 3fifth ICAME Conference, Nottingham, England, 30.04.-04.05.2014.

Boadi, Lawrence A. 1971. Education and the role of English in Ghana. In: John Spencer (ed.) The English Language in West Africa. London: Longman, pp. 49-65.

Bowerman, Sean 2004. White South African English: phonology. In: Schneider et al. (eds), pp. 931-942.

Brato, Thorsten. (in prep.). Compiling a historical written corpus of Ghanaian English: Methodological and theoretical considerations.

http://www.bbc.co.uk/worldservice/specials/1122_75_years/


410

Brato, Thorsten and Magnus Huber 2012. English in Africa. In: Raymond Hickey (ed.) Areal Features of the Anglophone World. Berlin: Mouton de Gruyter, pp. 161-185.

Buckley, Steve, Berifi Apenteng,, Aly Bathily and Lumko Mtimde 2005. Ghana Broadcasting Study. Report for the Government of Ghana and the World Bank. http://siteresources.worldbank.org/INTCEERD/ Resources/WBIGhanaBroadcasting.pdf (2014-06-01).

Burridge, Kate 2004. Synopsis: phonetics and phonology of English spoken in the Pacific and Australasia region. In: Schneider et al. (eds), pp.1089-1097.

Census Office 1964. 1960 Population Census of Ghana. Vol. III: Demographic Characteristics of Local Authorities, Regions and Total Country. Accra: Census Office.

Ghana Broadcasting Corporation (GBC) 2010. About Ghana Broadcasting Corporation. http://archive.today/bfqG0 (2014-06-15).

Ghana Office 1957. Ghana Today. London: Information Section, Ghana Office. Ghana Statistical Service 2013. 2010 Population and Housing Census.

Demographic, Social, Economic and Housing Characteristics. Accra: Ghana Statistical Service. http://www.statsghana.gov.gh/docfiles/publications/ 2010_PHC_demographic_social_economic_housing_characteristics.pdf (23 November 2013).

Government of the Gold Coast 1950. The Gold Coast Census of Population 1948. Report and Tables. London: The Crown Agents for the Colonies.

Gyasi, Ibrahim K. 1991. Aspects of English in Ghana. English Today 26: 26-31. Harman, H.A. 1931. The Sound of English Speech. A Handbook for African

Students. London: Longmans. Hay, Jennifer; Margaret MacLagan and Elizabeth Gordon 2008. Dialects of

English. New Zealand English. Edinburgh: Edinburgh University Press. Hickey, Raymond 2004. Irish English: Phonology. In: Schneider et al. (eds), pp.

68-97. Hoffmann, Sebastian, Andrea Sand and Peter K. W. Tan 2012. The Corpus of

Historical Singapore English. A First Pilot Study on Data from the 1950s and 1960s. Paper presented at the 3third ICAME Conference, Leuven, Belgium, 30.05-03.06.2012.

Huber, Magnus 1999. Atlantic English Creoles and the Lower Guinea Coast. A case against Afrogenesis. In: Magnus Huber and Mikael Parkvall (eds) Spreading the Word. The Issue of Diffusion among the Atlantic Creoles. London: University of Westminster Press, pp. 81-110.

Huber, Magnus 2004. Ghanaian English: Phonology. In: Schneider et al. (eds), pp. 842-865.

Huber, Magnus 2012. Ghanaian English. In: Bernd Kortmann and Kerstin Lunkenheimer (eds) The Mouton World Atlas of Variation in English. Berlin / New York: Mouton de Gruyter, pp. 382-393.

Huber, Magnus 2015. Stylistic and sociolinguistic variation in Schneider’s Nativization Phase. T-affrication and relativization in Ghanaian English. In: Sarah Buschfeld, Thomas Hoffmann, Magnus Huber and Alexander Kautzsch (eds) The Evolution of Englishes. The Dynamic Model and beyond. Amsterdam: John Benjamins, pp. 86-106.

http://siteresources.worldbank.org/INTCEERD/

http://archive.today/bfqG0

http://www.statsghana.gov.gh/docfiles/publications/


411

Huber, Magnus and Sebastian Schmidt 2011. New ways of analysing the history of varieties of English. Early Highlife recordings from Ghana. Paper presented at second ISLE Conference, Boston, USA, 17.06.-21.06.2011.

Inkoom, Agatha 2012. Implementation of initiatives to reform the quality of education in rural Ghanaian junior high schools. PhD thesis, School of Education, Faculty of Education and Arts, Edith Cowan University, Perth, Western Australia. http://ro.ecu.edu.au/cgi/viewcontent.cgi?article=1485& context=theses (2014-07-09).

The International Corpus of English (ICE) 2014. http://ice-corpora.net/ice (2014-07-09).

Kemp, Dennis 1898. Nine Years at the Gold Coast. London: MacMillan and Co. Limited.

Koranteng, Louisa Ann 2006. Ghanaian English. A description of its sound systems and phonological features. PhD thesis, Department of English, University of Ghana.

Lewis, M. Paul, Gary F. Simons and Charles D. Fennig (eds) 2013. Ethnologue: Languages of the World. seventeenth edn. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com (2013-11-23).

The London-Oslo-Bergen (LOB) Corpus, original version (1970-1978), compiled by Geoffrey Leech, Lancaster University, Stig Johansson, University of Oslo (project leaders), and Knut Hofland, University of Bergen (head of computing).

Melchers, Gunnel 2004. English spoken in Orkney and Shetland. In: Schneider et al. (eds), pp. 35-46.

Peasah, Kwame Ofosuhene 2009. Knapsack algorithm. A case study of Garden City Radio (a local radio station in Kumasi). MSc thesis, Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana. http://dspace.knust.edu.gh/jspui/bitstream/123456789/746/1/ KWAME OFOSUHENE PEASAH.pdf (2014-06-15).

Schachter, Paul 1962. Teaching English Pronunciation to the Twi-Speaking Student. Legon: Ghana University Press:

Schneider, Edgar W. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2): 233-281.

Schneider, Edgar W. 2004. Global synopsis: phonetic and phonological variation in English world-wide. In: Schneider et. al, pp. 1111-1137.

Schneider, Edgar W. 2007. Postcolonial English. Varieties around the World. Cambridge: Cambridge University Press.

Schneider, Edgar W., Kate Burridge, Bernd Kortmann, Rajend Mesthrie and Clive Upton 2004. A Handbook of Varieties of English. Vol. 1: Phonology. Berlin, New York: Mouton de Gruyter.

Sey, Kofi A. 1973. Ghanaian English. An Exploratory Survey. London: Macmillan. Simo Bobda, Augustin 2003. The formation of regional and national features in

African English pronunciation. An exploration of some non-interference factors. English World-Wide 24: 17-42.

Simo Bobda, Augustin 2000a. Comparing some phonological features across African accents of English. English Studies 81: 249-266.

Simo Bobda, Augustin 2000b. The uniqueness of Ghanaian English pronunciation in West Africa. Studies in the Linguistic Sciences 30: 185-198.

http://ro.ecu.edu.au/cgi/viewcontent.cgi?article=1485&

http://ice-corpora.net/ice

http://www.ethnologue.com

http://dspace.knust.edu.gh/jspui/bitstream/123456789/746/1/


412

The Singapore Free Press 1961. No. 16308, 1961-12-29. Strevens, Peter 1965. Pronunciations of English in West Africa. In: Peter Strevens

(ed.) Papers in Language and Language Teaching. London: Oxford University Press, pp. 110-122.

Stuart-Smith, Jane 2004. Scottish English: Phonology. In: Schneider et al. (eds), pp. 47-67.

Upton, Clive 2004. Synopsis: phonological variation in the British Isles. In: Schneider et al. (eds), pp. 1063-1074.

Wyld, Henry Cecil 1921 (1920). A History of Modern Colloquial English. Second edition. London: T. Fisher Unwin.

Becker Earlier South African English --- Page 413 of 525

413

19 Earlier South African English Ian Bekker 1 Introduction This article is focused on providing, through the analysis of early recorded data, a window on past varieties of English used in South Africa, particularly by those born and raised in this country. The data and speakers analyzed for this article form part and parcel of a larger collection of recordings obtained from the sound-archives of the South African Broadcasting Corporation (SABC) in Auckland Park, Johannesburg, South Africa1. These recordings cover a range of genres, including news-broadcasts, recorded speeches, interviews, etc.2 While there are a number of recordings dated earlier than the three selected for analysis below, they were deemed unsuitable for analysis for a number of different but not mutually exclusive reasons e.g. poor sound quality as well as uncertainty as to the birth-place or first-language of the subject concerned. Thus, by way of example, the earliest data available is from a 1922 recording of Sir Percy Fitzpatrick, born in 1862 in King William’s Town in the Eastern Cape (see Map 1 below), author of Jock of the Bushveld and prominent South African financier and politician3. The recording is, however, of a rather poor quality and, in addition, it is clear that Sir Fitzpatrick spent much of his early schooling career in England (to be specific at Downside School near Bath, Somerset) and is thus not a particularly useful subject in terms of trying to gain a perspective on early varieties of South African English (SAfE). Other recordings, particularly during the 1930s and 1940s, are of interest, but it is often unclear whether the individuals concerned were British or South African born or, in some cases, L1-English or L1-Afrikaans. The choice of the three eventually selected recordings was thus based on a number of practical considerations and one theoretical assumption. The considerations were those of a standard of sound quality high enough to allow for acoustic analysis, as well as clear evidence that the subject was South-African born and L1-English. The theoretical assumption was that which lies behind the so-called ‘apparent-time hypothesis’, common in Labovian variationist sociolinguistics as well as in traditional dialectology, and which is explained in, for example, Chambers (2003: 202-203) in the following way:

For the stages of life beyond young adulthood, our best evidence indicates that ... people’s speech preserves markers, some subtle and some blatant, that indicate where they have been. For most people, these markers include tell-tale signs of the home-dialect where they spent their childhood, the fossilized slang of a faded

1 Particular thanks go to Retha Buys from the SABC’s sound-archive unit for providing me with the archival material and digitizing the final selection. 2 More work still needs to be done in terms of cataloguing the data collection. 3 See the Wikipedia entry on Sir Percy Fitzpatrick for more information; accessed by myself on 26 October 2013.


414

adolescence, and the fine adjustments of maturity. Having worked their way through these formative periods, people reach a point where the range of styles and the inventory of socially significant variants are deemed sufficient, at least subconsciously, for all practical purposes in the situations they find themselves in.

Map 1. Southern Africa in 1898 Given the above assumption, it was decided to focus on three recordings: all of subjects born in South Africa in the second-half of the nineteenth-century and thus, in different ways, providing a window to the linguistic past of the English-speaking community of this country. These three speakers were also chosen on the basis that, impressionistically, they appear to represent three different ‘levels’ of speech in early SAfE, each individual using speech that is further or closer to a ‘standard’ or Received Pronunciation (RP) accent of the time, similar in this respect to the three-way distinction provided in, for example, Lanham and Traill (1962) i.e. South African Received Pronunciation (SARP) ‘A’, SARP ‘B’ and non-SARP SAfE, echoed in turn by the three modern SAfE sociolects: Cultivated/Conservative SAfE, :General/RespectableGeneral/Respectable SAfE and :Extreme/BroadExtreme/Broad SAfE. Naturally, the links between English-speaking South Africa and Britain were particularly strong during the historical period in question, so any analysis of early SAfE speech needs to keep this factor in mind. Map 1 shows Southern Africa during this period. Two areas were under British control at the time: the Cape Colony (in which one of the subjects was born) and Natal (in which the other two were born). The Transvaal and the Orange Free State (now part of South Africa but with different names) were at the time independent


415

republics, subsequently overthrown by the British and eventually incorporated into the Union of South Africa as a result of the Second Anglo-Boer War (1899-1902). In Section 2 below the individual subjects are first introduced in broad biographical terms, and then a broad impressionistic analysis is provided, including information on consonantal features. The emphasis of the main (acoustic) analysis is, however, on the qualities of the vowels of each of the speakers. In each case details concerning the analytical procedure are given and then a full vowel-chart is provided for consideration (Figures 1-3). Each vowel-chart is then briefly analyzed in terms of what it ‘says’ about the vowel qualities of the speaker concerned. In what follows Wells’ (1982) well-known lexical sets are used as a convenient basis for description and analysis. Section 3, in contrast, is focused on a comparison between the three speakers and Figures 4 to 7 provide a useful basis for the relevant comparisons. These place the acoustic results of all three speakers on the same plot, each plot reserved for one of Wells’ (1982) four so-called part-systems, the details of which will be explained further below. Section 4 details a number of tentative conclusions and generalizations gleaned from the above-mentioned analysis, hopefully opening up avenues for future research. 2 The individual recordings As mentioned above, each of these subjects appears to have been differentially influenced by the standard British English of the time, with the first subject showing the least influence, the third subject the most influence, while the second subject seems to lie somewhere in-between. The various analyses below will provide more evidence for this broad assertion. 2.1 Mr. Flemming The first recording is of one Mr. Charles Flemming (henceforth CF), who was 104 at the time of recording, 1958. He was thus born in 1854. From the interview it is clear that CF is the son of an 1820 Settler4 and born in South Africa in the Eastern Cape; in all likelihood in Port Elizabeth (see Map 1). Given the short duration of the recording (2 minutes) it is possible to provide the full transcription:

We didn’t stop too long in P.E. [Port Elizabeth], we shifted to Grahamstown, from Grahamstown we shifted to Bedford ... my father was an 1820 wagon builder by trade ... and of course timber they couldn’t buy in those days ... but in Bedford there was bush ... that he could go and cut his own timber to build wagons ... and carriages ... and the country was very rough yet ... I remember down in the Cape elephants

4 The 1820 Settlers were the first permanent English-speaking settlers in South Africa. In terms of some models of the development and formation of SAfE, the Eastern Cape settlement (centered particularly around Port Elizabeth and Grahamstown – see Figure 1) was the ‘birthplace’ of modern-day SAfE. See for example Lanham (1967: 105) and Lass (1995: 93). Bekker (2012), however, provides an alternative model.


416

were still knocking about ... even after they built the railway ... used to stop the train sometimes from going along ... once going down to P.E. ... the elephant was standing on the line and they had to stop the train until he moved on ... coz they didn’t worry about shooting them you see ... but the lions ... used to be about and ... wolves ... we had to be very careful ... lot of wolves about in those days ... I’m quite proud of a ... lot of big buildings I put up ... and schools ... I did build bridges and that was on the Cape-Natal railway ... [undecipherable] in Seaview ... the little Catholic church there, we put it up just the two of us ... just me and my son, my son was still a [ap]prentice under me ... me and my [undecipherable] built the government building [undecipherable] ... that was in the early days, we had to transport the stuff up from Ladysmith ... by wagons ... we hired transport ... we took a short cut through the Free State ... they were old transport wagons ... [undecipherable] that way too, it was the best way for them to go, but there were no roads ... sometimes I had to walk fifteen miles to my job ... you got ten bob a day [undecipherable] you were getting good pay ... and in those days we had to work from sunrise till sunset ... not like today, hey [laughter], only eight hours ... you were so dead-beat that you ... glad to have, get something to eat and get into bed.

On an impressionistic level, the first thing one notices is the pronunciation of the word church as [kjəts]. The word prentice is also used instead of apprentice and pronounced as [pɹɪntəs], while the town Bedford is pronounced as [bɪtfəd] instead of [bedfəd]. Voiceless stops in CF’s speech have a peculiar quality, which might be described as unaspirated5, but which some listeners have even described as ‘Indian-sounding’ i.e. retroflex. The possible influence of CF’s advanced age at the time of

5 According to Ray Hickey (p.c.), there are some early twentieth-century recordings of RP-speakers (or more accurately of those who adopted RP later in life) who have unaspirated stops. An example is the poet T.S. Eliot who apparently did not aspirate his voiceless stops at all.


417

recording complicates this issue somewhat however.

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

BA

DRFA

FL

FO

GOA

GOO

KI

LOTR

ST

MOPR

TH

early

there

KITL

KITV

KITU

Figure 1. Mr. Flemming’s vowel chart (Lobanov-normalised) Another consonantal feature is the use of [w] or even [β] for /v/6. Turning to the vowels, the use of an [æ]-quality in one of the tokens of elephant is quite apparent on an impressionistic level, as well as an occasional fronted quality to the BATH vowel, which in the case of two out of three instances of the word transport, has an Australian [a] or even American-like [æ] quality. The two other instances of BATH (which couldn’t be analyzed acoustically) were, however, rendered with a typically SAfE backed [ɑː]: father and after. David Britain (p.c.) also mentions that ‘two

6 According to Trudgill (2008: 190), ‘in many local varieties spoken in the southeast of England in the eighteenth and nineteenth centuries, prevocalic /v/ in items like village was replaced by /w/’. More research needs to be conducted in terms of reconstructing the regional provenance of some of the more unusual (from a SAfE perspective) features in CF’s speech. Space limitations have meant that this task could not be undertaken here.


418

instances of the NURSE vowel, ‘church’, and ‘work’ were short and quite open and, with respect to frontness, central. Trudgill reports this for Norfolk’7. While the sound-quality of the recording was, understandably, not up to modern standards, it did allow for the acoustic analysis of 165 vowels-tokens, the results of which are represented in Figure 1. In Figure 1 (as well as Figures 2 and 3) the traditionally short vowels of English are bolded, italicized and the name of the relevant lexical set is abbreviated. Although for most of Wells’ (1982) lexical sets there was enough data to determine a mean value, there were a few lexical sets that were not represented at all (e.g. CURE and NEAR) or where there were only one or two tokens; thus in the case of SQUARE there were two measurable tokens of the word there (averaged out for the purposes of Figure 1) and in the case of NURSE there was one analyzable token of early. The remaining lexical sets were represented by a minimum of three (3) (i.e. FOOT and BATH, with three tokens of transport) to a maximum of twenty-three (23) tokens i.e. FACE. In the case of all tokens F1 and F2 measurements were taken, using Praat (Boersma and Weenink 2013); in the case of the traditionally short-vowels at the temporal mid-point, while in the case of the traditionally long English vowels (both monophthongal and diphthongal) two measurement points were used, roughly at the 25% temporal point and the 75% temporal point. The formant data for CF was also subject to normalisation (Lobanov, 1971) to render it comparable with data from the other two female subjects. Turning to Figure 1, a brief perusal of this subject’s vowel-chart alerts one to the following main features:

1. the concentration of vowels in the front-low area of the vowel-space. Within this area, the close approximation of DRESS (DR in Figure 1) to TRAP (TR) is particularly noticeable;

2. Of particular interest too is the relatively fronted BATH (BA) vowel, confirming the original impressionistic analysis, and which overlaps in quality with the STRUT (ST) vowel in a very Australasian-like fashion. This, in fact, provides some evidence for the claim made in Bekker (2012) that the backing of BATH in SAfE is perhaps more the result of later developments in the formation of SAfE rather than due to inheritance from the 1820-input British dialects; this assessment should however be tempered by the observation that the un-analyzed tokens of BATH (father and after) were, impressionistically, fully-backed, thus pointing to variability in this regard in the speech of CF8;

3. Apparent too are the rather narrow PRICE (PR) and MOUTH (MO) vowels; in the case of the latter we have a somewhat fronted onset with an (unusual) slight off-glide to a lower position; and

7 Thanks also go to David Britain for pointing out the possibility of lack of aspiration with respect to the voiceless stops as well as the qualities of /v/. Comments by Andrew van der Spuy were also useful for this analysis. 8 Thus similar to Irish English which, according to Hickey (p.c.), displays lexical distribution with respect to this lexical set: thus father has a back vowel while all other instances of TRAP/BATH (including after) have a low central vowel.


419

4. CF’s KIT (KI) shows a classic SAfE KIT-Split (Wells 1982: 612-613), with clear polarization in phonetic space between KIT in high-front position (KITV in Figure 19), KIT before tautosyllabic /l/ (KITL) and unmarked KIT (KITU).

2.2 Ms. Murchie The second subject is one Ms. Murchie (henceforth MM), recorded in Durban in 1970, when the subject was 97 years old (i.e. born in 1873). It is clear from the recording that she was born in Durban, the daughter of a settler10. There is some evidence in the recording to suggest that her mother arrived in Durban in 1863. While it is clear that the subject spent most of her childhood in South Africa and received her schooling there, it is equally clear that time was also spend abroad in England, particularly in early adulthood.

9 KITV here refers to KIT in the context of a velar (kit, lick, king), after /h/ (hit) or initially (it). 10 The main Natal Settlement took place between 1848 and 1862 and comprised individuals generally of a class status higher than those who settled in the Eastern Cape; see for example Hattersley (1950) for an early history of this settlement.


420

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

BA

CHDR

FA

FL

FO

GOA

GOO

KI

LOMO

NE

NU

PR

SQ

ST

TH

TR

sureKIV

KIU

KIL

Figure 2. Ms. Murchie’s vowel chart (Lobanov-normalised) On an impressionistic level, MM sounds much more ‘standard’ than CF, although it is very clear that she is nonetheless South African, echoing Lanham and Traill’s (1962) later SARP ‘B’. Some features which stand out as ‘standard’ are the use of [ɪ], as opposed to [ə ~ ɤ], in words such as did and milk and a clearly unrounded GOAT nucleus. The occasional NURSE token also sounds unrounded, unlike the traditionally rounded SAfE NURSE i.e. [ø ː]. Less ‘respectable’ (and more typically SAfE) features, however, include a clearly fronted MOUTH vowel. The analysis of the acoustic data generally followed the same principles as in the case of CF (formant-analysis using Praat and Lobanov normalization), although in this case there was much more data. In the case of MM, therefore, a target of 20 tokens per lexical set was aimed for, reached in all cases exept NEAR (16 tokens), CHOICE (10 tokens) and CURE (one token of sure). In the case of KIT, 40 tokens were analyzed, given the special interest in the SAfE literature in the allophonic distribution of this lexical set i.e. the KIT-Split. An attempt was made to not analyze more than two tokens of the same word and this was mostly achieved, except in one or two lexical sets (e.g. SQUARE, the analysis of which had to


421

depend on numerous cases of the word there). The overall number of tokens analyzed for MM was 369. Figure 2 provides a graphic representation of the relevant acoustic data. Some of the more obvious features of MM’s vowel-space include the following:

1. Clearly diphthongal PRICE (PR in Figure 2) and FACE (FA) vowels; 2. A fronted (and raised) MOUTH (MO) vowel i.e. thus suggestive of

Diphthong Shift (Wells 1982) and the so-called PRICE-MOUTH Crossover11;

3. A backed BATH (BA) vowel (identical in quality with LOT (LO)); 4. A narrow GOAT (GOA) vowel; 5. A lack of phonetic polarization with respect to the KIT (KI) vowel12 i.e. no

KIT-Split; and 6. A raised DRESS (DR) vowel.

2.3 The Lady in White (Ms. Gibson) This speaker has the most (old-fashioned) RP-like dialect of the three South-African born speakers and from the interview it is clear that she, and her family, had close ties with England. Her speech is most likely characterizable as Wells’ (1982) ‘near-RP’13.

11 A continuation of the Great Vowel Shift (GVS), whereby the nuclei of PRICE and MOUTH are first lowered and then ‘cross-over’ in the sense that the traditionally front nucleus of PRICE is backed and the traditionally backed nucleus of MOUTH is fronted (Wells 1982: 310). RH: Can you provide a reference for this, please. Generally more conservative (as in ‘standard’) varieties do not display this crossover, while many ‘vernacular’ varieties (e.g. Cockney, Australian English, New Zealand English etc.) do. More ‘archaic’ varieties of English (i.e. those which have not undergone the full force of the GVS) often have PRICE and MOUTH nuclei in a closer, more centralized position, as for example found on Martha’s Vinyard in Labov’s classic 1963 study on the dialect of this island. 12 See KITV, KITL and KITU in Figure 3. 13 In other words, RP is not her ‘mother-tongue’ dialect, but rather the dialect she has attempted to model. Wells (1982: 279-280) also makes a distinction between ‘U-RP’ (upper-crust RP) and ‘Mainstream RP’. ‘Conservative RP’ and ‘Advanced RP’ are, moreover, two chronological variants of U-RP. I suspect, given the highly affected nature of some aspects of LW’s speech, that her speech is modelled on Conservative RP.


422

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

BA

CHCU

DRFA

FL

FO

GOA

GOO

KI

LO

MO

NE NU

PR

SQ

ST

TH

TR

KIV

KIU

Figure 3. Ms. Gibson’s vowel chart (Lobanov-normalised) The exact date of the interview is unclear but the subject (LW henceforth) was born in Durban in April 1888 and schooled in that city as well. Her grandfather arrived in Natal in 1851 (from King’s Lynn, Norfolk) and her father appears to have joined him in 1881 (from Woolwich, London). The title ‘Lady in White’ appears to relate to her singing career and, from her own account, she seems to have achieved a fair degree of international success in this regard. Her father was a prominent public figure of the time, with, for example, ties to the Prime Minister of Natal in the late nineteenth-century, and much of the interview is taken up with a description of her and her family’s ties with numerous luminaries both in South Africa and abroad. She is decidedly upper-class; an assessment clearly reflected in her speech. It should also be mentioned that the interview was of a prepared nature, thus lending to the use of a particularly formal style of speech: the interviewer had clearly sent the subject a set of questions before the interview and it is also clear that the subject had spent time constructing notes of some sort for herself: at some junctures in the interview it seems in fact as if she is reading. On a consonantal level, two outstanding features of her speech include the clear, virtually constant, maintenance of a /hw ~ w/ distinction (e.g. which vs.


423

witch) and the use of an occasional (highly affected) tapped ‘r’ (e.g. in inspi[ɾ]ation), which Wells (1982: 282) characterizes as ‘typical of some varieties of U-RP’, see also Fabricius, this volume. On a purely impressionistic level her GOAT and FACE vowels stand out as being particularly front, close and narrow and her MOUTH vowel is fully-backed. Her SQUARE vowel often has an in-glide to [ə]. The PRICE vowel is generally fronted and occasionally glide-weakened and there is no evidence of a KIT-Split in her speech. In terms of the acoustic analysis of vowel quality, the same procedure was followed as in the case of MM, though the slightly longer interview allowed for the collection of a little more data: while the same lexical sets as in the case of MM were relatively poorly represented, more tokens could be analyzed: nineteen (19) for NEAR, fifteen (15) for CHOICE and six (6) for CURE. The total number of tokens analyzed was 381. Figure 3 provides a graphic illustration of the relevant results. Some of the features which stand out and which generally confirm the broad impressionistic analysis above are as follows:

1. Very high, front and narrow FACE (FA) and GOAT (GOA) vowels; 2. A backed nucleus for MOUTH (MO), which is also glide-weakened; 3. A clear in-gliding diphthong in the case of SQUARE (SQ); and, less clearly,

for NURSE (NU); 4. A raised TRAP (TR) and DRESS (DR) vowel; and 5. No evidence of a Second-Force Merger (Wells 1982: 237)14 with respect to

CURE i.e. [ʊə ~ ʊɐ], not [ɔː]-like. 3 Comparative analysis Figures 4 to 7 reconfigure the data in Figures 1 to 3, in order to allow for an easier comparison of this data. Figure 4 focuses on Wells’ (1982) Part-System A (i.e. the traditionally short English vowels: KIT, DRESS, TRAP, STRUT, LOT and FOOT); Figure 5 deals with Part-System B (FLEECE, FACE, PRICE and CHOICE), Figure 6 with Part-System C (GOOSE, GOAT and MOUTH) and Figure 7 with the traditional Part-System D glides-to-[ə] (i.e. NEAR, SQUARE, CURE, BATH, THOUGHT and NURSE15). In all these figures CF’s data is in caps, bolded and italicized (e.g. his TRAP is ‘TR(CF)’), MM’s data is in small-letters, bolded and italicized (e.g. ‘tr(mm)’), while LW’s data is in caps but not bolded or italicized (e.g. ‘TR(LW)’). 3.1. Part-System A

14 This terms refers to the merger of CURE with THOUGHT-NORTH-FORCE at [oː] or [ɔː], thus, for example, [pɔː] and not the traditional [pʊə] for poor. 15 In the case of BATH and THOUGHT these terms are meant to represent the merger of the more narrowly defined lexical sets BATH, START and PALM on the one hand and THOUGHT, NORTH and FORCE on the other. There is no evidence in the data for any distinction between these different lexical sets.


424

Beginning with Part-System A and with FOOT (FO) in Figure 4, the first difference we note is that CF’s FOOT is far more back than either LW’s or MM’s. We also note, from Figure 1, that in CF’s case FOOT is nearly identical in quality to his KIT vowel before final, tautosyllabic /l/ (KITL), a feature CF shares with modern-day SAfE. While FOOT in both MM and LW is more front than in CF’s case, there is no evidence for an overlap with pre-/l/ KIT (Figures 2 and 3), mainly due to a lack of KIT-Split in the idiolects of MM and LW.

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

DR(CF)

dr(mm)

DR(LW) FO(CF)

fo(mm)FO(LW)

KI(CF)

ki(mm)KI(LW)

LO(CF)

lo(mm)

LO(LW)TR(CF)tr(mm)

TR(LW)

ST(CF)st(mm)ST(LW)

Figure 4. Part-System A cross-subject comparison


425

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

CH(LW)ch(mm)

FA(LW)

fa(mm)FA(CF)

FL(LW)fl(mm)

FL(CF)

PR(LW)pr(mm)

PR(CF)

Figure 5. Part-System B cross-subject comparison


426

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

GOA(LW)

goa(mm)GOA(CF)

GOO(LW)goo(mm)GOO(CF)

MO(LW)

mo(mm)

MO(CF)

Figure 6. Part-System C cross-subject comparison


427

F2

F1

-2.4-2.2-2-1.8-1.6-1.4-1.2-1-0.8-0.6-0.4-0.200.20.40.60.811.21.41.6

-1.4

-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

NU(LW)nu(mm)

early(CF)

BA(LW)

ba(mm)

BA(CF)

CU(LW)

sure(mm)

NE(LW)

ne(mm)

SQ(LW)

sq(mm)

there(CF)

TH(LW) th(mm)

TH(CF)

Figure 7. Part-System D cross-subject comparison With respect to LOT (LO), MM’s vowel is separate from the other two speakers and appears to be discernably lower. The subjects’ STRUT (ST) vowels are virtually identical. The same applies to TRAP which is quite raised across the various subjects e.g. if we compare TRAP to the position of STRUT or to the nuclei of PRICE for LW and MM as illustrated in Figure 5. A useful further basis for comparison are the relatively modern RP mean acoustic values provided in Cruttenden (2001: 99) for TRAP in citation-form and connected speech for female speakers i.e. 1011Hz and 1018Hz respectively. In comparison, the two female speakers in this study (LW and MM) have mean values of 602Hz and 710Hz respectively. As confirmed by Britain (p.c.),16 TRAP was often rather raised in both the traditional RP of the time as well as in many rural southern English counties, a fact which no doubts accounts for the similarity in this regard across subjects.

16 Cf. Torgersen and Kerswill (2004) for the south-east of England as well as Cruttenden (2001: 111) and Wells (1982: 291) for RP.


428

The similarity across subjects ends with DRESS with CF having a clearly lowered quality in comparison with the other two subjects. As was the case with TRAP, DRESS had a particularly raised quality in the RP of the time as well as in dialects from the south-east of English (Torgersen and Kerswill 2004; Cruttenden 2004; Wells 1982). CF’s KIT vowel also clearly has a more central quality, pointing to the already mentioned polarization in phonetic space i.e. the KIT-Split. The KIT-Split has generally been either attributed to the influence of Afrikaans on South African English (Lanham and Macdonald, 1979) or to a ‘nineteenth century vowel shift ... in which raising of /æ/ towards [ɛ] and raising of original /ɛ/ to [e] seem to have forced (most of) original /ɪ/ to centralise’ (Lass 1995: 96). 3.2 Part-System B Turning to Figure 5, while all the subjects have relatively monophthongized FLEECE (FL) vowels it is interesting to note that CF’s FLEECE vowel is slightly closer, a particularly high-front (and monophthongal) FLEECE vowel being characteristic of modern Broad SAfE. A monophthongal FLEECE is characteristic of SAfE more generally (contrasting this Southern Hemisphere variety with the other Australasian English dialects). From Figure 5 the narrow, close nature of LW’s FACE (FA) vowel is apparent, a feature echoed in modern prestigious varieties of General SAfE (Lass 1995: 99). Neither CF’s nor MM’s FACE vowels seem particularly lowered, although it would appear that CF’s FACE is somewhat glide-weakened in comparison. Turning to PRICE (PR), a prominent social variable in modern-day SAfE, with backed glide-weakened varieties carrying the least and fronted (optionally glide-weakened) varieties carrying the most prestige (Lass 1995: 99), it is interesting to note that both LW’s and MM’s PRICE vowels are relatively fronted, with LW’s PRICE showing the most fronting and most extensive glide-weakening of the two. While CF’s PRICE is glide-weakened it is certainly not backed and, in fact, shows a somewhat centralized quality, reminiscent of Labov’s (1963) values for PRICE in Martha’s Vineyard. This reflects a variant that has not survived into modern-day SAfE and is perhaps attributable to a number of (non-south-eastern) British dialects which had not at that time undergone the full-force of the Great Vowel Shift, thus retaining an [əɪ]-like quality for the PRICE vowel.17 There was no data for CHOICE (CH) for CF. In the case of the other two subjects, there is a definite glide to a higher and/or fronter position although it is difficult to account for the difference in glide direction. 3.3. Part-System C Turning to Figure 6, we note that GOOSE across the three subjects is virtually identical and, while all show some evidence for slight diphthongisation in the acoustic record, this is certainly not apparent on an impressionistic level. 17 Even for present-day English, Upton (2008: 274) claims that the ‘diphthong begins centrally, with [ɐ/ɜ/ə], for some speakers on Shetland, in Urban Scots, in Popular Dublin speech, and amongst older East Anglians.’


429

Furthermore, while MM and CF’s GOAT begin at the same point in phonetic space, MM’s GOAT appears prone to glide-weakening. LW’s GOAT vowel is substantially different to the other two in the sense of being very much glide-weakened and having an extremely close (RP-like) nucleus: this is no doubt the Conservative RP variant mentioned by Wells (1982: 294) who also claims that it is now ‘widely considered ‘affected’, and has ceased to be fashionable among younger speakers’. Modern SAfE, however, still places positive indexical value on GOAT values with a relatively close (and generally fronted) nucleus (whether rounded or unrounded) (Lass 1995: 100). We note the relatively fronted MOUTH-variant in the case of MM (echoing many broader variants of modern-day SAfE), the fully-backed conservative MOUTH in the case of LW and, as with PRICE, a relatively centralized variant in the case of CF. 3.4 Part-System D The relatively fronted BATH vowel in the case of CF is clearly apparent from Figure 7, although, again, this needs to be qualified by the fact that it is based on an acoustic analysis of two tokens of the same word (i.e. transport), as well as by the fact that the other (unanalyzed) BATH tokens (after, father and one token of transport) were fully-backed in CF’s recording. THOUGHT is difficult to analyze although it appears that CF has a slightly retracted variant in comparison to the other two subjects. MM seems to show some evidence for the Second-Force Merger with respect to CURE (i.e. [ɔː] and not [ʊə]), although this is based on only one token of sure. In the case of LW there is clear evidence for the old-fashioned RP [ʊə]-quality. NURSE has a fronted (and, impressionistically, occasionally unrounded) quality in the case of both MM and LW, with LW’s NURSE showing some evidence of an in-glide. Echoing Britain’s (p.c.) impressionistic analysis of CF’s speech (see section 2.1. above), the one token of early is substantially lowered. As mentioned above, there were no NEAR tokens in CF’s speech; MM’s and LW’s NEAR are, however, substantially different. Both are in-gliding, but in the case of LW substantially lowered. MM’s NEAR echoes modern SAfE values, basically [ɪə], while LW’s has a [eɐ] quality18. Both CF’s and MM’s SQUARE is based on limited data, but the difference between the three subjects does echo the modern distinction between Broad SAfE [eː], General SAfE [ɛː] and Cultivated SAfE [ɛə], with LW’s SQUARE having a clear RP-like in-glide (Wells 1982: 293). 5 Conclusion While the features of each subject’s speech, as well as their differences, are interesting in their own right, and provide a tantalizing ‘window’ on the varieties of English prevalent during the second half of the nineteenth century, are there any broader generalizations or conclusions that can be drawn from the above analysis? I 18 In all likelihood related to the replacement of NURSE or even BATH-like qualities for NEAR in particularly refined varieties of RP cf. Lanham (1967: 42) and Cruttenden (2001: 143).


430

believe there are, although these are necessarily tentative and speculative, based as they are on the speech of only three subjects (one of which ‘provides’ only limited data), but they do provide a number of possibly fruitful avenues for further research. Beginning with CF, it seems reasonably clear that his speech is coloured by the original British input and contains features that were eventually ‘ironed’ out once SAfE become a fully-focused variety. Assuming that a koine developed in the Eastern Cape during the mid-nineteenth century and given Trudgill’s (2004: 23) 50-year yardstick for the focusing of such a new-dialect, it would have come into its own around about 1870, almost twenty years after CF was born, meaning that he in all probability spent his childhood in a linguistic environment characterized by a substantial degree of inter and intra-individual variability. One provisional, though tantalizing, conclusion, relating specifically to the variability of CF’s BATH vowel, is that START-Backing (a characteristic of modern SAfE, but not Australasian-English) only really completed itself during the late nineteenth-century, a fact reflected by the unexceptionless use of a backed BATH in the case of MM and LW and also providing some support for the hypothesis contained in Bekker (2012) i.e. that modern SAfE is, in a substantial sense, a product of late nineteenth-century developments, rather than being directly traceable to the original 1820 input. Another tantalizing ‘lead’ relates to the fact that CF’s speech shows a fully-polarized KIT vowel; which means that this feature of modern-day SAfE is perhaps traceable to an original input into SAfE and is not an endogenous development, as for example argued for in Lass and Wright (1985, cf. Lass 1995), who claim that the centralization of KIT in SAfE is most likely the result of pressure from the other two short front vowels i.e. TRAP and DRESS, the raising of which, in terms of Lass and Wright’s (1985) scheme, led to a front short-vowel chain shift in SAfE. We note, in this regard, CF’s substantially lowered DRESS vowel. It is perhaps possible, therefore, that the SAfE KIT-Split was perhaps a direct inheritance and, in fact, the initiator of the relevant chain-shift19. As intimated above, almost all descriptions of SAfE point to a three-way division of this colonial variety, mainly in terms of its relationship with British-based standard varieties (i.e. the various forms of early RP). Thus the early Lanham and Traill (1962) note a still dominant exonormative orientation in the speech of many English-speaking South Africans and propose a division between SARP ‘A’, SARP ‘B’ and non-SARP SAfE, echoing later divisions of SAfE into Cultivated, General and Broad sociolects, with the first sociolect still generally modeled on British-based standard varieties. The speech of the three subjects seems to loosely reflect this trichotomy and, more importantly, many of the prestige values of modern-day SAfE (e.g. narrow and close FACE; fronted PRICE; close, front and narrow GOAT) appear to be reflected in the speech of LW, which raises the question of how much SAfE has to this day been influenced by prestige varieties of

19 It must be admitted that this thesis rests uncomfortably with studies such as that of Watson, Maclagan and Harrington (2000) on New Zealand English (NZE) which indicate that KIT-centralisation is an endogenous development in this variety. It would seem odd that what is an inheritance in SAfE would constitute an endogenous development in NZE. However, it is still difficult to reconcile CF’s lowered DRESS vowel and his KIT-Split with Lass and Wright’s (1985) and Lass’ (1995) endogenous account.


431

the past. We note too that a raised TRAP and DRESS vowel, often associated with Broader idiolects in modern-day SAfE, are in fact present in both MM and LW’s speech, both of which were clearly influenced by the British standard of the time and LW’s speech in fact having the most raised TRAP and DRESS vowels of all three subjects. It is clear, therefore, that at the time a raised DRESS and TRAP did not have the negative indexicality current in modern-day SAfE. It is, thus, perhaps worthwhile considering the possibility that raised SAfE TRAP and DRESS were not endogenous developments in SAfE, but are, in fact, (at least partly) residues of previously prestigious variants that have now, following modern varieties in the south-east of England (cf. Torgerson and Kerswill 2004), been reanalyzed on an indexical level and, in fact, show reversal in certain prestigious sub-varieties (Bekker and Eley 2007; Bekker 2009: 204); a form of colonial lag in Trudgill’s (2004) sense. The above possibilities, however, remain speculative and naturally further research will have to be conducted in order to determine if they have any true merit. Thus, while this article has hopefully provided an interesting ‘window’ on the past of SAfE, much more work needs to be done in order to unravel the processes and forces operative in the establishment and development of this Southern Hemisphere variety of English. References Bekker Ian 2009. The Vowels of South African English. Unpublished PhD thesis.

Potchefstroom: North-West University. Bekker, Ian 2012. South African English as a late nineteenth century extraterritorial

variety. English World-Wide 33(2): 127-146. Bekker, Ian and Georgina Eley 2007. An acoustic analysis of White South African

English (WSAfE) monophthongs. Southern African Linguistics and Applied Language Studies, 25(1), 107-114.

Boersma Paul and David Weenink 2013. Praat: doing phonetics by computer [computer program]. version 5.3.53, retrieved 9 July 2013 from http://www.praat.org/

Chamber, J. K. 2003. Sociolinguistic Theory: Linguistic Variation and its Social Significance. Second edition. Oxford and Malden: Blackwell.

Cruttenden Alan 2001. Gimson’s Pronunciation of English. Revised sixth edition. London: Arnold.

Hattersley, Alan F. 1950. The British Settlement of Natal: a Study in Imperial Migration. Cambridge: Cambridge University Press.

Labov, William 1963. The social motivation of a sound change. Word 19: 273-309. Labov, William 1994. Principles of Linguistics Change, Vol. 1: Internal Factors.

Oxford: Blackwell. Lanham Len W. 1967. The Pronunciation of South African English: a Phonetic-

phonemic Introduction. Cape Town: A. A. Balkema. Lanham Len W and Carol Macdonald 1979. The Standard in South African English

and its Social History. Heidelberg: Julius Groos Verlag. Lanham, Len W. and Anthony Traill 1962. South African English Pronunciation.

Johannesburg: Witwatersrand University Press.



432

Lass Roger 1995. South African English. In: Rajend Mesthrie (ed) Language and Social History: Studies in South African Sociolinguistics. Cape Town: David Philip, pp. 89-106.

Lass Roger and Susan Wright 1985. The South African chain shift: order out of chaos? In: Roger Eaton, Olga Fischer, Willem Koopman and Frederike Van der Leek (eds) Papers from the fourthinternational Conference on English Historical Linguistics. Amsterdam: John Benjamins, pp. 137-161.

Lobanov, Boris M. 1971. Classification of Russian vowels spoken by different speakers. Journal of the Acoustical society of America 49: 606-608.

Torgersen, Eivind and Paul Kerswill 2004. Internal and external motivations in phonetic change: dialect leveling outcomes for an English vowel shift. Journal of Sociolinguistics 8(1): 23-53.

Trudgill, Peter 2004. New-Dialect Formation: The Inevitability of Colonial Englishes. Edinburgh: Edinburgh University Press.

Trudgill, Peter 2008. The dialect of East Anglia: phonology. In: Bernd Kortmann and Clive Upton (eds) Varieties of English 1: The British Isles. Berlin and New York: Mouton de Gruyter: 178-193.

Upton, Clive 2008. Synopsis: phonological variation in the British Isles. In: Bernd Kortmann and Clive Upton (eds) Varieties of English 1: The British Isles. Berlin and New York: Mouton de Gruyter, pp. 269-282.

Watson, Catherine, Margaret Maclagan and Jonathan Harrington 2000. Acoustic evidence for vowel change in New Zealand English. Language Variation and Change 12: 51-68.

Wells, J. C. 1982. Accents of English 3 Vols. Cambridge: Cambridge University Press.

Schreier Tristan da Cunha --- Page 433 of 525

433

20 Early twentieth century Tristan da Cunha h’English Daniel Schreier Tristan da Cunha English (TdCE) is a dialect of great interest since it is probably the smallest variety of English around the world that has undergone full nativization. TdCE is a most suitable variety for research in historical sociolinguistics, since a copious amount of dialect data is available (some speakers born as early as in the mid-1870s), many of them not analyzed to date. The present chapter begins by introducing the various corpora and generally outlining their advantages and disadvantages for research on language variation and change. After discussing implications for history and evolution and presenting some new findings on the segmental phonology of earlier TdCE, it reports first results from an ongoing study on /h/ insertion in function and content words with an initial vowel (such as under, I, onion or island). TdCE is arguably the dialect of English around the world that has best preserved this archaic feature of British English, which today is found in very few locales only,1 and if it is, then only as a sporadic or remnant feature. I analyze the speech of four Tristanians (born between 1895 and 1910) to study whether or not the frequency with which they use this salient local variable is sensitive to external context. The four speakers were selected since they were recorded at roughly the same time by different interviewers (in the UK and on their native island) and in contexts with varying degrees of formality. The results suggest that even elderly and hyper-isolated speakers are sensitive to context-related constraints on variation. 1 Social, sociohistorical and sociolinguistic facts With a population of 258 (April 2014),Tristan da Cunha is one of the smallest communities in which English is spoken natively. Its unusual social history has been researched thoroughly and is discussed elsewhere in detail (Schreier 2002, 2003, Schreier and Trudgill 2006, Schreier and Lavarello-Schreier 2011), so the following account is kept brief. Originally discovered by the Portuguese in 1506, Tristan da Cunha was only settled in 1816, when the British admiralty formally annexed the island and dispatched a military garrison to the island. When it withdrew after one year only, some army personnel stayed behind and settled permanently: two stonemasons from Plymouth (Samuel Burnell and John Nankivel), a non-commissioned officer from Kelso, Scotland, named William Glass, as well as his wife, “the daughter of a Boer Dutchman” (Evans 1994: 245), and their two children. The population increased when shipwrecked sailors and castaways arrived soon after and in 1824, apart from the Glass family, the settlers included Richard ‘Old Dick’ Riley (from Wapping, East London), Thomas Swain (born in Hastings, Sussex) and Alexander Cotton (from Hull/Yorkshire) (Earle 1966 [1832]). The late 1820s and 1830s saw the arrival of a group of women from 1 Vernacular Newfoundland English is one example, see Clarke (2010) and Clarke, de Decker and van Herk (this volume).


434

St Helena and three settlers from Denmark and the Netherlands, later joined by American whalers. The population then grew rapidly. By 1832, there was a total of 34 people on the island, 22 of whom were young children or adolescents.

The second half of the nineteenth century, on the other hand, was a period of growing isolation. There were hardly any new settlers, with the exception of a weaver from Yorkshire (who left after a few years only) and two Italian sailors (Crawford 1945). This state of isolation lasted well into the twentieth century. When visiting the island in 1937, the Norwegian sociologist Peter Munch found that the Tristanians basically lived in pre-industrial conditions (Munch 1945) and Allen Crawford, the cartographer of the expedition, noted that only six out of a total of 170 Tristanians had ever left the island. This changed in April 1942, when the arrival of a British navy corps saw economic changes; a South African company employed the local workforce in the fishing industry and the traditional subsistence economy was replaced by a paid labor force economy. These social changes had sociolinguistic consequences, as the usage of local dialect features decreased somewhat (see Schreier and Lavarello Schreier 2011).

Though sociodemographically and politically insignificant, TdCE has perhaps over-enthusiastically been referred to as the “sociolinguists’ Galapagos” (Chambers 2004: 134), and two reasons can be advanced why it is particularly suitable for studies on contact-induced language change and variationist sociolinguistics. First, it is a prime research site because the community’s founders settled under tabula rasa conditions. There was no contact with indigenous varieties since the island was practically uninhabited when the garrison arrived (the population was precisely 2: one of the people present, a boy from Menorca, left on the first occasion, the other one, an Italian sailor, drank himself to death in the soldiers’ canteen). As we can date its origins to the 1820s, TdCE is one of the youngest nativized Englishes around the world (approximately one generation or so older than New Zealand English). Second, the English input varieties to TdCE are well-known, as is the development of the local population (there is an entire genealogical tree for the island community, reprinted in Crawford 1982). We know that the feature pool included dialects from the British Isles (the founders came from the Scottish Lowlands, East Yorkshire, East London and Hastings), the United States (the most influential American resident being a whaling captain from Massachusetts), various second-language varieties (spoken by settlers with the mother tongues Danish, Dutch, Italian and (perhaps) early nineteenth century Afrikaans) and St Helenian English (StHE), a variety that has undergone some amount of restructuring and perhaps creolization (Schreier 2008). As a result, multiple contact processes were operative during the genesis and formation periods of TdCE. Schreier (2002, 2003) and Schreier and Trudgill (2006) argued that TdCE primarily derives from varieties of British/late eighteenth century American English and StHE and that it did not emerge in a context of prima facie language contact, excluding pidginization effects on Tristan da Cunha. Some features most likely had a vernacular British English origin, such as slight STRUT fronting, /t/ glottalling, /v ~ w/ merger, THOUGHT in cloth, nucleus fronting in MOUTH, START backing, TH fronting, FLEECE in fish, HAPPY tensing, etc. On the other hand, L2 forms had some impact on TdCE as well, and a few non-native features were adopted also (TH sibilization, i.e. dental fricatives realized as /s/, as in think, throw, etc.; cf. Schreier 2003: 211). The existence of Creole-type features in TdCE (such as high rates of consonant cluster


435

reduction and absence of -ed past tense marking, Schreier 2005: 152; /v/ realized as [b]; lack of word order inversion in questions; copula absence with locatives and adjectivals; etc.) served as a strong indication that a creolized form of English was transplanted via (at least some of) the women who cross-migrated in the 1820s.

The question thus is how mixing between the inputs proceeded and why TdCE selected the features it did, and it is the earliest set of recordings of Tristanians (born in the late nineteenth century) that is of utmost significance here. 2 Researching Tristan da Cunha English: Taking stock There are several corpora currently available for a historical analysis of TdCE. These include the 1961-2 BBC/UCL corpus, the Svensson/Munch corpus, the 1999/2010 Schreier corpus and, as the latest substantial addition, a set of recordings made by Scottish oral historians in 2006. All these collections have their advantages and shortcomings, mostly due to the context(s) in which they were complied, and these are briefly described in the following.

The first ever linguistic description of TdCE comes from Zettersten (1969). Zettersten analyzed recordings made in 1961-2 by the British Broadcasting Corporation (BBC) in cooperation with the University College of London (UCL). At the time, the entire Tristan population was residing in Calshot, a disused army camp near Southampton, where they were forced to live after volcanic activities endangered the settlement and enforced a full-scale evacuation. Following a major eruption on October 9 1961, the Tristanians had to leave their island within 24 hours and were first shipped to Cape Town, where many of them for the first time saw modern civilization. However, since South Africa was no longer a member of the Commonwealth (it had been expelled after the National Party instated and implemented the Apartheid regime in 1948), the Tristanians were brought to England, arriving in November 1961. Though settling in remarkably well (many of them had jobs in the local industry), the majority voted for a return to the South Atlantic the following year, when the island was declared safe.

The data compiled by the UCL, in collaboration with the BBC, amount to a total of about four hours, or about 25 recordings made with approximately 20 Tristanians. Most interviewers were well-known radio journalists (e.g. René Cutforth) or linguists working for the UCL (among them Jan Svartvik), all of whom were highly educated and had RP accents (the exception being the technicians, who at times contributed to the conversation but were not in charge of leading the interviews). Since they were not familiar with the Tristanians, they tended to ask rather general (at times banal or rather insensitive) questions about island life and history, occasionally revealing their ignorance about the island and its past. The context of these recordings was very formal indeed: interviewees were asked to join the interviewers in a separate room that was specially equipped for the purpose, and where they had to speak in a microphone placed in front of their mouth. Other factors contributed to the stiff environment of the recordings: the interviewers wore a suit and tie, contrasting with the islanders’ traditional knit-ware, often there were several people present at interviews (one or two interviewers plus a technician) and it was conducted in an upper-class accent, which contrasted notably with TdCE and reinforced the social distance between interviewer and interviewee. All this bore


436

little resemblance with principles of sociolinguistic fieldwork developed to elicit vernacular speech in a relaxed and informal setting (Labov 1982, Schilling 2011, Schreier 2014). As a result, the recordings made by the BBC tended to be short, not yielding a lot of data: most were between six and fifteen minutes long.

The Svensson/Munch Corpus was compiled by the Swedish Painter Roland Svensson (1910-2003) and the Norwegian sociologist Peter A. Munch (1908-1984). Svensson was a passionate painter of islands who travelled extensively in the Atlantic and also in the Pacific. He developed a special liking for Tristan da Cunha and visited the island several times. He had a strong ethnographic interest and collected all sorts of material on domestic life, shared the every-day experiences of the islanders and also carried out recordings (some in Calshot in 1962, much more when visiting in the 1960s and 1970s). Though he gave much room to his own reflections (at times there are lengthy monologues and musings in his recordings, to which the islanders listened to politely), he was well-liked and popular with the population. Since the recordings were carried out in the Tristanians’ homes, often over a cup of tea or something stronger, the context was very relaxed and informal. Moreover, Svensson occasionally recorded couples or even asked some of the ‘old hands’ for a group interview (a real treasure chest for everybody interested in the social history of the island and the historical sociolinguistics of TdCE).

Similarly, Peter Munch was highly familiar with life on Tristan da Cunha. He was a member of the Norwegian expedition that visited the island in 1937-8, when he collected information to produce the most detailed account of social organization in the community. Munch spent WW II in a concentration camp and emigrated to North America in 1946, where he had a remarkable academic career at Southern Illinois University in Carbondale, where he held a chair in Sociology. He published several books on the sociology of Tristan da Cunha and had an engagement with the community until he passed away in 1984. His recordings and notes are priceless, since he studied the island for almost half a century and began studying the community when it practically lived in pre-industrialized conditions (Munch 1945). Though the observer’s paradox (Labov 1982) can never be fully discounted, the audio data at hand come rather close to unmonitored casual speech. The recordings collected by Svensson and Munch comprise a total of 23 hours with about 50 individuals, recorded alone or in groups. Both Svensson and Munch were ‘informed outsiders’, so to speak, they both had close connections with the Tristanians and were well liked and respected on the island. In their interviews, they asked about local life, family histories, all sorts of incidents in the late nineteenth and early twentieth century (including ghost stories, mysterious sightings of missing ships, information on some of the earliest settlers on Tristan, etc.), and collected all sorts of reminiscences of islanders, immersed in the habitat of the people they spoke to. The fact that these were fruitful topics for discussion is attested both in the length of responses given and stories elicited and also in the total size of the corpus.

The Schreier Corpus consists of recordings made in 1999, 2002 and 2010. The total length of the recordings is about 50 hours with 80 Tristanians. For his PhD project, Schreier initially spent six months on Tristan in 1999. Adopting a ‘friend of a friend’ approach, he expanded his acquaintances in snowball fashion. He spent the first three months getting to know the island and carried out very few recordings, with elderly members only (see the methodological description in


437

Schreier 2003). Only later were younger Tristanians recorded. Great care was taken to bring up local topics and stories (ghost stories, past experiences, e.g. during the volcano years in England) and all the individuals were recorded in places where they felt comfortable. The aim was to analyze a representative sample of the population, so Tristanians of all age groups were recorded, either in group interviews (up to four speakers) or individually. More recordings were carried out in follow-up visits in 2002 and in 2010. On his last fieldwork trip, Schreier was joined by a local fieldworker, the Head of the Tourist Division, who developed a strong interest in the project and helped during the fieldwork activities with the aim of collecting a body of oral history, to be housed in the local museum. The 1999 sub-corpus was analyzed in Schreier 2003, but the recordings made in 2010 have not been studied so far.

The Scottish Corpus, finally, consists of recordings made with some 25 individuals, both locals and expatriates alike. The interviewers were Scottish historians who visited the island on the SA Agulhas in September 2006 and spent three weeks on the island. They had no prior first-hand information and singled out individuals with the help of a Scottish dentist, who had a long-term relationship with Tristan as he was sent to Tristan on an annual basis starting in the late 1990s. They mostly worked with a semi-structured questionnaire and asked general questions about island life. Within merely three weeks, they managed to conduct 25 interviews (mostly single interviews) with a total of 25 residents, 13 of which were locals. Interviews were carried out in local homes and mostly lasted between 45 and 60 minutes.

When comparing the various corpora, we find that there are rather substantial differences relating to context, length of interview and interview styles, informant selection and also quality of the recordings. With an interviewer-interviewee rate of at times 3-1, the 1961 BBC corpus is formal and interviews rarely last more than 25 minutes. As for the principle of accountability (Labov 1982), sufficient material for a quantitative analysis of sociolinguistic variables is available for few speakers only. On the upside, recordings were made with a representative sample of the population, with children and also elderly members (the oldest speakers were born in 1887, 1894 and 1895) and the overall quality is good. The Svensson/Munch corpus, by contrast, is rather informal. Svensson was well acquainted with the Tristanians and brought up topics of interest to them. Interviews vary in length and were at times led in groups, as a result of which they are highly informal. The disadvantage is that there are only few recordings of younger speakers and that Svensson and Munch concentrated on the ‘old hands’ (in fact, the oldest Tristanian whose speech is preserved, Granny Mary, born in 1874, was recorded in 1963, others were born in 1889, 1900 and 1902); moreover, the quality is not always good (there is a constant humming of the background noise of the tape recorder, and some tapes are worn out because they were not stored properly). The Schreier corpus (Schreier 2003) is rather informal as well (interviews on the island, in the informants’ homes, ethnographic approach and 6-month residence on the island, local topics, etc.). Some speakers were recorded twice (1999 and 2010), allowing for a (albeit modest) real-time study. The Scottish corpus, again, is somewhat more formal in that the interviewers worked in pairs and only had three weeks on the island. However, they were briefed by a Scottish dentist (see above) who was familiar with the community and made arrangements for them beforehand.


438

Consequently, the four corpora differ on a number of accounts: place of recording (Calshot/Southampton or South Atlantic Ocean/Tristan da Cunha), degree of formality, degree of familiarity with the population (ranging from rather high in the case of Svensson/Munch and Schreier to low in the case of the BBC reporters), length of recordings (mostly 45-60 minutes in the case of the Scottish oral historians to about 5 minutes in the BBC corpus), and overall quality of the recordings (high, BBC, as opposed to varying, sometimes very low, Svensson, decreasing towards the end of an interview since some tapes are worn out). Considering these differences, it is clear that they should not be compared without qualifications and the advantages and limitations of each corpus must be taken into consideration at all times (this is the reason why I myself have not worked with the extremely formal BBC corpus as of yet, because I felt it contrasted too much with the informal character of my own 1999 recordings). Analyses of change in real time, for instance, based on a comparison of speakers recorded by the BBC (1962) and Schreier (1999) would confuse the degree of formality and compare data from different styles, which obviously influences the results. This must be kept in mind at all times.

Bearing in mind these methodological concerns, the potential of the four corpora for research on variation and change or new-dialect formation is certainly great. Together, they comprise some 140 hours of tape-recorded speech with more than 150 Tristanians born between 1874 and 1990. Moreover, two of the corpora (Svensson/Munch, Scottish) have not been subject to linguistic analysis, neither has Schreier (2010), so the potential for future research of these recordings for research on historical phonology, contact linguistics and variation and change is immense. 3 Analyzing earlier Tristan da Cunha English: General findings and a case

study Despite the caveats outlined above, it is clear that we must take care when comparing data from different corpora. However, the material available lends itself to studies on language variation and change. For one, a replication study can be carried since individual speakers were interviewed several times (of course, one would have to take great care to select recordings taken in similar contexts of formality, so as not to confuse real change with variation along the informal-formal axis; see above). Individuals recorded by Svensson in the 1970s and Schreier (1999, 2010) would lend themselves really well for such purposes. This would enable researchers to study the speech of single individuals over a period of time (see Rickford and McKnair Knox 1994 for a longitudinal case study). Alternatively, in a sample study, one can select pre-specified samples of individual speakers in an apparent-time approach, a practice adopted by Schreier (2003). This is ideal for an analysis of mobility-related language change (difficult to travel to and away from, Tristan da Cunha is sociodemographically stable with a small population). The Svensson corpus allows us to extend the degree of time depth well into the nineteenth century and to identify features in the earliest-born Tristanians recorded. The oldest islander whose speech is preserved is that of Granny Jane, born in 1874,


439

recorded by the South African Broadcasting Corporation in 1954.2 Unfortunately, the recording with her lasted merely 9.06 minutes and is thus too short for a quantitative analysis. Notwithstanding, it is noteworthy that Granny Jane uses several local features that have been recorded in Zettersten (1969), Schreier and Trudgill (2006), etc., and is in fact strong evidence that these forms were present in generation three of the Tristan population already (and not innovations of the twentieth century), which is most useful for the historical reconstruction of TdCE with reference to current models on English as a world language at large or koinéization in particular. A descriptive profile of the data includes the features listed in Table 1. Table 1. Features of Tristan da Cunha English relevant to historical reconstruction

1. /h/ insertion (seventy-/h/eight) 2. diagnostic consonant cluster reduction (CCR) in pre-vocalic environments

(that would las’ us over the year) 3. central PRICE onsets (right) 4. advanced KIT centralization (sisters, six) 5. zero existentials (Ø was about 76) 6. past be leveling (they was married to my two sisters).

Therefore, a recording such as this one, as short as it may be, pushes back the time frame by 30-40 years (or a generation or two) and allows precious glimpses into mid-nineteenth century TdCE. Some of these variables were subject to a quantitative analysis in twentieth century TdCE (past be leveling in Schreier 2003, CCR in Schreier 2005).

As for the early twentieth century, Zettersten (1969) provided a detailed qualitative description and made some general assessments on differences between older and younger speakers, and Schreier and Trudgill (2006) reported their impressionistic findings from some of the oldest speakers available, comparing their findings with observations on the speech of younger members of the community. They further, discussed them in order to retrace koinéization and language contact processes in the formation period of TdCE. Schreier (2003) used the apparent-time construct to study change in the twentieth century, reporting that some local variables were on the decline (e.g. third person singular present tense zero or present be-leveling), presumably as a result of increased contact with the outside world during the years 1961-3 in England following the volcano eruption on the island.

Research has also been carried out in segmental phonology, and Figure 1 provides a vowel plot for a male TdCE speaker, born in 1902 (one of two speakers analyzed so far). Figure 1. Vowel plot for a male TdCE speaker, born 1902

2 I have no idea how this recording, which I found in the Svensson corpus in St Louis University, was made and for what purpose it was collected. I know of no research project in the 1950s and so far have had no response from the SABC to my inquiries.


440

Figure 1 indicates some prominent features, such as the NEAR-SQUARE merger, KIT backing and centralization, GOOSE fronting, and the merger of NORTH and FORCE; further research needs to be carried out with regards to Tristanians born in the second half of the twentieth century.

Quantitative research is possible as well, and in what follows, the focus is on shifting by interview context (which the corpora lend themselves to ideally, see above). The question is whether there are differences between single speakers in the corpora, i.e., whether or not there is evidence of context sensitivity that gives rise to differential usage rates of sociolinguistic variables that can be explained by the setting of the interview. The variable selected is /h/ insertion, which is analyzed quantitatively for some selected speakers here.

The history of /h/ variation in English has been studied intensely, by historical linguists, philologists and sociolinguists alike. As is well-known, /h/ underwent a long process of weakening in the history of English (Lass 2006) and has now become a minority feature in some regions, particularly in the English Southeast and in vernacular London English. On the other hand, there have also been reports of /h/ insertion, i.e. usage in words where it is not found etymologically, for instance in the fourteenth century Norfolk Guilds and the Paston Letters (Wyld 1925). The Linguistic Atlas of Late Medieval English (McIntosh et al. 1986) indicates that it is mainly found in texts from the East Midlands, East Anglia and the South. In the period from c. 1190-1320 the texts range from Lincolnshire or Norfolk to the southern counties but “the instability seems to be greatest in the East Midlands” (Milroy 1992: 140). This has been noted frequently in the eighteenth and nineteenth centuries, for instance by Batchelor (1809: 29): “the aspirate h… is often used improperly, and is as frequently omitted where it should be used. Give my orse some hoats has been given as an example of these opposite errors from the


441

Cockney dialect”, and by Walker (1791), who speaks of a “fault of the Londoners: not sounding h where it ought to be sounded, and inversely.”

As has been reported elsewhere (Zettersten 1969, Schreier 2003, Schreier and Trudgill 2006), TdCE is quite remarkable in that it is almost certainly the only variety in the world that still has frequent /h/ insertion. It is found elsewhere, e.g. in the Caribbean (“on the island of Abaco in the Bahamas ear is what you do with your hear (or vice versa)”, Holm 1988: 76, quoted in Childs, Reaser and Wolfram 2003: 15) and the Appalachians, but there it is restricted to very few speakers, often elderly members of the community, and a small set of lexical items (such as the letter <h>). On Tristan da Cunha, this is much more widespread as Allen Crawford noted in the 1930s, “the tendency to add an “h” before vowels makes all islanders “highlanders”!” (reported in Crawford 1982: 49), and a previous study (Schreier 2006) provided quantitative evidence that one speaker had a total of approximately 30% of /h/ insertion in words such as oil, I, egg, etc. The same source (a case study of individual variation) showed that /h/ was inserted with prepositions (out, on, after, up, over), pronouns (it, I), verbs (ask, offer, open), adjectives (other, every, old), adverbs (outside), nouns (army, area, officer), and proper names (Ernie), making this not only a frequent but also a widespread and productive process that did not seem to be restricted to certain word types, which was counterevidence to historical reports (where examples of /h/ insertion were mostly nouns). /h/ was thus inserted in function and content words, with an overall insertion rate of 64.7% for lexical words and 24.7% for function words.

This was used as sociohistorical evidence as to who was responsible for bringing this feature to Tristan da Cunha. Since /h/ was reported extensively in London English (Walker 1791, Batchelor 1809) and Cockney in particular (Jespersen 1909), i.e. from precisely the area of origin of one of the most influential founders of the community (Richard Riley), Schreier (2010) speculated that it was a legacy of early nineteenth century London English.

Given what was said above about the different methods of data collection in the various corpora, the question addressed here is whether speakers vary in their usage of local variables when recorded in diverse settings. Of course, this in no way challenges Labov, who claimed that

One of the fundamental principles of sociolinguistic investigation might simply be stated as: There are no single-style speakers. By this we mean that every speaker will show some variation in phonological and syntactic rules according to the immediate context in which he is speaking. (Labov 2003: 234)

Indeed, it would be unexpected and counter-intuitive to find no variation, so the question is rather when, how often and to what extent the Tristanians vary between zero (Ø) and inserted /h/. Notwithstanding, one should bear in mind that the speakers, born between 1895 and 1910, were socialized in a hyper-isolated environment. These speakers grew up in a TdCE-speaking community with practically no exposure to other dialects (the exception being the missionaries sent out by the Society for the Propagation of the Gospel). Peter Munch reported that the island community was encapsulated and remarkably endocentric. The question then is to what extent selected speakers vary in their usage of a hyper-isolated local variant in the two interview modes, which allows us to address issues such as the


442

effects of isolation on “shifting potential”, the potential for shifting of individuals in dense and highly complex social network systems, etc. A total of four speakers met the criteria and were selected:

1. MS (female, 1895-1974), recorded by: BBC 1961, Peter Munch 1964 (TdC), Roland Svensson 1970 (TdC)

2. WR (male, 1889-1963), recorded by: BBC 1962 (Southampton), Roland Svensson 1963 (en route to TdC on a sailing boat)

3. BGG (male, 1899-1995), recorded by: BBC 1961 (Southampton), Roland Svensson 1963 (TdC), Peter Munch 1965 (TdC)

4. SG (male, 1910-1994), recorded by: BBC 1961 (Southampton), Roland Svensson 1970 (TdC)

In order to minimize the potential of type-token relations, only five lexical items were collected for each lexical item. Moreover, as is standard in sociolinguistic interviews, the first minutes of interviews were not considered (three minutes for shorter interviews from the Svensson corpus (lasting between 14 and 20 minutes)) with the exception of interviews shorter than 10 minutes. It should also be pointed out that one islander, BGG (male, 1899-1995) was born in the Cape and came to Tristan with his family in 1908, when he was aged 8. Another Tristanian, WR, was interviewed on board the sailing ship back to Tristan, shortly before arriving on the island (where he passed away three months later). If possible, a total of 100 tokens was extracted for each individual in each interview setting; based on the principle of accountability (“all occurrences of a given variant are noted, and where it has been possible to define the variable as a closed set of variants, all non-occurrences of the variant in the relative environments”; Labov 1982: 30), all instances of lexical items with initial vowels and alpha stress were selected and coded, no matter whether they had /h/ insertion or not. In some of the shorter recordings, the total could not be reached (here the minimum was 48).

A first look at /h/ variability indicates that the speakers vary, at times extensively, between zero (Ø) and /h/ insertion, as the following three examples show (all from the Svensson corpus), where /h/ is found before prepositions, nouns and verbs:

1. We give Ø each one so many. And then they (h)ordered potatoes, see, into the Ø island, and then they give Ø every cellar like a (h)onion size bag. (LR_1970)

2. Well Ø I got six drivers and sometime Ø I am (h)on my (h)own. (LG_1970) 3. It’s the best to place a postal (h)order when you go down to the post Ø

office. (MS_1964) With regard to variation in different recording contexts, Table 2 and Figure 2 provide the results for the four speakers who were interviewed twice (in Southampton (BBC) and the South Atlantic (TdC)). Table 2. /h/ insertion in four TdCE speakers by place of interview (Southampton

vs. TdC)


443

Speaker Recorded /h/ insertion /h/ zero Total %

MS (female, b.1895)

1961 (Southampton, BBC)

18 50 26.5 (18/68)

1964 (TdC) 42 58 42.0 (42/100)

WR (male, b.1889)


25 75 25 (25/100)

1963 (TdC) 27 39 40.9 (27/66)

BGG (male, b.1899)


7 41 14.6 (7/48)

1963 (TdC) 17 52 24.6 (17/69) 1964 (TdC) 26 67 28.0 (26/93)

SG (male, b. 1910)


19 32 37.3 (19/51)

1970 (TdC) 35 55 38.9 (35/90) Figure 2. /h/ insertion in four TdCE speakers by place of interview (Southampton

vs. TdC)

The results show that all four speakers tend to have higher /h/ insertion rates when interviewed on Tristan da Cunha (or en route to the island), thus on or close to their home ground. While varying between /h/ insertion or Ø in both settings, all speakers have higher /h/ rates when interviewed by Peter Munch or Roland Svensson. This is even more remarkable since the questions and topics raised were not much different, mostly focusing on life on Tristan and the South Atlantic, reminiscences of the time before the volcanic eruption, the evacuation and also the experiences in England, which was of particular interest since the islanders decided to return home in a democratic vote held in 1962 (see Schreier and Lavarello-Schreier 2011). The total difference is remarkable: in three speakers


444

(MS, BGG and WR), it amounts to about +12-15 per cent. The one exception is SG, where the increase, while still manifesting itself, is slight (+1.6 per cent). Here it is striking that /h/ insertion in the Southampton recording is very high to start with (37.3 per cent), so this may well be one of the strongest ‘/h/ inserters’ in the entire corpus. 4 Conclusion This chapter has given a short introduction to the sociolinguistic value of different corpora on late nineteenth century and twentieth century TdCE, all of which will be subject to further research. A major dialect study based on several corpora carries immense potential (real vs. apparent time, style shifting, individual variation), though doubts remain as to what extent data obtained under different conditions can be compared strictu sensu. Criteria related to validity, formality and authenticity have to be assessed and taken into consideration at all times. The potential of corpus comparison in historical linguistics has been dealt with in recent handbooks (in Krug and Schlüter 2013), particularly the usefulness of comparative research (qualitative vs. quantitative approaches, which depends on the amount of data available via the length of recordings, as for instance in group vs. individual interviews). The data can be used for apparent-time studies (cf. Schreier 2003), for descriptive purposes or a reconstruction of earlier forms (Schreier and Trudgill 2006), or for research on historical phonology (cf. Figure 1 above). The stance taken here is that the existing corpora are most suitable for an analysis of effects of setting and interviewer. This confirms Labov’s maxim ‘there are no single-style speakers’, yet it is noteworthy that all the speakers studied show sensitivity to social context. All four speakers are more /h/-ful (inserted before words such as island, on, I, Albert, etc.) when they are talking to someone they are familiar with (a Norwegian sociologist, a Swedish painter, informed outsiders both) and in a place where they feel comfortable (in their own homes or on a ship in the South Atlantic Ocean), despite the fact that the topics raised are very similar. This shows that elderly speakers of hyper-isolated dialects such as TdCE are certainly aware of social constraints on variability, perhaps even more so than one would have expected. Though three of them had never left the island or spent time outside the local community (the fourth, born in the Cape to a Tristanian father and Irish/South African mother, moving to Tristan aged 8), they all have the same context-related effects. At the risk of slight exaggeration, this is some indication that speakers in hyper-isolated communities do show remarkable awareness as to when and with whom to use local features and when not. Hyper-isolated speakers style-shift with interviewer and context, and local features are used more frequently in informal settings. Some problems for analysis remain to be solved, such as assessing the impact of individual parameters (topic, interlocutor, etc.), the individual orientation of speakers toward island life and the outside world (which has been shown to be a crucial factor in variation, or of course ‘proactive’ variation – enregisterment, identity formation, etc.). In this sense, the findings presented here can be no more than a first beginning. At the same time, the potential for future research is immense. The next steps would be to look into variation within speakers recorded on Tristan under similar


445

contexts (by Svensson in the 1970s and Schreier in 1999/2010), which would allow for change in real-time, to look (perhaps in a more qualitative approach) at the overall presence of features in speakers born between 1876 and 1992, which would allow one to investigate the overall usage (or perhaps the disappearance) of features or to reconstruct earlier forms (perhaps even the inputs). More research on different variables is definitely needed and the corpora, if handled with care, may certainly hold the key both to change in twentieth century TdCE and the origins of the dialect in the nineteenth century.

Acknowledgments I wish to thank Erik Thomas, North Carolina State University, for his help with producing the vowel plot, Nicole Studer-Joho, for her help with the history of /h/ in English, and Danae Perez-Inofuentes and Nicole Eberle, University of Zurich, for most valuable comments on an earlier draft version of this paper. I also wish to acknowledge the input of the editor of the volume, Raymond Hickey, whose input was most helpful, as always. References Batchelor, Thomas 1809. An Orthoepical Analysis of the English Language.

London: Didier and Tebbett. Chambers, J. K. 2004. Dynamic typology and vernacular universals. In: Bernd

Kortmann (ed.). Dialectology Meets Typology. Berlin: Mouton de Gruyter, pp. 127-145.

Childs, Becky, Jeffrey Reaser, and Walt Wolfram 2003. Defining ethnic varieties in the Bahamas: Phonological accommodation in black and white enclave communities. In: Michael Aceto, and Jeffrey P. Williams (eds). Contact Englishes of the Eastern Caribbean. Amsterdam: John Benjamins, pp. 1-28.

Clarke, Sandra 2010. Newfoundland and Labrador English. Edinburgh: Edinburgh University Press.

Crawford, Allan 1945. I Went to Tristan. London: Allen and Unwin. Crawford, Allan 1982. Tristan da Cunha and the Roaring Forties. London: Allen

and Unwin. Earle, Augustus 1966. Narrative of a Residence on the Island of Tristan D’Acunha

in the South Atlantic Ocean (first ed. 1832). Oxford: Clarendon Press. Evans, Dorothy 1994. Schooling in the South Atlantic Islands 1661-1992.

Oswestry: Anthony Nelson. Holm, John 1988. Pidgins and Creoles. Volume 1: Theory and Structure.

Cambridge: Cambridge University Press. Jespersen, Otto 1909-1949. A Modern English Grammar on Historical Principles, 7

vols. Copenhagen: Einar Munksgaard. [Reprinted, London: George Allen and Unwin, 1961, 1965, 1970, 1974.] (MEG).

Krug, Manfred and Julia Schlüter (eds) 2013. Research Methods in Language Variation and Change (Studies in English Language Series). Cambridge: Cambridge University Press.


446

Labov, William 1982. Objectivity and commitment in linguistic science. Language in Society 11: 165-201.

Labov, William. Some sociolinguistic principles. In: Christine Bratt Paulston and G. Richard Tucker (eds) Sociolinguistics: The essential readings. Oxford: Blackwell, pp. 234-50.

Lass, Roger 2006. Phonology and morphology. In: Richard Hogg, and David Denison (eds). A History of the English Language. Cambridge: Cambridge University Press, pp. 43-108.

McIntosh, Angus, Michael Louis Samuels, Michael Benskin, Margaret Laing, and Keith Williamson (eds) 1986. A Linguistic Atlas of Late Medieval English. Aberdeen: Aberdeen University Press.

Milroy, James 1992. Linguistic Variation and Change: On the Historical Sociolinguistics of English. Oxford and Cambridge MA: Blackwell.

Munch, Peter A. 1945. Sociology of Tristan da Cunha. Oslo: Det Norske Videnskaps-Akademi.

Rickford, John and Faye McKnair-Knox 1994. Addressee- and topic-influenced style shift: A quantitative sociolinguistic study. In: Biber, Douglas and Edward Finegan (eds) Sociolinguistic perspectives on register. Oxford: Oxford University Press, pp. 235-276.

Schilling, Natalie 2013. Sociolinguistic Fieldwork. Cambridge: Cambridge University Press.

Schreier, Daniel 2002. Terra incognita in the Anglophone world: Tristan da Cunha, South Atlantic Ocean. English World-Wide 23: 1-29.

Schreier, Daniel 2003. Isolation and Language Change: Sociohistorical and Contemporary Evidence from Tristan da Cunha English. Houndmills/Basingstoke and New York: Palgrave Macmillan.

Schreier, Daniel 2005. Consonant Change in English Worldwide: Synchrony Meets Diachrony. Houndmills/Basingstoke and New York: Palgrave Macmillan.

Schreier, Daniel 2006. The backyard as a dialect boundary? Individuation, linguistic heterogeneity and sociolinguistic eccentricity in a small speech community. Journal of English Linguistics 34: 26-57.

Schreier, Daniel 2008. St Helenian English: Origins, Evolution and Variation. Amsterdam: John Benjamins.

Schreier, Daniel 2010. The consequences of migration and colonialism II: Overseas varieties. In: Peter Auer, and Jürgen E. Schmidt (eds). Language and Space: An International Handbook of Linguistic Variation. Berlin and New York: Walter de Gruyter, pp. 451-467.

Schreier, Daniel 2014. Variation and Change in English: An Introduction. Berlin: Erich Schmidt Verlag.

Schreier, Daniel and Karen Lavarello-Schreier 2011. Tristan da Cunha and the Tristanians. London: Battlebridge.

Schreier, Daniel and Peter Trudgill 2006. The segmental phonology of nineteenth century Tristan da Cunha English: Convergence and local innovation. English Language and Linguistics 10: 119-141.

Walker, John 1791. A Critical Pronouncing Dictionary and Expositor of the English Language. London: G. G. and J. Robinson and T. Cadell.

Wyld, Henry Cecil 1925. A History of Modern Colloquial English. Third edition. London: T. Fisher Unwin Ltd.


447

Zettersten, Arne 1969. The English of Tristan da Cunha. Lund: Gleerup.

Cox and Palethorpe Historical Australian English Page 448 of 525

448

21 Open vowels in historical Australian English Felicity Cox and Sallyanne Palethorpe 1 Introduction In this chapter we will examine the open vowels in Australian English (henceforth AusE) through an acoustic analysis of modern and historical audio data selected to represent each end of a one hundred-year time span, from the turn of the twentieth century to the turn of the twenty-first century. Our analyses will attempt to answer some unresolved questions about the historical aspects of AusE and the changing relationships between the open vowels. We will concentrate our attention on /ɐː/ 1 (which in AusE occurs in the BATH, START and PALM lexical sets) and /ɐ/ (from the STRUT lexical set). These two vowels are relatively recent additions to the English phoneme inventory (Wells 1982; Gimson 1980; MacMahon 1998; Beal 1999; Lass 2000) and they developed from different sources in Early Modern English (Beal 1999). Early Modern English is the period usually described as the two hundred years between 1500 and 1700 (Nevalainen 2006: 9) but see Görlach (1991: 9-10) for some alternative timeframes. The vowel in the BATH, START and PALM lexical sets derives historically from a phonemic split that affected the Middle English (ME) short open front vowel, which was similar to the IPA [a]2 (or the slightly closer [æ]), through a complex process that involved both lengthening and retraction. Beal (1999: 106-107) and Lass (2000: 104-105) explain that lengthening preceded retraction and occurred in the following contexts:

1. In the BATH lexical set, where the vowel occurs before voiceless fricative codas /s, f, θ, ð/ (as in words like pass, staff, path, father) and variably before nasal plus consonant codas (/nC/) (can’t, plant), changes occurred progressively. According to Beal (1999: 108), lengthening in the context of /θ/ (bath) and /ð/ (father) began in the 1600s, extending to /s/ (pass), /f/ (staff) and /nC/ (plant) in the 1700s and 1800s.

2. In the START lexical set, where the vowel occurs before codas containing /ɹ/ (as in words like far, card, carp), the progress of change also interacted with phonetic context. The process began in environments where the vowel preceded complex codas which contained /ɹ/ followed by a voiced final consonant (as in card) possibly as early as the late 1600s and through the early 1700s (Beal 1999: 111-112). The effect further extended to include

1 In our discussion of Australian English vowels we will use the phonemic vowel symbols recommended for Australian English by Harrington, Cox and Evans (1997), Cox and Palethorpe (2007) and Cox (2012). We will also use the lexical sets devised by Wells (1982). 2 Symbols enclosed in square brackets represent IPA reference vowels or close approximation to vowel production. Symbols enclosed in slant brackets represent lower level phonemic categories.


449

complex codas containing voiceless consonants (e.g. carp) and then to simple /ɹ/ codas (e.g. far) in the later 1700s (Beal 1999: 115). The change was contiguous with /ɹ/ dropping and involved a complex interaction between /ɹ/ weakening in certain contexts and compensatory lengthening. See also McMahon (2000: 230-285) for discussion of /ɹ/ loss which was stigmatized in the 1700s and early 1800s (Beal 1999: 165).

3. In the PALM lexical set, where the vowel precedes codas containing /lf/, /lm/ and /lv/ (as in words like half, calm, calve), the change occurred subsequent to the diphthongization of the short open front ME vowel to [au] before dark /l/ and began around 1400 (Dobson 1968a: 553). This process was combined with the loss of /l/ in complex codas. Lengthening of the vowel did not become common until later “in educated Standard English, in which the l was still pronounced until the late sixteenth century” (Dobson, 1968a: 604). Variants containing a long vowel in this lexical set were considered unacceptable in the earlier period but were embraced in the 1600s (Dobson 1968a: 604-605; Beal 1999: 105).

Once lengthening had taken place in these lexical sets, retraction of the resulting vowel [aː] began to occur in the late 1700s and early 1800s. This was precisely the time of the first permanent European settlement in Australia which began in 1788 with the arrival of the first fleet of convicts transported to New South Wales. Characteristics of AusE can therefore be traced to this physical separation from England in the later part of the 1700s and early 1800s.

The lengthening and backing change to the BATH set was never completed in English and there remain exceptions such that Southern British English (henceforth SBE) has a long vowel in pass but not gas, and in can’t but not expand (Beal 1999: 105). In AusE there is variability in the choice of vowel for words containing the complex nasal coda /nC/ such as dance, plant, advance (see Oasa 1989; Bradley 1991; Horvath and Horvath 2001a). Both the short front vowel [æ] (TRAP) and the more retracted long vowel [ɐː] (BATH) are variably found in these words. In SBE the open back vowel [ɑː] is the usual form and became so, according to Lass (2000: 107), by the mid 1800s following a period of some fluctuation where the long vowel was common for a while but then lost ground in the late 1700s to become fashionable again by the mid 1800s. By the time the long vowel had taken hold in SBE, AusE was already well established. Written sources reveal that a unique AusE accent was present by the 1830s (Moore 2008: 76). Horvath and Horvath (2001b: 53) speculate that some accent variation in AusE may be explained by patterns of settlement history. Although the connection is not made explicit, their findings suggest that the long vowel [ɐː] is proportionately more likely to occur in the dance subset in those regions that were settled later in the nineteenth century (such as South Australia) whereas the short vowel [æ] occurs in older settlements like NSW and Queensland. Incidentally New Zealand English (henceforth NZE) uses the long vowel in the dance lexical set possibly also reflecting settlement chronology. Based on pronunciation sources from the late 1700s and early 1800s, scholars deduce that the long vowel in the dance set was stigmatised at this time. Walker (1791: 11) gives the assessment that pronunciation of the long vowel in words like plant and answer “borders on vulgarity”. MacMahon (1998: 456) also discusses social


450

evaluation of the varying qualities of this long open vowel finding that when the more retracted variant first arose it was restricted to the lower social classes but this may have occurred after the AusE accent had already taken hold.

For the remainder of this chapter, we will refer to the BATH, START and PALM lexical sets as START for simplicity as all three sets contain the same vowel /ɐː/ in AusE. As many speakers use the short vowel /æ/ in words from the dance subset these are included as examples of TRAP and therefore will only be discussed with reference to that vowel.

The second vowel under focus in this chapter is contained in the STRUT lexical set. It developed independently from the processes outlined above as a phonemic split from ME short [u] (Dobson 1968a: 585, 587), which is alternatively described in Gimson (1980: 79) as the “ME fronted back, half-close lip rounded [ʊ]”. The first change affecting this vowel was unrounding in the 1600s but there is some evidence of earlier unrounding in “barbarous speech” (Dobson 1968a: 585, 587). Gimson (1980:111) suggests that the unrounding stage was achieved “by or during the seventeenth century in the London region, though at the time the lowering may not have been very considerable”. The vowel then lowered further to the position of cardinal 6 (the half-open unrounded [ʌ]) by the 1700s, proceeding below half-open by the end of the 1800s (Gimson 1980, 111). Subsequent further lowering and fronting to [ɐ] occurred during the twentieth century (MacMahon 1998: 457). Gimson (1980: 111) goes as far as to suggest that modern “regional speech of the London area” has an open front vowel very close to cardinal 4 and that conservative speakers of Received Pronunciation (RP) often use a more retracted vowel. Henton (1990) analysed citation form vowels for twenty male and twenty female speakers of RP and found that STRUT was “no longer a half-open back vowel” but was instead raised and centralized (Henton 1990: 211). Harrington, Palethorpe and Watson (2000) report data presented in Deterding’s (1997) analysis of RP female newsreaders’ connected speech from the 1980s, and similarly show a central vowel for STRUT, although with a great deal of variability in height. Harrington et al. (2000: 76) did not find evidence of fronting which was consistent with the findings in Bauer (1985). Hawkins and Midgely (2005), in an apparent time study comparing citation-form data from RP speakers spanning birth years 1928 to 1981, did not show any notable change across speakers for either START or STRUT. This indicates lack of change during the middle part of the twentieth century for these vowels, calling into question Gimson’s (1980) description of RP STRUT as retracted.

Although our focus is the open vowels START and STRUT, another open vowel that has an important role to play in AusE sound changes is /æ/ (TRAP). MacMahon (1998) describes TRAP as being qualitatively variable in Britain between 1770 and 1820, ranging from Cardinal 3 to 4 but raised in affected speech, and Ellis (1889: 594) indicates that TRAP in the later part of the nineteenth century was closer to cardinal 3 [ɛ]. The lowering of TRAP towards cardinal 4 [a] is considered a late twentieth century innovation in SBE (MacMahon 1998: 465, see also Wells 1982: 291-292 and Bauer 1994: 119). This lowering change is clearly documented in Hawkins and Midgley (2005: 187-188) and also in the analysis of the Queen’s changing vowels by Harrington et al. (2000).


451

The characteristics of present day and archival AusE speech data allow us to explore the relationship between historical and ongoing sound changes and the resulting phonological characteristics. Examining historical data may provide some insight into how the changes described above were affected by transportation of English to Australia. 2 Australian English Historically, there has been some inconsistency in descriptions of the open vowels in AusE. Older impressionistic studies refer to AusE START and STRUT as front vowels. Dobson (1968b: 32) states that one of the main differences between AusE and SBE is in the “pronunciation of the vowel [ɑ:] in such words as dark, half, path; the Australian vowel is commonly a low front vowel [aː], the English a mid-back or even fully-back [ɑː]”. Mitchell (1958: 59) echoes these comments “The English a … is a deep, retracted sound. The Australian a is a clear front sound”. However, Mitchell modifies this position somewhat in Mitchell and Delbridge (1965: 36). Although still ascribing front vowel status to START, it is considered to be “more than half the distance” from cardinal 5 [ɑ] to cardinal 4 [a], while STRUT is described as front and raised. For Wells (1982: 599), AusE START is “central to front …noticeably fronter than in RP” with STRUT similar to Cockney and “between half open and open, just fronter than central, unrounded” (Wells 1982: 597). Wells (1982: 599) also speculates that the fronting of STRUT is a “drag-chain consequence of the movement of /æ/ up and away from cardinal 4”. Perhaps part of the reason the open AusE vowels have been described as front vowels relates to a simple comparison with SBE (Henton 1990). The SBE START vowel is much more similar to cardinal 5 (Roach 2004: 242) than is the AusE variant. For Dobson (1968b: 38), “the prevalence of low-front [aː] in Australian English in such words may be a survival of the eighteenth-century pronunciation, not a modification of the present day Standard English sound; in other words, it may be Standard English that has changed”. His suggestion is that SBE has continued the trajectory of START backing that began in the late eighteenth century, but in AusE the change was arrested prior to retraction.

It is, however, incorrect to refer to present-day AusE START and STRUT as front vowels. Acoustic analyses provide no support for the fronted categorization (see for example, Bernard 1970c; Harrington, Cox and Evans 1997; Cox 1999, 2006; Cox and Palethorpe 2008). Analyses of modern speaker datasets collected between 1965 and 1995 show that START and STRUT fall in the open central region of the vowel space and, until very recently, occupied the lowest point of the inverted vowel triangle. In present day AusE this position has been assumed by /æ/ (see for example, Cox and Palethorpe 2008). Figure 1 illustrates the average monophthong vowel space for present-day AusE based on acoustic analysis of citation-form /hVd/ data collected from 116 young female speakers from Sydney (Cox and Palethorpe 2012).


452

Figure 1. Monophthong F1/F2 vowel space plot for citation-form data collected

from 116 young females from Sydney between 2005 and 2009. One interesting aspect of AusE is that the phonology supports at least two pairs of vowels that are differentiated by length in the absence of spectral cues (e.g. /ɐː/ and /ɐ/ heart/hut, /e/ and /eː/ shared/shed) (Bernard 1967, 1970b; Cochrane 1970; Harrington et al. 1997; Watson and Harrington 1999; Cox 2006). The pair that has received the most attention in the literature is START and STRUT, which have minimal spectral differences in the F1/F2 plane (see for example, Cox 2006) indicating that they have very close articulatory similarity. In an acoustic analysis of 120 adolescent speakers’ vowels, Cox (2006) showed that the mean F1 and F2 values for START and STRUT were not differentiated but the vowels were instead separated from one another by length (282 msec for START, 161 msec for STRUT). Fletcher and McVeigh (1993) confirmed the durational contrast between START and STRUT in an analysis of data from the Australian National Database of Spoken Language (ANDOSL) corpus (Vonwiller, Roger, Cleirigh and Lewis 1995) and Watson and Harrington (1999) used a set of Gaussian classification experiments to show that excluding information on vowel duration greatly reduced the ability of their models to correctly classify START and STRUT. Bernard (1967) manipulated vowel duration in two identification experiments and confirmed that length was the primary cue in signalling the distinction between this vowel pair and in a follow-up paper, he also provided X-ray evidence for the undifferentiated articulatory position for these two vowels (Bernard 1970a). Recent work on child language acquisition has found that AusE speaking children as young as 18 months (Chen, Xu Rattanasone and Cox 2014) and 3 years (Yuen, Cox and Demuth 2014) have the ability to successfully use duration in their speech production and perception to separate the AusE long/short vowel pairs. What is unclear is when


453

and how the loss of spectral differentiation between START and STRUT occurred in historical AusE. 3 Chain shifting Chain shifting is a phonetically motivated process that ensures speech sounds remain sufficiently separate from one another within the constraints of the sound system. Such changes result from the interplay between hyperarticulation (listener induced clarity) and hypoarticulation (speaker induced economy of gesture) (Lindblom, 1990) and may result in drag- and push-chains where sounds can “push” and “pull” each other in order to preserve the functional economy of the phonemic vowel system (Labov 1994: 117). The short front vowels /ɪ, e, æ/ (KIT, DRESS, TRAP) in Southern Hemisphere varieties of English including AusE, NZE and South African English are traditionally considered raised relative to SBE. Although AusE TRAP has lowered over the past 40 years to its current position at the extreme open front of the vowel space (see Figure 1), and there is recent evidence that DRESS and KIT are also lowering (Cox and Palethorpe 2008). This change is in the opposite direction to the NZE short front shifts, which have shown raising of DRESS and TRAP and centralization of KIT (Watson, Maclagan and Harrington 2000). Bauer (1979: 59) suggests that Southern Hemisphere front vowel raising was precipitated by fronting of STRUT and he offered a push-chain explanation. Conversely, Wells (1982: 599), considered STRUT fronting to be a consequence of a drag-chain from TRAP. Bauer has since revised his explanation for the short front raising shift (Bauer 1992) and, more recently, Gordon et al. (2004: 265) have proposed that raised TRAP was brought to New Zealand by the early settlers from southern England and that further raising continued a trajectory of change (push-chain) that had already begun prior to New Zealand colonial settlement. Watson et al. (2000: 65) “favour a drag-chain account of front vowel raising of head and had in Australian and NZE” but reserve judgement on the possible effect of hud (STRUT). Gordon et al. (2004: 205) offer the possibility that START may have been implicated in the raising process in New Zealand after finding a correlation between impressions of START fronting and TRAP raising although this scenario is unlikely as START and TRAP are members of separate phonological subsets with START being a long (unchecked) vowel whereas TRAP is a short (checked) vowel. The possibility of influence from STRUT was not considered in their analysis and correlational effects for this vowel were not undertaken. Questions remain about the nature of the vowel shifts and the timing of the various changes in both AusE and NZE.

In this chapter we will examine historical aspects of AusE through an acoustic phonetic analysis of archival sound recordings from people born in the latter part of the nineteenth century and we will compare these historical data with audio data from a group of speakers born in the late twentieth century. This will allow us to examine the relationship between the AusE open vowels at each end of a one hundred-year time span. Leitner (2004) views the period from the 1850s to 1880s as the second formative phase in AusE accent development, following the accent inception phase. The accent inception phase can never be directly examined


454

because recording devices were not existent until the twentieth century and certainly not in common use until the middle of the century. Most of the oldest recorded AusE speakers would have been born around 1870 (or possibly slightly earlier), around eighty years after the permanent European settlement began and at a time when AusE had already been very firmly established for nearly fifty years (Moore 2008). Speakers from our historical dataset were born in the period just after the second formative phase of AusE accent development.

Work such as this hinges on the idea that the speech of any individual is a snapshot of time and place, reflecting both areal and temporal aspects of their personal history. Although it is acknowledged that dialect shift may occur throughout a person’s lifetime (Harrington, Palethorpe and Watson 2005), once an individual reaches puberty such accent change usually proceeds more slowly than changes occurring in the dialect through evolution (Labov 1994). In this project we make the assumption that the speech contained in the historical dataset should bear strong resemblance to the accent types that were current in the community around the turn of the twentieth century.

There has been surprisingly little empirical examination of Australian accent history. In contrast, New Zealand researchers have extensively examined historical NZE in the Origins and Evolution of New Zealand English Project (ONZE). The ONZE Project is based on recordings of speakers born in New Zealand in the second half of the nineteenth century made by the New Zealand National Broadcasting Service Mobile Disc Recording Unit in 1948 (Gordon et al. 2004). There is some debate about the origins of NZE pronunciation with researchers questioning the role that AusE had to play in the development of that variety (Gordon et al. 2004). Trudgill (2004: 10) argues that NZE evolved separately from AusE based on evidence from the Mobile Unit recordings and his observations that “none of these speakers sound much like Australians and most of them sound nothing like them at all”. Conversely, Gordon et al. (2004: 225, 326) comment that one of the Mobile Unit speakers, Mr George Firth who was born in Tasmania in 1875, and migrated to New Zealand in his early twenties, did not use speech patterns that were different from New Zealanders of a similar age, indicating that the two varieties may have been similar at the time. Evans and Watson (2010: 200) suggest that “whether the Australian accent had a considerable influence on the development of the New Zealand accent in the earliest years of colonisation, is a question likely to remain unanswered”. In their acoustic analysis comparing two speakers of historical AusE and NZE with their ‘modern’ counterparts, they found considerable change in the short front series of vowels for the NZE speakers compared to the AusE speakers confirming the divergence between the two accents during the twentieth century but they also showed considerable similarities for the speakers born in the late 1800s.

Many of the recent ideas relating to the development of Southern Hemisphere accent varieties are based on the ONZE Project (Gordon et al. 2004). An auditory quantitative analysis of the START vowel produced by 59 speakers born in New Zealand between 1851 and 1905 and including a total of 2,273 tokens (Gordon et al. 2004: 129-130) showed that only 1% of START vowels could be considered back vowels, whereas 52% were central and 47% were fronted variants. They conclude from this that the “earliest immigrants to New Zealand brought a


455

relatively front START vowel with them from Britain.” (Gordon et al 2004: 130). They also showed that males and speakers born before 1875 produced fewer fronted variants (Gordon et al. 2004: 132). These findings indicate that START arrived in New Zealand as relatively fronted but continued a trajectory of fronting during the late 1800s. Gordon et al (2004:139) report on a global auditory perceptual assessment of STRUT for 95 mobile unit speakers’ vowels made by Peter Trudgill. Analysis showed that approximately 40% of speakers “habitually use back variants that are above open-mid …. 15% use some fronted tokens ([ə]) and another 7%, ….. use some back realizations.” In addition, 20% used “somewhat fronted variants” and 15% used the open central modern type [ɐ]. They report that the fronted and lowered type became increasingly common over time determined through an analysis of birthdate, with those born later producing the more modern open central form. However they don’t comment on the characteristics of the oldest speakers.

In addition, Gordon et al. (2004: 91) conducted a detailed acoustic analysis of data from a subset of 10 speakers who “were chosen to be as representative of the Mobile Unit database as possible.” Five men and five women born between 1864 and 1886 from various settlement types and from both the North and South Islands were included. At least 20 tokens of each vowel were analysed including 50-70 for the front vowels and closing diphthongs in words carrying sentence stress. This acoustic analysis revealed that “STRUT is significantly closer and significantly more front than START” (Gordon et al. 2004: 139).

AusE researchers have not previously been in a strong position to contribute to the discussion of Southern Hemisphere English accent history because there has been so little analysis of historical AusE data. This chapter begins to redress this imbalance by contributing some empirical analysis to the debate. We will shed light on the similarities and differences between AusE and NZE before the turn of the twentieth century with respect to the open vowels, START and STRUT. Australia and New Zealand were settled during the retracting phase of START and the fronting phase of STRUT. Both changes were progressive and socially stigmatized at the time (MacMahon 1998). We will additionally consider the position of neighbouring vowels, in particular TRAP, with regard to chain shifting hypotheses (Labov 1994: 117).

4 Hypotheses Based on the historical progression of change documented in previous assessments of primary written sources and in the ONZE findings (and given the close historical association between Australia and New Zealand), we expect our historical data to display the following characteristics: START will be central to front (Gordon et al. 2004: 129-130) and be separated from STRUT in both height and fronting (Gordon et al. 2004: 139). There are two competing hypotheses to explain central START in AusE: firstly that START is not fully retracted in AusE because settlement occurred mid-way through the retraction phase, and secondly that START became more fronted compared with Southern British varieties after settlement. Recall that retraction was considered the less refined production of the time.


456

According to Gimson (1980: 111), STRUT was lowering during the inception stage of Australian European settlement achieving the back half-open position by the end of the 1800s. Subsequent further lowering and fronting was considered to occur during the twentieth century in SBE (MacMahon 1998: 457). Characteristics of our historical data may help to clarify aspects related to the chronology of changes to STRUT. We expect that some speakers will have back STRUT as was found by Gordon et al. (2004: 139) for 47% of their speakers which would support MacMahon (1998: 457) that STRUT fronting has occurred relatively recently.

These data may also be able to provide some insight into the chain shifting hypotheses that describe the interaction between STRUT and TRAP and short front vowel raising. Wells (1982: 599) suggests that fronted STRUT could be the result of a drag-chain from raised TRAP and Gordon et al. (2004: 265) propose a push-chain effect from TRAP. Watson et al. (2000: 65) do not discount the possibility of two processes occurring in a complementary manner involving a push-chain from STRUT combined with a drag-chain from KIT.

5 Method

5.1 Speakers Seven men and three women born in the last twenty years of the nineteenth century were recorded while engaged in oral history interviews with researchers or the speakers’ family members or friends between 1962 and 1985 on reel-to-reel or cassette tape. Four of the speakers’ interviews were located on reel-to-reel tapes stored in the Department of Linguistics at Macquarie University and were discovered in 2004. These recordings were made by Alex Hood. An additional three interviews were sourced from the NSW Bicentennial project which contains recorded interviews with 200 men and women over eighty years of age who had lived in NSW between 1900 and 1930. These recordings were made in 1987 onto cassette tape and are now available from the State Library of NSW on digital media. The remaining three of the historical recordings came to us through private bequests (the Australian Ancestors corpus). All the speakers were from rural working class backgrounds from NSW. Table 1 shows the date of birth (DOB), place of birth (POB), occupation, the source of the data (corpus), and year the recordings were made if known. The identity of these speakers is protected here.

The modern data used in this analysis was taken from interviews conducted in 1998 with five year-10 schoolboys in rural NSW collected for the Australian Voices project (Cox and Palethorpe 2008). Table 1. Details of the speakers selected for the historical analysis.


457

5.2 Phonetic Data

All speech sounds have been extracted from unscripted continuous speech. Our analysis is restricted to eleven of the stressed monophthongs of AusE in a variety of consonantal contexts. These vowels are exemplified by the words bead, bid, bed, bad, bard, bud, pod, board, boot, put, bird. For the present analysis we used all the monophthongs to create the vowel spaces but our statistical analysis is restricted to the vowels /ɐː/ (START) and /ɐ/ (STRUT). Table 2 shows the number of tokens analysed for each vowel per speaker.

Table 2. The number of tokens analysed per vowel per test speaker.

Speaker - historical /ɐː/ START /ɐ/ STRUT Ben 56 91 Bru 77 144 Dav 39 112 Deb 63 86 Ell 89 163 Gol 53 120 Gre 72 136 Jar 26 75 Mor 9 32 Nix 27 42 TOTAL 511 1001 Speaker - modern /ɐː/ START /ɐ/ STRUT #15 9 11 #19 9 10 #20 11 16 #21 10 15

Speaker DOB Born Occupation Corpus Recorded

Mr Ben 1880 Gulgong shearer Ancestors 1977

Mr Nix 1884 Wagga Wagga roustabout Alex Hood 1968 +

Mr Jar 1888 Coonamble drover Alex Hood 1968 +

Mr Mor 1893 Drake miner Alex Hood 1968 +

Mr Gre 1897 Guyra farmer Bicentennial 1987

Mr Deb 1897 Taree clerk Bicentennial 1987

Mr Gol 1899 Cooma drover Bicentennial 1987

Mrs Bru 1887 Mullengandera farmer Alex Hood 1968 +

Mrs Ell 1889 Rylestone teacher Ancestors unknown

Mrs Dav 1890 Kundabung house-keeper

Ancestors before 1987


458

#41 4 20 TOTAL 43 72

5.3 Data extraction

All speech data were transferred to digital media prior to analysis. Note that the historical data files do not contain any acoustic information above 8000Hz owing to limitations of the original recording equipment. We selected approximately 30 minutes of data from each of the archival oral history interviews. For the modern data there was typically only five and ten minutes of recording per speaker hence the reduced number of tokens relative to the historical interview data. All words included in the analysis carried sentence stress and were chosen from unrestricted consonantal contexts to ensure that the maximum number of tokens would be available for analysis.

Traditionally, the acoustic quality of a vowel is described with reference to the frequencies of the first two formants (resonance peaks). These acoustic measurements are highly correlated with articulatory data and give a representation of how the vowels are produced. Formant 1 (F1) is inversely related to vowel height so that phonetically higher vowels have lower F1 values. Formant 2 (F2) is related to articulatory fronting such that front vowels have higher F2 values. Objective data such as these allow us to make comparisons within and between speakers.

The frequencies of the first two formants were automatically tracked using the ESPS/Waves (twelfth order LPC analysis with a 49 ms raised cosine window and a frame shift of 5 ms). Labelling was carried out using the Emu speech database system (Cassidy and Harrington 2001, http://emu.sourceforge.net/) with reference to wideband spectrograms and aligned waveforms. The beginning and end of each prosodically accented vowel was hand-labelled according to criteria developed for the ANDOSL corpus (Vonwiller et al. 1995) and documented in Croot and Taylor (1995). The resulting formant traces were hand-corrected where required and the frequencies of F1 and F2 at the vowel target(s) were extracted. Results will be presented in F1/F2 vowel space plots to illustrate the relative positions of the vowels within the monophthong space and will be presented in Hertz to facilitate comparison with other datasets. For the historical data, speaker dependent normalization was carried out (Lobanov 1971) to reduce the variability resulting from speaker specific (primarily physiological) differences. Lobanov’s formula is based on standardising speakers’ means and standard deviations for the formants. Mixed model analyses were conducted in SPSS to compare START and STRUT for F1, F2 and duration separately for the historical data set with speaker included as a random factor. The modern speakers could not be statistically examined because an insufficient number of tokens was available for analysis. The modern data are hence provided here as a general comparison. Another problem with direct comparison between elderly and young speakers relates to the potential effects that result from the physiological effects of aging making comparisons difficult to interpret (Linville and Rens 1981).

6 Results

http://emu.sourceforge.net/)


459

For the historical data, separation between the open vowels was found to be statistically significant for F2 (F(1,9.315) = 131.813, p<.0001) and for duration (F(1,9.347)=217.439), p>.0001). There was no F1 difference. This shows that the vowels did not differ in height but were significantly separated in the horizontal dimension of fronting/retraction with STRUT more fronted than START. Figure 2 displays the average vowel spaces for the male and female historical speakers’ normalized vowels, clearly showing the horizontal separation for START and STRUT. The vowels also differed in duration as expected with STRUT being 47% shorter than START. The duration results for the historical data can be seen in the bottom panel of Figure 3. Although we are unable to make direct statistical comparisons between the historical data and the modern data owing to limitations inherent in the modern dataset, the F2 means for the male speakers do give some indication of how the vowels have changed. The historical male data shows mean F2 values of 1255 Hz for START and 1442 Hz for STRUT, a difference of nearly 200Hz. The mean values of F2 for the modern speakers’ data show that START and STRUT are not differentiated to the same degree as for the historical speakers. The means are 1305 Hz for START and 1375 Hz for STRUT, giving a difference of just 70 Hz. In the modern data the vowels have converged towards a single position (see Figure 4), an effect that is well documented in previous studies of modern AusE vowels (e.g. Harrington et al. 1997; Cox 2006; Yuen et al. 2014). There is also the expected durational difference for the modern data as illustrated in the top panel of Figure 3. Note that the modern data show more extreme length variability owing to the reduced number of tokens. The speakers also appear to have shorter vowels, but again these results cannot be directly compared with the historical data as the small number of tokens does not provide a representative sample. The phonetically lowered TRAP, known to be a feature of modern AusE (Cox 2006), is also clearly seen in Figure 4. It is important to be aware that these modern data are from adolescents recorded in 1998 and therefore do not truly represent present-day AusE (see Figure 1 for an example of more recent data).


460

Figure 2. Monophthong vowel space plots for the historical female and male

speakers.


461

Figure 3. Mean durations in milliseconds for the vowels START (black bar) and

STRUT (white bar) for each of the historical speakers (bottom panel) and modern speakers (top panel). Error bars represent 95% confidence intervals.


462

Figure 4. Monophthong vowel space for the AusE modern data extracted from

citation form vowels from five adolescent males from rural NSW recorded in 1998.

The results for the historical AusE data differ from Gordon et al. (2004: 139) who found acoustic separation between START and STRUT in both F1 and F2. Figure 5 has been reconstructed from Figure 6.5 in Gordon et al. (2004: 109) and illustrates the vowel spaces for historical NZE data collected from five male and five female speakers born between 1864 and 1886. Figure 5 shows, in particular, the significant height difference between the open vowels that we do not see in the historical AusE data. Interestingly the raised STRUT vowel found in Gordon et al. (2004) was also found in the continuous speech data analysed by Watson et al. (2000) in their analysis of two male and two female speakers from the New Zealand Mobile Unit recordings born between 1894 and 1899, and also in Evans and Watson (2004) for a male NZE Mobile unit speaker. Watson et al. (2000) dismissed this finding as a vowel reduction effect. However, they show the same STRUT raising found in Gordon et al. (2004) for historical NZE data from the same period. As this raising is not found in the AusE historical data, it could represent a difference between the AusE and NZE of the time and one that has not been previously identified.


463

Figure 5. NZE historical data reconstructed from Figure 6.5 in Gordon et al. (2004: 109) illustrating the normalized vowel spaces for five males (top panel) and five females (bottom panel) born in New Zealand in 1864-1886.

7 Discussion The suggested historical progression of change that can be found in the literature for START and STRUT led us to expect our historical AusE data to display centralized to fronted START separated in height and fronting from STRUT. The major effect that we have seen for the historical AusE data is the horizontal separation of START and STRUT which does not occur in the modern AusE data examined here nor has it been seen in other analyses of late twentieth century AusE vowels (see Cox and Palethorpe 2008). We did not find the expected height difference between START and STRUT that was present in Gordon et al. (2004)


464

and Watson et al. (2000) for NZE. Gordon et al. comment that the two oldest speakers in their acoustic analysis produced the most raised STRUT vowel, however further inspection of their individual vowel plots from Appendix 5 (Gordon et al. 2004: 329-333) shows that nearly all of the ten speakers whose data were acoustically analysed produced raised STRUT. The only two who didn’t exhibit raised STRUT were a female born in 1877 and a male born in 1875. Other speakers born in 1875, 1879 and 1886 also retained raised STRUT so the data do not seem to support a lowering of STRUT over time. In both the AusE and the NZE historical datasets the production of this vowel is highly variable. Data from our individual speakers confirm that four of the ten speakers do produce some raising of STRUT but it is never as raised as TRAP, unlike much of the NZE data. The STRUT raisers are amongst the oldest speakers in our AusE group but the pattern regarding date of birth is inconsistent as it is in the NZE data. For example, speaker Mor (born in 1893) produces the most raised STRUT whereas speaker Ben (our oldest speaker) born in 1880 does not raise STRUT. STRUT is a very short vowel, and in continuous speech it is likely to “result in a certain amount of target undershoot and therefore in a less open vocal tract” (Watson et al. 2000). This is because it requires an open jaw for its production which is at odds with the gestural requirement of surrounding consonants. Gestural overlap has the potential to severely impact the production of STRUT (Harrington, Fletcher and Roberts 1995). Further detailed examination of phonetic and prosodic aspects of the connected speech samples in our dataset is required to truly appreciate the characteristics of this vowel and the effect of context.

We also outlined two competing hypotheses to explain the central position of START in AusE and NZE: firstly that START is not fully retracted because settlement occurred mid-way through the retraction phase, and secondly that START became fronted compared with Southern British varieties after settlement. Examination of the data shows that the historical AusE data are horizontally separated such that START appears to have a more retracted position than in modern data and that STRUT occurs in a more fronted position than in modern data (see mean formant values above for comparison). Although we have not been able to directly compare these data statistically, convergence of the two vowels must have taken place sometime during the twentieth century with START fronting and STRUT retracting. The timing of this effect requires further empirical examination with historical data selected from a range of time periods to sample the twentieth century.

We also expected some of our historical AusE speakers to have a back variant of STRUT in line with Gordon et al.’s (2004: 139) finding that nearly half of their speakers used this type. Gimson (1980: 111) states that for STRUT, a vowel more open than cardinal 6 was common by the end of the 1800s and MacMahon (1998: 457) suggests that further lowering and fronting occurred as recently as the twentieth century in SBE. Our data shows the lowered and centralized STRUT as the common form for our speakers suggesting that the lowered variant was present earlier than the twentieth century In AusE. This is opposed to in NZE where Gordon et al. (2004) report that raised and backed tokens were more common (although their acoustic data sample of historical NZE does not support this). Gimson (1980) and Wells (1982) describe the central open variant as common in


465

Cockney and it may be that the AusE vowel is a remnant of the early Cockney or working class production. Retraction of START and fronted lowered STRUT were characteristic of working class speech in southern England in the 1800s. (e.g. Beal 1999; MacMahon 1998). Written sources provide interesting accounts of forms described as “vulgar” that ultimately became the standard production.

One interesting aspect of the comparison between historical AusE and NZE is that they show similar patterns for the DRESS and TRAP vowels. Present-day NZE is characterized by the extreme raising of DRESS and TRAP and centralization of KIT. Close inspection of Figure 5 indicates some incipient retraction of KIT in the historical NZE female data. This does not occur in the AusE plots but it is also present in the elderly female NZE speaker analysed in Evans and Watson (2004: 198). Present-day differences between the two varieties of English may have their antecedents in the patterns of vowels documented here. It is clear from this and previous analyses of historical NZE (Watson et al. 2000; Evans and Watson 2004; Gordon et al. 2004) that raised TRAP was a feature of both AusE and NZE from early in their histories. Why then, did NZE and AusE ultimately diverge with respect to the short front series?

These data may be able to provide some insight into the chain shifting hypotheses that describe the interaction between STRUT and TRAP. Wells (1982: 98) suggests that fronted STRUT could be the result of a drag-chain from raised TRAP and Gordon et al. (2000: 266) propose a push-chain effect from TRAP impacting on vowels further up the space. Watson et al. (2000: 65) indicate the possibility of both push-chain and drag-chain processes occurring in a complementary manner that involved the raising of the short front vowels to include push-chain from STRUT combined with drag-chain from KIT. Their explanation for the separate changes to KIT in AusE and NZE relate to different strategies deployed in order to maintain perceptual contrast and carry prosodic accent in the face of a crowded mid-high front vowel space. AusE achieves perceptual contrast by raising and fronting KIT, a strategy that is licenced because the only other monophthong competing for the high front position is /iː/ (from the FLEECE lexical set) which in AusE is long and incorporates the additional phonetic feature of onglide (Cox, Palethorpe and Bentink 2014). The particular characteristics of AusE FLEECE ensure that it remains maximally separated from KIT. We know from studies of NZE sound change that KIT continued to retract and lower during the twentieth century. Watson et al. (2000) also describe the advantages of this change in terms of NZE KIT facing even further mid-high front crowding from raised /e/ DRESS. One way to increase perceptual salience from other high front vowels is through backing. In NZE the prosodically accented KIT became centralized to achieve separation from the crowded mid-high front vowel space (Watson et al. 2000). We have no evidence from acoustic studies that AusE TRAP was subject to any further raising after the early 1900s but it did begin to lower some time during the 1970s and 1980s (Cox 1999) and in the very recent past has begun to have a drag-chain effect on DRESS and KIT (Cox and Palethorpe 2008). With regard to the short front vowels, the very raised STRUT in historical NZE appears to be the major source difference with historical AusE. This finding could add weight to an argument for STRUT influencing TRAP, the further raising of which occurred in NZE but not AusE.


466

We cannot overlook the possible effects of community changes on the speech of the elderly participants in the oral history interviews that form the basis of this chapter so we must tread cautiously with suggestions that the speech data truly reflect a previous time in history. Harrington et al. (2000, 2005) and Harrington (2006), through analysis of the Queen’s Christmas broadcasts, show that an individual’s speech changes over the lifespan in response to community changes. It is impossible to tell whether the community influence could be driving some of the effects that we can see in the data plots. The backing of KIT, for instance, may not have been present in the speech of the elderly NZE speakers at a younger age. Similarly, it may be that even greater separation between START and STRUT was a feature of early AusE. Our historical speakers could be displaying the influence of the modern change towards the convergence of these two vowels. Another caution with the type of data used here is that aging can affect formant frequencies differentially. Reubold and Harrington (2015) found for a single speaker across the adult lifespan that F1 and F2 increased with age in open vowels up to 72 years but then decreased. One explanation they offer for the F1 decrease relates to the challenges the elderly face as muscles begin to atrophy, which may hamper the ability to achieve the open mouth required for the production of open vowels. Lowered F1 would have the effect of making the open vowels appear raised. Comparisons between speakers of different ages must take these potential effects into consideration. The open vowels in our dataset appeared phonetically lowered (that is, higher F1) compared to the NZ data. It is possible that the NZE speakers were suffering from the effects of aging. However, as the speakers in both our historical dataset and the NZE dataset were all in their 80s at the time of recording we expect the effect of aging to be negligible when comparing these two sets of data.

We find support for TRAP lowering as a late twentieth Century innovation. The very open quality of present day AusE TRAP (see Figure 1) places STRUT in a precarious position. As these two short vowels carry a very high functional load, they require separation from each other so that confusion does not arise. We may expect retraction of STRUT in a further chain shift towards the back of the space to occur in the future if TRAP continues its progression of change in AusE.

8 Conclusions In the present study we detailed an acoustic analysis of the open vowels in AusE comparing data from our Australian Ancestors corpus which contains speech data from eight men and four women born in Australia in the period 1880 to 1899. We showed that, in contrast to modern data, speakers in the historical database produced significant horizontal separation between START and STRUT with STRUT more fronted but not more raised than START. We also mentioned the possibility that START fronted and STRUT retracted during the twentieth century. This was based on some evidence from modern data showing the convergence of START and STRUT towards the centre of the space,. Finally, we compared reports of historical NZE data with our own historical AusE dataset and suggested that


467

differences could represent true regional variation in the English accents spoken in Australia and New Zealand in the late nineteenth and early twentieth centuries.

Phonetic archaeology of this kind is in its infancy in Australia and there is much that can be learned from the speech patterns of our ancestors about accent evolution and the mechanisms involved in sound change. Acknowledgements Thanks to David Blair, Alex Hood, Linda Buckley, Kimiko Tsukada, Anne Drayton, Jemima McDonald, Hazel Suters, John Lonergan. This work was supported by Macquarie University Research Development grant 9200900671 and ARC Discovery grant DP110102479. References Bauer, Laurie 1979. The second Great Vowel Shift?, Journal of the International

Phonetic Association 9: 57-66. Bauer, Laurie 1985. Tracing phonetic change in the received pronunciation of

British English, Journal of Phonetics 13: 61-81. Bauer, Laurie 1992. The second Great Vowel Shift revisited, English World-Wide

13: 253-268. Beal, Joan C. 1999. English Pronunciation in the Eighteenth Century: Thomas

Spence's ‘Grand Repository of the English Language’. Oxford: Clarendon Press.

Bernard, John R. L.-B. 1967. Length and identification of Australian English vowels, AUMLA: Journal of the Australasian Universities Language and Literature Association 27: 37-58.

Bernard, John. R. L.-B. 1970a. A cine-x-ray study of some sounds of Australian English, Phonetica 21: 138-150.

Bernard, John. R. L.-B. 1970b. On nucleus component durations, Language and Speech 13: 89-101.

Bernard, John R. L.-B. 1970c. Toward the acoustic specification of Australian English, Zeitschrift für Phonetik 23: 113-128.

Bradley, David 1991. /æ/ and /a:/ in Australian English, In: Jenny Cheshire (ed) English Around the World: sociolinguistic perspectives, Cambridge: Cambridge University Press, pp. 227-234.

Cassidy, Steve and Jonathan Harrington 2001. Multi-level annotation in the Emu speech database management system, Speech Communication 33: 61-77.

Chen, Hui, Nan Xu Rattanasone and Felicity Cox 2014. Perception and production of phonemic vowel length in Australian English-learning 18 month-olds, LabPhon, 25-27 July 2014, Tokyo.

Cochrane, George R. 1970. Some vowel durations in Australian English, Phonetica 22: 240-250.

Cox, Felicity 1999. Vowel change in Australian English, Phonetica 56: 1-27. Cox, Felicity 2006. The acoustic characteristics of /hVd/ vowels in the speech of

some Australian teenagers, Australian Journal of Linguistics 26: 147-179.


468

Cox, Felicity and Sallyanne Palethorpe 2007. Australian English, Journal of the International Phonetic Association 37: 341-350.

Cox, Felicity and Sallyanne Palethorpe 2008. Reversal of short front vowel raising in Australian English, Proceedings of Interspeech 2008, 22-26 September 2008, Brisbane, pp. 342-345.

Cox, Felicity and Sallyanne Palethorpe 2012. Standard Australian English: the sociostylistic broadness continuum, In: Raymond Hickey (ed) Standards of English: Codified Varieties Around the World, Cambridge: Cambridge University Press, pp. 294-317.

Cox, Felicity.,Sallyanne Palethorpe and Samantha Bentink 2014. Phonetic archaeology and fifty years of change to Australian English /iː/, Australian Journal of Linguistics 34: 50-75.

Croot, Karen and Belinda Taylor 1995. Criteria for acoustic-phonetic segmentation and word-labelling in the Australian National Database of Spoken Language,

http://andosl.anu.edu.au/andosl/general_info/aue_criteria.html. Deterding, David 1997. The formants of monophthong vowels in Standard

Southern British English pronunciation, Journal of the International Phonetic Association 27: 47-55.

Dobson, Eric J. 1968a. Australian English and the Philologist, An address delivered to a meeting of the Association on 31 July 1968 by E. J. Dobson, M.A., D.Phil. (Oxon.), B.A. (Sydney), Professor of English Language in the University of Oxford. Retrieved from

http://ojs-prod.library.usyd.edu.au/index.php/ART/article/view/5465/6113. Dobson, Eric J. 1968b. English Pronunciation 1500-1700 Volume II Phonology,

London: Clarendon Press. Ellis, Alexander J. 1889. On Early English pronunciation, London: Tru bner and

Co. Evans, Zoe and Catherine Watson 2004. An acoustic comparison of Australian

English and New Zealand English vowel change, Proceedings of the tenth International Conference on Speech Science and Technology, 8-10 December, Macquarie University, pp. 195-200.

Fletcher, Janet and Andrew McVeigh 1993. Segment and syllable duration in Australian English, Speech Communication 13: 355-365.

Gimson, Alfred C. 1980. An Introduction to the Pronunciation of English. Third edition. London: Edward Arnold.

Gordon, Elizabeth, Lyle Campbell, Jennifer Hay, Margaret Maclagan, Andrea Sudbury and Peter Trudgill 2004. New Zealand English: Its Origins and Evolution, Cambridge: Cambridge University Press.

Harrington, Jonathan 2006. An acoustic analysis of ‘happy-tensing’ in the Queen’s Christmas broadcasts, Journal of Phonetics 34: 439-457.

Harrington, Jonathan, Mary E. Beckman and Janet Fletcher 2000. Manner and place conflicts in the articulation of accent in Australian English. In: Michael Broe (ed), Papers in Laboratory Phonology 5, Cambridge: Cambridge University Press, pp. 40-51.

http://andosl.anu.edu.au/andosl/general_info/aue_criteria.html

http://ojs-prod.library.usyd.edu.au/index.php/ART/article/view/5465/6113


469

Harrington, Jonathan, Felicity Cox and Zoe Evans 1997. An acoustic phonetic study of broad, general, and cultivated Australian English vowels, Australian Journal of Linguistics 17: 155-184.

Harrington, Jonathan, Janet Fletcher and Corinne Roberts 1995. Coarticulation and the accented/unaccented distinction: evidence from jaw movement data, Journal of Phonetics 23: 305-322.

Harrington, Jonathan, Sallyanne Palethorpe and Catherine Watson 2000. Monophthongal vowel changes in Received Pronunciation: an acoustic analysis of the Queen’s Christmas broadcasts, Journal of the International Phonetic Association 30: 63-78.

Harrington, Jonathan, Sallyanne Palethorpe and Catherine I. Watson 2005. Deepening or lessening the divide between diphthongs? An analysis of the Queen's annual Christmas broadcasts, In: William J. Hardcastle and Janet Mackenzie Beck (eds) A Figure of Speech: A Festschrift for John Laver, Mahwah, N.J.: Lawrence Erlbaum Associates, pp. 227-262.

Hawkins, Sarah and Jonathan Midgley 2005. Formant frequencies of RP monophthongs in four age groups of speakers, Journal of the International Phonetic Association 35: 183-199.

Henton, Caroline 1990. One vowel’s life (and death?) across languages: the moribundity and prestige of /ʌ/, Journal of Phonetics 18: 203-227.

Horn, Wilhelm and Martin Lehnert 1954. Laut und Leben: Englische Lautgeschichte der neueren Zeit (1400-1950). Berlin: Deutscher Verlag der Wissenschaften.

Horvath, Barbara M. and Ronald J. Horvath 2001a. A multilocality study of a sound change in progress: The case of /l/ vocalization in New Zealand and Australian English, Language Variation and Change 13: 37-57.

Horvath, Barbara M. and Ronald J. Horvath 2001b. Short A in Australian English, In: David Blair and Peter Collins (eds) English in Australia, Amsterdam: John Benjamins, pp. 341-355.

Labov, William 1994. Principles of Linguistic Change. Volume 1: Internal Factors, Oxford: Blackwell.

Lass, Roger 2000. The Cambridge History of the English Language. Volume III 1476-17, Downloaded 9 July 2014 from Cambridge Histories Online, Cambridge University Press http://universitypublishingonline.org/cambridge/histories/

Leitner, Gerhard 2004. Beyond Mitchell’s views on the history of Australian English, Australian Journal of Linguistics 24: 99-125.

Lindblom, Björn 1990. Explaining phonetic variation: a sketch of the H & H theory, In: William J. Hardcastle and Alain Marchal (eds) Speech Production and Speech Modelling, Dordrecht, The Netherlands: Kluwer, pp. 403-439.

Linville, Sue E. and Jennifer Rens 2001. Vocal tract resonance analysis of aging voice using long-term average spectra, Journal of Voice 15: 323-330.

Lobanov, Boris M. 1971. Classification of Russian vowels spoken by different speakers, The Journal of the Acoustical Society of America 49: 606-608.

http://universitypublishingonline.org/cambridge/histories/


470

MacMahon, Michael K. C. 1998. Phonology, In: Suzanne Romaine (ed) The Cambridge History of the English Language, Volume IV 1776-1997, Cambridge: Cambridge University Press, pp. 373-535.

McMahon, April 2000. Lexical Phonology and the History of English, Cambridge: Cambridge University Press.

Mitchell, Alexander G. 1958. Spoken English, London, Macmillan. Mitchell, Alexander G. and Arthur Delbridge 1965a. The Pronunciation of English

in Australia, Sydney: Angus and Robertson. Moore, Bruce 2008. Speaking our Language: The Story of Australian English.

Melbourne: Oxford University Press. Nevalainen, Terttu 2006. An Introduction to Early Modern English, Edinburgh:

Edinburgh University Press. Oasa, Hiroaki 1989. Phonology of current Adelaide English, In: Peter Collins and

David Blair (eds) Australian English: The Language of a New Society, St. Lucia: University of Queensland Press, pp. 271-287.

Reubold, Ulrich and Jonathan Harrington 2015. Disassociating the effects of age from phonetic change: a longitudinal study of formant frequencies, In: Annette Gerstenberg and Anja Voeste (eds) Language Development: The Lifespan Perspective, Cambridge: Cambridge University Press, pp. 9-37.

Roach, Peter 2004. British English: Received Pronunciation, Journal of the International Phonetic Association 34: 239-245.

Trudgill, Peter 2004. New-Dialect Formation: The Inevitability of Colonial Englishes, Oxford: Oxford University Press.

Vonwiller, Julia, Inge Rogers, Chris Cleirigh and Wendy Lewis 1995. Speaker and material selection for the Australian National Database of Spoken Language, Journal of Quantitative Linguistics 2: 177-211.

Walker, John 1791. A Critical Pronouncing Dictionary, London: G. G. J. and J. Robinson, and T. Cadell, online source Eighteenth Century Collections Online, Gale Cengage Learning, accessed 1 July 2014.

Watson, Catherine I. and Jonathan Harrington 1999. Acoustic evidence for dynamic formant trajectories in Australian English vowels, Journal of the Acoustical Society of America 106: 458-468.

Watson, Catherine I., Margaret Maclagan and Jonathan Harrington 2000. Acoustic evidence for vowel change in New Zealand English, Language Variation and Change 12: 51-68.

Wells, John 1982. Accents of English. Vol 3: Beyond the British Isles, Cambridge: Cambridge University Press.

Yallop, Colin 2003. A. G. Mitchell and the development of Australian pronunciation, Australian Journal of Linguistics 23: 129-141.

Yuen, Ivan, Felicity Cox and Katherine Demuth 2014. Three-year-olds’ production of Australian English phonemic vowel length as a function of prosodic context, Journal of the Acoustical Society of America 135: 1469-1479.

Sóskuthy, Hay, Maclagan, Drager and Foulkes Early New Zealand English --- Page 471 of 525

471

22 Early New Zealand English The closing diphthongs Márton Sóskuthy, Jennifer Hay, Margaret Maclagan, Katie Drager and Paul Foulkes 1 Introduction This chapter presents a detailed analysis of the changes affecting the closing diphthongs (PRICE, MOUTH, FACE and GOAT using the lexical set notation introduced in Wells 1982) in early New Zealand English (NZE). The analysis is based on a data set of over 10,000 auditorily coded vowel tokens from the Origins of New Zealand English (ONZE) corpus. We look at two different but related developments: diphthong shift (which affects the nuclei of the diphthongs) and glide weakening (which affects the offglides). Our main goals are to verify previous claims about the unfolding of these processes in NZE and to add further detail to existing descriptions. The current data set is particularly well-suited to these aims inasmuch as it contains tokens of the relevant diphthongs in a wide range of different environments from speakers with diverse backgrounds born between 1857 and 1904. We also introduce a number of methodological innovations which help us illustrate and analyse otherwise less obvious aspects of these changes. We use two different data visualisation methods to provide a clear picture of both the phonetic details and the time-course of the relevant developments in NZE. We refer to these methods as pom-pom graphs and ribbon graphs (the motivation for these names will become clearer once the relevant figures have been introduced in Section 4). In addition, we use speaker-specific random intercepts extracted from mixed effects logistic regression models to map the relationships among the different changes (cf. Drager and Hay 2012). We argue that this method allows us to look for causal relationships among different changes that are not simply due to the confounding effects of other predictors such as time. The structure of the chapter is as follows. Section 2 provides a general description of diphthong shift and glide weakening, followed by an overview of previous claims about these changes and a summary of the theoretical and descriptive goals of the chapter. Section 3 introduces our data set in more detail and outlines our main analytical methods. Section 4 presents our results through a combination of visualisation techniques (cf. above) and mixed effects logistic regression. Section 5 provides a summary of our findings and relates them to the issues raised in Section 2. Section 6 concludes the chapter.

2 Closing diphthongs in NZE and elsewhere The terms ‘diphthong shift’ and ‘glide weakening’ were introduced by Wells (1982) to describe widespread changes in the history of English. Diphthong shift refers to movement of the nucleus of closing diphthongs beyond the endpoint of the


472

Great Vowel Shift. The result is an increasingly wide trajectory of the vowel, as the nucleus generally moves further away from the offglide. To give an example, RP is traditionally described as having a high-mid [eɪ] realisation for the FACE lexical set. Varieties with diphthong shift, such as NZE or vernacular London English may realise this vowel with a low nucleus such as [aɪ], thus increasing the phonetic distance between the nucleus and the offglide (Wells 1982: 306-310). Glide weakening is inferred to be a subsequent process in which the offglide moves so that ‘it more closely approaches’ the nucleus, thus narrowing the gap between the two (Trudgill 2004: 140).

New Zealand English exhibits both diphthong shift and glide weakening for all four of the closing diphthongs: PRICE, MOUTH, FACE and GOAT. As a result of diphthong shift, all of these diphthongs show a wider trajectory in NZE than in RP.1 Gordon et al. (2004:149) outline the relevant processes as follows: for PRICE, diphthong shift involves further backing of the nucleus from [ɐ] to [ɑ], [ɒ] or [ɔ] (RP [ɐɪ]); in the case of MOUTH, diphthong shift consists of a nucleus that is fronter than [ɐ] or higher than [a] (RP [ɐʊ]); shifted FACE has a nucleus lower than [e] (cf. RP [eɪ]); GOAT has a nucleus fronter than or lower than [o] (RP [əʊ], itself showing diphthong shift). Glide weakening yields offglides that are lower than [ɪ] and [ʊ], and thus phonetically closer to the quality of the nucleus. The similarity between the nucleus and the offglide may be strengthened through further processes such as centralisation and, occasionally, complete assimilation.

Diphthong shift is a salient feature in the history of NZE, but it is by no means restricted to this variety. Wells (1982: 256) reports diphthong shift in Cockney, local varieties in the south and midlands of England, and Australian English. Labov (1994: 208-218) suggests that diphthong shift was part of a larger set of developments referred to as the ‘Southern Shift’, which also involves a number of southern American dialects. Although the realisations of the closing diphthongs are broadly similar in these varieties, it is not clear whether they should all be regarded as examples of the same type of pattern. For instance, Kerswill et al. (2008, citing Britain 2005) argue that many varieties never developed the [aʊ] realisation that is usually considered the starting point of diphthong shift in MOUTH, and instead had a ‘preshifted quality of MOUTH as [əʊ]’ (Kerswill et al. 2008). Diphthong shift for such a dialect would consist of fronting and possibly lowering. This is quite different from diphthong shift with [aʊ] as its starting point, which consists of fronting and raising.

As outlined by Gordon et al. (2004), the British historical record indicates that at least some diphthong-shifted versions of all of the vowels are likely to have come to New Zealand with the early British settlers. Diphthong-shifted versions of each vowel did already exist in some parts of the British Isles at the relevant time,

1 Note that we are not claiming that RP was the starting point for diphthong shift in NZE. We are simply following Wells (1982) in using RP as a reference point. The reference transcriptions given for RP are derived from the realizations shown in Roach (2004).


473

that is, around the mid nineteenth century. Notably, however, Scottish dialects do not display any diphthong shift.

Gordon et al. (2004) discuss an ‘auditory perceptual analysis’ conducted by one of the authors, Peter Trudgill. This analysis involved listening to 95 New Zealand speakers (born 1851-1904, and included in ONZE’s Mobile Unit corpus, described below), and gaining an ‘overall evaluation’ of their phonetic productions (Gordon et al. 2004: 90). These evaluations were recorded on a specially designed template and used as the basis for the auditory perceptual analyses presented in Gordon et al. (2004). Inferences were then drawn from these overall observations regarding the order of changes and possible implications. Approximate quantifications were given, but no statistical analysis was completed.

In this study, we revisit this data set and conduct a token-wise analysis of a subset of the speakers discussed by Gordon et al. (2004). By developing a quantitative profile of the distribution of variants used by each speaker, we are able to move beyond ‘overall impressions’ and ask questions relating to, for example, the conditioning environments underlying the changes the diphthongs have undergone. In providing a more detailed descriptive account of diphthong shift and glide weakening in NZE we are also able to address in detail a number of previous claims about these changes. We now briefly review some of these claims.

Trudgill (2004), drawing on the data discussed by Gordon et al., claims that diphthong shift affected the closing diphthongs in the following chronological order: MOUTH > PRICE > GOAT > FACE.2 This order is claimed to manifest itself at the level of individuals as “an implicational scale such that, for example, speakers who have shifted [GOAT] will necessarily have shifted [PRICE] but not necessarily [FACE]” (Trudgill 2004: p. 50). Gordon et al. (2004) are not quite so strong about the overall implicational relationship, although they do note it for one particular vowel pair: “Malcolm Ritchie (born 1866) is the only speaker to produce some diphthong-shifted variants of PRICE (with or without glide weakening) without also producing diphthong-shifted versions of MOUTH….” (Gordon et al. 2004:155). A similar (but less strong) claim is made about the relationship between FACE and MOUTH (p. 157).

In Gordon et al. (2004), glide weakening is presented as intricately linked with diphthong shift, and therefore the rates of glide weakening are not reported separately. For example, the figures for MOUTH distinguish between “speakers producing non-shifted dialect forms (~20%), those producing no diphthong shift or glide weakening (nearly 10%), and those showing some degree of diphthong shift or diphthong shift plus glide weakening.” (Gordon et al. 2004:152). Although we do not learn much about the degree of glide weakening from such a description, we can infer that it is tightly linked to, and subsequent to, diphthong shift. While both MOUTH and PRICE have some degree of glide-weakening, it is also noted that there

2 Indeed, this is the order stated by Gordon et al. (2004: 209), and indicated by claims about the data on page 159. However on page 149, they claim that their analysis will show that “diphthong shift began with MOUTH and PRICE, probably in that order, before spreading later to FACE and GOAT, in that order.” As the data they present, and claims later in the book, are compatible with GOAT preceding FACE, we can only assume that this is a typo or oversight on the part of the authors – two of whom, we must sheepishly point out, also appear as authors on the current chapter.


474

are no observed tokens of glide-weakened FACE and GOAT. This is taken as further evidence that these two vowels lag behind in the overall shift.

One issue that remains unclear is whether these changes are developments internal to NZE, or whether they were transplanted to New Zealand by settlers from other parts of the English-speaking world. Britain (2008) argues that the fronted realisation of MOUTH is not the result of a gradient process of diphthong shift that took place solely in NZE, but a feature that was inherited from non-standard varieties of Southern British English. He presents early dialect evidence which suggests that fronted variants were the norm in the rural English communities from which a large proportion of early New Zealand settlers came. According to his account, the fronted variant imported from these dialects was dominant in NZE from the very beginning, and eventually it replaced other variants through a process of dialect levelling. Two important implications of Britain’s hypothesis are (i) that we should find a large proportion of fronted variants in the speech of early NZE speakers, and (ii) that we should then see a gradual increase in the frequency of these variants.

Another claim that has some relevance to our current investigation is made by Trudgill (2004) in relation to the emergence of Canadian Raising-type patterns. Canadian Raising is an alternation involving PRICE and MOUTH, with central and raised variants occurring before voiceless obstruents ([əɪ] and [əʊ], respectively) and lower and more peripheral variants elsewhere. As we will see in section 4 the PRICE vowel in early NZE shows precisely this type of alternation. Trudgill’s (2004: 88) account makes very specific predictions about the emergence of such patterns. He argues that Canadian Raising results from the reallocation of variants from different dialects (e.g. central variants in Scottish dialects and peripheral variants in southern English dialects) to different phonological environments. Moreover, the account presented in Trudgill (2004) suggests that such reallocation will take place in the late stages of dialect formation. Trudgill estimates that in NZE this late stage would be exemplified by the speech of those speakers born after approximately 1890 (Trudgill 2004:113). Therefore, Trudgill’s account predicts that the Canadian Raising-type pattern we observe in our data set should start emerging in these later-born speakers and should be absent in the speech of the older speakers. In addition, the distribution of variants in the speech of earlier generations should be such that speakers with Scottish ancestry show more central variants than others.

In our analysis we also aim to uncover more detail about how these changes unfolded over time. Trudgill (2004:50) reports an overall increase of diphthong shift over the time period we are investigating, with 68% of speakers born 1850-1869 displaying at least some diphthong shifted tokens, increasing to 81% for speakers born 1870-1889. These figures would indicate that there is indeed change in progress, and that understanding the trajectory of this change may be revealing. Further, glide weakening is also reported to increase, with none of the earliest born speakers (1850-59) producing glide-weakening on at least one diphthong-shifted vowel, but 57% for the later born speakers (1870-1879) doing so (Trudgill 2004: 141). In summary, then, the theoretical and descriptive goals of this chapter are as follows:


475

Firstly, we aim to address various claims in the literature

1. diphthong shift affected closing diphthongs in NZE in the order MOUTH > PRICE > GOAT > FACE (Trudgill 2004, Gordon et al. 2004);

2. glide weakening is tightly linked to diphthong shift (Gordon et al. 2004); 3. MOUTH did not undergo diphthong shift in NZE: the majority variant in the

settlers’ speech was already fronted, and the back variants disappeared through a process of levelling (Britain 2008);

4. speakers with Scottish or Ulster Scots ancestry have more raised / central realisations of PRICE and MOUTH than other speakers; reallocation of these variants may have led to Canadian Raising-type patterns in speakers born after 1890 (Trudgill 2004).

Secondly, in addition, we attempt to develop a more comprehensive descriptive account of diphthong shift by looking at

1. the temporal unfolding of diphthong shift and glide weakening; 2. the phonetic and social conditioning of diphthong shift and glide

weakening; 3. relationships across different vowels in the extent of diphthong shift and

glide weakening (suggestive of chain shifts, parallel changes, etc.). 3 Methods 3.1 The ONZE corpus The ONZE corpus at the University of Canterbury contains recordings from more than 600 speakers born between 1851 and 1987. The corpus is divided into three archives depending on the provenance of the original recordings: the Mobile Unit (MU) archive (speakers born between 1851 and 1904) the Intermediate Archive (speakers born between 1891 and 1953) and the Canterbury Corpus (speakers born between 1926 and 1987). For this analysis, speakers were drawn from the MU archive, the oldest in terms of speaker ages and recording dates. The recordings were made by the Mobile Unit of the New Zealand Broadcasting Service, which travelled around New Zealand between 1946 and 1948 recording, among other things, pioneer reminiscences. The original recordings were made on acetate discs, and were copied onto cassette tapes and were then later digitised for preservation and analysis (see Gordon, Maclagan and Hay 2007 for more details). 3.2 The current data set The data that serve as the basis of this chapter come from a large set of auditorily coded tokens of PRICE (2943 tokens), MOUTH (2139), FACE (3638) and GOAT (3173) from 33 speakers in the ONZE corpus. In what follows, we provide a description of the data set in terms of the speakers, the main principles for auditory coding and our criteria for excluding problematic tokens.


476

The speakers included in the data set are relatively evenly distributed across the two sexes (15 female, 18 male). However, there was considerable variation in the number of tokens per speaker, with some speakers contributing fewer than 20 tokens per vowel and others nearly 200. Females comprise 45% of the speakers in the sample, but males produced more tokens on average than the females, so the overall proportion of tokens from females is 40%. The youngest speaker was born in 1857 and the oldest in 1904. The distribution of birth year was slightly skewed towards older speakers, with 19 born before 1877 (the mean year of birth) and 14 born after or in 1877.

The ONZE corpus includes information on the origins of the speakers’ parents, which should provide some indication of the linguistic input they received during their childhood. Table 1 is a contingency table showing the numbers of speakers with specific combinations of parents.

FATHER MOTHER

Austr

Eng Ire NZ Other

Scot unknown

TOTAL

Austr 0 0 0 0 0 2 0 2 Eng 0 2 0 1 2 1 0 6 Ire 1 1 1 0 0 0 0 3 NZ 0 1 0 1 0 0 0 2 other 0 0 1 0 0 0 0 1 Scot 0 1 1 0 0 9 1 12 unknown

0 1 0 2 1 2 1 7

TOTAL 1 6 3 4 3 14 2 33 Table 1. The parents’ origins for the speakers in the data set. As is clear from the table, the data set is heterogeneous in terms of where the speakers’ parents come from. Most speakers had parents with different origins from one another, and the places of origin for several are unknown or ‘other’ (i.e. not from the British Isles, Australia or New Zealand). There is, however, one clear tendency in Table 1: over half of the speakers have at least one Scottish parent, and nearly a third have parents who are both from Scotland. These distributional peculiarities will have important consequences for the set of predictors that we can include in the analysis (as explained in the next section). The categories for the auditory coding were established after an initial impressionistic auditory analysis which sought to uncover the full range of variation in our data set. The nucleus and the offglide of each diphthong received separate numeric codes, indicating how far advanced they are in terms of diphthong shift and glide weakening, respectively. The codes ranged from (0) to (2) (or (3) in the case of MOUTH), with (0) indicating the most conservative pronunciation and (2) or (3) the most innovative. In the case of the nucleus, the (0) realizations are the only ones that would not count as ‘diphthong-shifted’ tokens according to the specifications provided by Gordon et al. (2004). Table 2 shows the coding scheme for the four diphthongs. One coder analysed the FACE and MOUTH diphthongs and the fourth author analysed PRICE and GOAT. The coders were trained by an


477

experienced phonetician (the third author) who checked their analyses until she was satisfied. In addition, four of the speakers (around 10% of the data set) were completely re-checked (by the third author) for accuracy.

DIPHTHONG SHIFT GLIDE WEAKENING 0 1 2 3 0 1 2 PRICE ɐ ɒ ɒ – i~ɪ e MOUTH ɑ ɐ æ ɛ – ʊ ə FACE e ɛ æ – i~ɪ e GOAT o ɔ ɐ – ʊ ə

Table 2. Coding scheme for the closing diphthongs. The lowest score for glide

weakening, i.e. (0), indicates no off-glide resulting in a monophthongal token

In order to prepare the transcripts for coding, potential tokens of the four diphthongs were identified and numbered automatically in the transcripts. The coders then made their analyses by listening to the auditory recordings. A maximum of 200 tokens per speaker were analysed for each diphthong, with no more than 10 tokens of any one word per speaker in order to avoid introducing strong lexical biases. We only coded diphthongs if they satisfied the following two conditions: (i) the target word had some degree of sentence stress, and (ii) the syllable containing the diphthong had word stress. The coders were not aware of any particular research hypotheses and nor were they aware of the birth dates of the speakers they were working on. This ensured that they could not unintentionally bias the analysis in any way. A number of tokens were excluded from the final analysis, for a variety of reasons. First, we excluded two speakers (a male and a female) since a preliminary analysis showed that they exerted an unduly large influence on the statistical models for diphthong shift in the MOUTH and the FACE vowels. This was determined by looking for clear outliers in the distributions of random intercepts from our mixed effects logistic regression models (cf. Section 4.1.2).3 The analysis reported here is therefore based on 31 speakers. There were also a small number of tokens where the coder had provided problematic auditory codes which did not fit into the categories in Table 2 (e.g. diphthong shift codes of 4). These may have been typos or extreme values, but either possibility warranted exclusion. Tokens where the diphthong was followed by an /r/ were also discarded, as this was a relatively small group and we felt that the phonetic influence of following rhotics could interfere quite strongly with diphthong shift and glide weakening. We also excluded all tokens with a following vowel, and MOUTH tokens with a following lateral. Both these environments could also have had strong influences on 3 We decided to exclude these speakers on the basis of the following assertion in Tagliamonte and Baayen (2012: 144): “Although mixed effects models can bring individual differences into the statistical model, they do not protect against distortion by atypical outliers. Model criticism is an essential part of good statistical practice, irrespective of whether a mixed-effects approach is adopted.”


478

diphthong shift and glide weakening, and both were very rare in the data set. The final token counts were: PRICE (2622), MOUTH (1826), FACE (3323), GOAT (2854). 3.3 Data analysis The data were analysed using a combination of visualisation techniques, mixed-effects logistic regression modelling and pairwise correlations among random intercepts. All of the data analysis was performed using the open-source R software package (R Core Team 2013) with the lme4 (Bates et al. 2011) and the languageR (Baayen 2011) libraries. Section 4 describes all of these techniques in detail. The present section serves to introduce the main predictors that we used in our regression models. Year of birth This is simply the year of birth of the speakers represented as a continuous variable. Mixed effects modelling works more reliably when the numeric predictors in the model are centred around 0 and the ranges of their values are not too wide, so we centred and scaled this variable. We transformed these values back to their original form for those partial effects plots that look at the influence of year of birth, to aid interpretability. It should also be noted that we represented year of birth using restricted cubic splines with three knots (Baayen 2006) to allow for possible non-linearities in the temporal evolution of diphthong shift and glide weakening. This means that the regression models allowed for at most one turning point in the trajectory when estimating the influence of year of birth. Sex This binary variable represents the biological sex of the speakers with two possible values: female and male. Sonority class of following segment We included this categorical variable in the models since following segments often have a strong influence on changes in the realisations of closing diphthongs in English (e.g. the voicing of the following segment in Canadian Raising-type patterns). After running preliminary models it was determined that place of articulation did not have a significant effect on diphthong shift and glide weakening, so we decided to focus on the manner of articulation and the voicing of the following consonant. As these two parameters were not independent of one another, we collapsed them into a single variable which represents the sonority class of the following segment. This variable had the following levels: word-final, voiceless obstruent, voiced obstruent, nasal and lateral. Since tokens with following rhotics were excluded from the data set, the label lateral can be used interchangeably with liquid. We use the term lateral to avoid confusion. Scottish parent We noted above that the set of speakers is relatively heterogeneous in terms of the parents’ origins. Therefore, we decided not to include a general predictor that covers a range of different places of origin for the parents (as was done in Gordon


479

et al. 2004). Instead, we created a single binary predictor that partitions the data set into two relatively well-balanced portions: speakers with at least one Scottish parent (16 speakers) and speakers with no Scottish ancestry (17 speakers). This binary division is methodologically convenient, as it splits the data relatively evenly. It is also theoretically motivated, as there was a non-trivial influence of Scottish English on the formation of NZE (Trudgill, Maclagan and Lewis 2003), and Scottish English is particularly notable in the current context, in having an absence of diphthong shift. 4 Results The processes of diphthong shift and glide weakening will be discussed separately in sections 4.1 and 4.2. These sections are structured similarly: we first present a general overview of the changes in the distribution of conservative and innovative variants over time, followed by a more principled investigation based on mixed-effects logistic regression. We also attempt to relate the data to the specific research questions raised in section 2. The broader implications of our findings are discussed in section 5. 4.1 Diphthong shift in the ONZE corpus 4.1.1 Diphthong shift as a function of time We use two different types of graphs to give the reader a general sense of the changes affecting the nuclei of closing diphthongs. Both of these graphs visualise the proportions of different variants in specific subsets of our data. These subsets consist of tokens from speakers born within a specific time window. The first type of graph, which we refer to as a pom-pom graph, illustrates the proportions of the different variants for the nuclei in a given time period by plotting circles in a schematic articulatory space. Each diphthong is represented by a set of circles connected by lines, where the individual circles correspond to different phonetic variants (cf. Table 2). The name of the lexical set corresponding to the diphthong appears next to the most conservative variant – the variant that is not ‘shifted’. The area of each circle is proportionate to the relative frequency of the variant that it represents. Figure 1 shows three such graphs representing different periods in the history of NZE.


480


481

Figure 1. Three pom-pom graphs illustrating the proportions of different phonetic

variants for PRICE, FACE, MOUTH and GOAT in NZE for speakers born between 1860-1870 (top), 1870-1880 (middle) and 1880-1900 (bottom).


482

The second type of graph, which we refer to as a ribbon graph, also plots the proportions of variants in different time windows, but it provides a more fine-grained view of the temporal evolution of the diphthongs. In order to create these graphs, we first calculated the proportions of nuclear variants for each diphthong within successive overlapping time windows (this is analogous to the method of moving averages in statistics). A separate plot is created for each diphthong, where the x-axis represents the centre of the time window for a given set of proportions, which are plotted along the y-axis. The variants themselves are represented by horizontal bands of different shades of grey, where the darkest band indicates the most conservative variant. Figure 2 shows four ribbon graphs representing the four diphthongs.


483

Figure 2. Ribbon graphs illustrating changes in the proportions of different

phonetic variants for PRICE, MSOUTH, FACE, and GOAT in NZE for speakers born between 1865-1900. Darker colours indicate more conservative variants.

Both the pom-pom graphs and the ribbon graphs demonstrate the overall increase in the proportion of innovative variants over time. However, there are a number of interesting differences in the patterns of change shown by the individual diphthongs. PRICE and MOUTH show no clear increase in the frequency of innovative variants until around 1880-1885 (and, in fact, MOUTH appears to exhibit a slight decrease in the initial period). Even after 1880, the size of the increase is relatively small. This is particularly easy to see in the ribbon graphs, but it is also discernible


484

in the pom-pom graphs, where the main changes affecting PRICE and MOUTH only become apparent in the last panel. Note that at the beginning, the most frequent variant is the low central vowel [ɐ] for both PRICE and MOUTH. Therefore, it appears that the starting point at the beginning of the observed time period is the same nuclear variant for both lexical sets. The innovative variants are in the back region of the vowel space in the case of PRICE, and in the front region in the case of MOUTH. Since the frequency of the innovative variants increases at the expense of conservative variants, it is relatively easy to interpret these shifts in variant proportions as parallel processes of backing for PRICE and fronting for MOUTH (see section 4.1.2 for more detail).

The changes affecting FACE and GOAT are rather different. Both diphthongs show a gradual decrease in the frequency of the most conservative variant, and a more sudden increase in the frequency of the most innovative variant. Importantly, the changes are already underway in the oldest speakers, which is in stark contrast with the changes observed for PRICE and MOUTH. It should also be noted that the phonetic interpretation of the changes affecting FACE and GOAT is less straightforward than it is for the other two diphthongs. The time course of the shifts is very similar, but the directions of the changes are somewhat different: FACE mainly lowers, while GOAT undergoes both fronting and lowering at the same time.

Viewed this way, the data tell a different story about the chronology of the changes than that reported in Gordon et al. (2004) or Trudgill (2004) (MOUTH > PRICE > GOAT > FACE). The order displayed by our data depends on whether one is focusing on (i) the actual proportions of variants that can be defined as ‘diphthong-shifted’ or (ii) the rates at which these proportions change. Neither ordering matches that put forth by previous authors.

Let us first look at the data using approach (i), that is, looking at the proportions of different variants. If we take at face value the definition of diphthong shifted variants from Gordon et al., then MOUTH has the most ‘shifted’ tokens and, indeed, is already shifted at the beginning of the observable time period. However, it is not undergoing change over the time period we analyse – not until the very end of it. FACE and GOAT contain fewer conservative variants than PRICE, and are changing from the very beginning. The rate of change for FACE seems to be slightly faster than for GOAT. PRICE displays the most ‘non-shifted’ variants, and remains relatively stable until towards the end of the period. At all stages of the shift, it appears the most conservative. In terms of the appearance of diphthong shifted variants (as defined by Gordon et al.), the chronology appears to be MOUTH > FACE > GOAT > PRICE.

However, if we did not bring the theoretical construct of ‘shifted’ versus ‘non-shifted’ tokens to the analysis, then the trajectories of change themselves (as per (ii) above) would seem to suggest a different order. Specifically, it looks like FACE and GOAT move early, and MOUTH and PRICE move late. Overall, it appears that NZE may have inherited substantial numbers of diphthong-shifted variants of MOUTH, which formed the input to the dialect. Within NZE, then, changes subsequently took place in FACE and GOAT, which were then followed by MOUTH and PRICE.

The fact that diphthong-shifted variants of MOUTH are dominant even for our oldest speakers seems to support Britain’s (2008) claim that MOUTH inherited diphthong shift from other English varieties. However, what we see in NZE is not a


485

simple case of levelling: the majority variant for the onset of the MOUTH diphthong in early NZE speakers seems to be central [ɐ], but this is not the variant that eventually came to dominate in NZE (Maclagan, Gordon and Lewis 1999). Figure 2 suggests that the more fronted variants started to replace central [ɐ] through a slow and gradual process shown by speakers born after 1875. It seems much more likely that this was a gradient shift internal to NZE rather than a case of levelling. 4.1.2 The phonetic and social conditioning of diphthong shift This section presents results from mixed-effects logistic regression models for each of the four diphthongs. The models investigate both the time course and the phonetic/social conditioning of the changes. This will allow us to develop a more accurate description of diphthong shift and also to test specific hypotheses about it. The section is structured as follows. We first provide a brief outline of our regression models and the procedure we used to build and evaluate them. We also highlight those factors that proved significant for all four of the diphthongs. This is followed by a discussion of the specific results for each lexical set. Finally, we attempt to provide some insight into the causal relationships among the four different changes using random intercepts from the regression models. We built separate mixed-effects logistic regression models for each vowel. The dependent variables for these models were binary factors that indicate whether the nuclear variant for a given token is conservative or innovative. We defined the conservative and innovative groups in a way that minimises the difference between the numbers of tokens in each of them (cf. Gordon et al. 2004 for a similar approach) This classification was based on pragmatic considerations: logistic regression is most interpretable when there is a dependent variable with only two levels, and works best when these two levels are both well-represented in the data set. Thus, in the case of PRICE, tokens with a nucleus coded as (0) were classified as conservative and those with a code of (1)–(2) as innovative. For all other vowels, the conservative group included codes (0)–(1), and the innovative group all higher codes. The independent variables have already been described in Section 3.3. All of our models were initially fit using the same set of main effects and interactions. These are outlined below:

1. year of birth (represented as restricted cubic splines with three knots to allow for non-linearities; Baayen 2008: 176-179)

2. sex 3. the sonority class of the following segment 4. whether the speaker had at least one Scottish parent 5. the interaction between sex and year of birth 6. the interaction between sonority class and year of birth 7. the interaction between Scottish parent and year of birth 8. the interaction between Scottish parent and sonority class

In addition to the independent variables listed above, the models also included random intercepts for the speakers and the words.


486

The models were built using backwards stepwise regression. This means that we started with full models including all the independent variables listed above, and then gradually eliminated those variables that did not improve the model fit significantly. All model comparisons were performed using chi-square difference tests with α = 0.05 as the significance threshold. In cases where sonority class was significant, further investigation was conducted in order to determine whether there were grounds for collapsing this factor down into a smaller number of contrasts. That is, were all five codes for sonority class justified, or was its significance driven by a more simple contrast (e.g. voiced versus voiceless)?

Although the set of significant predictor variables that were retained in the models was different for each of the vowels, there were some general trends that held across the entire data set. Importantly, all of the models retained year of birth as a significant predictor: as predicted, speakers born later show a higher proportion of innovative variants. This provides further support for the somewhat informal observation in the previous section that diphthong shift was progressing in all four of the lexical sets in the period between 1860 and 1895. Moreover, three of the four lexical sets (PRICE, FACE and GOAT) showed some amount of phonological conditioning, with certain environments favouring diphthong shift, although the relevant environments were different in each case (see below). The same three lexical sets also showed a significant effect of the parents’ origin: in each case, speakers with at least one Scottish parent were less likely to produce innovative variants than the rest of the speakers. This is consistent with the likely inputs given to the speakers, as Scottish English does not have diphthong shift (Gordon 2004: 154).

It should also be noted that none of the models retained sex, the interaction between sex and year of birth, the interaction between the following environment and the parents’ origin or the interaction between year of birth and the parents’ origin. In what follows, we provide a brief outline of the specific results for each of the vowels. ‘Positive’ and ‘negative’ effects should be interpreted with respect to the probability of innovative variants. Thus, a ‘positive’ effect refers to an increase in the probability of innovative variants. PRICE The regression model for PRICE retained the following effects: the main effect of year of birth (positive), the sonority class of the following environment (see below), whether the speaker had a Scottish parent (negative for Scottish parents) and an interaction between year of birth and the following environment (see below). A more careful investigation of the partial effects associated with the different sonority classes suggests that the influence of the following environment depends mainly on a contrast between following voiceless obstruents versus all other possible following environments. This was confirmed by a model comparison using Akaike’s Information Criterion (AIC), which showed that a model with a simple contrast between following voiceless obstruents and other environments is preferable to a model that includes contrasts among all possible environments.

The partial effects graph in Figure 3 shows the interaction from the model between the following environment and year of birth.


487

Figure 3. The interaction between year of birth and whether the following

environment is voiceless or not. Voicelessness has a negative effect among older speakers, but this effect gradually disappears as year of birth increases. As can be seen in Table 2 and Figure 1 the main difference between conservative (0) and innovative (1, 2) variants for PRICE is one of frontness: conservative variants are central, while innovative variants are back. We have already referred to this pattern in Section 2: the centralisation of the PRICE vowel in voiceless contexts mirrors Canadian Raising-type patterns observed in a wide range of other English varieties (cf. Trudgill 2004: 51). What is interesting is the development of this pattern: the influence of voicelessness is quite strong in the oldest speakers, but it appears to diminish rather quickly, and we see no voicing effect from birth dates around 1880 onwards. As Section 5 will show, this has important consequences for Trudgill’s account of the emergence of Canadian Raising-type patterns outlined in Section 2. MOUTH Only a single predictor was retained in the final model for MOUTH : year of birth (positive). Model comparison between the final model with a non-linear main effect for year of birth and a more streamlined model with a simple linear main effect showed that the non-linear component of the model is not justified. In other words, the proportion of innovative variants showed a steady increase in the period under investigation; while there may be a slight decrease in innovative variants around 1875, it is not large enough to warrant using the more complex, non-linear model. FACE Three predictors were retained in the final model for FACE: year of birth (positive), the sonority class of the following environment and Scottish parents (negative). Using the same procedure as for PRICE, it can be shown that the influence of the following environment is mainly carried by a distinction between word-final


488

position and other environments: word-final tokens show a higher proportion of innovative variants than the rest of the tokens. Moreover, the non-linear component of the predictor representing year of birth is not justified: a model comparison based on AIC suggests that a model with a simple linear main effect for year of birth is preferable. It appears that the proportion of innovative variants for FACE rose steadily in the second half of the nineteenth century. GOAT The following predictors were retained for GOAT: year of birth (positive), the sonority class of the following environment, Scottish parents (negative) and an interaction between year of birth and following sonority class. A model comparison reveals that the effect of following sonority class rests mainly on a distinction between following laterals and other environments. Specifically, GOAT is less affected by diphthong shift when followed by a lateral [l] – although this effect only holds among the younger speakers, as shown by Figure 4.

Figure 4. The interaction between year of birth and whether the following sound

is a lateral or not. Note that the distinction between innovative and conservative variants for GOAT is based on a difference in backness: innovative variants are central, while conservative ones are back (cf. Figure 1). This implies that following laterals act against a process of fronting that affects the GOAT nucleus in other environments. This is a familiar phenomenon which is often reported for dialects where the GOOSE and the GOAT lexical sets are undergoing fronting (see e.g. Labov 1994: 332, Haddican et al 2013). However, the case of GOAT in NZE seems special in that pre-lateral tokens of GOAT do not simply block fronting, but appear to actively push the nucleus of the diphthong towards the back region of the vowel space.


489

Causal relationships among different lexical sets We have now looked at each of the diphthongs separately and found a number of factors that hinder or facilitate diphthong shift. We were also interested in investigating whether the progress of diphthong shift in one vowel is related to its progress in a different vowel, and whether such relationships can be argued to involve an element of causation. We can test for relationships across vowels by treating each speaker as a single data point, and creating an index for their overall realization of each vowel by averaging their diphthong shift codes (cf. Table 2). If the speaker-specific averages are significantly correlated across two vowels, we might infer that the two vowels are not independent with respect to diphthong shift (this type of methodology and argumentation is used by Gordon et al. 2004 and Hay and Sudbury 2005 for different variables).

Investigation of the degree of association between the different vowels reveals high degrees of correlation. Speakers who are more innovative with one vowel are also likely to be more innovative with another. These significant correlations are shown in Figure 5.

Figure 5. Top-right: scatterplots of per-speaker average diphthong shift indices

(shown along the x and y axes) for different pairs of vowels (each dot represents a speaker); bottom-left: parametric (r = Pearson) and non-parametric (rs = Spearman) correlation coefficients based on the data shown in the scatterplots. The cells along the diagonal plot the distribution of average diphthong shift indices for a given vowel.


490

Figure 5 shows the correlations among the speaker-specific average diphthong shift indices for the different vowels (this figure was created using pairscor.fnc, available in the languageR library). Each row and each column corresponds to a specific lexical set. For instance, the first row and the first column both represent PRICE, while the second row and the second column both represent FACE. The lower left half of the rectangle shows parametric and non-parametric correlation coefficients and significance values at the intersections of different columns and rows. The upper right half of the rectangle plots the random intercepts for the speakers against each other for different pairs of vowels. In the following discussion, we refer to the individual panels by specifying their column and row numbers, in that order. For instance, Figure 5:(3,2) refers to the scatterplot showing the relationship between GOAT and FACE (third column, second row).

It is not possible to infer directly any causal relationship from this figure, for reasons that will be explained below. However, it is nonetheless useful for assessing the claim that there is an implicational scale between the variables (see Section 2). For convenience, the hypothesized scale is repeated here: MOUTH > PRICE > GOAT > FACE.

Inspection of the raw data reveals no evidence for the implicational scale proposed by Trudgill. There are speakers, for example, with shifted GOAT but very little shifted PRICE. Figure 5:(3,1) gives mean values for each speaker. Inspection of this plot shows that, while there is a relationship between the two, some speakers are more advanced in GOAT, and some in PRICE. And, indeed, as already noted, the apparent overall degree of shifting of GOAT appears to be higher. The relationships between the other vowels tell a similar story.

Unfortunately, attempting to infer any causality from the correlations in Figure 5 runs into a problem in the case of the current data set (and indeed, probably most others where sound changes are occurring concurrently). It would in fact be very surprising to see no overall correlation between diphthong shift in two different vowels given that the proportion of innovative variants is strongly and positively correlated with year of birth for all the vowels (as shown by the logistic regression models described above). In other words, any relationship between two vowels might be simply due to the confounding effect of year of birth (and possibly other speaker-related variables, such as the parents’ origin). In order to avoid this issue, we use the random intercepts for the speakers from the models built for each vowel. These random intercepts correspond to individual-level variation beyond what the model can predict on the basis of fixed effects such as year of birth and Scottish parents. The variation represented by the fixed effects cannot have a confounding effect on the correlations among the random intercepts for different vowels, since their influence has already been accounted for by the regression models. Therefore, a significant correlation between the random intercepts for two vowels indicates either a causal relationship or the influence of an additional hidden variable that affects both vowels, and was not included in the regression model. We will adopt the first interpretation in presenting our results, but we acknowledge the possibility that some of our findings are due to unknown confounds.


491

Figure 6. Top-right: scatterplots of random intercepts from our statistical models

(shown along the x and y axes) for different pairs of vowels (each dot represents a speaker); bottom-left: parametric (r = Pearson) and non-parametric (rs = Spearman) correlation coefficients based on the data shown in the scatterplots. The cells along the diagonal plot the distribution of random intercepts for a given model.


492

Figure 6 shows the correlations among the random intercepts for the different vowels. The figure reveals four significant correlations: PRICE-FACE (1,2), FACE-MOUTH (2,3), FACE-GOAT (2,4) and GOAT-MOUTH (3,4) (all significant by a non-parametric correlation test). It should be noted, however, that the significance values associated with the correlations between all pairs except FACE-MOUTH are relatively high, and these correlations become non-significant if we control for multiple comparisons using a Bonferroni correction (which lowers the significance threshold to 0.05 / 6 = 0.008). Moreover, even without correction, the correlation between GOAT-MOUTH does not survive the removal of the speaker with the lowest random intercept for GOAT (cf. Figure 6:(4,3)), who appears to be an outlier (but the non-linear correlation between FACE-GOAT does survive; cf. Figure 6:(4,2)). While the results based on the random intercepts are far from unequivocal, we can make a few tentative observations. First of all, the strong and robust correlation between FACE and MOUTH suggests that there is indeed a causal relationship between diphthong shift in these two vowels. Although this finding is somewhat perplexing, an inspection of Figure 1 shows that the majority of the innovative variants for these two diphthongs occupy the same spot in phonetic space. This may well be related to the fact that they are not independent of each other in terms of diphthong shift, but we were unable to find any further indications as to the exact source of this relationship.

While the correlation between MOUTH and GOAT is likely due to a single outlier, the correlation between PRICE and FACE might be indicative of a chain shift. Given the observation in Section 4.1.1 that FACE likely started shifting before PRICE, this appears to be a push chain, where the movement of the FACE nucleus into a lower and more back position forces the PRICE nucleus to shift further back. Furthermore, the correlation between FACE and GOAT could point to a parallel shift in these two diphthongs, which would not be surprising given that these two vowels have evolved in tandem throughout a good portion of the history of English (MacMahon 1998). 4.2 Glide weakening in the ONZE corpus 4.2.1 An overview of glide weakening Table 3 shows the counts and relative proportions of different offglide realisations for each lexical set. Recall that (0) stands for complete nucleus-offglide identity leading to a monophthongal realisation, (1) for no glide weakening, and (2) for a glide weakened [e] (PRICE and FACE) or [ə] (MOUTH and GOAT) offglide.

0 1 2 no. % no. % no. % PRICE 21 1% 2108 80% 493 19% MOUTH 13 1% 1552 85% 261 14% FACE 214 7% 3068 92% 41 1% GOAT 87 3% 2714 95% 53 2%


493

Table 3. The counts and the relative proportions of glide weakening variants for the four different lexical sets.

The figures in this table suggest that the current data set provides limited opportunities for the investigation of glide weakening. This is because the proportion of innovative pronunciations which show glide weakening is very small (especially for FACE and GOAT). The scarcity of innovative tokens (compared to the high number of conservative tokens) makes the results of logistic regression models somewhat unreliable, and also makes pom-pom and ribbon plots quite unilluminating. Therefore, we limit ourselves to a fairly general discussion related to the counts of different variants in this section. The next section presents regression models for PRICE and MOUTH but not for FACE and GOAT.

As can be seen in Table 3, PRICE appears to be the furthest ahead in terms of glide weakening, while the other diphthongs only show a small proportion of innovative variants. This is somewhat surprising given that PRICE appeared to be the most conservative lexical set in terms of diphthong shift, with a high proportion of unshifted variants and a trajectory that showed little change until the end of the period under investigation (cf. 2).

Interestingly, only FACE and GOAT have a non-trivial proportion of (0) variants, that is, variants that do not have an offglide at all. This is likely a result of the fact that these lexical sets have monophthongal realisations in some of the dialects that served as the input to NZE. For our purposes, the most important varieties with monophthongal FACE and GOAT are the ones spoken in Scotland, as this is one of the regions that contributed the highest number of settlers in New Zealand. The panels in Table 4 show that the majority of (0) glide realisations for FACE and GOAT come from speakers with at least one Scottish parent.

FACE GOAT 0 1, 2 0 1, 2 SCOTTISH 173 1530 SCOTTISH 58 1314 OTHER 41 1579 OTHER 29 1453 χ2 = 71.32, df = 1; p < 0.001

χ2 = 6.23, df = 1; p = 0.0126

Table 4. The numbers of (0) offglide variants versus other variants for FACE and

GOAT in speakers with at least one Scottish parent versus speakers with no Scottish parents.

The results of the chi-squared tests reported in the tables confirm that these differences are significant. 4.2.2 The phonetic and social conditioning of glide weakening In order to examine the phonetic and social conditioning of glide weakening in the PRICE and MOUTH vowels, we built mixed-effects logistic regression models along the same lines as in Section 4.1.2. The initial predictors in the model and the procedure for eliminating non-significant variables were the same. The dependent


494

variable was once again a binary factor that divided the glide realisations into innovative and conservative subgroups. Variant (2) was defined as innovative and variant (1) as conservative. As the small number of monophthongal (0) forms may have indicated dialectal variants, they were excluded from the analysis.

Both resultant models retained only a single predictor: the sonority class of the following environment. A number of model comparisons showed that the influence of the environment cannot be reduced to a binary distinction between a single environment and all other environments. For PRICE, the favouring following environments are laterals (28% glide weakened), then nasals, voiced obstruents, voiceless obstruents and pauses (15%), in that order. For MOUTH, the environment most favouring weakening is voiceless obstruents (28% glide weakened), then voiced obstruents, pauses and nasals (6%) in that order.

Thus, while following environment affects these vowels, it does not do so in the same way. The pattern observed for PRICE has been found in a variety of dialects with following voiced obstruents and sonorants favouring glide weakening (e.g. Tristan da Cunha, Schreier and Trudgill 2006; Ann Arbor, US, Dailey-O’Cain 1997; Falkland Islands, Britain and Sudbury 2008). What is probably most important about the results of these models is that year of birth did not significantly influence the proportion of innovative variants. This suggests that glide weakening was not, in fact, a change in progress in the second half of the nineteenth century, but instead represents a residue of the input dialects (at least in the case of PRICE and MOUTH). 4.2.3 Causal relationship between diphthong shift and glide weakening Since Gordon et al. (2004) describe diphthong shift and glide weakening as processes that were causally related, we may expect to find a strong correlation between them.

Inspection of the raw data (not presented graphically here) certainly confirms the interpretation that non-shifted tokens are unlikely to contain glide-weakening. 0% of the non-shifted MOUTH tokens display glide weakening, and only 6% of the non-shifted PRICE tokens do. This compares with much higher rates of glide weakening amongst shifted tokens (15% for MOUTH, and 36% for PRICE). This supports an analysis in which glide-weakening is a process that largely operates on diphthong-shifted tokens. But this hypothesis raises the question of whether there is also a tighter relationship between glide weakening and diphthong shifting at the level of individual tokens. That is, do tokens that show greater degrees of diphthong shift also show greater degrees of glide-weakening? Inspection of the raw data shows quite clearly that the opposite is true. The highest rates of glide-weakening are on the moderately-shifted tokens.

Another way to approach the same question is to look at the relationship between diphthong shift and glide weakening at the level of speakers rather than specific tokens. In other words, do speakers who show greater degrees of diphthong shift also show greater propensity for glide weakening? Using the methodology presented in Section 4.1.2, we compared the random intercepts for speakers from the diphthong shift and glide weakening models built for PRICE and for MOUTH. Figure 7 shows the results of this comparison. PRICE is shown on the left and


495

MOUTH is shown on the right. In neither case does the correlation approach significance (p>.3 in both cases). Although the line through the PRICE data shows an upward trend, this appears to be due to a single data point at the top right of the graph. In other words, the individual who displays the most diphthong shift for PRICE also displays the most glide weakening, but there is no overall correlation in the data.

While it seems plausible, then, that diphthong-shift provides the appropriate environment for glide-weakening to take place, the relationship ends there. Subsequent to this facilitating initial state, diphthong shift and glide weakening seem to operate as separate processes, at least in the case of PRICE and MOUTH. And in the case of the other lexical sets, we do not have sufficient glide-weakened tokens to observe any reliable relationship. We also note that there is no strong relationship across the vowels between the two processes, as PRICE is the last vowel to move in terms of diphthong shift, but shows the strongest evidence of glide weakening.


496

Figure 7. The relationship between speaker-specific random intercepts for

diphthong shift and glide weakening in the PRICE lexical set (top) and MOUTH lexical set (bottom). The lines show a non-parametric scatterplot smoother (lowess) fit through the data.

We also checked to see whether there was a relationship between glide weakening in PRICE and MOUTH. Are speakers who have a high degree of glide weakening in PRICE also more likely to have glide weakening in MOUTH? The answer is no. There is absolutely no correlation between the random effects for these two processes (p > .9).


497

5 Discussion and summary Let us briefly summarise the results for diphthong shift and glide weakening in the four lexical sets under investigation. We found that the proportion of variants showing diphthong shift increased over the time period investigated in all four of the lexical sets. This is supported by the results of our logistic regression models as well. The trajectories of the changes in the different vowels suggest that FACE and GOAT were the first to change, followed by PRICE and MOUTH, although substantial numbers of diphthong shifted MOUTH vowels were present from the very beginning. On the other hand, glide weakening did not advance during this period. The small number of tokens with weakened glides in the rest of the diphthongs makes it impossible to make reliable observations about the progress of glide weakening in these lexical sets.

As for phonological conditioners, diphthong shift seems to be sensitive to a range of different phonological environments depending on the vowel. Thus, following voiceless obstruents blocked the backing of the nucleus in PRICE in older speakers, and following laterals developed a blocking effect on the fronting of GOAT in younger speakers. Moreover, FACE showed higher proportions of innovative variants in word-final environments throughout the period. MOUTH did not show any phonological conditioning.

The social conditioning of diphthong shift seems to be relatively simple. We did not find any effect of sex, but the parents’ origin had a significant influence on diphthong shift in three of the four vowels. Specifically, speakers with at least one Scottish parent produced a greater number of monophthongal tokens as well as fewer innovative variants than the rest of the speakers.

We found a number of suggestive correlations across the different lexical sets, which may indicate causal relationships. Thus, PRICE and FACE appear to be undergoing a chain shift whereby PRICE is forced to move by FACE. There is also some indication that the shifts in FACE and GOAT are occurring in parallel. Finally, we found that MOUTH and FACE are strongly correlated in terms of diphthong shift, which may be related to the fact that their nuclei are moving towards the same point in phonetic space. Unfortunately, the current data set does not provide any further indications as to the source of this correlation. Although non-diphthong-shifted variants seldom attract glide-weakening, the data set did not provide any evidence that diphthong shift and glide weakening are further related beyond that.

The data provide partial support for Britain’s (2008) claim that diphthong shift in MOUTH was inherited from English varieties and became dominant through a process of levelling: we found a much higher number of diphthong shifted variants for MOUTH than for the other vowels, and these variants were present even in the speech of the oldest speakers. However, we do not see levelling in favour of the majority variant: the change seems to have been further incremented in NZE.

Trudgill’s account of the emergence of Canadian raising-type patterns through reallocation is partly compatible with our findings: speakers with Scottish ancestry showed more central realisations of PRICE (but not MOUTH) than the rest of the speakers. However, the data clearly contradict Trudgill’s prediction that Canadian raising should emerge in later generations (i.e. in speakers born after 1890). While we observed a pattern of raising, the pattern was only present in our oldest speakers, and disappeared for those born after 1880. There are two different


498

ways of interpreting this result: (i) reallocation takes place in an earlier stage than suggested by Trudgill, or (ii) reallocation is not the source of Canadian raising-type patterns. Since we do not have data from speakers born before 1855, we cannot determine which of these two interpretations is correct.

6 Conclusion Overall, our findings suggest the following general story for the evolution of closing diphthongs in NZE. All four vowels show a certain number of diphthong shifted variants from the beginning of the period under investigation, which were likely inherited from other varieties of English. Diphthong shift seems to be further incremented within NZE, with FACE and GOAT leading the process and MOUTH and PRICE lagging behind. These changes also show a degree of interdependence: FACE and GOAT may have undergone a parallel shift, PRICE shifting as part of a push-chain started by FACE, and FACE and MOUTH interacting with each other due to their phonetic proximity. As for glide weakening, it seems that diphthong shift creates relevant environments for this process, but it does not affect it beyond that. The data also provide some support for the claims in Gordon et al. (2004), Trudgill (2004) and Britain (2008), but all of these accounts need further refinements to capture the full set of facts related to diphthong shift and glide weakening. Acknowledgements This research was made possible by a Rutherford Discovery grant, and a Leverhulme Visiting Professorship awarded to Jennifer Hay. The paper uses data from the Mobile Unit Corpus, which was collected by the Mobile Disc recording Unit of the NZ Broadcasting Service. The work done by members of the Origins of New Zealand English Project (ONZE) in preparing the data, making transcripts and obtaining background information is also acknowledged. We would like to thank Maria Timmermans, who performed some of the auditory analysis described here. As ever, we are grateful to Robert Fromont for his ongoing development and support of LaBB-CAT – the corpus analysis tool that houses the ONZE corpora. References Baayen, R. Harald 2008. Analyzing Linguistic Data: A Practical Introduction to

Statistics Using R. Cambridge: Cambridge University Press. Baayen, R. Harald 2011. languageR: Data sets and functions with “Analyzing

Linguistic Data: A practical introduction to statistics”. R package version 1.4.

Bates, Douglas, Martin Maechler and Ben Bolker 2011. lme4: Linear mixed-effects models using S4 classes. R package version 0.999375-42.

Britain, David 2005. Where did New Zealand English come from? In: Allan Bell, Ray Harlow, and Donna Starks (eds), Languages of New Zealand. Wellington: Victoria University Press, pp. 156-193.


499

Britain, David 2008. When is a change not a change? A case study on the dialect origins of New Zealand English. Language Variation and Change 20:187-223.

Britain, David. and Andrea Sudbury 2008. What can the Falkland Islands tell us about diphthong shift? Essex Research Reports in Linguistics.

Drager, Katie, and Jennifer Hay 2012. Exploiting random intercepts: Two case studies in sociophonetics. Language Variation and Change 24:59-78.

Dailey-O’Cain, Jennifer 1997. Canadian raising in a midwestern US city. Language Variation and Change, 9(1): 107-120.

Gordon, Elizabeth., Lyle Campbell, Jennifer Hay, Margaret Maclagan, Andrea Sudbury and Peter Trudgill 2004. New Zealand English: its Origins and Evolution. Cambridge: Cambridge University Press.

Gordon, Elizabeth, Margaret A. Maclagan and Jennifer Hay 2007. The ONZE Corpus. In: Joan C. Beal, Karen P. Corrigan and Hermann Moisl (eds). Models and Methods in the Handling of Unconventional Digital Corpora: Volume 2, Diachronic Corpora. Basingstoke, Hampshire: Palgrave Macmillan, pp. 82-104.

Haddican, Bill, Paul Foulkes, Vincent Hughes and Hazel Richards 2013. Interaction of social and linguistic constraints on two vowel changes in northern England. Language Variation and Change 25(3): 371-403.

Hay, Jennifer and Andrea Sudbury 2005. How rhoticity became /r/-sandhi? Language 81:799-823.

Kerswill, Paul, Eivind N. Torgersen and Susan Fox 2008. Reversing “drift”: Innovation and diffusion in the London diphthong system. Language Variation and Change 20: 451-491.

Labov, William 1994. Principles of Linguistic Change. Vol. 1: Internal Factors. Blackwell, Oxford.

Maclagan, Margaret A., Elizabeth Gordon and Gillian Lewis 1999. Women and sound change: conservative and innovative behaviour by the same speakers. Language Variation and Change 11(1): 19-41.

MacMahon, Michael K. C. 1998. Phonology. In: Suzanne Romaine (ed.) The Cambridge History of the English language, Vol. IV 1776-1997, pp. 375-535. Cambridge: Cambridge University Press.

R Core Team 2013. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Roach, Peter 2004. British English: Received Pronunciation. Journal of the International Phonetic Association 34(2): 239-245.

Schreier, Daniel and Peter Trudgill 2006. The segmental phonology of nineteenth-century Tristan da Cunha English: convergence and local innovation. English Language and Linguistics 10(1):119-141.

Tagliamonte, Sali A. and R. Harald Baayen 2012. Models forests and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change 24:135-178.

Trudgill, Peter 2004. New Dialect Formation: The Inevitability of Colonial Englishes. Edinburgh: Edinburgh University Press.

Trudgill, Peter, Margaret Maclagan and Gillian Lewis 2003. Linguistic archaeology: the Scottish input to New Zealand English phonology. Journal of English Linguistics 31: 103-124.


500


Raymond Hickey The development of recording technology --- Page 501 of 525

501

23 The development of recording technology Raymond Hickey The credit for developing the earliest device for recording sound goes to the Frenchman Édouard-Léon Scott de Martinville (1817-1879). His device, the phonautograph, patented in France in 1857, worked on the basis of converting sound waves into mechanical modulations which were traced onto paper blackened with soot to render the tracings easily visible. De Martinville’s concern was to study the acoustics of sound waves and not to reverse the process and reproduce sound from the patterns his device generated. This step was left to Thomas Edison (1847-1931), the great American inventor, who in 1877 developed his phonograph (later known as a gramophone), a device for mechanically recording an acoustic signal on a physical medium. The principle working of this device consisted of amplifying the acoustic signal through a megaphone-like horn at the narrow end of which was attached a stylus which etched a pattern onto a wax cylinder which corresponded to the modulating acoustic input. The wax cylinder could then be used to play back the signal by using a stylus and horn to reverse the process. The mechanical modulation of the stylus moving through the patterning on the wax cylinder converted these movements into an acoustic signal via a diaphragm which was then released through the horn of the device (Figure 1).

Figure 1. An early cylindrical phonograph from around 1900. Wax cylinders could be reused for a recording by smoothing the surface so that it


502

could accept new patterning during a later recording session. In early recording work for dialectological purposes wax cylinders were frequently reprocessed after a field worker took notes from a recording. This meant, of course, that many audio recordings did not survive.1 The next development in recording technology was the gramophone record in which recorded sound was stored on a flat disc of shellac, later of polyvinyl chloride, simply called ‘vinyl’. This development is mainly associated with the German-American Emile Berliner (1851-1929) who founded several companies for the commercial production and sale of records. Linguistic material was however collected on very early gramophone records (see Figure 2). In Germany in the 1930s a large-scale project was begun which collected samples of all the dialects of the then Third Empire stretching from the Rheinland in the west to Prussia in the east. This project, with the name Lautdenkmal reichsdeutscher Dialekte ‘Audio monument of imperial German dialects’ was infected by the Nazi ideology of the time and in fact the completed set of some 300 gramophone records with a player in a specially designed wooden cabinet was presented to Adolf Hitler on his 48th birthday on 20 April 1937. The politically unacceptable thinking behind the project, especially of one of its main members Hermann Neef (1904-1950), led to its neglect for many years. There have been a number of attempts to use the material for objective dialect analysis in the past two decades and many recordings have since been digitised (see the information at www.lautdenkmal.de).

1 Some recordings were retained, such as that of the Austrian Emperor Franz Joseph I (1830-1916) recorded in Bad Ischl, Upper Austria (near Salzburg) on 2 August 1903. This recording and other early audio material is available from the Phonogrammarchiv ‘phonogramme archive’ of the Institute for Audio-visual Research and Documentation of the Austrian Academy of Sciences, Vienna.

http://www.lautdenkmal.de)


503

Figure 2. Portable mechanical gramophone player from the 1930s A different principle is where an analogue audio signal is recorded on a wire by moving it across a recording head which induces magnetic patterns in the wire. These patterns reflect the modulations of the audio signal and so the wire functions as a storage medium for sound. Although the principle of magnetic recording had been laid down in the 1870s and 1800s by the American engineer Oberlin Smith (1840-1926) and the Danish engineer Valdemar Poulsen (1969-1942) it was not until 1928 that the Austro-German engineer Fritz Pfleumer (1881-1945) produced the first magnetic tape recorder. This used a long plastic (originally paper) band on which a coating of magnetisable material (ferric oxide) had been added and stored on a reel. In the context of the present volume the most important early data source from England are the sound recordings made for the Survey of English Dialects. The fieldworkers of this project collected material over an eleven-year period from 1950 to 1961. The informants were generally non-mobile, older, rural, males (so-called NORMs) who represented their dialect area best in the opinion of the survey’s manager Harold Orton (1898-1975) of the University of Leeds. With advances in recording technology, several of the dialect locations were revisited and some of the original informants (or comparable individuals) were recorded. This continued until 1974 and resulted in a collection of gramophone records and audio tapes (from reel-to-reel recorders, see Figure 3), some 287 of which are available on the website of the British Library as audio clips which can be listened to online (see http://sounds.bl.uk/accents-and-dialects/survey-of-english-dialects).

http://sounds.bl.uk/accents-and-dialects/survey-of-english-dialects)


504

Figure 3. Early reel-to-reel magnetic tape recorder The breakthrough which led to modern recording technology was the development of digital recording where an acoustic signal (or video signal for filming devices) is translated into a digital signal consisting of sequences of binary data with only two values, 0 and 1, or below and above a certain threshold which is then interpreted as 0 and 1 by the digital recording device. This device can then recreate the analogue signal by converting the digital data and outputting it via a loudspeaker (or computer display for a visual representation of the sound pattern). The mechanics of digital encoding had been developed in the nineteenth century and was used in early telegraphy, telecommunications and the transmission of Morse code. The principle of the latter system consisted of alternating sequences of clicks, tones or lights consisting of two types, a long and a short signal. The system was indeed binary but it did not capture sound but used an encoding system which could be understood by a receiver who knew how the code worked. The digitisation of sound came with pulse-code modulation, which sampled the audio signals so many times a second and converted each sample to a digital value using a certain bit depth, the larger the latter, the better the quality of the digitised audio signal. A patent for this system was taken out by the English scientist Alec Reeves (1902-1971) and after WWII the first digital recorders were developed. During the 1970s commercial digital recorders became available and in 1979 the digital optical storage disc, or simply CD (for Compact Disc) was developed and since then digital recording and storage has become the commonplace of audio recording. For speech recording the great advantage of digital recording is the high quality and the freedom from background noise caused by the recording device. Noise picked up with the speech signal can often be successfully filtered out with digital devices, usually by processing the recording with appropriate software afterwards. Digital recordings can furthermore be displayed and edited as spectrograms or in wave form. The compact size of digital recorders (Figure 4) make them very suitable for linguistic fieldwork.


505

Figure 4. Modern hand-held digital recorder (Roland Edirol R09HR, 24bit/96kHz

WAVE/MP3 recorder – used by author for fieldwork) As the 2000s progressed CDs became less favoured as storage media and small removable hard disks, USB flash drives and Solid State Drives appeared as alternative storage forms for digital data as they all have the advantage of allowing easy deletion and (re-)storing of data. The encapsulation of sound data in digital form led to the development of processing software to allow not only the basic editing of primary data but also the spectral and formant analysis of such data. The premier software tool at the moment is Praat developed by Paul Boersma and David Weenink at the University of Amsterdam (see Boersma and Weenink 2013 and the website http://www.praat.org/; the current version is 6.0.11 [January 2016]). This software has been used in the processing of audio data in nearly every chapter in the current volume. The availability of such software has enabled the whole research field of sociophonetics (Johnson 2008; Thomas 2011; Di Paolo and Yaeger-Dror, eds, 2011; Celata and Calamai, eds, 2014) in which phonetic data is subjected to detailed analysis within a sociolinguistic framework. The numerical data gained from such software as Praat is often taken as input to a statistical package which is capable of graphically representing the input data. Most commonly used here are statistical packages generated with R, an open source, higher level programming



506

language developed by an international team (see the website https://www.r-project.org/) who are part of the R Project for Statistical Computing. The combination of digital sound data, audio processing software and statistical computing software has established itself as a standard set of tools for sociophonetic work in the 2010s and will likely remain so for the future with further developments in technology and software adding more processing power and possibilities and so allowing linguists to refine the types of acoustic analysis which will be possible in years to come. References Baayen, R. H. 2008. Analyzing Linguistic Data: A Practical Introduction to

Statistics using R. Cambridge: Cambridge University Press. Boersma, Paul and David Weenink 2013. Praat: doing phonetics by computer.

5.3.47. Celata, Chiara and Silvia Calamai (eds) 2014. Advances in Sociophonetics.

Amsterdam: John Benjamins. Di Paolo, Marianna & Malcah Yaeger-Dror (eds) 2011. Sociophonetics. A Student’s

Guide. London: Routledge. Gries, Stefan T. 2009. Quantitative Corpus Linguistics with R: A Practical

Introduction. London: Routledge. Gries, Stefan T. 2013. Statistics for Linguistics with R: A Practical Introduction.

Berlin: de Gruyter Mouton. Johnson, Keith 2008. Quantitative Methods In Linguistics. Malden, MA: Wiley-

Blackwell. Levshina, Natalia 2015. How to do Linguistics with R: Data exploration and

statistical analysis. Amsterdam: John Benjamins. Thomas, Erik 2011. Sociophonetics. An Introduction. Basingstoke: Palgrave

Macmillan.

https://www.r

Index --- Page 507 of 525

507

Index acoustic analysis

Praat software, 509 sociophonetics, 509 statistical computing

R language, 509 African American English, 314 American regional dialects, 205

evaluation, 226 speech sampling, 206 the Midland, 215

Central Ohio, 219 Indianapolis, 219

LOT and THOUGHT vowels, 221

vowel overview, 220 Kansas City, 216 PEN and PIN vowels, 218 Pittsburgh

MOUTH vowel, 225 Pittsburgh/Western

Pennsylvania, 225 LOT and THOUGHT vowels,

226 St Louis, 221

NORTH, FORCE and START vowels, 225

TRAP and DRESS vowels, 224

vowel overview, 223 the North, 208

NORTH and FORCE vowels, 212

Northern Cities Shift, 209 TRAP and PAN vowels, 214

Atlas of North American English (ANAE), 205, 236, 294 Australia, 452

archival sound recordings acoustic phonetic analysis, 457 data extraction, 462 data sampling, 460 discussion of results, 467 hypotheses, 459 results of analysis, 463

speaker details, 461 vowel tokens, 461

assessment of open vowels, 470 BATH lexical set

historic lengthening, 452 dating, 453 prescriptive comments, 454

BATH, START and PALM lexical sets, 452

chain shifting, 457, 469 KIT, DRESS and TRAP lexical

sets, 469 older Australian English, 455 PALM lexical set

historic lengthening, 453 relationship to Southern British

English, 453 START lexical set

historic lengthening, 452 STRUT lexical set, 452, 454 TRAP lexical set, 453 TRAP lowering, 470 variable lengthening before /nC/,

453 Barbed Wire Ballads, 15 Berliner Lautarchiv, 15, 127, 130, 152, 156, 160, 161, 162, 164, 166, 169, 171

Parable of the Prodigal Son, 15, 16, 21, 26, 131, 149, 154, 172

Berliner Lautarchiv, Der, 4 Royal Prussian Phonographic

Commission, 4 Brandl, Alois, 4, 13, 130, 152

1916-1918 recordings, 14 British Library

Berliner Lautarchiv British and Commonwealth Recordings, 12, 151, 152

evaluation of recordings, 31 California Vowel Shift (CVS), 294, 299 Canada – Mainland, 338


508

Corpus of Early Ontario English, 340

foreign (a) words, 347 historic recordings

acoustic analysis, 347, 350 analysis, 342 assessment, 352 auditory-impressionistic

analysis, 345 Canadian Raising, 344, 345 Canadian Shift, 344 GOOSE fronting, 344 linguistic features, 343 number of recordings, 343 SK-palatalisation, 346 T-flapping, 344, 346 WH-voicing, 344 Yod deletion, 344, 346

merry-marry merger, 340, 347 Oral Histories of the First World

War Veterans 1914-1918, 340

periodisation of Canadian English, 338

significance of World War I, 340 Canada – Newfoundland, 355

Canadian Raising, 355 acoustic analysis, 362

results, 363 /ai/, 368 /au/, 368

comparison with Falkland Islands English, 369

conservative or innovative features, 359

data sampling, 361 Irish and Southwest-English

origin, 362 differential occurrence, 360 evaluation, 369 Newfoundland situation, 360 origins, 358 rarity in anglophone context, 358 source of diphthongs, 358

conservative nature of dialect, 357 Dialect Atlas of Newfoundland and

Labrador, 357, 360, 370 H deletion, 357

Memorial University Folklore and Language Archive (MUNFLA) recordings, 356

North American context, 356 Caribbean – Trinidad and Jamaica

Caribbean Creoles, 376 data sampling, 380 diglossia in Jamaica, 373 historic recordings, 375 intonational patterns, 373 Jamaican Spirit Language, 385 prosody

acoustic cues, 381 annotation conventions, 379 evaluation, 391 intonation, 381 older recordings, 385 pitch and relative prominence,

375 theoretical background, 378 Trinidadian, 390 types of pitch accent, 387 word level stress, 380

prosody and language change, 375 recent research, 376 vowel variation, 376

Corpus of Historical Singapore English, 398 Defoe, Daniel, 128 Diachronic Corpus of Hong Kong English, 398 dialect recording, 4 dialects of Shelley and Skelmanthorpe, 17

Dyson, Wibsey, 20 grammatical features, 19, 26 lexis, 26 phonetic features, 17, 20

BATH, STRUT, FACE, GOAT and GOOSE lexical sets, 21

Townend, John, 17 Dictionary of American Regional English (DARE), 264, 265, 268, 314 Doegen, Wilhelm, 4, 13, 130, 151, 177

Lautabteilung an der Preußischen Staatsbibliothek, 12

Odeon Recording Company, 13


509

The Doegen Records Web Project, 152

Dynamic Model (Schneider), 8, 392, 401 early audio recordings

analysis, 2 availability, 4 Berliner, Emile, 506 British Library collections, 5 caveats, 2 classification of insights, 3 dialectological work, 506 digital recorder, 509 digital technology, 508

fieldwork, 509 recorders, 509

Edison, Thomas phonograph, 505

English dialects, 5 gramophone records

shellac, 506 vinyl, 506

magnetic tape Pfleumer, Fritz, 507 recorder, 508

magnetic wire Poulsen, Valdemar, 507 Smith, Oberlin, 507

magnetic wire recording, 507 range, 5 Scott de Martinville, Édouard-Léon

phonautograph, 505 supraregional varieties, 5 technology, development of, 505 testing hypotheses, 3 time depth, 2 types, 4

deliberate recordings, 4 incidental recordings, 5

wax cylinders, 505 England – Liverpool

diagnostic features, 103 working-class Catholic speakers,

105 new dialect formation, 110 non-rhoticity, 102, 116 NURSE-SQUARE merger, 102

gender difference, 118

possible origin, 112 origins, 102 Origins of Liverpool English

(OLIVE) corpus, 103, 115 Teen subcorpus, 116 T-lenition, 122

population development, 110 TH/DH-stopping, 102, 104 voiceless stop lenition, 102

possible origin, 123 trajectory, 107

England – London Cockney, 84

Labov’s recording, 92 normalised Pairwise Variability

Index (nPVI) values, 94 Sivertsen’s description, 85

vowel system, 87, 89 vowels 1930-1970, 90

contact and change in the East End, 92

early twentieth-century Cockney, 77

evaluation, 97 in-migration, 77

Middle English period, 77 nineteenth-century Irish, 78 Yiddish-speaking Jews, 78

Jewish East London, 81 Linguistic Innovators

The English of Adolescents in London, 90

Multicultural London English (Jafaican), 77

Yiddish Anglicisation, 83 influence on London English, 80 Jewish Cockney, 82 presence in East End, 79 stereotypical pronunciations of

English, 84 Voice Onset Time (VOT), 95

England – Tyneside, 126 contemporary descriptions, 129 dilution of traditional dialects, 129 historic recordings, 129

consonants /h/, 135


510

/j/, 136 /l/, 135 /ŋ/, 135 /p,t,k/, 134 /r/, 135 /v/, 136 /θ ð/, 136

Dixon, William, 130 evaluation, 145 Roper, Arthur, 130 speakers, 132 vowels

DRESS, 136 FACE and GOAT, 138 FLEECE, 136 FOOT/STRUT, 137 GOOSE, 137 happY, 138 KIT, 136 lettER, 138 LOT, 137 MOUTH, 138 NURSE, 142 NURSE-NORTH merger, 142 PALM, 137 PRICE, 138 START, 137 THOUGHT, 137 TRAP/BATH, 136

Northunbrian burr, 128, 142 proximity to Scottish border, 129

English Dialect Dictionary, 22 Evolving English, 16 Ex-Slave Recordings, 314

assessment, 334 FACE vowel, 328 issues concerning data, 315 non-rhoticity, 323 possible analyses, 317

formant measurements, 322, 325, 326

fundamental frequency, 329 pitch accent contour, 331 technical issues, 318 timing issues, 332

scholarship overview, 316

Southern White vowel fronting, 327

feature pool, 98 Ghana, 398

archival sources, 401 Nkrumah centenary audio files,

401 assessment, 412 English and indigenous languages,

399 Ghana Broadcasting Corporation,

399 Ghanaian languages, 400 history of broadcasting, 400 NURSE vowel, 399, 409 phonetic variables, 404 STRUT vowel, 399, 410 variable (ING), 399, 405 variable (WH), 399, 407

Gimson, A. C., 39 Great Vowel Shift, 358, 432, 476 group second-language learning, 98 Hanley Recordings, 229, 230, 260 Henry Sweet, 13 Historical Written Corpus of Ghanaian English, 398 How the Edwardians Spoke, 15 International Corpus of English, 398 International Phonetic Association, 4 Ireland – Dublin

diphthongs, 181 Canadian Raising, 182 CHOICE, 181 MOUTH, 182 PRICE, 182

Dublin English NORTH-FORCE distinction,

180 SOFT-lengthening, 180 WH-W distinction, 180

historic recordings, 204 Irish English

long vowels, 181 FACE, 181 GOAT, 181 GOOSE, 182 THOUGHT, 181

L-velarisation, 181


511

TH/DH-stopping, 180 T-lenition, 180 T-R rule, 181

Pathé News, 177 post-WWII

changes in supraregional Irish English, 201

continuity of Irish English, 201 further developments, 200 linguistic identity of Irish

people, 202 vowel shift of the 1990s, 201

radio broadcasting, 177 recordings

availability, 176 pre-WWII, 184

BATH-retraction, 187, 198 gender issue, 197

comparison with RP, 187 GOAT vowel, 193 individuals, 184 PALM vowel, 199 Rathmines accent, 196 rhoticity

James Joyce, 190 types of r, 189 variable, 190

STRUT vowel, 191 summary of features, 199 TRAP vowel, 189 TRAP-BATH split, 195

quality, 178 technology, 177 time range, 176

relationship of Dublin English to Irish English non-vernacular forms, 180

Short Front Vowel Lowering, 189 short vowels, 183

FOOT, 183 LOT, 183 MERRY/MARY/MARRY, 183 NURSE, 183 STRUT, 183 TERM, 183

varieties of Irish English, 178 dialect divisions, 179 non-vernacular forms, 179

Jones, Daniel, 13, 39, 42, 45 Linguistic Atlas of Late Medieval English, 445 Linguistic Atlas of New England (LANE), 229 Linguistic Atlas of North Central States (LANCS), 234, 265 Linguistic Atlas of the Middle and South Atlantic States, 230 Linguistic Atlas of the Middle and South Atlantic States (LAMSAS), 314, 318 Linguistic Atlas of the United States and Canada (LAUSC), 229 Linguistic Atlas of the Upper Midwest (LAUM), 265 Linguistic Survey of India, 15, 131 Lloyd James, Arthur, 13 London-Oslo-Bergen Corpus, 398 Longman Pronunciation Dictionary, 39 Māori English, 93 New Zealand, 475

auditory perceptual analysis, 477 Canadian Raising, 478 closing diphthongs, 475

PRICE, MOUTH, FACE and GOAT, 476

diphthong shift, 475 increase over time, 478

discussion and evaluation, 501 glide weakening, 475, 479 internal or external developments,

478 Origins of New Zealand English

(ONZE), 113, 156, 230, 356, 458, 475, 479, 502 causal relationship between

diphthong shift and glide weakening, 498

causal relationships among different lexical sets, 493

data analysis, 482 data sampling, 479 diphthong shift, 483

conditioning, 489 FACE vowel, 491 glide weakening, 496


512

conditioning, 497 GOAT vowel, 492 MOUTH vowel, 491 PRICE vowel, 490 results of analysis, 483

review of scholarship, 477 NORMs (non-mobile rural older males), 259 Northern Cities Shift (NCS), 293 Norwich English

Low Countries influence, 78 observer’s paradox, 133 On Early English Pronunciation, 128 Orton, Harold, 16 Pronunciation of English in the Atlantic States, The (PEAS), 230 Received Pronunciation, 36, 133

alternative designations, 38 BBC Archive, 46 corpus details, 56 early twentieth century, 37, 59

Bloomsbury Group, 59, 60 Bowen, Elizabeth, 60

features of speech, 67 social position, 61 VOT values, 67

Edward VIII, 69 abdication speech, 69

Eliot, T. S., 60 hyperadaption, 65, 67 social position, 61 voice recordings, 60 VOT values, 66

Elizabeth II 21st birthday broadcast, 70 Christmas broadcast

1953, 70 1984, 71 2014, 71

VOT values /k/, 73 /p/, 72 /t/, 72 averages, 71

Forster, E. M., 60 Frost, Robert, 66 George V, 69

Empire Day and Christmas Day messages, 69

George VI, 70 Empire Exhibition and victory

in Europe speeches, 70 Jones, Daniel, 62

aspiration, 63 key features, 61

GOOSE vowel, 61 THOUGHT vowel, 61 TRAP raising, 61

speech of English monarchs, 68 Stephen, Leslie, 59 Strachey, Lytton, 60 voice onset time (VOT), 62, 63

averages for four monarchs, 73

averages for three writers, 67 increase over time, 65 lexical incidence, 64 measuring, 64 parameters, 63 phonetic context, 64 point of articulation, 64 post-stress voiceless stops, 68 ranges across four monarchs,

74 Woolf, Virginia, 59

VOT values, 65 enregisterment, 39 FACE vowel, 37 FLEECE vowel, 37 GOAT-fronting, 37 happY vowel, 37 historical data, 43 L-vocalization, 36 MOUTH vowel, 37 numbers of speakers, 38 online recordings, 44 R sounds, 41

evaluation, 49 labialised realisations, 51 syllable position, 48 use of trill, 45

speaker stance, 45 TRAP, FOOT and GOOSE vowels,

36 use of term, 59


513

variationist descriptions, 40 Schünemann, Georg, 13 Scotland – Glasgow and Central Belt, 151

coda /r/, 151 historic recordings, 151

Central Belt Scottish English, 153

evaluation, 171 overview, 159 preparation of sound files, 156 rhotics, 157, 161

a century of change, 167 Berliner Lautarchiv, 162 coda /r/, 171 seven environments, 166, 169 syllable position, 165 tap versus trill, 162 tapped R, 158 trilled R, 159 weak variant, 162

T-glottalling, 157, 160 distribution, 161

vowels, 160 Scottish Standard English, 152 Sounds of the City, 156

Scottish English, 62 Scottish Screen Archive of the National Library of Scotland, 152 Singapore English, 93 Sivertsen, Eva, 84 software

Audacity, 63 Praat, 63

Sound and Moving Image, 16 Sound Archive at the School of Scottish Studies, 152 South Africa, 417

archival sources, 419 comparative analysis, 427 evaluation, 433 Mr Flemming, 419 Ms Gibson, 425

tapped R, 427 WH-W distinction, 426

Ms Murchie, 423 three-way division of variety,

434

Cape Colony, 419 English

General/Respectable, 418 Extreme/Broad, 418

Cultivated/Conservative, 418 Natal, 419 Orange Free State, 419 situation at end of nineteenth

century, 418 South African Broadcasting

Corporation (SABC), 417 South African Received

Pronunciation, 418 Transvaal, 419 Union of South Africa, 419

Southern Hemisphere varieties, 62 Southern Vowel Shift (SVS), 293 Standard Southern British English, 38, 93, 307 Stumpf, Carl, 13 Survey of English Dialects, 14, 16, 40, 87, 89, 127, 130, 356 Sweet, Henry, 15 Three dialects of English (Labov), 293 Tristan da Cunha, 437

data sources 1961-2 BBC/UCL corpus, 439 Schreier Corpus, 441 Scottish Corpus, 441 Svensson/Munch corpus, 439

earlier recordings, 442 advanced KIT centralization, 443 central PRICE onsets, 443 consonant cluster reduction, 443 evaluation, 448 H-insertion, 443, 445, 447 past be levelling, 443 zero existentials, 443

grammar copula absence, 439 lack of word order inversion in

questions, 439 H-insertion, 437 history and society, 437 linguistic background, 438 phonology

consonant cluster reduction, 439


514

happY-tensing, 439 long vowel in cloth, 438 long vowel in fish, 439 MOUTH-fronting, 438 START-backing, 439 STRUT-fronting, 438 T-glottalling, 438 TH-fronting, 439 TH-stabilisation, 439 V-W merger, 438

scholarship, 439 Tyneside Linguistic Survey, 127, 130, 142 USA – New England, 229

Dialect Notes, 229 evaluation, 259 historic recordings, 238

acoustic analysis, 239 locations of speakers, 241 patterns of variation, 240 percentage of rhoticity, 242 regional and social

characteristics of speakers, 241

LOT-PALM distinction, 236 LOT-THOUGHT distinction, 236 LOT-THOUGHT merger, 235 lower vowels, 235 Martha’s Vineyard, 230 NORTH and FORCE vowels, 238,

255 PALM lexical set, 231 PALM, LOT and START lexical

sets, 250 short a, 232, 247 state of scholarship, 232 TRAP/BATH lexical sets, 230, 245

USA – Upper Midwest, 264

[u] offglide with GOAT, 271 archival sources, 265

distribution of speakers, 268 evaluation, 286 LOT-THOUGHT merger, 272

distribution of THOUGHT vowel, 276

division of region, 273 monophthongisation of GOAT,

268, 270 ongoing vowel changes, 264 realisation of TRAP vowel, 276

comparison of variants, 282 conditioned raising, 279 variants in DARE, 285

USA – Western United States, 290 archival sources, 294, 302 back vowel fronting, 299 early documentation, 294 evaluation, 309 historical koineization, 309 low back merger, 299, 307 present-day speech, 293, 296 settlement history, 290

California, 291 Nevada, 292

Western vowel system, 293 Voices of the UK, 16, 32, 131 Walker, John, 165, 445, 454 Ward, Ida, 13 Wenker, Georg, 13 Wisconsin English Language Survey (WELS), 265, 268 Word Geography of the Eastern United States, A, 230 Wrede, Ferdinand, 13 Wright, Joseph, 22


515

Listening to the Past - Uni DuE

Documents

Transcript of Listening to the Past - Uni DuE