Download - EXPRESSIVE LANGUAGE IN EARLY ADOLESCENCE 1

EXPRESSIVE LANGUAGE IN EARLY ADOLESCENCE 1

Tell me a fairy tale: an automated approach to analyze expressive language

development in adolescents

Breno Pedronia, Fabio T. Rochaa, Marina L. Puglisib, Sandra M. Aluisioc and Sabine Pompeiaa

a Departamento de Psicobiologia, Universidade Federal de São Paulo, São Paulo, Brazil

b Faculdade de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil

c Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos,

Brazil

Author Note

Breno Pedroni https://orcid.org/0000-0001-9418-1436

Fabio T. Rocha https://orcid.org/0000-0003-3013-4117

Marina L. Puglisi https://orcid.org/0000-0003-0652-8837

Sandra M. Aluisio https://orcid.org/0000-0001-5108-2630

Sabine Pompeia https://orcid.org/0000-0002-5208-8003

Declaration of interest: none

Correspondence concerning this article should be addressed to Sabine Pompeia,

Departamento de Psicobiologia, Universidade Federal de São Paulo, Rua Napoleão de

Barros, 925, CEP 04024002, São Paulo, SP, Brazil São Paulo, Brazil. Email:

[email protected]


Abstract

Expressive oral language development (EOLD) in adolescence is understudied and usually

involves a small set of language parameters. To explore this development in more detail, we

used automated language analyses of transcripts of narrative speech (fairy-tale retellings) of

257 Portuguese-speaking 9-15-year-olds. We grouped multiple linguistic parameters into

word/lexical, sentence/grammar and inter-sentence/cohesion category levels entered as

multivariate factors in general linear models considering the effects of development, sex and

parental schooling. We found substantial EOLD at all levels, with negligible effects (except

in one level) of sex, and parental schooling (a measure of participants’ socioeconomic status).

Age (months), pubertal status and school grade generally showed comparable EOLD effects,

suggesting sexual maturation and participants’ grade are not more determinant factors than

age. In summary, automated scoring of linguistic features of transcribed fairy-tale retellings is

a sensitive, inexpensive, and ecological way of identifying EOLD in adolescence which

minimizes socioeconomic effects on oral language.

Key-words: adolescence, development, retell, narrative, cognition, automatic language

analysis


Tell me a fairy tale: an automated approach to analyze expressive language

development in adolescents

Assessment of language skills is predominantly carried out during early childhood

when these abilities improve more substantially and can therefore be affected by a host of

factors that affect development (Black et al., 2017). Less attention has been paid to the

development of these abilities during early adolescence, a phase marked by changes in

physical/brain structure and function, and new life experiences (changes in social roles, home

and school environments) (Paus, 2005), which are partly due to the pubertal transition

(Goddings et al., 2019; Gozdas et al., 2019; Vijayakumar et al., 2021). These brain alterations

are known to also include cortical and subcortical areas that are related to cognitive functions

involved in the use of language (Sakai, 2005). All these changes can impact oral language

skills (see Berman, 2007; Nippold et al., 2014; Ricketts et al., 2020), as well as other

cognitive and social abilities that support language (Paus, 2005) as adolescents mature, such

as declarative memory (associated with lexicon/semantic knowledge) and procedural memory

(involved in the subconscious understanding of grammar and syntax) (Ullman, 2004). These

are protracted processes which the present study aimed to investigate in a non-clinical sample

of early adolescents.

In the field of cognitive sciences, assessment of language skills in adolescence is

commonly restricted to the determination of receptive knowledge of word meaning (lexicon

and semantics). This is done mainly by using the Vocabulary subtest of one of the many

versions of the Wechsler intelligence scales, such as the Wechsler Abbreviated Scale of

Intelligence (WASI; Wechsler, 1999). However, this type of task does not capture other

receptive and expressive language abilities that are also known to improve during

adolescence, such as how well this vocabulary knowledge is employed and used to connect


words into a sentence, or how increasingly more sophisticated syntactic constructions and

cohesive devices are used to better tie meaningful words and sentences together (Nippold,

1993; Berman, 2007). In contrast, within the field of speech and language therapy,

assessment of language goes way beyond determining people’s receptive vocabulary. One of

the most common comprehensive tools for evaluating expressive and receptive language

skills is probably the Test of Language Development-Intermediate (TOLD-I:5: Hammill &

Newcomer, 2020). This test battery includes six subtests, the first three of which assess

vocabulary in different ways using spoken words, while the other three involve generating a

more complex sentence from the combination of simpler ones, forming sentences from a

randomly ordered string of words, and the ability to recognize ungrammatical spoken

sentences. Although the TOLD-I is a much more comprehensive tool in terms of language

assessment than tools that simply determine people’s lexical knowledge (vocabulary), it does

not tap into the ability to spontaneously express and connect complex ideas. This can be done

with a great variety of measures at this age (Petersen et al., 2008), such as the Test of

Narrative Language (TNL-2; Gillam & Pearson, 2004), another norm-based battery that

assesses the ability to comprehend and produce three types of stories (a script of a familiar

day-to-day occurrence, a personal, and a fictional narrative cued by pictures), or even oral

description of pictures (e.g., Ardila & Rosselli, 1996; Martins et al., 2007), such as the cookie

theft task of the Boston Diagnostic Aphasia Examination (BDAE; Goodglass et al., 2001).

Although the latter may be more suitable to collect adolescents’ naturalistic speech samples,

the current scoring system focuses on a narrow set of factors such as speech rate (e.g.,

Martins et al., 2007) and/or individual linguistic features (e.g., story content; macrostructural

characteristics like character, setting, and initiating events; and microstructural elements

(metrics) such as the use of pronoun, causal adverbial clauses and cohesive adequacy) (see

Petersen et al., 2008). Data extracted from these speech samples could be further explored to


provide a more comprehensive view of adolescents’ expressive oral language abilities and

developmental trajectories as a whole, instead of being analyzed at individual linguistic

levels.

All the above-mentioned language assessment tools also share some disadvantages

that we aimed to avoid. Namely, they can take a long time to administer, involve

time-consuming manual scoring and fees because they are not open access, and have usually

only been culturally- and linguistically-adapted for use in a few countries. Moreover, these

tests were generally developed to detect intellectual disability or language impairment, and

not to determine how expressive oral language develops in healthy adolescents. An

alternative way to avoid such limitations is the use of language sample analysis (LSA), a tool

that is more commonly employed in the field of linguistics. This type of analysis consists of

obtaining speech samples in real-world conversational contexts, which are essential for

academic and social activities (Miller et al., 2016; Spencer et al., 2017). These samples are

then transcribed to allow the extraction of different types of linguistic elements/metrics and

speech characteristics. This can be done free of charge, automatically and in different

languages and provides a more ecological language sampling and analysis (Heilmann &

Nockert, 2010; Johnston, 2006; Miller et al., 2016) compared to simply defining words,

building sentences or even determining ratings of story grammar from speech samples (see

Schneider et al., 2006).

LSA has been consistently used with children for many years, either to assess

language development (Heilmann & Miller, 2010), or for clinical and diagnostic purposes

(Schneider & Dubé, 2005; Westerveld, 2011), but standardized and age-appropriate protocols

for adolescents are still lacking (Spencer et al., 2012). Furthermore, most published studies

used only a small set of metrics/elements from oral (e.g., Nippold et al., 2017) and written

samples without providing a theoretical explanation for the selection of the analyzed


linguistic elements. This leads to a limited understanding of what is expected in terms of

expressive oral language development in this age range (Hill et al., 2021). Despite this, based

on the available literature, Berman (2007) and Nippold (1993, 2000, 2004) claim that, during

adolescence, there seems to be a notable improvement in the use of the lexical (increased use

of more abstract words and figurative language) and syntactic aspects of oral language (e.g.,

longer sentences, expanded noun and verb phrases and a better use of embedded clauses and

passive voices). Improvement is also found in other metalinguistic and non-literal uses of

language, which all reflect later language development (the transition process from being

fluent to becoming proficient in language use, which often takes course during school-age

and adolescent years) (Berman, 2007).

Most of these studies in adolescents were based on microlinguistic dimensions

(within-sentence formulation of a verbal expression at the lexical or grammatical level: see

Barker et al., 2020). Macrolinguistic elements (between-sentence and higher-order

formulation and/or conceptual organization: Barker et al., 2020) are particularly difficult to

score manually and, therefore, have been less explored in prior studies on development,

which rarely used automatic scoring, as done here. One of these macrolinguistic features is

cohesion, a linguistic mechanism that represents an objective characteristic of discourse

(Graesser & McNamara, 2011; Graesser, McNamara & Kulikowich, 2004), which considers

not only explicit connective and cohesive cues and devices that organize and tie together

ideas and information across sentences, but also word overlaps (Crossley et al., 2019;

Graesser et al., 2004).

Despite the generalized improvement in language use across adolescence described

above, there are many variations in results from study to study. Hill et al. (2021), for instance,

did not find age effects in adolescents in syntactic (measured by mean length of utterance and

clausal density) and lexical elements (number of different words, also referred to as types).


There are also contradictory findings regarding improvement in cohesion in adolescence. For

example, Nippold et al. (1992) showed age-related increases in the use of cohesive devices,

but the authors only assessed adverbial conjunctions (such as “moreover” and “furthermore”).

Ciccia and Turkstra (2002), in turn, considered only cohesive markers in T-units (which

consist of a main clause plus all subordinate clauses or non-clausal structures attached to it)

(Cannizzaro & Coelho, 2013) classified as forming complete, incomplete or erroneous ties

and found no age effects, like Hill et al. (2021). Even though the parameter assessments in

these studies measure cohesion, they only reflect a very limited aspect of cohesion.

Therefore, cohesion needs to be explored in more detail, as is done here using automated

analyses, as described later. These conflicting findings are not surprising because studies

varied considerably regarding many factors beyond the limited number of studied linguistic

elements, such as differences in sample size and studied age ranges. We also call attention to

differences in the nature of the task used to elicit the language samples among prior studies

(e.g., narrative retellings, persuasive or descriptive discourses and conversations), which can

capture distinct language skills (Frizelle et al., 2018; Nippold et al., 2005, 2014, 2017;

Westerveld, 2011). All these types of tasks involve the formulation and connection of ideas

using the appropriate vocabulary and grammar, while also requiring speakers to keep track of

the information that is being conveyed. However, they can differ in many respects, as

described next. Skills needed for conversations, for example, may be less cognitively and

linguistically challenging than narratives in that they do not require speakers to sustain a

cohesive, complex and engaging monolog (Byrd et al., 2012). Indeed, in conversations,

monitoring the listener’s comprehension and adjusting the output accordingly is usually done

in a more superficial way (e.g., shortening sentences and simplifying syntax) compared to

narratives of other types (Heilmann & Miller, 2010; Frizelle et al., 2018). Furthermore,

conversations, persuasive discourse, and narratives that require people to describe something


personal or an individual’s point of view have no a priori fixed structure. This leads to wide

interpersonal variations in the size and complexity of speech samples (since, for example, one

person can tell a story about their summer vacation in details, while another might not be so

thorough simply because they did not experience anything particularly interesting). This

diversity potentially introduces much inter-individual variation in aspects other than language

itself, which is diminished when stories are retold because they have a general plot and a

specific and pre-established structure.

Additionally, the type of story to be retold can influence the analyses of expressive

oral language. Asking people to retell a story that they have just learned from listening to an

audio track, or observing pictures (see Hill et al., 2021), can bias results when analyzing

developmental effects because children/adolescents of different ages can interpret/understand

stories and memorize them at different levels of proficiency. It may be more appropriate to

ask children/adolescents of different ages to retell a famous fairy tale or fable, learned in

early infancy, as this allows between-age comparisons within the same general story

framework. Despite being highly familiar, fairy-tale retellings require complex skills because

these stories have sophisticated meaning and events that must be organized into a coherent

summary (Nippold et al., 2014). This type of retell is also believed to elicit a higher syntactic

complexity than most of the other narrative tasks and is particularly sensitive to

developmental changes in children (Westerveld & Vidler, 2016) and adolescents (Nippold et

al., 2014, 2017). Hence, fairy-tale retellings offer various advantages compared to other oral

speech samples in studies of expressive oral language development in early adolescents. This

can be carried out for free, in any language, and provides an ecological way to elicit

syntactic, semantic, and other expressive language abilities.

In this study, we therefore chose to investigate the development of expressive oral

language in adolescents using oral retells of fairy tales. Transcripts of these stories were used


to extract a wide range of micro- and macrolinguistic elements using automated linguistic

analyses, which is feasible due to the progress of disciplines such as computational linguistics

and discourse processing (Graesser et al., 2004). An example of an open access

computational linguistic tool for this purpose is Coh-Metrix 3.0 (Graesser et al., 2004;

Graesser et al., 2011; McNamara et al., 2010; http://cohmetrix.com/), which returns 108

metrics of written samples of language discourse (Graesser & McNamara, 2011; Graesser et

al., 2011). Although the number of available metrics vary between languages, similar tools

are also available in Romanic languages, such as Spanish (Coh-Metrix-Esp: Quispesaravia et

al., 2016) and Brazilian Portuguese, (NILC-Metrix, which provides 200 metrics: Leal et al.,

2021), which is the native language of the individuals who were assessed in this study.

In sum, most previous studies on expressive oral language development during

adolescence focused on specific language elements/metrics that differ among studies and do

not represent a comprehensive measure of oral language. To obtain a clearer picture of

oral language development across early adolescence, it is important to analyze speech

samples considering metrics that tap the different constructs/areas of language. One type of

classification is the following six multilevel theoretical framework described in Graesser &

McNamara (2011; see also Graesser et al., 2011, 2014): words (lexical or semantic level),

syntax (or grammar; sentence level), textbase (connection between sentence, such as

cohesion), situation model, genre/rhetorical structure and pragmatic communication.- The

latter three levels were not explored in this study as they mainly apply to comparisons

between different types of narrative (see Graesser et al., 2011) and we only used fairy-tale

retellings.-

This seemingly simple approach for classifying elements into language categories

can, however, introduce some difficulties in comparing data across studies because many

metrics could potentially be classified into more than one category. An example is lexical


diversity measures, such as the Honoré index, which considers type/token ratios. It can, at

first glance, be classified at the word (lexical) level; but it can also easily fit into the textbase

level, considering the importance of lexical diversity for story cohesion (see Graesser et al.,

2011).

Taking into account all the above considerations, the aim of the present

cross-sectional study was to explore which types of oral language abilities develop during the

age range in which most people go through the pubertal transition (9-15-year-old) (Dorn et

al., 2006), which we will call early adolescence. To do so, we obtained speech samples of

fairy-tale retellings from 267 adolescents in order to extract 200 micro and macrolinguistic

language elements using an automated free software (NILC-Metrix;

http://fw.nilc.icmc.usp.br:23380/nilcmetrix). Because there is no consensus on which metrics

to analyze in adolescents (many are expected to reach a plateau by adolescence; some are

redundant; and many are classified into different language constructs across studies), a panel

of experts selected a subgroup of metrics based on previous studies (see details in the

Methods section and in the Supplementary File) and reclassified them into language

categories. A data-driven approach was further used to reduce the elements and categories.

We then evaluated which of three developmental measures that can influence language had a

greater effect on these language subgroups: (a) chronological age, because being older allows

more time and experience to acquire vocabulary and better master abstract lexicon and

syntactic aspects of language (see Berman, 2007), which explains differences in performance

between older and younger individuals in the same school grade (Peña, 2020); (b) pubertal

status, which could have specific effects on performance because age of pubertal onset and

trajectory vary considerably among individuals (Dorn et al., 2006) and these factors are

known to impact the maturational advancement of brain structure/function that likely support

language and many other related aspects of cognition (Goddings et al., 2019; Gozdas et al.,


2019; Vijayakumar et al., 2021); and (c) school grade, which can impose cumulative

academic challenges in terms of use of language (Tomblin & Nippold, 2014) that might lead

those in more advanced grades to perform better.

We also investigated whether there are sex and parental schooling effects on the

development of expressive oral language skills. There are conflicting data on sex differences

in language abilities and when they are found, effects sizes are small (see Newman et al.,

2008; Xia, 2013) and more often reflect a female advantage (e.g., Snowling & Hulme, 2020).

If this were found to be related to brain changes induced by puberty, this could be explained

by the fact that puberty begins earlier in girls (Dorn et al., 2006; Goddings et al., 2019).

Regarding parental schooling, this measure was selected as a proxy for socioeconomic status

(SES) because it is strongly associated with factors such as family income, nutrition, overall

health, education and cognitive stimulation (see Reynolds et al., 2017). The home literacy

environment (e.g., Puglisi et al., 2017, Spencer et al., 2017; Snowling & Hulme, 2020) and

poverty-related factors (Black et al., 2017) are all related to parental schooling and are known

to influence the development of many cognitive skills (e.g., Farah, 2017; Von Stumm &

Plomin, 2015). Proficiency in receptive (Jacobsen et al., 2013) and expressive (see

Gardner-Neblett & Iruka, 2015) vocabulary in early infancy, for instance, has been found to

be related to SES (e.g., Alt et al., 2016; Reynolds et al., 2017). However, studies of the effects

of SES on other aspects of oral language are uncommon.

We stress that this study was mainly exploratory, in that it was designed to pinpoint

sets of micro- and macrolinguistic automated metrics obtained from oral fairy-tale retellings

and organized into language category levels that could be sensitive to change across

adolescent maturation. Nonetheless, based on the abovementioned literature, we hypothesized

that there would be improvement across the tested ages, although we refrained from making

predictions about which specific discourse category level would be most affected. We also


had no specific predictions as to which developmental marker (chronological age, pubertal

status or school grade) would better indicate these effects, as all of them, theoretically, are

related to language and discourse skill improvement. We also expected there might be

(minimal) sex differences and a positive association of language performance with parental

schooling, but in which language level this would become apparent was unclear because sex

and SES have been mainly explored at the lexical (vocabulary) level only. Lastly,

performance in the vocabulary subtest from the WASI was included as a control measure as it

is one of the very few language assessment tools that is standardized for adolescents in

Brazil, where the study took place. Performance in this task (raw scores) was expected to be

higher in older adolescents; similar between sexes; and lower in participants whose parents

had less years of schooling (see Von Stumm & Plomin, 2015).

Methods

Participants

This study tested a convenience sample of 267 native Portuguese-speaking

adolescents (9-15-year-old) from the city of São Paulo, Brazil. Exclusion criteria were being

a student with special needs, having been held back in school, and/or using daily medication,

all of which aimed to avoid testing participants with clinical/mental chronic health conditions

that could have an impact on language. Due to a lack of similar prior studies with

adolescence in the literature and the exploratory nature of this study, sample size estimations

were not carried out.

Procedure

This cross-sectional study was approved by the Ethics Committee of the Universidade

Federal de São Paulo (CAAE: 56284216.7.0000.5505) and all participants and their parents

provided signed informed assent and consent, respectively. Participants were recruited at their


own schools, which also consented to be involved in the study. Students were shown a

four-minute video about the study and, if interested in taking part, took home written forms to

be filled in by their legal guardians (consent forms and questionnaires about demographics,

health and behavior). The assessments were administered by trained research assistants and

were performed individually, in a room set apart from the class at the participants’ own

schools.

All measures described here were obtained from a single session that included other

tests and questionnaires that will not be addressed here (part of the same sample was used in

other publications by the last author and colleagues). The test session began with the

Vocabulary WASI tests, but the remaining measures were presented in one of four random

orders, which included the Pubertal Development Scale (PDS), used to assess self-rated

pubertal status (see below) and oral fairy-tale retelling from memory, which was recorded and

later transcribed. The participants were awarded a “Science Partner” certificate, and their

families were provided with a written report about their results.

Measures

Demographics

Guardians provided information on biological sex (gender identity data was not

collected), date of birth (from which age in months was determined), school grade and the

number of years of formal schooling of both parents/guardians. Years of parental schooling

was averaged across both parents/guardians when both were present in the participants’ lives

and used as a proxy of SES, since it is strongly and positively correlated with parental income

and, therefore, associated with general family SES (see Farah, 2017; Jacobsen et al., 2013;

Reynolds et al., 2017; Spencer et al., 2017).

Pubertal status


Self-rated pubertal status was assessed with the Pubertal Development Scale (PDS)

(Carskadon & Acebo, 1993), adapted for use in Brazil by Pompeia et al., 2019. It included

five questions that enquire about rapid height growth, growth of body hair and skin changes

(such as the presence of pimples) for both sexes, deepening of voice and growth of facial hair

for boys and breast growth and occurrence of menarche in girls. Each question can be

answered in 5-point-scales: not yet started (1) to seems complete (4); 0 was used to indicate

that responders did not know the answer, and menarche is rated as either yes (4) or no (1).

The average score indicates puberal status, which ranges from 1 to 4 (see Pompeia et al.,

2019).

Vocabulary

The Vocabulary subtest of the WASI battery (Wechsler, 1999), adapted for local use

(Heck et al., 2009a; Yates et al., 2006) was used, which assesses knowledge of 38 orally

presented words that participants are asked to define. Scores per word range from zero -

cannot define the word to two the definition is correct and complete, following criteria in the

test manual. Raw scores were used because we intended to confirm the improvement in

performance across adolescence and not participants’ performance relative to others of the

same age. Possible scores range from 0 to 76.

Oral retelling of fairy tale

To ensure the participants knew the fairy tale they were supposed to retell from

memory, there were two alternative stories to choose from (Little Red Riding Hood or Jack

and the Beanstalk1). A picture alluding to the main characters of these tales was first

1 These stories were selected based on a pilot study of the best-known fairy tales at the testedage. Thirty adolescents first wrote down the name of all fairy tales they could remember, andthen selected which, among a pool of 25 different fairy tales, they could retell from memory.Little Red Riding Hood was the most mentioned (observed in 44.8% of the answers) andmost voted (89.7% of the votes) story. The alternative story selected to match this one had asimilar structure but a male protagonist instead because it has been shown that sex/genderidentification may be an important issue at the tested age (see Maccoby, 2002). It was the


presented (a girl in a red cape standing on a field covered in snow, with a forest in the

background; a boy standing on a plant above the clouds, under a blue sky, with no additional

details pertaining to the tales, such as the presence of a wolf, granny, giant or castle). The

participants were then asked to name the illustrated fairy tales (to make sure that they knew

them) and to choose the one they knew best or preferred to retell from memory, including a

beginning, middle and an ending. The examiner did not interrupt the participants while they

were retelling the stories unless their version was too brief, in which case they were asked to

provide more details. At the end of the retell, participants were asked to provide an

alternative ending from their own imagination, which, for the present purposes, served to

increase the size of the language sample. All speech samples were audio recorded.

Transcription

Retold stories were transcribed from the audio recordings according to the criteria

proposed by Gago (2002) and Marcuschi (1986), following spelling and segmentation rules

of the Portuguese language. Transcriptions were also marked for disfluencies, discourse

markers and unrelated comments, which were all removed in order to generate a “clean”

transcription. All transcriptions were checked by two people to ensure errors were minimized.

Extraction of the linguistic elements/metrics from the narratives

Linguistic elements/metrics were obtained from the clean transcriptions using the

NILC-Metrix software (Leal et al., 2021), which is based on Coh-Metrix (Graesser et al.,

2004; Graesser et al., 2011; McNamara et al., 2010) and generates 200 language metrics,

including lexical and syntactic information, readability and psycholinguistic measures, and

metrics of referential and semantic cohesion, all of which are explained in detail, in

Portuguese, on the NILC-metrix website (http://fw.nilc.icmc.usp.br:23380/nilcmetrix).

Selection of language elements/metrics and their classification into language categories

second most voted for (55.2% of the votes); however it was only mentioned three times whenrecalled from memory.

http://fw.nilc.icmc.usp.br:23380/nilcmetrix


The selection of which oral language aspects/levels and metrics would be analyzed

was a multistep procedure detailed in the Supplementary File.

Briefly, this involved various sessions during which all the authors (specialists in

linguistics, speech and language therapy, computational linguistics and cognitive sciences)

considered the characteristics of all 200 metrics returned by NILC-Metrix. The 47 selected

metrics (elements) were then organized by consensus into language categories/levels that

made sense theoretically following some suggestions in the literature. There were six

linguistic category levels: Lexical complexity (six metrics) and Grammatical/syntactic

complexity (16 metrics), which mapped on to the first two of the six of the levels of the

multilevel theoretical framework proposed by Graesser & McNamara (2011) and Graesser et

al. (2011, 2014). The third level according to their framework, textbase, comprised two

categories in the current study (Referential Cohesion, with 10 metrics; Semantic Cohesion,

with four metrics) because these are the only types of cohesion devices returned from the

NILC-Metrix in Brazilian Portuguese and different types of cohesion seem to fall into various

different categories in principal component analyses (e.g., Graesser et al., 2011). We also

added two further categories: General Complexity/Readability (7 metrics) and

Psycholinguistic Complexity (four metrics), based on Santos et al. (2020) and dos Santos et

al. (2017), because it was unclear in which of the prior categories these types of metric should

be allocated. The number of metrics was then reduced to include only the broad elements

(e.g., verb diversity instead of gerund verb ratio, inflected verb ratio, infinitive verb ratio, and

so on) that are expected to improve as adolescents age, while also avoiding redundancies.

When redundancies were detected (e.g., Brunet and Honoré indexes: both are a form of

type/token ratio metric), we prioritized the most reliable/used metrics in the literature, and/or

selected the metric that captured a broader use of language (in the former example, the

Honoré index, since besides having all characteristics of the Brunet index, it also accounts for


words that occur only once). Initially, 47 metrics were selected (see Table S1 in the

Supplementary File). The number of metrics and levels were further reduced following the

analytic plan described below, mainly to avoid type I errors in the statistical analyses.

Analytic plan

Descriptive statistics and the inferential analyses were all carried out using IBM SPSS

Statistics version 21 (IBM Corp., Armonk, NY, USA). There was no imputation of missing

data. After the definition of all six language category levels (1 – Lexical Complexity; 2 –

Grammatical/Syntactic Complexity; 3 – Referential Cohesion; 4 – Semantic Cohesion; 5 –

Readability; and 6 – Psycholinguistics), we initially explored the pattern of Pearson linear

correlations between and within elements of different language categories (see Tables 2S-7S

in the Supplementary File; the correlation patterns between all 47 metrics can be found at

https://osf.io/2u6vw/). This was done to: (a) ensure the levels were adequate (i.e., element

correlations were higher within levels than between categories); and (b) propose possible

changes in the number of categories, reallocation of metrics to other categories and/or

exclusion of elements that had no sizable associations with any other elements or correlated

too highly ( r > .90, to avoid multicollinearity: see Grice & Iwasaki, 2008).

After changes in the category levels and the reduction of metrics based on the above

data-driven method (described in the Results section), data were screened for outliers with

boxplots. This was followed by an examination of the adequacy of the narratives that

contained outliers. If the same participant presented outlying data in five or more of our

selection of metrics and the narrative was not minimally organized in a logical way, all data

of the participant was removed.

We then ran multivariate General Linear Models (mGLM), with outcome/dependent

variables being the selected metrics for their respective language category level (see Table 2).

Sex and fairy tale (Little Red Riding Hood or Jack and the Beanstalk) were entered as

categorical predictors (two levels each), and chronological age (in months) and mean parental


years of schooling, as continuous predictors. This approach was used to determine how a set

of quantitative dependent variables in each category could be combined linearly to maximize

(see Grice & Iwasaki, 2008) the identification of oral retelling performance development

according to chronological age and controlled for the effects of sex and parental schooling, as

well as of the effects of these latter two variables. Pillai’s Trace was used to report the effects

as this reflects the complete set of linear combinations generated from the analysis (Grice &

Iwasaki, 2008). The level of significance adopted was 5%. We did not test for normality in

each dependent variable because mGLM (or MANOVA) are robust towards univariate

non-normality (Allen & Bennett, 2008; Allen et al., 2014) and because this does not

guarantee multivariate normality. mGLM are also very robust against violations of

homogeneity assumptions when groups sizes are higher than 30 (Allen & Bennett, 2008;

Allen et al., 2014) so this was also not controlled for. However, the distribution of residuals

was visually inspected and no indication of substantial deviations of normality of the

multivariate distribution was detected. Follow-up univariate General Linear Models (uGLM)

were run to describe the metrics that showed higher effects, individually. We also ran the

same analyses substituting age for pubertal status (scores in the PDS) and then school grade.

The WASI Vocabulary raw scores were also analyzed using a uGLM, with age (in

months), sex (categorical) and mean parental schooling (continuous) as factors, and then

again substituting chronological age for pubertal status and school grade. Partial eta squared

(ηp²) values between .0588 - .1379 were considered medium, and over .138, large effect sizes.

We will focus our analyses on results that were statistically significant and reached medium

to large effect size based on these rules of thumb since they are likely the most robust and

useful findings in practical terms (see Lakens, 2013; Richardson, 2011).

Results


All anonymized transcripts and the anonymized database can be found at the Open

Science Framework website (https://osf.io/2u6vw/).

Selection of linguistic levels and metrics

The pattern of correlations between the 47 initially selected metrics (see Table 2S-7S

in the Supplementary File and https://osf.io/2u6vw/) indicated that some of these categories

should be merged. For example, various psycholinguistic metrics and the Honoré index were

associated with elements in the Lexical Complexity level, reflecting similarity at the word

(lexical, semantic) level, while other readability metrics, such as the Flesch readability ease,

were more related with the Grammatical/Syntactic category. These associations made perfect

sense so some categories were merged, and the metrics that had a low correlation with others

were then removed or reallocated to other category levels containing elements that they

correlated more highly with. These procedures – described in detail in the Supplementary File

– resulted in the selection of three linguistic category levels (Lexical Complexity, at the word

level; Grammatical/Syntactic Complexity, at the sentence level; Cohesion, at the between

sentences level) containing a total of 18 metrics (four to seven per level: see Table 1).


Table 1

Language levels considered in the present study and their respective elements (metrics) extracted by the software NILC-Metrix from the

transcriptions of the oral fairy-tale retelling narratives

Note. The steps involved in the selection of these metrics and their classification into these language levels are described in the text and in detail

in the Supplementary File. Based on the literature, upward arrows indicate that higher values represent higher text complexity (directly

proportional), whereas downward arrows indicate that higher values represent lower text complexity (inversely proportional).

21

Ten participant narratives were excluded after they were identified as outliers in at

least five boxplot graphic descriptions of our 18 selected metrics. Most of these outliers were

identified as having produced such a disconnected and unclear narrative that, not surprisingly,

they gave rise to a different pattern of results.

Descriptive statistics of the retell analyses

After the exclusion of the outliers, 257 participants remained in the sample (150 girls,

107 boys). Their characteristics are illustrated in Table 2 according to age in (completed)

years. The sample mean age was M = 156.7, SD = 21.1 months (range 9-15 years), and

school grade varied from fourth (elementary) to first grade of high school, which corresponds

to fourth grade (elementary) to tenth (high school–sophomore year) in the United States. The

average number of years of parental schooling was M = 10.6, SD = 3.7, but highly variable,

ranging from 0 to 19.5. Pubertal status (PDS scores) ranged from 1 to 4 (M = 2.5, SD = 0.7).

Narratives were, on average, 315 (SD = 198.7) words-long (ranging from 41 to 1535). Some

were very short but we opted not to exclude them if their content was coherent. The Little

Red Riding Hood tale was chosen to be retold by the great majority (n = 223) of participants,

while Jack and the Beanstalk was chosen by only 13% (n = 34) of the sample and

disproportionately by boys (67% of those 34 subjects were boys, while only 42% of the

whole sample were boys), x² (1) = 12.10, p < .001 (Table 2). This led us to exclude this factor

from the statistical models reported next because it was unreliable and there was no reason

for us to believe language skill would differ according to the stories, which were of the same

genre.

22

Table 2

Demographic characteristics of the sample per age (in completed years) and sex, and the number of participants who chose each fairy tale to

retell

Note. PDS = Pubertal Development Scale score; PS = Parental schooling (average number of years of formal education), a proxy for

socioeconomic status. This Table only considers participants included in the analyses, after 10 participants were excluded. Age in months

(continuous) were converted to categorical age variables (in years) for illustrative purposes, but statistical analyses were carried out using age in

months. The fairy tale factor was not considered because it was unreliable (see text).

23

Table 3

Results of all multivariate and follow-up univariate general linear models pertaining to fairy-tale retell and univariate model of the vocabulary

test, considering the effects of age, sex and parental schooling

24

Note. PS = Parental schooling. Bold results represent significant effects with medium to large effect sizes (ηp² ≥ .058). Regular font represents

significant effects with small effect sizes. Light gray results indicate non-significant effects. For effects of sex, girls are the reference group. Total

N = 257 (150 girls; 107 boys).

25

Multivariate and univariate analyses with age as the developmental marker

In the mGLM that included the developmental marker age (Table 3), chronological

age had a positive and significant multivariate effect on all three predefined language levels,

which are represented by Pillai’s Trace indicating significant effects and medium to large

effect sizes: Lexical Complexity, F(7, 247) = 4.94; p < .001; ηp² = .123;

Grammatical/Syntactic Complexity, F(7, 247) = 9.47; p <.001; ηp² = .212; Cohesion, F(4,

254) = 13.56; p < .001; ηp² = .178.

Follow-up univariate analyses showed that, regarding Lexical Complexity, there was

an age-related improvement with a medium effect size indicating that older participants used

words that are acquired at older ages: Age of acquisition, F(1, 253) = 21.21; p < .001; ηp² =

.077. The other metrics in this level, such as content word diversity and mean word

concreteness, also significantly improved with age, but these effects only reached small effect

sizes.

Univariate analyses on the Grammatical/Syntactic level also showed improvement

with age in all but one of the metrics (noun phrase length). Effect sizes greater than medium

were (from strongest to weakest): Prepositions per sentence, F(1, 253) = 53.30; p < .001; ηp²

= .174; words per sentence, F(1, 253) = 18.42; p < .001; ηp² = .068; and Yngve’s index, F(1,

253) = 17.14; p < .001; ηp² = .063.

For the Cohesion category, there was a significant increase (higher scores) with age in

three of the selected metrics, but only two reached a medium effect size: Argument overlaps

in all pairs of sentences, F(1, 253) = 22.61; p = .0001; pη² = 0.082; and semantic similarity

between all pairs of sentences, F(1, 253) = 18.57; p < .001; pη² = 0.068.

Sex presented effects only in Lexical Complexity: Pillai’s Trace, F(7, 247) = 4.42; p <

.001; ηp² = .111, and follow up analyses showed a medium-sized sex effect favoring girls only

on word concreteness mean, F(1, 253) = 16.87; p < .001; ηp² = .063, girls having used less

26

concrete words and, therefore, more abstract words than boys; however, mean differences

were negligible in practical terms (Mgirls = 4.68; Mboys = 4.76).

Parental schooling effects were found in the multivariate analyses only for the

Cohesion level: Pillai’s Trace, F(4, 250) = 3.51; p = .008; ηp² = .053, but this was a small

effect only observed as a combination of the individual metrics, as none of the univariate

analyses reached anywhere near a medium effect size.

Performance in the Vocabulary WASI subtest was better in older participants: F(1,

253) = 73.96; p < .001; ηp² = .226 – high effect size, ranging from 18, at 9 years of age, to 62

at 15-years of age on average, and was not affected by sex, F(1, 253) = 2.41; p = .122,

although it was positively related with parental schooling, F(1, 253) = 6.54; p = .011; ηp² =

.025–small effect size.

Multivariate and univariate analyses with pubertal status and school grade as the

developmental markers

The pattern of effects was almost exactly the same for the oral retell task and the Vocabulary

subtest in the multivariate and univariate follow-up models in which age was substituted for

pubertal status and school grade, although effects were slightly larger using age (see Table 4).

The direction of effects with all developmental markers was also the same, except in the case

of sex effects in the WASI Vocabulary subtest: there were no sex effects when considering

chronological age but, when the developmental variable was the PDS score (that is, when

boys and girls are matched by pubertal status), girls presented a vocabulary knowledge

disadvantage, which will be detailed in the Discussion. Hence, to avoid redundant

descriptions of results, we will focus henceforth on the models with age (see Tables 8S and

9S in the Supplementary File for detailed results regarding pubertal status and school grade,

respectively).

27

Table 4

Comparison of significant effects of the three developmental markers (age, pubertal status measures with the Pubertal Development Scale – PDS

and school grade), after multivariate and follow-up univariate general linear models pertaining to fairy-tale retell and univariate model of the

Vocabulary test, according to effect sizes of significant effects (Large = L; Medium = M; Small = S; - = non-significant).

28

Note. All significant developmental effects were in the same direction (e.g. improved with age, PDS and grade) *except in respect to sex effects

(girls were at a disadvantage in their performance in the WASI Vocabulary subtest when the PDS was the developmental variable). Effect sizes

were classified following Lakens’ (2013) and Richardson’ (2011) rules of thumb: S = small (η p² ≤ .0587), M = medium (.0588 ≤ ηp² ≤ .1379),

and L = large (ηp² ≥ .138). Detailed results can be found in Table 3 for chronological age, and Tables 8S (PDS) and 9S (school grade) of the

Supplementary File.

29

Discussion

The main objective of this study was to identify, in an exploratory manner, which

elements of expressive oral language, assessed by a fairy-tale retellings, develop during

adolescence. Overall, we confirmed our predictions that there would be

developmentally-related improvements in oral language skills during this ager range, which

we found to occur quite similarly in respect of all three analyzed developmental markers

(age, pubertal status and school grades) and language categories levels (Lexical Complexity,

Grammatical/Syntactic Complexity and Cohesion) with medium to large effect sizes. We also

corroborated that sex differences favoring girls do occur (but only at the lexical level, which

did not seem to be pubertally driven) and, contrary to what was expected, found only subtle

effects (small effect size) of parental schooling, which occurred at only one language level

(Cohesion). In the following paragraphs we address the adequacy of the linguistic elements

analyzed in each category, followed by a more detailed discussion of the development of

expressive and receptive (Vocabulary) language and the possible practical applications and

implications of the results.

Selection of language elements and categories

Firstly, we address the appropriateness of our rationale and method of selecting

language elements (from the initial pool of 200 returned by NILC-Metrix) and classifying

them into language category levels. This was undertaken because: (a) reducing the number

of dependent variables avoids inflation of type I errors; (b) there is a lack of clear guidelines

regarding in which language category level many of these elements should be classified; and

(c) it is unclear in the literature which NILC-Metrix elements are most likely to improve in

early adolescents’ oral retell due to the fact that prior studies have mainly used only a limited

number of metrics and only very few considered cohesion. Our panel of experts initially

excluded around 75% (N=163) of the elements that were either too narrow/specific and/or

30

redundant classifications of language, and grouped the remaining variables (n = 47) into six

language categories based on prior studies (Graesser & McNamara, 2011; Graesser et al.,

2004, 2011, 2014; dos Santos et al., 2017; Santos et al., 2020. We then used a data-driven

approach based on the pattern of intercorrelations among these 47 metrics to refine the

selection of elements and levels and reach a minimum set of multiple variables that were

more highly correlated within than between language levels2. This resulted in the analyses of

18 metrics classified into three language category levels (four to seven metrics per category).

This final three-level classification (Lexical Complexity, Grammatical/Syntactic Complexity

and Cohesion) is congruent with the first three (words, syntax and textbase) of the six

multilevel theoretical framework proposed in Graesser & McNamara (2011) and Graesser et

al. (2011, 2014). The other three levels of their framework (situation model, genre/rhetorical

structure and pragmatic communication) were not considered here because they rely on

comparisons of language elements between different types of narratives (see Graesser et al.,

2011) and here we only analyzed fairy-tale retellings.

Variables in the psycholinguistic level (i.e., word concreteness and familiarity) were

associated with other elements at the lexical level so these levels were merged, including the

lexical complexity measures (e.g., Honoré’s), that, unlike proposed by Graesser et al. (2011),

did not seem to index cohesion, but rather associated more highly with other selected lexical

measures. For instance, the Honoré’s index, which indicates lexical richness by considering

the total number of different words used in the narratives, was more strongly correlated with

the other lexical elements (Pearson’s r values over 0.15 for four out of the six elements of

lexical complexity) than the cohesion markers (Pearson’s r values below 0.13 for all four

2 This way it would be possible to determine all possible linear combinations of effects of all elements at eachlanguage category level, which is important for the application of multivariate general linear models (see Grice& Iwasaki, 2008). Differently, if values of the metrics in each level had been largely unassociated, the results ofthe multivariate analyses would have simply reflected the sum of univariate effects and not their linearcombination (Grice & Iwasaki, 2008). Hence, by combining correlated metrics in each statistical model wemaximized the chance of finding language categories effects and not spurious results or type I errors.

31

selected elements. The correlation patterns between all 47 metrics are available at

https://osf.io/2u6vw/). Classic measures of readability were largely redundant and the only

one that remained within the final 18 metrics (Flesch readability ease) correlated more highly,

and was merged with, elements at the syntactic level. Finally, our proposal to separate

referential from semantic cohesion measures did not seem necessary as they correlated, so

these levels were also merged. Based on this set of 18 variables and three categories levels,

we analyzed developmental, sex and parental schooling effects.

Age effects

Regarding development in terms of Lexical Complexity, age effects reached medium

effect sizes with the multivariate analyses approach, showing that older participants used

more complex words measured in various ways. The most expressive univariate age effect

was the age of acquisition metric (the average value of the age at which individuals usually

learn the content words present in a narrative, implemented as described by dos Santos et al.,

2017). The mean age of acquisition value of the content words used by all adolescents in their

narratives was approximately 3 (see Table 3), which corresponds to a 5-6-year-olds

vocabulary according to the NILC-Metrix manual. However, this seemingly low value and

the fact that no other lexical element was individually as strongly associated with age might

be a reflection of the thematic scope of the fairy tales used here (Little Red Riding Hood and

Jack and the Beanstalk), which can be retold without resorting to vocabulary acquired later in

life (see more on this issue below). Yet, despite the simplicity of the retell task, there was still

a significant and medium effect size multivariate increase in lexical aspects of the narratives.

The strongest age effect at the multivariate level was found for the category level of

Grammatical/Syntactic Complexity. Older adolescents produced longer and more complex

sentences, indicating that they were producing more densely packed sentences with

embedded clauses in the form of subordination and/or coordination and even syntactic

https://osf.io/2u6vw/

32

passive voices, surpassing the need to separate them into different and less connected

sentences. This corroborates findings from studies using a small set of metrics analyzed

individually (see Berman, 2007; Nippold et al., 2014) and supports what Berman (2007)

suggests as key for later language development. Berman (2007) claims that school aged

individuals do not necessarily learn new syntactic constructions, which have been known

since preschool age but, rather, employ these constructions to form more complex sentences.

This was confirmed by a medium sized univariate age effect on the use of embedded clauses

per sentence (Roark et al., 2007) as measured with the Yngve’s index (Yngve, 1960). This

index yields a higher value when sentences branch out to the right in a syntactic tree (a

characteristic of both Portuguese and English languages; Fraser et al., 2015), representing a

more syntactically complex discourse. Interestingly, the increase in size and complexity of

sentences can also be explained by the linguistic element with the highest effects in this and

other levels: an age-related increase in the number of prepositions per sentence, which

explained 17% (adjusted R2) of the variance in age. Prepositions (e.g., in, up, down, of, from,

after, until, between) are function words used to specify time and space, duration, direction

and place, as well as to express relations of possession, company, quality, causality, and other

semantically related associations. Hence, syntactically, prepositions can be used to specify

both the verb itself, as a case marker, or the whole sentence, as part of an adverbial adjunct.

The use of prepositions as verb case markers, as adverbial adjuncts or to form nominal

complements indicates better linguistic performance and richer communicative behavior,

allowing the speaker to be more accurate about the content in their discourse, while also

using the same words, such as transitive verbs, to create different specific meanings. For

example, “walking in the forest”, “walking through the forest”, “walking by the forest” and

“walking round the forest” all have different meanings, representing events that differ in

relation to their space condition. This also endorses a particular type of language

33

development observed during adolescence, described by Berman (2007) as “new word-forms

and expressions serving to elaborate familiar constructions, while new constructions (…)

trigger novel uses of earlier acquired lexical items”.

As for the Cohesion category,3 positive age effects were the second largest at the

multivariate level (after the Grammatical/Syntactic level) and two of its elements reached

medium effect sizes: (a) semantic similarity between all pairs of sentences (Landauer &

Dumais, 1997; Graesser et al., 2004; Graesser & McNamara, 2011), which indicated that

older adolescents used words that are more closely related to each other in meaning across

succeeding sentences, therefore producing a more semantically interconnected narrative; and

(b)argument overlaps considering all sentences, which shows increases in the number of

sentences that share common nouns and/or pronouns (McNamara et al., 2010; Graesser &

McNamara, 2011). Thus, as adolescents grow older, they also seem to make more efficient

use of cohesive devices and strategies to connect ideas and events in their narrative. There are

few studies that evaluate cohesion in terms of spoken discourse, and most refer to second

language acquisition and use (e.g., Crossley et al., 2019) which are not directly comparable to

ours. There are few exceptions, such as the study of Nippold et al. (1992) who, like us,

showed age-related improvement in the use of cohesive devices (only adverbial conjunctions)

in adolescence. In contrast, Ciccia & Turkstra (2002) and Hill et al. (2021) did not find

developmental effects in cohesion abilities at this age. This might have a number of

explanations, such as smaller sample sizes and different age ranges, but is most likely the

3 In the present study, we considered that increases in semantic and referential cohesion metrics indicated betteruse of language because the cohesion metrics scores were positively correlated with other metrics that had aclear interpretation and improved with age (e.g., Yngve’s index, prepositions per sentence and words persentence). However, it is still unclear in the literature if higher or lower cohesion should be interpreted as amore complex or adequate text/narrative. This depends on many issues. For example, people with a goodunderstanding of a certain topic and/or who master a language can build and understand narratives that requireinferences, without explicitly referring to ideas, sequences of events and their relationships in a narrative. Thiswould lead to low cohesion scores (see Graesser et al., 2004; Graesser & McNamara, 2011; McNamara et al.,2010), even though a discourse that demands more inferences is considered to be more complex. On the otherhand, narratives with more repetitions of semantic and referential elements, which leads to higher cohesionscores, are not necessarily clear or informative.

34

result of the use of a limited number of metrics that assess cohesion in both studies,

differently from our approach.

Sex effects

As to the effect of sex at the multivariate level, the only variable that was sensitive to

sex was mean concreteness of content words (Lexical Complexity level), showing that girls

used on average more abstract content words than boys (with a medium effect size), an

indicator of a higher level of lexical knowledge (Berman, 2007). Other studies using different

types of measures have also shown that girls have an advantage over boys in the use of

language (Snowling & Hulme, 2020). However, we did not find sex differences at the

Grammatical/Syntactic level, which has been shown in other studies that investigated

different aspects of language (see Xia, 2013), nor sex differences in cohesion that, to our

knowledge, has not been determined before at this age. Another interesting sex-related

finding is that boys were more prone to select the story with the male character (Jack and the

Beanstalk), which indicates that, in early adolescence, there may be different preferences for

telling stories with protagonists that participants identified themselves with in terms of

sex/gender (see Maccoby, 2002). This was the reason we provided alternative stories with a

male and female main character.

Parental schooling effects (socioeconomic status)

As to parental schooling, lower values were related with lower Cohesion, measured

only at the multivariate level with a small effect size, which could indicate a mild

difficulty to interconnect sentences. We failed to find other studies that investigated this issue

in expressive oral language development during adolescence to corroborate our results.

Contrary to what was expected, we found no parental schooling effect at the

Lexical/Semantic levels in the retell task, even though proxies of SES during childhood are

often shown to be related to receptive language abilities (Jacobsen et al., 2013; Reynolds et

35

al., 2017; Spencer et al., 2012), and were indeed associated here with the WASI Vocabulary

task scores. As to the Grammatical/Syntactic category level, although Spencer et al. (2012)

showed that low SES was related to lower mean length of utterances in a narrative task, we

did not confirm this using different language elements at this language level. The absence of

parental schooling effects at the word (Lexical Complexity) and sentence

(Grammatical/Syntactic Complexity) levels can be attributed to the fact that people can tell

stories, particularly simple ones, such as fairy tales, without necessarily having to use their

complete repertoire of words, nor use more complex sentences beyond the verified

improvement with age. We do not believe this is a weakness of using oral fairy story

retelling. On the contrary, a lack of parental schooling effects on Lexical and

Grammatical/Syntactic metrics may allow the differentiation of underprivileged children who

have language/developmental disorders from those whose language is less developed due to

inadequate home and school language environments, which is difficult to do when using

other classic and normed tasks that show SES effects (e.g., Vocabulary tasks). Furthermore,

when using a retell task of a simple story, it is possible to detect possible negative SES effects

by looking at cohesion.

Comparison across developmental indicators (age, pubertal status and school grade)

In terms of which developmental marker was more sensitive to improvement in

language across early adolescence, we found that age measured in months, self-assessed

pubertal status (PDS) and school grade had similar effects (albeit slightly larger for

chronological age), indicating that neither of these factors are specifically related to language

development during adolescence. According to Peña (2020), even students in the same school

grade who are exposed to the same academic demands may show different academic

achievement depending on their “relative age”. In other words, higher academic scores were

systematically found for older compared to younger classmates, suggesting that even within

36

individuals who differ in age by a year or less, having lived longer allows for more time to

acquire language abilities (e.g., through socialization, reading books, watching television).

Stated differently, experience gained with age (measured in months) seems to be a more

important factor for expressive language development in adolescence, which has also been

shown for receptive knowledge (Kovács et al., 2022). Entering puberty earlier has also been

associated with better working memory (Kovács et al., 2022) and academic outcomes in

longitudinal studies with large number of adolescents (Torvik et al., 2021), independently of

age, which has been explained as being due to the earlier maturation of brain systems that are

associated with language and academic skills in individuals who are more pubertally

advanced (Goddings et al., 2019; Gozdas et al., 2019; Vijayakumar et al., 2021). However, in

the present study, pubertal status measured with PDS scores, like school grade, did not better

account for expressive and receptive language development than age, which represents gains

in oral expression with expressive experience improved by the passage of time rather than

academically or in terms of sexual development (this latter effect is subtle, and probably

requires very large samples to be identified, which indicate it may be of little importance in

practical terms). Notwithstanding, contrasting effects in the analyses using age and the PDS

provided interesting results regarding developmental language differences between boys and

girls, which had a different pattern of effects for expressive and receptive language. In both

the analyses that considered age and the PDS, girls, who mature sexually one to two years

earlier than boys (Dorn et al., 2006), had an advantage on the lexical complexity level. This

suggests that this is not related to sexual maturity, which was matched across sexes in the

analysis with the PDS, and is more likely due to gender-related cultural aspects. For example,

adult females tend to use more adverbs and adjectives, which could offer more opportunities

to use abstract words of psychological or social content (see Newman et al., 2008; Xia,

2013). On the other hand, girls were impaired in relation to boys in terms of receptive

37

vocabulary (WASI) in the analysis with the PDS. This was probably the result of the fact that

they were younger than boys of the same pubertal status, confirming the importance of age in

gaining knowledge about word meanings (as found by Kovács et al., 2022), whereas no sex

difference was observed in the analysis that considered age, corroborating the findings of

other studies (e.g., Toivainen et al., 2017). In other words, receptive vocabulary improves

with time in both sexes across early adolescence, whereas girls tend to express themselves

verbally using more complex words than boys, irrespective of their age and pubertal status, a

finding that must be confirmed in future investigations.

Implications and applications of the results

The practical implications of the results of the present study in respect of furthering

the understanding of the development of expressive oral language in early adolescence are:

(a) fairy-tale retell is a useful type of narrative for this purpose that, unlike most language

assessment tools, is not copyrighted and is much less time consuming to carry out; (b) from

the transcribed retold stories it is possible to automatically extract a great number of metrics

that indicate expressive oral abilities in various languages (English, Portuguese, Spanish)

using free computer software such as Coh-Metrix and NILC-Metrix that avoids errors and

time-consuming subjective manual assessments of narratives, making the results more

reliable (see Graesser et al., 2011; Miller et al., 2016) and easier to replicate (Heilmann &

Miller, 2010; Treviso et al, 2018); (c) it is feasible and simple to use various language metrics

grouped into language categories that correspond to the use of language at the word, sentence

and inter-sentence levels, which facilitates comparisons across studies, especially those that

analyze a limited number of different metrics; (d) all main category levels of oral language

skills seem to improve with age in early adolescence, even when a simple narrative is

involved; (e) clear developmental effects can be found according to chronological age (but it

is probably wise to do so in months), pubertal status or school grade, although the two latter

38

factors may have differential effects if samples are very large, but are probably of little

practical application; (f) since preposition diversity was the metric that presented the largest

effect size for all developmental measures, it might be interesting to investigate this

improvement further in future studies; this seems to be the case for the Portuguese language,

however other languages might differ in this aspect, which should also be further explored;

(g) girls have better performance at the expressive (but not receptive) lexical level and this

does not seem to be related to the fact that their pubertal transition takes place earlier than

that of boys, a finding that indicates that fairy-tale retell might be a means of picking up

possible culturally gender-driven female lexical advantages; (h) in studies with early

adolescents, it may be important to consider the fairy tale to be told regarding gender roles

(see Maccoby, 2002) as those who chose to tell the Jack and the Beanstalk story were

disproportionately male; (i) although the lack of parental schooling effects at the lexical and

syntactic levels might seem a limitation, it can be a great advantage for teasing apart

language/developmental difficulties from inadequate schooling in individuals from families

with low SES, while still showing these effects on cohesion; (j) samples such as the ones

obtained here can be used as described by Miller et al. (2016) to identify Brazilian Portuguese

speaking adolescents who have expressive language difficulties (e.g., specific language

impairment, attention deficits, autism spectrum condition). To this end, the scores of these

individuals in the various elements obtained from transcribed orally retold stories can be

compared to those of a pool of peers from our sample (same age or grade, irrespective of sex)

to determine whether their scores are significantly below (in standard deviations) those found

for typically developing individuals. Miller et al. (2016), however, did so considering only a

few metrics (i.e., number of words, utterances, ratio of the total number of clauses to the total

number of communication units). The categories we proposed here increase the possibilities

of identifying different types of, or more extensive, language impairment; and, finally, (k)

39

future research might benefit from studying the relationships between the expressive oral use

of language, as analyzed here, to understand how they are associated with other cognitive

abilities and functions, such as declarative and procedural memory (Ullman, 2004), executive

functions, social cognition and executive functions (see Matthews, Biney, & Abbot-Smith,

2018).

Study limitations

The limitations of our study are as follows. Firstly, in order to tell a fairy tale,

individuals, including adolescents, do not have to use all of their linguistic abilities due to the

simplicity of the story plot and the limited number of characters and relevant events.

Therefore, it is possible that the results would have reached different effects had we used a

different type of oral task, such as a more complex narrative theme, an argumentative or

expository discourse (see Nippold et al., 2005), stream of consciousness (Newman et al.,

2008), or even written texts, which recruit a greater use of formal/complex lexical knowledge

and syntactic structures (Bourdin & Fayol, 1994). These alternative language tasks can,

however, introduce analytic difficulties such as wide inter-age and interindividual variations

in the structure and depth of content of the narratives, problems that we wanted to avoid. We

also did not assess utturances, story grammar, qualitative aspects of the retold story,

disfluencies, factors such as the use of proverbs, metaphors and other non-literal uses of

language, nor metalinguistic awareness, or pragmatic abilities (Matthews et al., 2018), all of

which are known to also improve at the tested age (see Berman, 2007) and which could have

been differently sensitive to developmental, sex and SES effects. Pubertal status was assessed

by self-report using the PDS scale so it is possible that higher effects would have been found

if gold-standard Tanner ratings evaluated by clinicians or measurements of sex hormone

concentrations had been used (see Dorn et al., 2006). Furthermore, our selection and

reduction of all 200 primary metrics was subjective (even though based on previous studies),

40

meaning that different researchers could have selected a different set of linguistic elements

and organized them in a distinct manner, which could lead to other results. Notwithstanding,

the selection was made by professionals from various fields related to language development

(cognitive psychology, speech and language therapy, linguistics and natural language

processing) and data were analyzed only after confirming the appropriateness of the selection

with their pattern of inter-correlations. A further possible limitation was the fact that we only

worked with the NILC-Metrix returned set of metrics, which lack, for instance, measures of

cohesion that are not of the referential and semantic types (see Graesser et al., 2004, 2011).

Additionally, the language spoken by adolescents likely influences the development of the

language elements, so cross-language studies are needed to determine universal versus

language-specific spoken narrative adolescent abilities. Future studies must explore this and

the usefulness of our approach for other types of populations (e.g., different ages, clinical

groups). Lastly, it was not possible to determine sample size a priori due to the innovative

and exploratory approach that was used, so it cannot be excluded that more substantial gender

and SES effects could have been found if the sample had been larger. However, we believe

this was not a shortcoming of this study because the sample was large enough to pick up

developmental effects (medium to large effect sizes) and none of our models had fewer than

10 participants per metric, which is a rough estimate of the minimum number per variable in

many types of exploratory analyses such as ours (e.g., Osborne & Fitzpatrick, 2012;

VanVoorhis & Morgan, 2007).

41

Conclusion

Using various automatically generated language elements from transcribed samples of

an oral fairy-tales retelling task combined at the multivariate level considering three types of

language skills (lexical, grammatical and cohesion), we found that: (a) it is possible to clearly

show that as early adolescents age, their vocabulary and their ability to build and interrelate

sentences increases, and that this is not specific to their pubertal status or academic level; (b)

sex effects can be found, but only at the lexical level, and it is likely that these are not

biologically driven; and (c) parental schooling affects only cohesion.

Acknowledgements

The authors would like to thank the researchers who aided in the data collection, retell

transcription and analyses, all the families that participated in the study, as well as all school

staff who took part in the recruitment process.

Funding

This work was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo

(FAPESP, grant numbers 2016/14750-0, 2020/01091-3); Coordenação de Aperfeiçoamento

de Pessoal de Nível Superior (CAPES, finance code 001); Conselho Nacional de

Desenvolvimento Científico e Tecnológico (CNPq, due to fellowship awarded to authors SP:

301899/2019-3); and Associação Fundo de Incentivo à Pesquisa (AFIP).

42

REFERENCES

Allen, P., & Bennett, K. (2008). SPSS for the health & behavioural sciences. Thomson.

Allen, P., Bennett, K., & Heritage, B. (2014). SPSS statistics version 22: A practical guide.

Cengage Learning Australia.

Alt, M., Arizmendi, G. D., & DiLallo, J. N. (2016). The role of socioeconomic status in the

narrative story retells of school-aged English language learners. Language, speech,

and hearing services in schools, 47(4), 313-323.

Ardila, A., & Rosselli, M. (1996). Spontaneous language production and aging: sex and

educational effects. International Journal of Neuroscience, 87(1-2), 71-78.

Barker, M. S., Nelson, N. L., & Robinson, G. A. (2020). Idea formulation for spoken

language production: the interface of cognition and language. Journal of the

International Neuropsychological Society, 26(2), 226-240.

Berman, R. A. (2007). Developing linguistic knowledge and language use across

adolescence. In E. Hoff & M. Shatz (Eds.), Blackwell handbook of language

development (pp. 347–367). Blackwell Publishing.

Black, M. M., Walker, S. P., Fernald, L. C., Andersen, C. T., DiGirolamo, A. M., Lu, C., ... &

Lancet Early Childhood Development Series Steering Committee. (2017). Early

childhood development coming of age: science through the life course. The

Lancet, 389(10064), 77-90.

Bourdin, B., & Fayol, M. (1994). Is written language production more difficult than oral

language production? A working memory approach. International journal of

psychology, 29(5), 591-620.

43

Byrd, C. T., Logan, K. J., & Gillam, R. B. (2012). Speech disfluency in school-age children’s

conversational and narrative discourse. Language Speech and Hearing Services in

Schools, 43(2), 153-163.

Cannizzaro, M. S., & Coelho, C. A. (2013). Analysis of narrative discourse structure as an

ecologically relevant measure of executive function in adults. Journal of

psycholinguistic research, 42(6), 527-549.

Carskadon, M. A., & Acebo, C. (1993). A self-administered rating scale for pubertal

development. Journal of Adolescent Health, 14(3), 190-195.

Ciccia, A.H., & Turkstra, L.S. (2002). Cohesion, communication burden, and response

adequacy in adolescent conversations. International Journal of Speech-Language

Pathology, 4(1), 1–8.

Crossley, S. A., Kyle, K., & Dascalu, M. (2019). The Tool for the Automatic Analysis of

Cohesion 2.0: Integrating semantic similarity and text overlap. Behavior research

methods, 51(1), 14-27.

Dorn, L. D., Dahl, R. E., Woodward, H. R., & Biro, F. (2006). Defining the boundaries of

early adolescence: A user's guide to assessing pubertal status and pubertal timing in

research with adolescents. Applied Developmental Science, 10(1), 30-56.

Farah, M. J. (2017). The neuroscience of socioeconomic status: correlates, causes, and

consequences. Neuron, 96(1), 56-71.

Fraser, K. C., Meltzer, J. A., & Rudzicz, F. (2015). Linguistic features differentiate

Alzheimer’s from controls in narrative speech. Journal of Alzheimer’s Disease, 49(2),

407-422.

44

Frizelle, P., Thompson, P. A., McDonald, D., & Bishop, D. V. (2018). Growth in syntactic

complexity between four years and adulthood: evidence from a narrative task. Journal

of Child Language, 45(5), 1174-1197.

Gago, P. C. (2002). 5) Questões de transcrição em análise da conversa. Veredas-Revista de

Estudos Linguísticos, 6(2), 89-113.

Gardner-Neblett, N., & Iruka, I. U. (2015). Oral narrative skills: Explaining the

language-emergent literacy link by race/ethnicity and SES. Developmental

psychology, 51(7), 889-904.

Gazzola, M., Leal, S., Pedroni, B., Theoto Rocha, F., Pompéia, S., & Aluísio, S. (2022). Text

complexity of open educational resources in Portuguese: mixing written and spoken

registers in a multi-task approach. Language Resources and Evaluation, 56, 621-650.

Gillam, R. B., & Pearson, N. A. (2004). TNL: Test of narrative language. Austin, TX: Pro-ed.

Goddings, A. L., Beltz, A., Peper, J. S., Crone, E. A., & Braams, B. R. (2019). Understanding

the role of puberty in structural and functional development of the adolescent brain.

Journal of Research on Adolescence, 29(1), 32-53.

Goodglass, H., Kaplan, E., & Weintraub, S. (2001). BDAE: The Boston Diagnostic Aphasia

Examination. Philadelphia, PA: Lippincott Williams & Wilkins.

Gozdas, E., Holland, S. K., Altaye, M., & CMIND Authorship Consortium. (2019).

Developmental changes in functional brain networks from birth through

adolescence. Human brain mapping, 40(5), 1434-1444.

Graesser, A. C., & McNamara, D. S. (2011). Computational analyses of multilevel discourse

comprehension. Topics in cognitive science, 3(2), 371-398.

45

Graesser, A. C., McNamara, D. S., Cai, Z., Conley, M., Li, H., & Pennebaker, J. (2014).

Coh-Metrix measures text characteristics at multiple levels of language and discourse.

The Elementary School Journal, 115(2), 210-229.

Graesser, A. C., McNamara, D. S., & Kulikowich, J. M. (2011). Coh-Metrix: Providing

multilevel analyses of text characteristics. Educational researcher, 40(5), 223-234.

Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix: Analysis

of text on cohesion and language. Behavior research methods, instruments, &

computers, 36(2), 193-202.

Grice, J. W., & Iwasaki, M. (2008). A Truly Multivariate Approach to MANOVA. Applied

Multivariate Research, 12(3), 199-226.

Hammill, D. D., & Newcomer, P. L. (2020). TOLD-I: 5: Test of Language Development.

Intermediate. Pro-ed.

Heck, V. S., Yates, D. B., Poggere, L. C., Tosi, S. D., Bandeira, D. R., & Trentini, C. (2009).

Validação dos subtestes verbais da versão de adaptação da WASI. Avaliaçao

Psicologica: Interamerican Journal of Psychological Assessment, 8(1), 33-42.

Heilmann, J. J., Miller, J. F., & Nockerts, A. (2010). Using language sample databases.

Language, Speech and Hearing Services in Schools, 41(1), 84-95.

Heilmann, J., Nockerts, A., & Miller, J. F. (2010). Language sampling: Does the length of the

transcript matter?. Language, Speech and Hearing Services in Schools, 41(4),

393-404.

46

Hill, E., Claessen, M., Whitworth, A., & Boyes, M. (2021). Profiling variability and

development of spoken discourse in mainstream adolescents. Clinical linguistics &

phonetics, 35(2), 117-137.

Jacobsen, G. M., Moraes, A. L., Wagner, F., & Trentini, C. M. (2013). Qual é a participação

de fatores socioeconômicos na inteligência de crianças? Revista Neuropsicologia

Latinoamericana, 5(4), 32-39.

Johnston, J. (2006). Thinking about child language: Research to practice. Thinking

Publications.

Kovács, I., Kovács, K., Gerván, P., Utczás, K., Oláh, G., Tróznai, Z., ... & Gombos, F. (2022).

Ultrasonic bone age fractionates cognitive abilities in adolescence. Scientific reports,

12(1), 1-14.

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a

practical primer for t-tests and ANOVAs. Frontiers in psychology, 4, 863.

Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic

analysis theory of acquisition, induction, and representation of knowledge.

Psychological review, 104(2), 211.

Leal, S. E., Duran, M. S., Scarton, C. E., Hartmann, N. S., & Aluísio, S. M. (2021).

NILC-Metrix: assessing the complexity of written and spoken language in Brazilian

Portuguese. arXiv preprint arXiv:2201.03445.

Maccoby, E. E. (2002). Gender and group process: A developmental perspective. Current

directions in psychological science, 11(2), 54-58.

47

Matthews, D., Biney, H., & Abbot-Smith, K. (2018). Individual differences in children’s

pragmatic ability: A review of associations with formal language, social cognition,

and executive functions. Language Learning and Development, 14(3), 186-223.

Marcuschi, L. A. (1986). Análise da conversação. Série Princípios. São Paulo, Brazil: Ática.

Martins, I. P., Vieira, R., Loureiro, C., & Santos, M. E. (2007). Speech rate and fluency in

children and adolescents. Child Neuropsychology, 13(4), 319-332.

McNamara, D. S., Louwerse, M. M., McCarthy, P. M., & Graesser, A. C. (2010).

Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes, 47(4),

292-330.

Miller, Jon F.; Andriacchi, Karen; Nockerts, Ann (2016). Using Language Sample Analysis to

Assess Spoken Language Production in Adolescents. Language Speech and Hearing

Services in Schools, 47(2), 99–112.

Newman, M. L., Groom, C. J., Handelman, L. D., & Pennebaker, J. W. (2008). Gender

differences in language use: An analysis of 14,000 text samples. Discourse processes,

45(3), 211-236.

Nippold, M. A. (1993). Developmental markers in adolescent language: Syntax, semantics,

and pragmatics. Language, Speech, and Hearing Services in Schools, 24(1), 21-28.

Nippold, M. A. (2000). Language development during the adolescent years: Aspects of

pragmatics, syntax, and semantics. Topics in language disorders, 20(2), 15-28.

Nippold, M. A. (2004). Research on later language development: International perspectives.

Language development across childhood and adolescence, 3, 1-8.

Nippold, M. A., Frantz-Kaspar, M. W., Cramond, P. M., Kirk, C., Hayward-Mayhew, C., &

MacKinnon, M. (2014). Conversational and narrative speaking in adolescents:

48

Examining the use of complex syntax. Journal of Speech, Language, and Hearing

Research, 53(3), 876-886.

Nippold, M., Hesketh, L., Duthie, J., & Mansfield, T. (2005). Conversational versus

expository discourse: A study of syntactic development in children, adolescents, and

adults. Journal of Speech, Language, and Hearing Research, 48, 1048–1064.

Nippold, M. A., Schwarz, I. E., & Undlin, R. A. (1992). Use and understanding of adverbial

conjuncts: A developmental study of adolescents and young adults. Journal of

Speech, Language, and Hearing Research, 35(1), 108-118.

Nippold, M. A., Vigeland, L. M., Frantz-Kaspar, M. W., & Ward-Lonergan, J. M. (2017).

Language sampling with adolescents: Building a normative database with fables.

American Journal of Speech-Language Pathology, 26(3), 908-920.

Osborne, J. W., & Fitzpatrick, D. C. (2012). Replication analysis in exploratory factor

analysis: What it is and why it makes your analysis better. Practical assessment,

research, and evaluation, 17(1), 15.

Pakhomov, S., Chacon, D., Wicklund, M., & Gundel, J. (2011). Computerized assessment of

syntactic complexity in Alzheimer’s disease: A case study of Iris Murdoch’s writing.

Behavior research methods, 43(1), 136-144.

Paus, T. (2005). Mapping brain maturation and cognitive development during adolescence.

Trends in cognitive sciences, 9(2), 60-68.

Peña, P. A. (2020). Relative age and investment in human capital. Economics of Education

Review, 78, 102039.

49

Petersen, D. B., Gillam, S. L., & Gillam, R. B. (2008). Emerging procedures in narrative

assessment: The index of narrative complexity. Topics in language disorders, 28(2),

115-130.

Pompéia, S., Zanini, G. D. A. V., Freitas, R. S. D., Inacio, L. M. C., Silva, F. C. D., Souza, G.

R. D., ... & Cogo-Moreira, H. (2019). Adapted version of the Pubertal Development

Scale for use in Brazil. Revista de Saude Publica, 53(56), 1-12.

Quispesaravia, A., Perez, W., Cabezudo, M. S., & Alva-Manchego, F. (2016, May).

Coh-Metrix-Esp: A complexity analysis tool for documents written in Spanish. In

Proceedings of the Tenth International Conference on Language Resources and

Evaluation (LREC'16) (pp. 4694-4698).

Reynolds, S. A., Andersen, C., Behrman, J., Singh, A., Stein, A. D., Benny, L., ... & Fernald,

L. C. (2017). Disparities in children’s vocabulary and height in relation to household

wealth and parental schooling: A longitudinal study in four low-and middle-income

countries. SSM-Population Health, 3, 767-786.

Richardson, J. T. (2011). Eta squared and partial eta squared as measures of effect size in

educational research. Educational research review, 6(2), 135-147.

Ricketts, J., Lervåg, A., Dawson, N., Taylor, L. A., & Hulme, C. (2020). Reading and oral

vocabulary development in early adolescence. Scientific Studies of Reading, 24(5),

380-396.

Roark, B., Mitchell, M., & Hollingshead, K. (2007). Syntactic complexity measures for

detecting mild cognitive impairment. In Biological, translational, and clinical

language processing (pp. 1-8).

50

Sakai, K. L. (2005). Language acquisition and brain development. Science, 310(5749),

815-819.

dos Santos, L. B., Duran, M. S., Hartmann, N. S., Candido, A., Paetzold, G. H., & Aluisio, S.

M. (2017). A lightweight regression method to infer psycholinguistic properties for

Brazilian Portuguese. In International conference on text, speech, and dialogue (pp.

281-289). Springer, Cham.

Santos, R. L.S., Wick-Pedro, G., Leal, S., Vale, O.A., Pardo, T.A.S., Bontcheva, K., &

Scarton, C. (2020). Measuring the impact of readability features in fake news

detection. In Proceedings of the 12th language resources and evaluation conference

(pp. 1404-1413).

Schneider, P., & Dubé, R. V. (2005). Story presentation effects on children’s retell content.

American Journal of Speech-Language Pathology, 14(1), 52-60.

Schneider, P., Hayward, D., & Dubé, R. V. (2006). Storytelling from pictures using the

Edmonton narrative norms instrument. Journal of speech language pathology and

audiology, 30(4), 224-238.

Snowling, M. J., & Hulme, C. (2020). Annual Research Review: Reading disorders

revisited–the critical importance of oral language. Journal of Child Psychology and

Psychiatry, 62(5), 635-653.

Spencer, S., Clegg, J., Stackhouse, J., & Rush, R. (2017). Contribution of spoken language

and socio‐economic background to adolescents’ educational achievement at age 16

years. International Journal of Language & Communication Disorders, 52(2),

184-196.

51

Spencer, S., Clegg, J., & Stackhouse, J. (2012). Language and disadvantage: a comparison of

the language abilities of adolescents from two different socioeconomic areas.

International Journal of Language & Communication Disorders, 47(3), 274-284.

Toivainen, T., Papageorgiou, K. A., Tosto, M. G., & Kovas, Y. (2017). Sex differences in

non-verbal and verbal abilities in childhood and adolescence. Intelligence, 64, 81-88.

Tomblin, J. B. & Nippold, M. A. (Eds.). (2014). Understanding individual differences in

language development across the school years. New York, USA: Psychology Press.

Torvik, F. A., Flatø, M., McAdams, T. A., Colman, I., Silventoinen, K., & Stoltenberg, C.

(2021). Early Puberty Is Associated With Higher Academic Achievement in Boys and

Girls and Partially Explains Academic Sex Differences. Journal of Adolescent Health,

69, 503-510.

Treviso, M. V., Santos, L. B. D., Shulby, C., Hübner, L. C., Mansur, L. L., & Aluísio, S. M.

(2018). Detecting mild cognitive impairment in narratives in Brazilian Portuguese:

first steps towards a fully automated system. Letras de Hoje, 53, 48-58.

Ullman, M. T. (2004). Contributions of memory circuits to language: The

declarative/procedural model. Cognition, 92(1-2), 231-270.

VanVoorhis, C. W., & Morgan, B. L. (2007). Understanding power and rules of thumb for

determining sample sizes. Tutorials in quantitative methods for psychology, 3(2),

43-50.

Vijayakumar, N., Youssef, G. J., Allen, N. B., Anderson, V., Efron, D., Hazell, P., ... & Silk,

T. (2021). A longitudinal analysis of puberty-related cortical development.

Neuroimage, 228, 117684.

52

Von Stumm, S., & Plomin, R. (2015). Socioeconomic status and the growth of intelligence

from infancy through adolescence. Intelligence, 48, 30-36.

Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence. San Antonio, USA. The

Psychological Corporation.

Westerveld, M. (2011). Sampling and analysis of children’s spontaneous language. Acquiring

Knowledge in Speech, Language and Hearing, 13(2), 63-67.

Westerveld, M. F., & Vidler, K. (2016). Spoken language samples of Australian children in

conversation, narration and exposition. International Journal of Speech-Language

Pathology, 18(3), 288-298.

Xia, X. (2013). Gender differences in using language. Theory and practice in language

studies, 3(8), 1485-1489.

Yates, D. B., Trentini, C. M., Tosi, S. D., Corrêa, S. K., Poggere, L. C., & Valli, F. (2006).

Apresentação da escala de inteligência Wechsler abreviada (WASI). Avaliação

Psicológica, 5(2), 227-233.

Yngve, V. H. (1960). A model and an hypothesis for language structure. Proceedings of the

American philosophical society, 104(5), 444-466.