中国英语专业大学生与本族语学生英语写作句法复杂度对比研究

 2022-04-07 20:37:57

论文总字数:63544字

摘 要

在二语写作领域,句法复杂度是一个重要的研究概念,其与二语学习者的写作质量、语言能力和语言发展之间的关系是许多二语写作研究的焦点。对句法复杂度的相关研究最早兴起于上世纪中后叶的西方国家。在过去几十年里,该领域学者不断加深对句法复杂度及其测量方法的理解。

目前,国内学者已经认识到利用句法复杂度评价学生英语写作中的重要性,然而,在已有研究中,以英语专业学生为研究对象的研究明显不足。英语专业学生通常被认为是国内英语学习者的典型群体,对其写作句法复杂度进行研究能够反映出国内英语教学的质量和效果。同时,在已有研究中,以本族语学生水平为评估基线的研究也较少,留下了一定的研究空白,启发了本研究的研究思路。

本文对中国英语专业大学生与本族语学生英语写作句法复杂度进行了探索。在研究中,笔者从中国学生万篇英语作文语料库(V1.1)(许家金 2015)和the LOCNESS语料库(Granger 1996)中抽取了240篇议论文,并使用二语句法复杂度分析器中的14项测量指标来研究中国英语专业高年级和低年级学生与本族语学生的英语写作特点以及他们的句法复杂度在各维度是否存在差异。

研究在使用SPSS 22.0处理、整理、对比统计数据后发现,中国英语专业大学生与本族语学生的英语写作复杂度分别具有鲜明的特点且各维度均存在显著差异。与本族语学生相比,中国英语专业大学生写作中产出的从属结构、并列短语、t单位、动词短语、复合名词和从句数量较少。同时,测量中发现的不显著差异也说明英语专业大学生与本族语学生写作句法复杂度有一定程度的相似性。此外,研究还指出,在多数测量结果中,高年级英语专业学生的写作句法复杂度相比于低年级学生更接近于本族语学生水平。

此项研究揭示了中国英语专业大学生与本族语学生英语写作句法复杂度的实际差距,旨在帮助了解国内英语专业学生写作水平的现状,并引起教学的关注和反思。同时,本研究结果与分析也为以后相关研究提供了方向性参考。

关键词:句法复杂度;中国英语专业学生;二语写作

Table of Contents

Acknowledgements i

Abstract ii

摘要 iv

Table of Contents v

List of Tables 1

Chapter One Introduction 2

1.1 Background of the Study 2

1.2 Purpose and Significance of the Study 2

1.3 Thesis Arrangement 3

Chapter Two Literature Review 4

2.1 Rationale for Research on Syntactic Complexity 4

2.1.1 Syntactic Complexity and Measures 4

2.1.2 Definition of Terms 5

2.2 Previous Works on Syntactic Complexity 6

2.2.1 Western Researches 6

2.2.2 Domestic Researches 8

Chapter Three Methodology 11

3.1 Research Questions 11

3.2 Data Source 11

3.3 Data Collection 12

3.4 Instrument for Data Analysis 13

3.5 Research Procedure 14

Chapter Four Results and Discussion 16

4.1 Results and Discussion Related to Question 1 16

4.1.1 Length of Production Unit 16

4.1.2 Amount of Subordination 17

4.1.3 Amount of Coordination 17

4.1.4 Degree of Phrasal Sophistication 17

4.1.5 Overall Sentence Complexity 19

4.2 Results and Discussion Related to Question 2 19

4.2.1 Length of Production Unit 20

4.2.2 Amount of Subordination 21

4.2.3 Amount of Coordination 22

4.2.4 Degree of Phrasal Sophistication 22

4.2.5 Overall Sentence Complexity 23

Chapter Five Conclusions 24

5.1 Major Findings and Implications 24

5.2 Limitations of the Research and Suggestions for Further Studies 25

References 27

Appendix Results of the Bonferroni Test 31

List of Tables

Table 1: Summary of samples 13

Table 2: Syntactic complexity measure used 14

Table 3: Mean complexity values for the English major groups 18

Table 4: Mean values and standard deviations of 14 measures of the English major

and NS groups 19

Table 5: Differences in mean values among the groups 21

Chapter One Introduction

1.1 Background of the Study

The studies on syntactic complexity rose in the 1960s. As Swain (1985) once observed, the output of the language should become part of language learning process rather than a mere outcome. To explore how sophisticated or varied the production units are in second language learners’ output, studies have been done and proved that syntactic complexity, the degree of sophistication of syntactic structures surfacing in language production, is an essential measurement to reflect learner’s writing proficiency (Hunt 1965; Ortega 2003; Crossly amp; Mcnamara 2014).

Compared with fruitful researches abroad, the domestic researches sprung up in the late 1990s and are relatively few, most of which targeted Chinese English learners as a whole research group while failed to separate the target into minor groups to study. However, since the number of English major students’ papers that issued on international magazines increases and the proportion of students going abroad for further study expands, there is a need to investigate the syntactic complexity of the writing of English major students and to adjust pedagogical plans correspondingly.

1.2 Purpose and Significance of the Study

Generally, the study aims to enrich the studies of second language acquisition theoretically and pedagogically by investigating the writing features of Chinese English majors and those of native students and making comparisons between them.

On one hand, it encourages a deeper understanding of Chinese English majors’ output and employs the most effective measures to evaluate the outputs. What’s more, since most previous domestic studies discuss Chinese learners who take English as a foreign language (EFL) as a whole target, this study employs quantitative research methods to investigate syntactic complexity of Chinese English majors. More significantly, as the baseline of native speakers (NS) is a neglected dimension in the assessment of the writing of non-native speakers (NNS), this research makes up for the blank by comparing performances of Chinese English majors and NS and helping readers to understand in what way English majors’ performance approximates or deviates from NS’s. On the other hand, this study provides valuable information that can be adopted by teachers and course designers to make effective teaching strategies and to devise appropriate pedagogical interventions to solve existing problems. Also, it lays solid foundations for further studies.

1.3 Thesis Arrangement

The thesis contains five chapters and is organized as follows.

As the general introduction part, Chapter 1 introduces the background and significance of this study as well as the arrangement of this thesis.

Chapter 2 discusses syntactic complexity with its measures and the explanations of some significant terms employed in the research. In addition, it reviews relevant studies at home and abroad.

Chapter 3 introduces the research questions addressed and the two corpora used as baselines in this study. Besides, the research procedures and instrument are included in this chapter.

Chapter 4 presents a series of statistical results and discusses the research findings.

Chapter 5 summarizes the findings and concludes with potential pedagogical implications. Also, it explains the limitations of this research and lists some suggestions for further study.

Chapter Two Literature Review

2.1 Rationale for Research on Syntactic Complexity

Syntactic complexity, a construct of significance in study of second language (L2) writing, is used to value how sophisticated and varied production units are (Wolfe-Quintero et al. 1998). The study of it first flourished in the western countries in the 1960s. The researchers in this field have kept deepening their understanding of syntactic complexity with its measurement methods.

2.1.1 Syntactic Complexity and Measures

Since the 1970s, syntactic complexity has been acknowledged as a standard criterion in L2 teaching and researches, as the growth of learners’ syntactic repertoire represents internal parts of his or her development in the target language (Ortega 2003). Up to now, the definition of it has been defined by large numbers of scholars. According to Foster and Skehan (1996), grammatical complexity which is primarily manifested in grammatical variation and sophistication was defined as “progressively more elaborate language” and “a greater variety of syntactic patterning” (1996: 303). Gaies (1980) once regarded syntactic complexity as the ability to compress plenty of ideas or pieces of information into concise expressions. As for Wolfe-Quintero et al. (1998), they believed that with higher level of syntactic complexity, both basic and sophisticated structures of a wide range are accessible and available. On the contrary, only a very narrow range of simple structures that can be accessed are available. Lu (2010) made a conclusion based on the previous explanations that syntactic complexity, applied in L2 writing, illustrates how varied or sophisticated the grammatical structures or production units are.

With regard to measures, the absence of suitable yardstick kept calling for the ideal construction of measures to evaluate learners’ syntactic complexity to reflect their language proficiency (Larsen-Freeman 1978). Ortega (2000) once put that scholars could examine development in language acquisition better with various kinds of metrics. Also, they are able to establish comparisons among different groups of population and measure the impact an experimental intervention may bring on linguistic output. Based on the previous findings, Norris and Ortega (2009) proposed a new way to analyze syntactic complexity in several dimensions which are the overall complexity, subordination complexity, subclausal or phrasal elaboration complexity (Norris and Ortega 2009: 561). Lu (2010) learned the best from the previous researches and promoted four domains of syntactic complexity, including 1) length of production unit, 2) amount of subordination, 3) amount of coordination and 4) phrasal sophistication.

2.1.2 Definition of Terms

Some terms employed in this research require detailed explanations as follows:

T-unit

T-unit was first named by Hunt as a new production unit in 1965, the length of which is considered to be an effective index to reflect syntactic complexity since “ the unit has the advantage of preserving all the subordination achieved and all the coordination between words and phrases and subordinate clauses” (Hunt 1965: 37). More directly, T-unit, the minimal terminal units, is defined by Hunt as “one main clause plus the subordinate clauses attached to or embedded within it” (Hunt 1965: 49). For instance, “I know a girl and she has purple hair” can be segmented in two T-units since the compound sentence has two independent clauses, whereas “I know a girl with purple hair” has only one T-unit.

Although T-unit is regarded as a better measure than others, it has been criticized because it separates inappropriate coordination into different T-units (Quintero et al. 1998). To avoid such kind of misstate, this research does not rely on T-unit completely and it takes T-unit-related measures as parts of the measurement, not the whole.

Complex nominals

Complex nominals can be classified into three types: 1) noun adjective, prepositional phrase, possessive, participle, relative, appositive or adjective clause; 2) nominal clauses; 3) gerunds and infinitives in subject, but not object position (Cooper 1976: 180; Lu 2011: 44-45). The examples of each type are listed as follows:

1) Noun adjective: e.g.,the small dog

2) Noun prepositional phrase: e.g., the pig under the desk; the apple on the chair

3) Noun possessive: e.g.,her coat; the woman’s coat

4) Noun participle: e.g.,the crying baby

5) Noun relative clause: e.g., the woman who sang the songs

6) Noun appositive: e.g.,Alexander the king

7) Nominal clause: e.g., she thought that she encountered a thief; he believes that he was innocent

8) Infinitives and gerunds in subject (not object position): e.g., car driving is

good for brain.

2.2 Previous Works on Syntactic Complexity

For the past several decades, both western and domestic scholars have conducted meaningful researches on syntactic complexity from various perspectives.

2.2.1 Western Researches

Most of the western studies can be categorized as followings: (1) pursuing the most systematic and effective ways of evaluating syntactic complexity; (2) exploring the relationships between syntactic complexity and language proficiency; (3) exploring the effects of different task-, learner-and context-based factors on syntactic complexity.

In terms of the most effective ways of evaluating syntactic complexity, researchers have generally come to realize that the measurement should be conducted in a multidimensional one with each dimension containing more than one indexes. Wolfe-Quintero et al. (1998) tested more than 100 measures in accuracy, fluency and complexity and found that the best measures for fluency were error-free T-unit length and clause length. A decade later, Norris and Ortega (2009) used specific models to analyze sentences, clauses, and phrases. They pointed out that multivariate analyses of measures were necessary and each measure corresponded to a different acquisition dimension. However, their results lacked strong empirical support. Following the previous multidimensional model, Lu (2010) designed a computational system named the L2 Syntactic Analyzer (L2SCA). 14 measures are divided into five dimensions: length of production unit, amount of coordination, amount of subordination, phrasal sophistication and overall sentence complexity. With the L2SCA, Lu (2011, 2015) carried out several empirical studies based on corpora and found out the best measures to reflect language proficiency.

As to the relationships between syntactic complexity and language proficiency, relevant studies are basically classified into longitudinal groups and cross-sectional groups. When it comes to the former, researchers usually analyzed changes in writing of the same participants over periods of time. With another computational tool Coh-Metrix, Crossly and Mcnamara (2014) analyzed 57 essays written by L2 learners over one semester’s writing course. The results indicated that the measures of syntactic complexity which implied growth patterns were not exactly the same as those which predicated judgement of proficiency. The possible explanation is that a short period of learning, for instance, one semester, does not guarantee significant syntactic development. When it comes to the latter, researchers often target participants at different language proficiency levels. Larsen-Freeman (1978) used T-unit to analyze 212 compositions written by ESL students. The average length and percentage of error-free T-units turned out to be the most suitable measures in the research. However, his judgements lacked detailed and objective standards, which might influence the outcome. Lu (2011) used 14 syntactic complexity measures in L2SCA to investigate students’ writing in terms of amount of coordination, amount of subordination, length of production unit and degree of phrasal sophistication. Among 14 measures, complex nominals per clause, mean length of T-unit and complex nominals per T-unit revealed significant differentiates between levels. Compared with the previous studies, Lu collected a great many samples and he set up multidimensional models. However, Lu failed to control variables in his research when collecting samples, which left the gaps.

Other scholars explored the possible effects of different task-, learner-and context-based factors might make on syntactic complexity. Hinkle (2003) did a quantitative analysis of large scale by collecting 1,083 texts written by NS graduate and NNS undergraduate international students, finding that NS students utilized a wider range of advanced constructions than NNS students. Ellis and Yuan (2004) studied how conditions might influence Chinese learners’ writing complexity and they highlighted the negative effects the lack of pre-planning brought to syntactic complexity based on the results. Beers and Nagy (2009), on the other hand, explored the role of gender plays in syntactic complexity by doing T-unit analyses on the essays written by males and females and did not find out any significant differences.

2.2.2 Domestic Researches

Compared with abundant studies abroad, domestic studies are relatively few. Following studies abroad, domestic studies of syntactic complexity can also be divided into three categories: (1) finding out effective measures of syntactic complexity; (2) exploring the relationships between syntactic complexity and language proficiency; (3) exploring the effects of different task- and context-based factors on syntactic complexity.

To start with the measures, some Chinese scholars devoted themselves to finding out effective measures to evaluate syntactic complexity. Chen Huiyuan (2010) conducted a research which compared the length of T-units (W/T) and S-nodes per T-units in measuring learners’ writing quality in complexity, accuracy, fluency. The result showed that the length of T-unit, as a more sensitive index, could reflect English proficiency more accurately. Nevertheless, Chen’s study failed to give a fuller picture due to the limited measures. Compared with Chen’s previous work (Chen 2010), Zhao and Chen (2012) made improvements by applying more measures. They collected 1041 sample essays from Chinese Learners’ English corpus and analyzed dependent clauses, non-finite verbs and coordinate phrases in each T-unit. As a result, those measures were confirmed to be valid in evaluating syntactic complexity of L2 learners’ writing. Although the research described above helped prove the validity of some measures, it lacked a systematical and comprehensive view.

In addition, some Chinese scholars did empirical studies which explored the relationship between syntactic complexity and learners’ proficiency. Ji (2009) adopted the number of clauses in each T-unit as a measure. The results of the study showed that the learner's language proficiency and their syntactic complexity were not relevant. Bao (2009) categorized syntactic complexity into unit length and clause density. It was found that the length index of learners' writing grew faster than the density index. Xu et al. (2013) found that the unit length and density of students' writing showed a linear growth trend with grades and language proficiency. What’s more, in terms of sentence patterns, the use of simple sentences, adverbial clauses and object clauses deceased with the increase of grades. The use of passive sentences increased as learners’ language proficiency increased. Overall, the results above suggest that the relationships between each dimension of syntactic complexity and the learner's language proficiency may be different. However, since present researches does not delve into all dimensions of syntactic complexity, the understanding of the relationship between syntactic complexity and learners’ proficiency needs enhancing.

Last but not least, some Chinese scholars attempted to explore the effects the variables might bring. Wang (2013) conducted a research in T-units studying the impact task complexity makes on syntactic complexity in 118 Chinese university and found that task complexity had no obvious impact. In general, domestic scholars' researches on the variables are still lacking.

Generally, most domestic studies chose less than five measures to examine, lacking multidimensional measurements, which could hardly depict the complete picture. Besides, most domestic scholars paid attention to the subordination measures and length-based measures, neglecting that phrasal sophistication and coordination meaningful components when analyzing syntactic complexity as well.

Having reviewed the existing researches, I notice that few scholars focused on investigating the syntactic complexity of Chinese EFL learners, especially English majors who are considered to be more supplicated and proficient than non-English majors, and the baseline of native speakers (NS) was largely neglected in the assessment. Although Lu (2011) made comparisons of syntactic complexity in NS and NNS college-level students’ writing selected from the Written English Corpus of Chinese Learners version1.0 (WECCL corpus1.0) as well as the Louvain Corpus of Native English Essays (LOCNESS), the limitations in his research leave me research gaps. Since WECCL 1.0 is a collection of written texts of the 1990s, it is hard to reveal the present situation of Chinese ESL learners. What’s more, the task-related variables that should have been controlled might, to some extent, make influences to the previous results and analyses. In this research, I choose Ten-thousand English Compositions of Chinese Learners (the TECCL Corpus) Version 1.1 which was updated in 2015 as the sample pool and strictly control related variables.

Chapter Three Methodology

3.1 Research Questions

Drawing upon data from TECCL 1.1 and LOCNESS which are to be introduced, the present research intends to investigate the writing features of Chinese English majors and those of native students and to compare in what way they differ in syntactic complexity in terms of: 1) length of production unit, 2) amount of subordination, 3) amount of coordination, 4) degree of phrasal sophistication, and 5) overall sentence complexity.

Specifically, the research intends to give answers to the following two questions:

1. Are there any significant differences between the writing of Chinese English majors and NS students in syntactic complexity? If there are, in which dimensions do they differ?

2. More specifically, are there any significant differences among the writing of Chinese English majors at different proficiency levels and NS students in syntactic complexity? If there are, in which dimensions do they differ?

3.2 Data Source

The present research draws samples written by Chinese English major students and native students from TECCL and the LOCNESS, separately released by Beijing Foreign Studies University and Université Catholique de Louvain.

TECCL 1.1 comprises 9,864 texts written by EFL learners in China. Unlike other EFL corpora, TECCL corpus is more up-to-date and was compiled to represent the EFL learner in China of the time since all the materials contained were produced between the year of 2011 and 2015. To a large extent, the number of the contributors in 985/211 and non-985/211 universities exactly corresponds to the proportion of Chinese universities. What’s more, as the widest of Chinese EFL learners’ corpora, the corpus by far collected materials from 32 provinces, including Hong Kong and Taiwan.

TECCL also contains a wide range of topics, going over 1,000. Each essay in TECCL is annotated with relevant information, including the genre, topic and timing condition of the essay; gender and school level of the contributor. The essays written by English majors are collected in separate files with all the information listed as before.

LOCNESS consists of 436 essays produced by native English speakers. The contributors are 232 American university students, 90 British university students, and 114 British A-Level (General Certificate of Education Advanced Level) students. A range of topics are included in LOCNESS, for example, “Feminists have done more harm to the cause of women than good”. Also, there is a file in LOCNESS describing the information about contributors and the genres, topics and timing condition of the essay. The information listed is comparable to that in TECCL 1.1, making LOCNESS an appropriate control corpus to compare data.

3.3 Data Collection

When selecting the samples, certain variables were taken into consideration to maximize comparability of two groups. Specifically, to avoid the potential influences of genre, the research only sampled argumentative essays in the corpora. To avoid the potential influences of sample size, the research sampled the same number of essays in each corpus. To avoid the potential influence of time condition, the research only sampled timed essays written by students in class. To avoid the potential influence of school level in the NS data, the research only sampled essays produced by university students in LOCNESS, excluding A-level students. After filtering out the essays that did not meet the criteria, I obtained the final dataset consisting of 240 essays. Among the final samples, 120 are timed argumentative essays written by NS students and the other 120 are by NNS students. More specifically, the 120 NNS students’ essays are produced by 30 Chinese English-majored freshmen, 30 sophomores, 30 juniors, and 30 seniors. To simplify the process, I put the samples of the freshmen and sophomores into one group (English major-low group) and the samples of juniors and seniors into another group(English major-high group), assuming that Chinese English majors in the first two years are at relatively lower proficiency levels than those in the last two years. The assumption aligns with the present curriculum expectations on English majors since they are obliged to prepare for and pass TEM 4 towards the end of sophomore year and sit for TEM 8 towards the end of senior year. Table 1 summarizes the samples in the three groups.

Table 1: Summary of samples

Group

English Major-Low

English Major-High

NS

Number of essays

60

60

120

Average length of essay(words)

472.28

497.97

570.19

Total number of words

28,337

29,878

68,423

The NS essays are relatively longer than the English majors’ essays. However, it is worth highlighting that difference of the length of essays do not affect the comparisons pursued here because “the syntactic complexity measures considered are all computed as ratios of one syntactic structure to another in complete texts” (Lu, 2010).

3.4 Instrument for Data Analysis

In 2010, Lu, the professor of Applied Linguistics in Pennsylvania State University, designed a computational program named L2SCA (L2 Syntactic Complexity Analyzer) for analyzing syntactic complexity automatically together, effectively resolving the bottleneck in L2 syntactic complexity research. 14 measures which are listed as follows are used to help researchers to analyze syntactic complexity. Lu introduced the detailed functions and principles in his paper (Lu 2016) and proved higher accuracies of L2SCA than human annotator in empirical studies. Generally, L2SCA enables researchers to resolve the bottleneck that used to exist in the research procedures.

剩余内容已隐藏,请支付后下载全文,论文总字数:63544字

您需要先支付 80元 才能查看全部内容!立即支付

该课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找;