Falcon Darío Restrepo Ramos
This study examines the linguistic complexity of Spanish as a second language (L2) in learners' essays across proficiency levels at two timelines of a composition class during a college semester. Data comes from 22 L2 learners of Spanish enrolled in two sections of a third-year composition class at the college level, who were assigned nine compositions (150–250 words each) at different times throughout one semester. Data was prepared using a natural language processing (NLP) pipeline, including UDPipe, an NLP tool that allows annotation, part-of-speech (POS) tagging, lemmatization, and dependency parsing based on Universal Dependencies (UD) treebanks. In addition, the NLTK package and several Python scripts were used on the annotated model to process and extract syntactic and lexical information from the datasets. Results showed that selected predictors of syntactic complexity increased at different stages in the semester and the effect seems more robust in beginner learners. Moreover, the use of lower frequency lexicon appears to be integrated thorough out the semester in both groups of learners. These findings indicate the pedagogical benefits of Spanish composition courses and the specific indices of L2 writing development obtained during a semester of classes across two groups of language proficiency.