Creating a Korean Engineering Academic Vocabulary List (KEAVL): Computational Approach
Date
2020-04-24
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ORCID
0000-0002-9375-421X
Type
Thesis
Degree Level
Masters
Abstract
With a growing number of international students in South Korea, the need for developing materials to study Korean for academic purposes is becoming increasingly pressing. According to statistics, engineering colleges in Korea attract the largest number of international students (Korean National Institute for International Education, 2018). However, despite the availability of technical vocabulary lists for some engineering sub-fields, a list of vocabulary common for the majority of the engineering sub-fields has not yet been built. Therefore, this study was aimed at creating a list of Korean academic vocabulary of engineering for non-native Korean speakers that may help future or first-year engineering students and engineers working in Korea.
In order to compile this list, a corpus of Korean textbooks and research articles of 12 major engineering sub-fields, named as the Corpus of Korean Engineering Academic Texts (CKEAT), was compiled. Then, in order to analyze the corpus and compile the preliminary list, I designed a Python-based tool called KWordList. The KWordList lemmatizes all words in the corpus while excluding general Korean vocabulary included in the Korean Learner’s List (Jo, 2003). Then, for the remaining words, KWordList calculates the range, frequency, and dispersion (in this study deviation of proportions or DP (Gries, 2008)) and excludes words that do not pass the study’s criteria (range ≥ 6, frequency ≥ 100, DP ≤ 0.5).
The final version of the list, called Korean Engineering Academic Vocabulary List or KEAVL, includes 830 lemmas (318 of intermediate level and 512 of advanced level). For each word, the collocations that occur more than 30 times in the corpus are provided.
The comparison of the coverage of the Korean Academic Vocabulary List (Shin, 2004) and KEAVL based on the Corpus of Korean Engineering Academic Texts showed that KEAVL covers more lemmas in the corpus. Moreover, only 313 lemmas from the Korean Academic Vocabulary List (Shin, 2004) passed the criteria of the study. Therefore, KEAVL may be more efficient for engineering students’ vocabulary training than the Korean Academic Vocabulary List and may be used for the engineering Korean teaching materials and curriculum development. Moreover, the KWordList program written for the study can be used by other researchers, teachers, and even students and is open access (https://github.com/HelgaKr/KWordList).
Description
Keywords
Word Lists, Korean for Academic Purposes, Engineering Vocabulary of Korean
Citation
Degree
Master of Arts (M.A.)
Department
Linguistics
Program
Linguistics