Diachronic Corpora as a Tool for Tracing Etymological Information of Indonesian-Malay Lexicon

Kamal Yusuf, Dewi Puspita


Indonesian lexicon comprises numerous loanwords which some of them already exist since the 7th century. The large number of loanwords is the reason why many dictionaries of Indonesian etymology available today contain merely the origin of the words. Meanwhile, there are several aspects in a word etymology that can be studied and presented in a dictionary, such as the change in a word form and in its meaning. This article seeks to demonstrate the use of corpora in identifying the etymological information of Malay words from diachronic corpora and to figure out the semantic change of the Malay words undergo from time to time until they turn out to be Indonesian lexicon. More specifically, two selected Malay words were examined: bersiram and peraduan. By exploring data resources from the corpus of Malay Concordance Project and Leipzig Corpora, this study attempts to collect etymological information of Indonesian lexicon originated from Malay by employing a corpus based research. The findings show that the examined words have changed in meaning through generalization and metaphor. However, unlike the word bersiram, the change that the word peraduan happened only occurs in semantic level. This information, ultimately, can be used as informative data for a more comprehensive Indonesian etymology dictionary. Drawing on corpus analysis, this paper addresses the importance use of diachronic corpora in tracing words origin.

Keywords: diachronic corpora, etymology, corpus analysis, semantic change, Malay-Indonesian

Full Text:



Allan, K. & Robinson, J.A. (eds.). (2012). Current Methods in Historical Semantics. Berlin: Walter de Gruyter.

Altakhaineh, A.R.M. (2018). The Semantic Change of Positive vs. Negative Adjectives in Modern English. Lingua Posnaniensis. 60(2), 25-37. DOI: https://doi.org.10.2478/linpo-2018-00010.

Andaya, L.Y. (2001). The Search for the ‘Origin‘ of Melayu. Journal of Southeast Asian Studies. 32(3), 315-330. DOI: https://doi.org/10.1017/S0022463401000169.

Bakar, N.S.A.A. (2020). The Development of an Integrated Corpus for Malay Language. In: Alfred R., Lim Y., Haviluddin H., On C. (eds) Computational Science and Technology. Lecture Notes in Electrical Engineering, vol 603. Springer, Singapore. DOI: https://doi.org/10.1007/978-981-15-0058-9_41.

Bieman, Chris, et al. (2007). The Leipzig Corpora Collection: Monolingual Corpora of Standard Size. Accessed from https://www.birmingham.ac.uk/documents/college-artslaw/corpus/conference-archives/2007/190paper.pdf&ved=2ahUKEwi6i_j4ubTnAhVSb30KHdBFADIQFjAAegQUBRAB&usg=AOvVaw1Fr6CtQ1ChDmmSbkH0fzZ

Bochkarev, V., et al. (2020). A Method of Semantic Change Detection Using Diachronic Corpora Data. In: van der Aalst W., et al. (eds.). Analysis of Images, Social Networks and Text. AIST 2019. Communications in Computer and Information Science, vol 1086, 94-106. Springer, Cham. DOI: https://doi.org/10.1007/978-3-030-39575-9_10.

de Melo, Gerard. (2014). Etymological Wordnet: Tracing the History of Words. In: Proceeding of the Ninth International Conference on Language Resources and Evaluation. European Language Resources Association: Reykjavil, 1148-1154.

Collins, J.T. (1998/2018). Malay, World Language: A Short History. Kuala Lumpur: Dewan Bahasa dan Pustaka.

Comrie, Bernard (ed.). (2009). The World’s Major Languages (3rd Edition). London: Routledge.

Durkin, Philip. (2009). The Oxford Guide to Etymology. New York: Oxford University Press.

Gallop, Annabel Teh. (2013). The Language of Malay Manuscript Art: A Tribute to Ian Proudfoot and the Malay Concordance Project. International Journal of the Malay World and Civilisation. 1(3), 11-27.

Hasan, Mahade. (2015). Semantic Change of Words Entered into another Language trough the Process of Language Borrowing: A Case of Arabic Words in Bengali. People International Journal of Social Sciences, Special Issue 1 (1), 1375-1390.

Hoogervorst, Tom G. (2015). Tracing the Linguistic Crossroads between Malay and Tamil. Wacana. 16(2), 249-283. DOI: https://doi.org/10.17510/wacana.v16i2.378.

Jatowt, A & Duh, K. (2014). A Framework for Analyzing Semantic Change of Words across Time. In IEEE/ACM Joint Conference on Digital Libraries, London, 229-238. DOI: https://doi.org/10.1109/JCDL.2014.6970173.

Joharry, Siti Aeisha & Rahim, Hajar Abdul. (2014). Corpus Research in Malaysia: a Bibliographic Analysis. Kajian Malaysia. 32(1), 17-43.

Kamus Besar Bahasa Indonesia. (2020). Accessed from https://kbbi.kemdikbud.go.id/

Kamus Dewan. (2015). Kuala Lumpur: Dewan Bahasa dan Pustaka.

Kridalaksana, H. (2001). Arah Pengembangan Kajian Etimologi Indonesia. Kata. April, 2001.

Leipzig Corpora. (2019). Retrieved from http://corpora.uni-leipzig.de/en?corpusId=ind_mixed_2013

Liberman, A. (2009). Word Origins and How We Know Them: Etymology for Everyone. New York: Oxford University Press.

Malay Concordance Project. (2020). Accessed from http://mcp.anu.edu.au/

McEnery, T. & Hardie, A. (2012). Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge University Press.

Melebek, Abdul Rashid & Moain, Amat Juhari. (2006). Sejarah Bahasa Melayu. Kuala Lumpur: Utusan Publications.

Moeljadi, David, et al. (2019). Considerations for Providing Etymological Information in the KBBI Indonesian Dictionary. In Proceeding of the 13th International Conference of the Asian Association for Lexicography. Istanbul, 223-240.

Mohamed, Noriah & Yusof, Radiah. (2014). Pendekatan Kontranstif dan Komparatif Bahasa-Bahasa di Malaysia. Penang: Universiti Sains Malaysia Press.

Omar, Asmah Haji. (2005). The Encyclopedia of Malaysia: Languages and Literature. Singapore: Editions Didier Millet Pte Ltd.

Proudfoot, I. (1991). Concordances and Classical Malay. Bijdragen tot de Taal-, Land- en Volkenkunde 147(1), 74-95. DOI: https://doi.org/10.1163/22134379-90003200.

Pusat Rujukan Persuratan Melayu. (2020). Accessed from https://prpm.dbp.gov.my/

Richter, Matthias, et al. (2006). Exploiting the Leipzig Corpora Collection. Accessed from https://.researchgate.net/publication/228517118_C_Exploiting_the_leipzig_corpora_collection

Ricklefs, M.C. (2008). A History of Modern Indonesia since C.1200 (revised edition). London: Macmillan.

Russel, Jones, et al. (eds). (2007). Loanwords in Indonesian and Malay. Leiden: KITLV.

Sneddon, J.N. (2003). The Indonesian Language: Its History and Role in Modern Society. Sydney: UNSW Press.

Tadmor, Uri. (2009). Loanwords in Indonesian. In: Haspelmath, Martin and Tadmor, Uri (eds.). Loanwords in the Wolrld’s Languages: A Comparative Handbook. Berlin: Walter de Gruyter.

Teeuw, A. (1967). The History of the Malay Language. In: Modern Indonesian Literature. KITLV. Springer: Dordrecht. DOI: https://doi.org/10.1007/978-94-015-0768-4_2.

Wijaya, Derry Tanti & Yeniterzi, Reyyan. (2011). Understanding Semantic Change of Words Over Centuries. In the Proceeding of the 2011 International Workshop on DETecting and Exploiting Cultural Diversity on the Social Web, 35-40. DOI: https://doi.org/10.1145/2064448.2064475.

Yurrivna, Shumylo Myroslava. (2014). Etymological and Semantic Changes of the English Medical Terms. Science and Education a New Dimension, Philology, 2(6), Issue 29, 29-31.

DOI: https://doi.org/10.18326/rgt.v13i1.153-182


  • There are currently no refbacks.

Copyright (c) 2020 Kamal Yusuf, Dewi Puspita

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

In the aim of improving the quality of the Journal since 19th October 2016 this journal officially had made cooperation with ELITE Association Indonesia (The association of Teachers of English Linguistics, Literature & Education). See The MoU Manuscript.