Contribute to our documentation efforts 

All proceeds will go towards expanding this platform, specifically:


About Mini Buleku

Background  

Mini Buleku is a dictionary created for the language known as Sibe (IPA: [ɕivə], Sibe: sibe gisun), a Tungusic language spoken by the Sibe people of Northwestern Xinjiang, China. The main developer of the dictionary, Jacob Kodner, graduated with his BA in Linguistics from the University of California, Irvine in the United States. The editor touched base with Meng Rong Lu (孟荣路) from Chabchal County via WeChat in April 2021. Meng Rong Lu, the primary linguistic correspondent for this project, is a native speaker of Sibe. He is also fluent in Mandarin Chinese—which was the language of communication between the collaborators.

In terms of the status of the Sibe language, it is an endangered language according to the UNESCO Atlas of the World's Languages in Danger. A recent estimate puts the number of Sibe speakers in Xinjiang to be around 20,000 individuals (Zikmundová 2013), though those proficient in the language is estimated to be lower due to an ongoing linguistic shift to Mandarin Chinese (Meng Rong Lu, personal communication). Due to the current status of Sibe, this dictionary was created with three objectives in mind for different audiences, which are outlined in (1) below:

(1) For native speakers: to learn the written language
(2) For non-native speakers: to learn the spoken and written language
(3) For the academic community: to provide linguistic data for research purposes  

The specific structure of the dictionary’s entries—which are discussed in detail in the Methodology section—was created in order to address these objectives. For instance, the traditional Manchu script version of lexemes are provided to meet objective (1), and audio recordings and translations of lexemes are included to meet objective (2). 

Methodology

The present version of the dictionary was worked on from June to December 2021, and the majority of the time spent on the project was dedicated to collecting and processing entries. Each entry is coded with a number based on the order in which the lexeme (spoken version) appears in the alphabet, and the data points collected and processed for each entry are listed below in (4)—with a sample entry outlined in (5):

(4) Entry #                                                       
a. Spoken (romanized) version
b. Audio recording of spoken version
c. IPA transcription of spoken version
d. Written (romanized) version
e. Written version in Manchu script
f. Mandarin translation
g. English translation
(5) Entry #1188                            
a. honcirem
b. 1188.wav
c. χɔntɕirɯm
d. honcarambi 
e. ᡥᠣᠨᠴᠠᡵᠠᠮᠪᡳ
f. 打呼噜
g. to snore

The data collection process started with Meng Rong Lu creating a list of lexemes, which was done by consulting the written Manchu terms in Kengo Yamamoto’s A Classified Dictionary of Spoken Manchu and translating them into spoken form. The terms in spoken form (eventually becoming (4a)) and the written forms (4d) were grouped together as individual lexemes on a spreadsheet, and each lexeme received an individual number code (i.e., the entry number). Each lexeme was also given a translation in Mandarin Chinese (4f). Meng Rong Lu then recorded himself pronouncing the spoken version of each lexeme and converted the recordings into .wav files named after the code (as shown in (4b)). A spreadsheet with the components collected for each coded lexeme (i.e., (4a), (4b), (4d), and (4f)) was sent to the main developer.

The spreadsheet received by the main editor from Meng Rong Lu was then uploaded to a cloud-based spreadsheet editor. The main editor proceeded to use an online tool he personally developed to convert the romanized forms of spoken lexemes into IPA. The aforementioned online tool, which can be found on this site under "Transcription System", takes a specific romanization system developed by Meng Rong Lu as input, and outputs IPA. In addition to IPA, the main editor included images of the written form of lexemes written in traditional Manchu script. There have been documented issues with rendering Manchu script on different browsers and software, so in order to remedy these issues, static images of the script were chosen over Unicode. Images were generated using the online tool Anakvu.

FThe main editor translated then all of the Mandarin translations from Meng Rong Lu into English. For accuracy purposes, both Chinese and English dictionaries were consulted, including Manchu.Work (满族空间) and Jerry Norman’s A Comprehensive Manchu-English Dictionary. Any words that were considered by the main editor and Mandarin speakers to be unclear and/or difficult to translate were run past Meng Rong Lu for assistance. Upon completion of the translation process, the entire spreadsheet was checked for errors multiple times by both the main editor and Meng Rong Lu. An additional cross-checking process was performed by So Wai Lun, Tony (蘇偉倫). Finally, all content was uploaded to this website, where we continue to add dictionary entries and multimedia content.


Selected References

Gorelova, L. M. (2002). Manchu Grammar. Brill. 

Moseley, C. (2010). Atlas of the World’s Languages in Danger. 3rd ed. UNESCO Publishing.

Norman, J. (2013). A Comprehensive Manchu-English Dictionary. Harvard University Asia Center.

Yamamoto, K. (1969). A Classified Dictionary Of Spoken Manchu: With Manchu, English And Japanese Indexes. Institute for the Study of Languages and Cultures of Asia and Africa

Zikmundova. (2013). Spoken Sibe: Morphology of the Inflected Parts of Speech (Vol. 55060). Karolinum Press.

 

Acknowledgement of Funding

Mini Buleku was initially funded by the Undergraduate Research Opportunities Program at the University of California, Irvine for the 2021-2022 academic year. We additionally received funding from the 2022 Language Legacies grant with the Endangered Language Fund. We are very grateful for our funding partners, with whom this project has become possible.




Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.