The population of the ASEAN Economic Community is over 600 million and they speak many different languages. Consequently, natural language processing (NLP) is necessary to cope with many languages.
The state of the art technologies in NLP are based on treebanks. A treebank is a linguistic knowledge representation of natural language texts. The basic linguistic annotations in treebanks are word segmentation, part-of-speech (POS) tagging, and parsing annotations. Almost all NLP researches and tools are based on treebanks in a broad sense.
The main problem of the creation of a treebank is that it needs a lot of linguistic knowledge for the language. As a result, existing treebanks are limited in their sizes, annotation types and languages. In particular, no publicly available treebanks for most of Asian languages.
This background makes us propose this project for developing Asian Language Treebank (ALT). The objective of ALT is developing a parallel treebank for Asian languages. Indeed, ASEAN IVO is an ideal organization for developing ALT, because it consists of top-level NLP research institutes for Asian languages. Without ASEAN IVO, it will be impossible to corporate and cover main Asian languages for building treebanks.
ASEAN IVO is an ideal organization for developing ALT, because it consists of top-level NLP research institutes for Asian languages. Without ASEAN IVO, it will be impossible to corporate and cover main Asian languages for building treebanks.
The developing of ALT has already been started. NICT and UCSY has started building Japanese, English and Myanmar treebanks in FY 2015. NICT has also finished the translation of 20,000 English sentences (from Wikinews) into Indonesian, Vietnamese, Thai, Khmer, Laos, Malay, Philippine languages.
In this project, BPPT, I2R, IOIT, NIPTICT, UCSY and NICT will develop ALT for Indonesian, Malay, Vietnamese, Khmer, Myanmar and Japanese languages, respectively. (NICT will also develop English ALT). Those different language treebanks will be built from the already translated Wikinews. After finishing the development of ALT, it will be used to develop NLP tools within this project.
The members of this project are as follows:
For more information: Asian Language Treebank (ALT) Project
ICCA 2021, the 19th in the series that has been held annually since 2003, will bring together leading engineers and scientists in computer and information technology from around the world.
In Collaboration with University of Computer Studies, Yangon and Myanmar Computer Professionals Association (MCPA) will be held the 9th MCPA Day on 4th March, 2020 at 9:30 am.
Research Showcase, and Discussion for possible research collaboration with foreign universities.
The academic year 2019-2020 mid term examination will be held according to the time table in the University of Computer Studies, Yangon.
The 23nd Conference of Oriental COCOSDA will be hosted by the University of Computer Studies, Yangon (UCSY). With the Myanmar hosting Oriental COCOSDA for the first time ...