For access to this article, please select a purchase option:
IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.
Your recommendation has been sent to your librarian.
In view of the inheritance and protection of the excellent Dongba culture of Naxi nationality in Yunnan Province, the problem of automatic text classification is put forward in the field of artificial intelligence. Based on the large amount of text data obtained from Dongba classic collection of Lijiang Dongba Culture Research Institute, Dongba text data set was established and labeled; the Dongba text was classified according to the ceremony by a machine learning algorithm named catboost. The experimental results on six ritual categories of the final 300 datasets show that the classification accuracy of catboost algorithm is 87.5%, and the recall rate is 86.7%. The Dongba text data set is not perfect, meanwhile, the types of rituals and the number of texts in the corpus can be further enriched.This study attempts to use catboost algorithm to solve the problem of text classification in Dongba classics. Experiments show that the algorithm is reasonable and effective, and has practical application value.
Inspec keywords: learning (artificial intelligence); pattern classification; text analysis
Subjects: Other topics in statistics; Data handling techniques; Computer vision and image processing techniques