Phrase Based Amharic News Text Classification: a Comparative Analysis of Bigrams and Trigrams - Zeleke Abebaw - Kirjat - LAP LAMBERT Academic Publishing - 9783659305955 - perjantai 23. marraskuuta 2012
Mikäli Kansi ja otsikko eivät täsmää, on otsikko oikein

Phrase Based Amharic News Text Classification: a Comparative Analysis of Bigrams and Trigrams

Zeleke Abebaw

Hinta
€ 44,49

Tilattu etävarastosta

Arvioitu toimitus ke - to 17. - 25. syys
Lisää iMusic-toivelistallesi
Eller

Phrase Based Amharic News Text Classification: a Comparative Analysis of Bigrams and Trigrams

The recent growth of ICT infrastructure in Ethiopia is resulting in an exponential increase of digital information in local languages including Amharic. Huge volumes of data are available in Amharic, which is observed on the growing online newspapers, websites, and digital storage's of Ethiopian News Agency. To tackle the agency?s news text management problems, a number of researches have been conducted on automatic processing of Amharic news texts using bag-of-words feature representation. However, using single words as features could result in losing the intended meaning when the concept is created from two or more sequential words. Thus, in order to maintain this concept, a phrase based approach (using bigrams and trigrams) has been proposed and implemented in this research. The result shows that using bigram phrases,the best accuracy (95.3%) has been obtained at four news categories, followed by (81.3%)for eight categories and (72.01%) for twelve categories. On the other hand, for trigram phrase structures, the best accuracy was(72.9%)four news categories, followed by 69.7% for eight categories, and 56.4% for twelve categories. Thus,bigrams shows better accuracy than trigrams.

Media Kirjat     Paperback Book   (Kirja pehmeillä kansilla ja liimatulla selällä)
Julkaisupäivämäärä perjantai 23. marraskuuta 2012
ISBN13 9783659305955
Tuottaja LAP LAMBERT Academic Publishing
Sivujen määrä 112
Mitta 150 × 7 × 226 mm   ·   185 g
Kieli German