A Corpus Based N-gram Hybrid Approach of Bengali to English Machine Translation

Mohammad Masudur Rahman, Md Faisal Kabir, Mohammad Nurul Huda

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Scopus citations

    Abstract

    Machine translation means automatic translation which is performed using computer software. There are several approaches to machine translation, some of them need extensive linguistic knowledge while others require enormous statistical calculations. This paper presents a hybrid method, integrating corpus based approach and statistical approach for translating Bengali sentences into English with the help of N-gram language model. The corpus based method finds the corresponding target language translation of sentence fragments, selecting the best match text from the bilingual corpus to acquire knowledge while the N-gram model rearranges the sentence constituents to get an accurate translation without employing external linguistic rules. A variety of Bengali sentences, including various structures and verb tenses are considered to translate through the new system. The performance of the proposed system is evaluated in terms of adequacy, fluency, WER, and BLEU score. The assessment scores are compared with other conventional approaches as well as with Google Translate, a well-known free machine translation service by Google. It has been found that experimental results of the work provide higher scores over Google Translate and other methods with less computational cost.

    Original languageEnglish (US)
    Title of host publication2018 21st International Conference of Computer and Information Technology, ICCIT 2018
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    ISBN (Electronic)9781538692424
    DOIs
    StatePublished - Jan 31 2019
    Event21st International Conference of Computer and Information Technology, ICCIT 2018 - Dhaka, Bangladesh
    Duration: Dec 21 2018Dec 23 2018

    Publication series

    Name2018 21st International Conference of Computer and Information Technology, ICCIT 2018

    Conference

    Conference21st International Conference of Computer and Information Technology, ICCIT 2018
    CountryBangladesh
    CityDhaka
    Period12/21/1812/23/18

    All Science Journal Classification (ASJC) codes

    • Information Systems
    • Computer Networks and Communications

    Fingerprint Dive into the research topics of 'A Corpus Based N-gram Hybrid Approach of Bengali to English Machine Translation'. Together they form a unique fingerprint.

    Cite this