Please use this identifier to cite or link to this item:
http://dspace.univ-mascara.dz:8080/jspui/handle/123456789/1073
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Baligh, BABAALI | - |
dc.date.accessioned | 2024-10-01T12:08:45Z | - |
dc.date.available | 2024-10-01T12:08:45Z | - |
dc.date.issued | 2024-10-01 | - |
dc.identifier.uri | http://dspace.univ-mascara.dz:8080/jspui/handle/123456789/1073 | - |
dc.description.abstract | Machine translation serves as a crucial tool for breaking down language barriers and facilitating communication and information access across diverse linguistic contexts. However, its efficacy heavily relies on the availability of sufficient and high-quality training data, a challenge often encountered in low-resource language settings. In this study, we explore methods to enhance Neural Machine Translation (NMT) systems by employing data augmentation techniques to address the challenges posed by such scenarios. Our experimentation involved various augmentation strategies, including Back Translation, Copied Corpus, and innovative methods like Right Rotation Augmentation, with the aim of enriching training data and improving translation quality. Through rigorous evaluation comparing augmented NMT models with the baseline, we observed significant enhancements in translation quality, as evidenced by improved BLEU scores. Our analysis underscores the effectiveness of different augmentation techniques in bolstering NMT systems, especially in low-resource language contexts. Furthermore, our comparative analysis between Seq2Seq NMT models and GPT-based models sheds light on their architectural intricacies and performance characteristics. Evaluating their performance across diverse translation tasks, we found that the ChatGPT model consistently outperformed the Seq2Seq model, exhibiting higher COMET, BLEU, and ChrF scores. Notably, the ChatGPT model demonstrated superior performance in translating from the Algerian Arabic dialect (DZDA) to Modern Standard Arabic (MSA). Moreover, transitioning from zero-shot to few-shot scenarios led to enhanced translation performance f or ChatGPT models across both language pairs. These findings contribute to a deeper understanding of the interplay between Seq2Seq and GPT-based models in machine translation, offering valuable insights for future advancements in the field. | en_US |
dc.subject | Neural Machine Translation | en_US |
dc.subject | Large Language Model | en_US |
dc.subject | Data augmentation | en_US |
dc.subject | Low resource language | en_US |
dc.title | Arabic Machine Translation of Social Media Content | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | Thèse de Doctorat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Thesis.pdf | 3,18 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.