Dialogue-aware translation
Source turns are provided with previous context, domain information, and speaker-addressee gender metadata.
ArabicNLP 2026 Shared Task
A shared task for building and diagnosing dialectal Arabic MT systems. Participants can build systems that use rich dialogue context, speaker metadata, persona attributes, and multiple dialectal varieties. They can also build systems that diagnose dialectal Arabic translations by identifying error spans and assigning an error type to each span.
Registration
Register by July 20, 2026, and join the Google Group for task announcements, data releases, and evaluation updates.
Motivation
Arabic dialectal machine translation remains difficult because of regional linguistic variation, limited high-quality parallel data, non-standardized orthography, and translation choices that depend on social and conversational context.
AlexandriaX-2026 Shared Task evaluates systems that preserve meaning while adapting lexical, morphological, pragmatic, and sociolinguistic choices to the requested Arabic dialect. It also covers MT evaluation with interpretable error detection so teams can see not only which systems score well, but where and why they fail.
Source turns are provided with previous context, domain information, and speaker-addressee gender metadata.
The MT subtasks span broad Arabic varieties instead of treating dialectal Arabic as one label.
Short financial-domain queries are translated from one Arabic variety into another.
Span-level LQM-inspired annotations support error detection and classification.
Shared Task Subtasks
Participants can build context-aware translation systems, cross-dialect Arabic MT systems, diagnostic error-analysis systems, or any combination of the three.
Participants need to build a machine translation system that translates selected English dialogue turns into a specified dialectal Arabic variety. The system must use the provided context, such as previous dialogue turns, target country/dialect label, domain, speaker/addressee gender, and persona information when available. The output should preserve the original meaning while sounding natural and appropriate in the target dialect, including correct lexical, morphological, pragmatic, and sociolinguistic choices.
Participants may enter either:
Participants need to build a system that translates text from one Arabic variety into another Arabic variety. The input is a short financial-domain query written in a source Arabic variety, plus the required target dialect or variety. The system must generate a translation that keeps the same meaning, uses appropriate financial terminology, and sounds fluent and natural in the target dialect.
The task will test both familiar dialect pairs and unseen dialect-pair settings to measure robustness across Arabic varieties.
Participants need to build an evaluation system that analyzes machine translation outputs and identifies translation errors. The system must do two things: first, predict the exact word-level span in the translated text where an error occurs; second, assign an error category to each span using the provided LQM-inspired typology. Error categories include issues such as accuracy, fluency, terminology, dialect/locale appropriateness, grammar, and meaning-related errors.
Data Examples
For me, Hajj, this is more than just paper. I want to build a long-term partnership with you, one that is based on trust.
You speak the truth, my daughter. A contract is just ink, but trust is what matters. May God bless our work together.
بنسبة ليا، لحاج، هادشي كتر من غير ورقة. بغيت نبني معاك شراكة long-term، تكون مبنية على ثقة.
عندك لحق ا بنتي. ل contrat غير مداد، ولكن ثقة هي لي مهمة. الله يبارك فخدمتنا.
ماذا افعل إن لم أستلم بطاقتي الجديدة؟
شنو ندير إلا ماوصلتنيش لاكارط جديدة ديالي؟
وش أسوي إذا ما استلمت بطاقتي الجديدة؟
آش نعمل كان ما جاتنيش كارطتي الجديدة؟
Okay, I'll close my mouth and sew it up with needle and thread. I will not speak at all.
طيب | Sociolinguistics، هادف | Semantics فمي و خيطو | Morphosyntax بِالْإِزْرِ | Semantics وَ الخيط. ما غادي | Morphosyntax نكلمش | Morphosyntax بزاف.
Important Dates
Release task website, documentation, and registration form.
Release task training/development data, baseline code, and evaluation scripts.
Teams register and receive blind test inputs for final evaluation.
Submission deadline for final predictions.
Official rankings and diagnostic breakdowns shared with participants.
Camera-ready system descriptions are due.
Overview paper describing the task, data, evaluation, and official results is due.
Final camera-ready materials due for the conference proceedings.
Shared task overview and participant systems presented during the conference period.
Participation
Teams may participate in any subset of the three subtasks.
Contact
For task questions, registration issues, or release coordination, contact the organizing team.
alexandriax2026@gmail.comJoin the Google Group for task updates, clarifications, and evaluation announcements.
alexandriax-2026 Google GroupOrganizing Committee