ArabicNLP 2026 Shared Task

AlexandriaX‑2026

Context-Aware Dialectal Arabic MT and MT Evaluation

A shared task for building and diagnosing dialectal Arabic MT systems. Participants can build systems that use rich dialogue context, speaker metadata, persona attributes, and multiple dialectal varieties. They can also build systems that diagnose dialectal Arabic translations by identifying error spans and assigning an error type to each span.

Registration

Registration is open for AlexandriaX-2026.

Register by July 20, 2026, and join the Google Group for task announcements, data releases, and evaluation updates.

Motivation

Dialectal Arabic MT needs context, diagnosis, and broader coverage.

Arabic dialectal machine translation remains difficult because of regional linguistic variation, limited high-quality parallel data, non-standardized orthography, and translation choices that depend on social and conversational context.

AlexandriaX-2026 Shared Task evaluates systems that preserve meaning while adapting lexical, morphological, pragmatic, and sociolinguistic choices to the requested Arabic dialect. It also covers MT evaluation with interpretable error detection so teams can see not only which systems score well, but where and why they fail.

Dialogue-aware translation

Source turns are provided with previous context, domain information, and speaker-addressee gender metadata.

Broad dialectal coverage

The MT subtasks span broad Arabic varieties instead of treating dialectal Arabic as one label.

Cross-dialect MT

Short financial-domain queries are translated from one Arabic variety into another.

Diagnostic MT evaluation

Span-level LQM-inspired annotations support error detection and classification.

Shared Task Subtasks

Three complementary routes into dialectal Arabic MT.

Participants can build context-aware translation systems, cross-dialect Arabic MT systems, diagnostic error-analysis systems, or any combination of the three.

1

Context-Aware English-to-Dialectal Arabic Dialogue Translation

Constrained: provided data only, <=5B parameters Unconstrained: external resources allowed, no restrictions on model usage

Objective

Participants need to build a machine translation system that translates selected English dialogue turns into a specified dialectal Arabic variety. The system must use the provided context, such as previous dialogue turns, target country/dialect label, domain, speaker/addressee gender, and persona information when available. The output should preserve the original meaning while sounding natural and appropriate in the target dialect, including correct lexical, morphological, pragmatic, and sociolinguistic choices.

Participants may enter either:

  • Constrained track: use only the provided Alexandria training/development data, with models up to 5B parameters.
  • Unconstrained track: use external data, pretrained models, and larger models without a parameter limit.

Data

  • Training and development data for model building.
  • Private test set with hidden gold translations.

Evaluation

  • Primary automatic metrics: spBLEU and chrF++.
  • Overall average across all countries.
  • Constrained and unconstrained tracks will have separate leaderboards.
2

Cross-Dialect Arabic Machine Translation

Arabic-to-Arabic MT Financial-domain queries

Objective

Participants need to build a system that translates text from one Arabic variety into another Arabic variety. The input is a short financial-domain query written in a source Arabic variety, plus the required target dialect or variety. The system must generate a translation that keeps the same meaning, uses appropriate financial terminology, and sounds fluent and natural in the target dialect.

The task will test both familiar dialect pairs and unseen dialect-pair settings to measure robustness across Arabic varieties.

Data

  • Short financial-domain queries in source Arabic varieties.
  • Target dialect or variety labels for each input.

Evaluation

  • Target-side translation quality and dialectal naturalness.
  • Seen and unseen dialect-pair settings for robustness analysis.
  • Official metrics and ranking details will be released with the evaluation scripts.
3

Dialectal Arabic MT Error Detection and Classification

Span prediction Error classification LQM-inspired typology

Objective

Participants need to build an evaluation system that analyzes machine translation outputs and identifies translation errors. The system must do two things: first, predict the exact word-level span in the translated text where an error occurs; second, assign an error category to each span using the provided LQM-inspired typology. Error categories include issues such as accuracy, fluency, terminology, dialect/locale appropriateness, grammar, and meaning-related errors.

Data

  • Training and development data for model building.
  • Private test set with hidden gold translations.

Evaluation

  • Span detection F1.
  • Error classification Macro-F1.
  • Overall ranking by Labeled Span F1.

Data Examples

Data Examples from AlexandriaX-2026 subtasks.

Subtask 1

Context-Aware Dialogue Translation

Domain: Agriculture and farming Dialect: Moroccan Standard Darija Dialect

English Conversation

Female -> Male · Wholesale Buyer · Turn 1

For me, Hajj, this is more than just paper. I want to build a long-term partnership with you, one that is based on trust.

Male -> Female · Farmer · Turn 2

You speak the truth, my daughter. A contract is just ink, but trust is what matters. May God bless our work together.

Dialectal Translation

Female -> Male · Wholesale Buyer · Turn 1

بنسبة ليا، لحاج، هادشي كتر من غير ورقة. بغيت نبني معاك شراكة long-term، تكون مبنية على ثقة.

Male -> Female · Farmer · Turn 2

عندك لحق ا بنتي. ل contrat غير مداد، ولكن ثقة هي لي مهمة. الله يبارك فخدمتنا.

Subtask 2

Cross-Dialect Arabic Machine Translation

Domain: Financial services Intent: card arrival Source variety: Modern Standard Arabic

MSA Source

ماذا افعل إن لم أستلم بطاقتي الجديدة؟

Moroccan Translation

شنو ندير إلا ماوصلتنيش لاكارط جديدة ديالي؟

Saudi Translation

وش أسوي إذا ما استلمت بطاقتي الجديدة؟

Tunisian Translation

آش نعمل كان ما جاتنيش كارطتي الجديدة؟

Subtask 3

MT Error Detection and Classification

Direction: English to Moroccan dialect Output: highlighted span | error type

Source

Okay, I'll close my mouth and sew it up with needle and thread. I will not speak at all.

MT Prediction with Tagged Errors

طيب | Sociolinguistics، هادف | Semantics فمي و خيطو | Morphosyntax بِالْإِزْرِ | Semantics وَ الخيط. ما غادي | Morphosyntax نكلمش | Morphosyntax بزاف.

Important Dates

AlexandriaX-2026 shared-task timeline.

Task Launch

Release task website, documentation, and registration form.

Training and Development Release

Release task training/development data, baseline code, and evaluation scripts.

Registration Deadline and Blind Test Release

Teams register and receive blind test inputs for final evaluation.

Final System Output Deadline

Submission deadline for final predictions.

Final Results Released

Official rankings and diagnostic breakdowns shared with participants.

System Description Papers Due

Camera-ready system descriptions are due.

Shared Task Overview Paper Due

Overview paper describing the task, data, evaluation, and official results is due.

Conference Camera-ready Deadline

Final camera-ready materials due for the conference proceedings.

ArabicNLP / EMNLP Presentation Period

Shared task overview and participant systems presented during the conference period.

Participation

Multiple entry points for teams with different resources.

Teams may participate in any subset of the three subtasks.

Submission Requirements

  • Submit system outputs for the blind test set in the specified format.
  • Use only permitted data and model sizes for the constrained MT track.
  • Include enough system details for result verification.
  • Prepare a system description paper following ArabicNLP instructions.

Contact

Questions and announcements

Organizer Contact

For task questions, registration issues, or release coordination, contact the organizing team.

alexandriax2026@gmail.com

Organizing Committee

AlexandriaX-2026 organizers

The University of British Columbia logo King Fahd University of Petroleum and Minerals logo VinUniversity logo Hamad Bin Khalifa University logo King Abdullah University of Science and Technology logo