GEM Workshop at ACL 2021

The workshop will be held as part of ACL-IJCNLP 2021, August 1-6, 2021. It will take place on August 5 or 6.

It is endorsed by the ACL Special Interest Group on Natural Language Generation (SIGGEN).

Workshop Overview

Natural language generation is one of the most active research fields in NLP, with generation, summarization, and dialog among the most submitted-to tracks. As such, the number of available datasets, metrics, models, and evaluation strategies are increasing rapidly. This is leading to the situation where new models are often evaluated on different anglo-centric tasks with incompatible evaluation setups. With GEM, we are aiming to solve this problem by standardizing and improving the corpora on which to evaluate NLG models, and by supporting the development of better evaluation approaches. In our shared tasks, models will be applied to a wide set of NLG tasks. It covers challenges that measure specific generation aspects, such as content selection and planning, surface realization, paraphrasing, simplification, and others. To avoid hill-climbing on automated metrics, a second part of the shared task focuses on an in-depth analysis of submitted model outputs across both human and automatic evaluation with the aim to uncover shortcomings and opportunities for progress.

Shared Tasks

The shared task is described in-depth on this page. It includes two parts:

  1. In the first part, participants are encouraged to apply their model to as many of the included tasks as possible and submit their formatted outputs. We provide GEM-specific test sets that will be used to evaluate specific generation aspects.
  2. In the second part, all submitted and baseline outputs will be released for an evaluation shared task. Participants can submit analyses and evaluations of the model outputs.

During the GEM workshop, shared task participants will come together to discuss their findings which will inform future iterations of GEM.

Call for Papers

All papers are allowed unlimited space for references and appendices. For papers associated with the shared task, we additionally highly encourage publishing the code used to generate the results. We ask for papers in the following categories:

System Descriptions

Participants of the modeling shared task are invited to submit a system description of 4-8 pages.

System Evaluation Descriptions

Participants of the evaluation shared task are invited to submit a paper describing their analysis approach and findings of 4-8 pages.

Research Papers

We welcome papers discussing any of the following topics:

  • Automatic evaluation of NLG systems
  • Creating challenge sets for NLG corpora
  • Critiques of benchmarking efforts (including ours)
  • Crowdsourcing strategies to improve the inclusiveness of NLG research
  • Measuring progress in NLG / What should a GEM 2.0 look like
  • Modeling and data-augmentation strategies for training effective and/or efficient NLG systems that can be applied to a wide range of tasks
  • Standardizing human evaluation and making it more robust

We additionally invite every group that contributed to the creation and organization of GEM to submit a description of their considerations and contributions.

Please note that we are not looking at submissions that focus on specific modeling challenges or introduce new model architectures, etc., which would fit better into conferences like ACL or INLG.

These submissions can take either of the following forms:

  • Archival Papers Papers describing original and unpublished work can be submitted in either a short (4-page) or a long (8-page) format.
  • Non-Archival Abstracts To discuss work already presented or under review at a peer-reviewed venue, we allow the submission of 2-page abstracts


All submissions should conform to ACL 2021 style guidelines. Archival long and short paper submissions must be anonymized. Abstracts and shared task submission descriptions should include author information. Please submit your papers at the SoftConf link.

Important Dates


February 2 First Call for Shared Task Submissions and Papers, Release of the Training Data

May 3 Workshop Paper Due Date (excl. shared tasks) UPDATED

May 28 Notification of Acceptance (excl. shared tasks)

June 7 Camera-ready papers due (excl. shared tasks)

Shared Task Dates


February 2 Release of the training Data

March 29 Release of the test sets

May 14 Modeling submissions due


March 29 Release of the baseline outputs

May 17 Release of the submission outputs

System Descriptions and Analyses

June 11 System Descriptions and Analyses due

June 25 Notification of Acceptance (shared task)

July 9 Camera-ready papers and task descriptions due

August 5-6 Workshop Dates


The workshop is organized by

The shared task and the GEM environment is organized by a larger team which is listed on this page.