We are regularly publishing papers on aspects of GEM that describe findings or resources we find worthwhile to share. Please have a look below:
GEMv1 OverviewGEM Workshop 2021
This is our first overview paper, introducing GEM and the initial set of tasks and baselines.
Data CardsGEM Workshop 2021
In "Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards", we describe the approach for data documentation in GEMv1 and the similar approach used by HuggingFace datasets.
Evaluation SuitesNeurIPS 2021
In the paper "Automatic Construction of Evaluation Suites for Natural Language Generation Datasets", we discuss how to build data collections that test robustness of models and show that they are much more expressive than typical test splits.
NL-Augmenter 🦎 → 🐍GEM Workshop 2021
This was a collaborative & participatory workshop collecting >117 different ways to transform text and >23 ways to filter out subpopulations of datasets.