The Sixteenth Linguistic Annotation Workshop

Co-located with LREC, June 24, 2022, Marseille, France

Call for Papers

Introduction

Linguistic annotation of natural language corpora is the backbone of supervised methods of statistical natural language processing, as well as other types of corpus-based research.

The Sixteenth LAW (LAW XVI) will provide a forum for presentation and discussion of innovative research on all aspects of linguistic annotation, including creation/evaluation of annotation schemes, methods for automatic and manual annotation, use and evaluation of annotation software and frameworks, representation of linguistic data and annotations, etc.

As in the past, the LAW will provide a forum for annotation researchers to work towards standardization, best practices, and interoperability of annotation information and software.  

Annotation Procedures

  • Innovative automated and manual strategies forInnovative automated and manual strategies for annotation annotation 1
  • Machine learning and knowledge-based methods for automation of corpus annotation
  • Creation, maintenance, and interactive exploration of annotation structures and annotated data

Annotation Evaluation

  • Inter-annotator agreement and other evaluation metrics and strategies
  • Qualitative evaluation of linguistic representation

Annotation Access and Use

  • Representation formats/structures for merged annotations of different phenomena, and means to explore/manipulate them
  • Linguistic considerations for merging annotations of distinct phenomena

Annotation Procedures

  • Innovative automated and manual strategies forInnovative automated and manual strategies for annotation annotation 1
  • Machine learning and knowledge-based methods for automation of corpus annotation
  • Creation, maintenance, and interactive exploration of annotation structures and annotated data

Annotation Guidelines and Standards

  • Best practices for annotation procedures
  • Development and documentation of annotation schemes
  • Interoperability of annotation formats and/or frameworks among different systems as well as different tasks, frameworks, modalities, and languages

Annotation Software and Frameworks

  • Development, evaluation and/or innovative use of annotation software frameworks

Annotation Schemes

  • New and innovative annotation schemes
  • Comparison of annotation schemes

Special Theme—Annotating Multimodality 

The special theme for LAW (XVI) is The Impact of Multimodal Language Understanding on Annotation Practices and Representations.

Recent years have seen rapid improvements in performance of machine learning models across multiple modalities of communication such as, text, speech, images, video, gestures, etc. Improvements in unsupervised representation and learning have resulted in state of the art models needing less manually annotated data for training. 

However, the need for high quality, manual annotations for capturing multiple layers of information surrogates across various signals, including linguistic, is unlikely to go away. On the contrary, annotation practices, guidelines and representations will need to be adapted, extended, to address the challenges brought about by a richer landscape of phenomena. 

Historically these communities have existed as separate islands, and have crafted solutions that satisfy local research and application needs. The evolution of next generation, situated language understanding systems is likely to create a greater demand on the availability, and ease of use of such multimodal annotations and frameworks. We solicit papers addressing the gamut of issues brought into light by this emerging area of research. 

Articles can range from those analyzing the state of existing representations, approaches, methods, etc. to those providing ideas, or full-fledged solutions to tools and/or models which could facilitate the integration and search over data and annotations spanning multiple modalities.

Identify, Describe and Share your LRs!

Describing your Language Resources (LRs) in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences).

To continue the efforts initiated at LREC-2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository.  This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.   

As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC-2022 and LAW-XVI endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.

Important Dates*

(Extended) April 15, 2022 Papers Due
(New) April 18, 2022 Paper updates allowed
May 9, 2022 Notification of acceptance
May 23, 2022 Camera ready final version due
June 24, 2022 LAW Workshop, Marseille, France
*All deadlines are 11:59 pm AoE (Anywhere on Earth)

Submissions

Long paper submissions are limited to 8 pages in length plus references. Short papers, posters and demo descriptions are limited to 4 pages plus references. Format requirements are the same as for full papers of LREC 2022 for guidelines and style files

Submissions should be made at the LAW-XVI portal .

Reviewing

The reviewing of the papers will be double blind. The paper should not include the authors' names and affiliations. Furthermore, self-citations and other references (e.g. to projects, corpora, or software) that could reveal the author's identity should be avoided. For example, instead of "We previously showed (Smith, 1991) …", write "Smith previously showed (Smith, 1991) …".

Organization

Organizers

Sameer Pradhan (Linguistic Data Consortium, University of Pennsylvania and cemantix.org)

Sandra Kübler (Indiana University, USA) 

Ines Rehbein (University of Mannheim, Germany) 

Amir Zeldes (Georgetown University, USA)

Program Committee Members

Omri Abend (Hebrew University, Israel)
Ron Artstein (University of Southern California, USA)
Emmanuele Chersoni (Hong Kong Polytechnic University)
Jonathan Dunn (University of Canterbury, New Zealand)
Kilian Evang (Heinrich-Heine University Düsseldorf, Germany)
Annemarie Friedrich (Bosch, Germany)
Kim Gerdes (Université Paris-Saclay, France)
Chu-Ren Huang (Hong Kong Polytechnic University)
Jena D. Hwang (Allen Institute for AI, USA)
Nancy Ide (Vassar College, USA)
Mikel Iruskieta (University of the Basque Country)
John Lee (City University of Hong Kong)
Els Lefever (Ghent University, Belgium)
Lori Levin (Carnegie Mellon University, USA)
Adam Meyers (New York University, USA)
Jiří Mírovský (Charles University, Czech Republic)
Philippe Muller (Institut de Recherche en Informatique de Toulouse, France)
Kemal Oflazer (Carnegie Mellon University, Qatar)
Maciej Ogrodniczuk (Polish Academy of Sciences, Poland)
Lilja Øvrelid (University of Oslo, Norway)
Antonio Pareja-Lora (Universidad de Alcalá de Henares, Spain)
Miriam R.L. Petruck (ICSI, USA)
Massimo Poesio (Queen Mary University of London, UK)
Michael Roth (University of Stuttgart, Germany)
Nathan Schneider (Georgetown University, USA)
Djamé Seddah (University Paris Sorbonne, France)
Manfred Stede (University of Potsdam, Germany)
Katrin Tomanek (Google Research, USA)
Bonnie Webber (University of Edinburgh, USA)
Michael Wiegand (Alpen-Adria-Universität Klagenfurt, Austria)
Fei Xia (University of Washington, USA)
Nianwen Xue (Brandeis University, USA)
Deniz Zeyrek (Middle East Technical University, Turkey)
Heike Zinsmeister (University of Hamburg, Germany)