Linguistic annotation of natural language corpora is the backbone of supervised methods of statistical natural language processing, as well as other types of corpus-based research.
The Sixteenth LAW (LAW XVI) will provide a forum for presentation and discussion of innovative research on all aspects of linguistic annotation, including creation/evaluation of annotation schemes, methods for automatic and manual annotation, use and evaluation of annotation software and frameworks, representation of linguistic data and annotations, etc.
As in the past, the LAW will provide a forum for annotation researchers to work towards standardization, best practices, and interoperability of annotation information and software.
The special theme for LAW (XVI) is The Impact of Multimodal Language Understanding on Annotation Practices and Representations.
Recent years have seen rapid improvements in performance of machine learning models across multiple modalities of communication such as, text, speech, images, video, gestures, etc. Improvements in unsupervised representation and learning have resulted in state of the art models needing less manually annotated data for training.
However, the need for high quality, manual annotations for capturing multiple layers of information surrogates across various signals, including linguistic, is unlikely to go away. On the contrary, annotation practices, guidelines and representations will need to be adapted, extended, to address the challenges brought about by a richer landscape of phenomena.
Historically these communities have existed as separate islands, and have crafted solutions that satisfy local research and application needs. The evolution of next generation, situated language understanding systems is likely to create a greater demand on the availability, and ease of use of such multimodal annotations and frameworks. We solicit papers addressing the gamut of issues brought into light by this emerging area of research.
Articles can range from those analyzing the state of existing representations, approaches, methods, etc. to those providing ideas, or full-fledged solutions to tools and/or models which could facilitate the integration and search over data and annotations spanning multiple modalities.
Describing your Language Resources (LRs) in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences).
To continue the efforts initiated at LREC-2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC-2022 and LAW-XVI endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.
(Extended) April 15, 2022 | Papers Due |
(New) April 18, 2022 | Paper updates allowed |
May 9, 2022 | Notification of acceptance |
May 23, 2022 | Camera ready final version due |
June 24, 2022 | LAW Workshop, Marseille, France |
Long paper submissions are limited to 8 pages in length plus references. Short papers, posters and demo descriptions are limited to 4 pages plus references. Format requirements are the same as for full papers of LREC 2022 for guidelines and style files.
Submissions should be made at the LAW-XVI portal .
The reviewing of the papers will be double blind. The paper should not include the authors' names and affiliations. Furthermore, self-citations and other references (e.g. to projects, corpora, or software) that could reveal the author's identity should be avoided. For example, instead of "We previously showed (Smith, 1991) …", write "Smith previously showed (Smith, 1991) …".
Sameer Pradhan (Linguistic Data Consortium, University of Pennsylvania and cemantix.org
)
Sandra Kübler (Indiana University, USA)
Ines Rehbein (University of Mannheim, Germany)
Amir Zeldes (Georgetown University, USA)
Omri Abend (Hebrew University, Israel)
Ron Artstein (University of Southern California, USA)
Emmanuele Chersoni (Hong Kong Polytechnic University)
Jonathan Dunn (University of Canterbury, New Zealand)
Kilian Evang (Heinrich-Heine University Düsseldorf, Germany)
Annemarie Friedrich (Bosch, Germany)
Kim Gerdes (Université Paris-Saclay, France)
Chu-Ren Huang (Hong Kong Polytechnic University)
Jena D. Hwang (Allen Institute for AI, USA)
Nancy Ide (Vassar College, USA)
Mikel Iruskieta (University of the Basque Country)
John Lee (City University of Hong Kong)
Els Lefever (Ghent University, Belgium)
Lori Levin (Carnegie Mellon University, USA)
Adam Meyers (New York University, USA)
Jiří Mírovský (Charles University, Czech Republic)
Philippe Muller (Institut de Recherche en Informatique de Toulouse, France)
Kemal Oflazer (Carnegie Mellon University, Qatar)
Maciej Ogrodniczuk (Polish Academy of Sciences, Poland)
Lilja Øvrelid (University of Oslo, Norway)
Antonio Pareja-Lora (Universidad de Alcalá de Henares, Spain)
Miriam R.L. Petruck (ICSI, USA)
Massimo Poesio (Queen Mary University of London, UK)
Michael Roth (University of Stuttgart, Germany)
Nathan Schneider (Georgetown University, USA)
Djamé Seddah (University Paris Sorbonne, France)
Manfred Stede (University of Potsdam, Germany)
Katrin Tomanek (Google Research, USA)
Bonnie Webber (University of Edinburgh, USA)
Michael Wiegand (Alpen-Adria-Universität Klagenfurt, Austria)
Fei Xia (University of Washington, USA)
Nianwen Xue (Brandeis University, USA)
Deniz Zeyrek (Middle East Technical University, Turkey)
Heike Zinsmeister (University of Hamburg, Germany)