Search Results for author: Stephanie Strassel

Found 37 papers, 0 papers with code

Reflections on 30 Years of Language Resource Development and Sharing

no code implementations • LREC 2022 • Christopher Cieri, Mark Liberman, Sunghye Cho, Stephanie Strassel, James Fiumara, Jonathan Wright

The Linguistic Data Consortium was founded in 1992 to solve the problem that limitations in access to shareable data was impeding progress in Human Language Technology research and development.

Management Open-Ended Question Answering

Paper
Add Code

A Study in Contradiction: Data and Annotation for AIDA Focusing on Informational Conflict in Russia-Ukraine Relations

no code implementations • LREC 2022 • Jennifer Tracey, Ann Bies, Jeremy Getman, Kira Griffitt, Stephanie Strassel

This paper describes data resources created for Phase 1 of the DARPA Active Interpretation of Disparate Alternatives (AIDA) program, which aims to develop language technology that can help humans manage large volumes of sometimes conflicting information to develop a comprehensive understanding of events around the world, even when such events are described in multiple media and languages.

Paper
Add Code

CAMIO: A Corpus for OCR in Multiple Languages

no code implementations • LREC 2022 • Michael Arrigo, Stephanie Strassel, Nolan King, Thao Tran, Lisa Mason

CAMIO (Corpus of Annotated Multilingual Images for OCR) is a new corpus created by Linguistic Data Consortium to serve as a resource to support the development and evaluation of optical character recognition (OCR) and related technologies for 35 languages across 24 unique scripts.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Add Code

WeCanTalk: A New Multi-language, Multi-modal Resource for Speaker Recognition

no code implementations • LREC 2022 • Karen Jones, Kevin Walker, Christopher Caruso, Jonathan Wright, Stephanie Strassel

The WeCanTalk (WCT) Corpus is a new multi-language, multi-modal resource for speaker recognition.

Speaker Recognition

Paper
Add Code

The SAFE-T Corpus: A New Resource for Simulated Public Safety Communications

no code implementations • LREC 2020 • Dana Delgado, Kevin Walker, Stephanie Strassel, Karen Jones, Christopher Caruso, David Graff

We introduce a new resource, the SAFE-T (Speech Analysis for Emergency Response Technology) Corpus, designed to simulate first-responder communications by inducing high vocal effort and urgent speech with situational background noise in a game-based collection protocol.

Action Detection Activity Detection +3

Paper
Add Code

Call My Net 2: A New Resource for Speaker Recognition

no code implementations • LREC 2020 • Karen Jones, Stephanie Strassel, Kevin Walker, Jonathan Wright

Speakers used a variety of handsets, including landline and mobile devices, and made VoIP calls from tablets or computers.

Speaker Recognition

Paper
Add Code

Morphological Segmentation for Low Resource Languages

no code implementations • LREC 2020 • Justin Mott, Ann Bies, Stephanie Strassel, Jordan Kodner, Caitlin Richter, Hongzhi Xu, Mitchell Marcus

This paper describes a new morphology resource created by Linguistic Data Consortium and the University of Pennsylvania for the DARPA LORELEI Program.

Segmentation

Paper
Add Code

A Progress Report on Activities at the Linguistic Data Consortium Benefitting the LREC Community

no code implementations • LREC 2020 • Christopher Cieri, James Fiumara, Stephanie Strassel, Jonathan Wright, Denise DiPersio, Mark Liberman

This latest in a series of Linguistic Data Consortium (LDC) progress reports to the LREC community does not describe any single language resource, evaluation campaign or technology but sketches the activities, since the last report, of a data center devoted to supporting the work of LREC attendees among other research communities.

Paper
Add Code

Basic Language Resources for 31 Languages (Plus English): The LORELEI Representative and Incident Language Packs

no code implementations • LREC 2020 • Jennifer Tracey, Stephanie Strassel

This paper documents and describes the thirty-one basic language resource packs created for the DARPA LORELEI program for use in development and testing of systems capable of providing language-independent situational awareness in emerging scenarios in a low resource language context.

Paper
Add Code

Corpus Building for Low Resource Languages in the DARPA LORELEI Program

no code implementations • WS 2019 • Jennifer Tracey, Stephanie Strassel, Ann Bies, Zhiyi Song, Michael Arrigo, Kira Griffitt, Dana Delgado, Dave Graff, Seth Kulick, Justin Mott, Neil Kuster

Paper
Add Code

Laying the Groundwork for Knowledge Base Population: Nine Years of Linguistic Resources for TAC KBP

no code implementations • LREC 2018 • Jeremy Getman, Joe Ellis, Stephanie Strassel, Zhiyi Song, Jennifer Tracey

Knowledge Base Population

Paper
Add Code

Simple Semantic Annotation and Situation Frames: Two Approaches to Basic Text Understanding in LORELEI

no code implementations • LREC 2018 • Kira Griffitt, Jennifer Tracey, Ann Bies, Stephanie Strassel

Transfer Learning

Paper
Add Code

From `Solved Problems' to New Challenges: A Report on LDC Activities

no code implementations • LREC 2018 • Christopher Cieri, Mark Liberman, Stephanie Strassel, Denise DiPersio, Jonathan Wright, Andrea Mazzucchi

Dialogue Management Language Identification +2

Paper
Add Code

Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers

no code implementations • LREC 2018 • Zhiyi Song, Ann Bies, Justin Mott, Xuansong Li, Stephanie Strassel, Christopher Caruso

Knowledge Base Population

Paper
Add Code

VAST: A Corpus of Video Annotation for Speech Technologies

no code implementations • LREC 2018 • Jennifer Tracey, Stephanie Strassel

Action Detection Language Identification +2

Paper
Add Code

Event Nugget and Event Coreference Annotation

no code implementations • WS 2016 • Zhiyi Song, Ann Bies, Stephanie Strassel, Joe Ellis, Teruko Mitamura, Hoa Trang Dang, Yukari Yamakawa, Sue Holm

Knowledge Base Population

Paper
Add Code

A Comparison of Event Representations in DEFT

no code implementations • WS 2016 • Ann Bies, Zhiyi Song, Jeremy Getman, Joe Ellis, Justin Mott, Stephanie Strassel, Martha Palmer, Teruko Mitamura, Marjorie Freedman, Heng Ji, Tim O{'}Gorman

Anomaly Detection

Paper
Add Code

LORELEI Language Packs: Data, Tools, and Resources for Technology Development in Low Resource Languages

no code implementations • LREC 2016 • Stephanie Strassel, Jennifer Tracey

In this paper, we describe the textual linguistic resources in nearly 3 dozen languages being produced by Linguistic Data Consortium for DARPA{'}s LORELEI (Low Resource Languages for Emergent Incidents) Program.

Paper
Add Code

Uzbek-English and Turkish-English Morpheme Alignment Corpora

no code implementations • LREC 2016 • Xuansong Li, Jennifer Tracey, Stephen Grimes, Stephanie Strassel

Morphologically-rich languages pose problems for machine translation (MT) systems, including word-alignment errors, data sparsity and multiple affixes.

Machine Translation Translation +1

Paper
Add Code

Multi-language Speech Collection for NIST LRE

no code implementations • LREC 2016 • Karen Jones, Stephanie Strassel, Kevin Walker, David Graff, Jonathan Wright

The Multi-language Speech (MLS) Corpus supports NIST{'}s Language Recognition Evaluation series by providing new conversational telephone speech and broadcast narrowband data in 20 languages/dialects.

Paper
Add Code

The Query of Everything: Developing Open-Domain, Natural-Language Queries for BOLT Information Retrieval

no code implementations • LREC 2016 • Kira Griffitt, Stephanie Strassel

The DARPA BOLT Information Retrieval evaluations target open-domain natural-language queries over a large corpus of informal text in English, Chinese and Egyptian Arabic.

Information Retrieval Natural Language Queries +1

Paper
Add Code

Parallel Chinese-English Entities, Relations and Events Corpora

no code implementations • LREC 2016 • Justin Mott, Ann Bies, Zhiyi Song, Stephanie Strassel

This paper introduces the parallel Chinese-English Entities, Relations and Events (ERE) corpora developed by Linguistic Data Consortium under the DARPA Deep Exploration and Filtering of Text (DEFT) Program.

Knowledge Base Population Translation

Paper
Add Code

Selection Criteria for Low Resource Language Programs

no code implementations • LREC 2016 • Christopher Cieri, Mike Maxwell, Stephanie Strassel, Jennifer Tracey

This paper documents and describes the criteria used to select languages for study within programs that include low resource languages whether given that label or another similar one.

Management

Paper
Add Code

Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus

no code implementations • LREC 2016 • Xuansong Li, Martha Palmer, Nianwen Xue, Lance Ramshaw, Mohamed Maamouri, Ann Bies, Kathryn Conger, Stephen Grimes, Stephanie Strassel

High accuracy for automated translation and information retrieval calls for linguistic annotations at various language levels.

Information Retrieval Retrieval +2

Paper
Add Code

From Light to Rich ERE: Annotation of Entities, Relations, and Events

no code implementations • WS 2015 • Zhiyi Song, Ann Bies, Stephanie Strassel, Tom Riese, Justin Mott, Joe Ellis, Jonathan Wright, Seth Kulick, Neville Ryant, Xiaoyi Ma

Anomaly Detection Reading Comprehension

Paper
Add Code

A New Dataset and Evaluation for Belief/Factuality

no code implementations • SEMEVAL 2015 • Vinodkumar Prabhakaran, Tomas By, Julia Hirschberg, Owen Rambow, Samira Shaikh, Tomek Strzalkowski, Jennifer Tracey, Michael Arrigo, Rupayan Basu, Micah Clark, Adam Dalton, Mona Diab, Louise Guthrie, Anna Prokofieva, Stephanie Strassel, Gregory Werner, Yorick Wilks, Janyce Wiebe

Knowledge Base Population

Paper
Add Code

Event Nugget Annotation: Processes and Issues

no code implementations • WS 2015 • Teruko Mitamura, Yukari Yamakawa, Susan Holm, Zhiyi Song, Ann Bies, Seth Kulick, Stephanie Strassel

Paper
Add Code

Transliteration of Arabizi into Arabic Orthography: Developing a Parallel Annotated Arabizi-Arabic Script SMS/Chat Corpus

no code implementations • WS 2014 • Ann Bies, Zhiyi Song, Mohamed Maamouri, Stephen Grimes, Haejoong Lee, Jonathan Wright, Stephanie Strassel, Nizar Habash, Esk, Ramy er, Owen Rambow

Transliteration

Paper
Add Code

A Comparison of the Events and Relations Across ACE, ERE, TAC-KBP, and FrameNet Annotation Standards

no code implementations • WS 2014 • Jacqueline Aguilar, Charley Beller, Paul McNamee, Benjamin Van Durme, Stephanie Strassel, Zhiyi Song, Joe Ellis

Relation Extraction Semantic Parsing +1

Paper
Add Code

New Directions for Language Resource Development and Distribution

no code implementations • LREC 2014 • Christopher Cieri, Denise DiPersio, Mark Liberman, Andrea Mazzucchi, Stephanie Strassel, Jonathan Wright

Despite the growth in the number of linguistic data centers around the world, their accomplishments and expansions and the advances they have help enable, the language resources that exist are a small fraction of those required to meet the goals of Human Language Technologies (HLT) for the worldÂ’s languages and the promises they offer: broad access to knowledge, direct communication across language boundaries and engagement in a global community.

Transfer Learning

Paper
Add Code

The RATS Collection: Supporting HLT Research with Degraded Audio Data

no code implementations • LREC 2014 • David Graff, Kevin Walker, Stephanie Strassel, Xiaoyi Ma, Karen Jones, Ann Sawyer

The DARPA RATS program was established to foster development of language technology systems that can perform well on speaker-to-speaker communications over radio channels that evince a wide range in the type and extent of signal variability and acoustic degradation.

Action Detection Activity Detection +3

Paper
Add Code

Collecting Natural SMS and Chat Conversations in Multiple Languages: The BOLT Phase 2 Corpus

no code implementations • LREC 2014 • Zhiyi Song, Stephanie Strassel, Haejoong Lee, Kevin Walker, Jonathan Wright, Jennifer Garland, Dana Fore, Brian Gainor, Preston Cabe, Thomas Thomas, Brendan Callahan, Ann Sawyer

The DARPA BOLT Program develops systems capable of allowing English speakers to retrieve and understand information from informal foreign language genres.

Machine Translation Translation

Paper
Add Code

Parallel Aligned Treebanks at LDC: New Challenges Interfacing Existing Infrastructures

no code implementations • LREC 2012 • Xuansong Li, Stephanie Strassel, Stephen Grimes, Safa Ismael, Mohamed Maamouri, Ann Bies, Nianwen Xue

Parallel aligned treebanks (PAT) are linguistic corpora annotated with morphological and syntactic structures that are aligned at sentence as well as sub-sentence levels.

Machine Translation Sentence +2

Paper
Add Code

Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual

no code implementations • LREC 2012 • Xuansong Li, Stephanie Strassel, Heng Ji, Kira Griffitt, Joe Ellis

To advance information extraction and question answering technologies toward a more realistic path, the U. S. NIST (National Institute of Standards and Technology) initiated the KBP (Knowledge Base Population) task as one of the TAC (Text Analysis Conference) evaluation tracks.

Cross-Lingual Entity Linking Entity Linking +5

Paper
Add Code

Creating HAVIC: Heterogeneous Audio Visual Internet Collection

no code implementations • LREC 2012 • Stephanie Strassel, Am Morris, a, Jonathan Fiscus, Christopher Caruso, Haejoong Lee, Paul Over, James Fiumara, Barbara Shaw, Brian Antonishek, Martial Michel

Linguistic Data Consortium and the National Institute of Standards and Technology are collaborating to create a large, heterogeneous annotated multimodal corpus to support research in multimodal event detection and related technologies.

Event Detection

Paper
Add Code

Annotation Trees: LDC's customizable, extensible, scalable, annotation infrastructure

no code implementations • LREC 2012 • Jonathan Wright, Kira Griffitt, Joe Ellis, Stephanie Strassel, Brendan Callahan

In recent months, LDC has developed a web-based annotation infrastructure centered around a tree model of annotations and a Ruby on Rails application called the LDC User Interface (LUI).

Reading Comprehension

Paper
Add Code

Linguistic Resources for Handwriting Recognition and Translation Evaluation

no code implementations • LREC 2012 • Zhiyi Song, Safa Ismael, Stephen Grimes, David Doermann, Stephanie Strassel

LDC has developed a stable pipeline and infrastructures for collecting and annotating handwriting linguistic resources to support the evaluation of MADCAT and OpenHaRT.

Document Classification Handwriting Recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.