Better Extraction from Text Towards Enhanced Retrieval (BETTER)

The amount of unstructured text information generated daily is exponentially increasing. This presents challenges for analysts of various types to classify, triage, and examine all relevant information for their specific problem area of interest, with potential applications to multiple languages. To keep pace with this ever-increasing amount of information, new tools and methods are needed to enable personalized extraction of semantic information from text and the application of such semantic information to triage and retrieval problems.

Current systems and methods often treat extraction problems as “one size fits all” with ontologies defined a priori and used to address a wide range of analyst needs. Additionally, information retrieval systems lack a deep integration with information extraction systems, especially when viewed in light of personalized information extraction. A final hurdle many tools face is the applicability to a single language or problem domain.

The BETTER program aims to develop enhanced methods for personalized, multilingual semantic extraction and retrieval from text. The goal is to provide a user with a system that quickly and accurately extracts complex semantic information, targeted for a specific user, from text. The system then uses this extracted information to discover and triage relevant documents from a large corpus. Toward this end, BETTER will focus on three research areas: 1) information extraction, 2) information retrieval, and 3)human-in-the-loop interaction. BETTER will apply these three research areas across three phrases composed of diverse problem areas and language sets. Increasingly fine-grained information needs will serve to focus the information extraction and retrieval elements throughout each phase, as well as providing a basis for human-in-the-loop interaction.


