Rusnachenko, N., Liang, H., Kalameyets, M. and Shi, L., 2024. ARElight: Context sampling of large texts for deep learning relation extraction. In: 46th European Conference on Information Retrieval, ECIR 2024, 24-28 March 2024, Glasgow, UK, 229-235.
Full text available as:
Preview |
PDF
ECIR_2024_arekit_sampling_paper.pdf - Accepted Version 1MB |
|
Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk. Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material. |
DOI: 10.1007/978-3-031-56069-9_23
Abstract
The escalating volume of textual data necessitates adept and scalable Information Extraction (IE) systems in the field of Natural Language Processing (NLP) to analyse massive text collections in a detailed manner. While most deep learning systems are designed to handle textual information as it is, the gap in the existence of the interface between a document and the annotation of its parts is still poorly covered. Concurrently, one of the major limitations of most deep-learning models is a constrained input size caused by architectural and computational specifics. To address this, we introduce ARElight<sup>1</sup>, a system designed to efficiently manage and extract information from sequences of large documents by dividing them into segments with mentioned object pairs. Through a pipeline comprising modules for text sampling, inference, optional graph operations, and visualisation, the proposed system transforms large volumes of text in a structured manner. Practical applications of ARElight are demonstrated across diverse use cases, including literature processing and social network analysis.(<sup>1</sup>https://github.com/nicolay-r/ARElight)
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| ISSN: | 0302-9743 |
| Uncontrolled Keywords: | Data Processing Pipeline; Information Retrieval; Visualisation |
| Group: | Faculty of Media, Science and Technology |
| ID Code: | 41508 |
| Deposited By: | Symplectic RT2 |
| Deposited On: | 20 Mar 2026 16:05 |
| Last Modified: | 20 Mar 2026 16:05 |
Downloads
Downloads per month over past year
| Repository Staff Only - |
Tools
Tools