Using decoder-based distillation for enhancing multilingual clinical case report summarization

Rusnachenko, Nikolai; Liu, Xiaoxiao; Chang, Jian; Zhang, Jian J.

Using decoder-based distillation for enhancing multilingual clinical case report summarization.

Tools

Rusnachenko, N., Liu, X., Chang, J. and Zhang, J. J., 2025. Using decoder-based distillation for enhancing multilingual clinical case report summarization. Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2025), 4038, 544-553.

Full text available as:

Preview

PDF
paper_39.pdf - Published Version
Available under License Creative Commons Attribution.
4MB

Copyright to original material in this document is with the original owner(s). Access to this content through BURO is granted on condition that you use it only for research, scholarly or other non-commercial purposes. If you wish to use it for any other purposes, you must contact BU via BURO@bournemouth.ac.uk.

Any third party copyright material in this document remains the property of its respective owner(s). BU grants no licence for further use of that third party material.

Official URL: https://ceur-ws.org/Vol-4038/

Abstract

Automatic summarization of clinical reports represent an important field of studies that contribute to shortening long textual narratives written in various languages. Effective report summarization poses numerous challenges, including density of medical terms mentions, semantic interdependency among mentioned entities. The most recent advances of instruction-tuned models illustrate promising capabilities of models at various scale across numerous fields of Natural Language Processing, including textual summarization. A hybrid teacher-student distillation process leverages the power of knowledge distillation by transferring knowledge from a large model (teacher) to a smaller model (student). To our best knowledge, numerous existing studies broadly exploit Seq2seq models. Despite their effectiveness for dialogues and summarization of short texts, such techniques have not become common for supporting multilingual and long input contexts. To bridge the gap in exploring distillation tuning, this paper proposes an adaptation of the teacher-student framework for decoder based systems. In this paper, we experiment with a teacher-student framework for summarising clinical case reports. We adopt the Qwen2.5 models family and evaluate our setup on the MultiClinSum<sup>small</sup> dataset. We demonstrate that fine-tuning the 0.5B model with the knowledge transferred from the 72B model results in 2.4%-4% performance increment by Rouge metrics compared to the conventional fine-tuning process, highlighting our model’s practical benefits in clinical information processing. Our framework is publicly available: https://github.com/nicolay-r/ distil-tuning-llm

Item Type:	Article
ISSN:	1613-0073
Uncontrolled Keywords:	Large Language Model; Hybrid Distillation; Clinical Report Summarization; Multilingual Summarization
Group:	Faculty of Media, Science and Technology
ID Code:	41500
Deposited By:	Symplectic RT2
Deposited On:	09 Mar 2026 16:46
Last Modified:	09 Mar 2026 16:46

Downloads

Downloads per month over past year

More statistics for this item...

Repository Staff Only -