Using Open Source LLM Model for Medical Transcription

Published in Metropolia University of Applied Sciences Journal, 2025

In modern healthcare, clinical documentation is paramount for patient safety, accurate diagnoses, and continuity of care. However, physician burnout has been caused by the increasing overhead of electronic health record (EHR) systems, which take up less time for real human interaction. In less-resourced languages such as Finnish, in which natural language processing (NLP) tools are only beginning to emerge, this is an even bigger challenge. This thesis investigates the fine-tuning of the open-source LLaMA 3.1–8B language model on simulated Finnish clinical conversations that is, transcribed clinical dialogues created by Metropolia UAS students. The aim is to verify if a domain- aligned large language model (LLM) is able to reliably translate spoken Finnish medical discourse into formal clinical reports. With 7-fold cross-validation, the fine-tuned model achieved a BLEU score of 0.1242, ROUGE-L score of 0.4982, and BERTScore F1 score of 0.8373, showing satisfactory semantic performance using a small dataset and scalability of privacy-oriented NLP tools in Finnish medical environments.

Fig. 1. Keenious use flow.

Keenious

Recommended citation: Mohammed Nowshad Ruhani Chowdhury. (2025). "Using Open Source LLM Model for Medical Transcription." Metropolia University of Applied Sciences Journal. 1(3). https://www.theseus.fi/bitstream/handle/10024/890628/Chowdhury_Mohammed_Nowshad_Ruhani.pdf?sequence=2