¹ßÇ¥Çü½Ä :
|
Á¢¼ö¹øÈ£ - 980307 OTOP 3-4 |
ADVANCING TINNITUS THERAPEUTICS: GPT-2 DRIVEN CLUSTERING ANALYSIS OF
COGNITIVE BEHAVIORAL THERAPY SESSIONS AND GOOGLE T5-BASED PREDICTIVE
MODELING FOR THI SCORE ASSESSMENT |
ROWAN INC., SEOUL, REPUBLIC OF KOREA1, NEURIVE CO., LTD., GIMHAE, REPUBLIC OF KOREA,2, DEPARTMENT OF OTORHINOLARYNGOLOGY - HEAD AND NECK SURGERY, KOREA UNIVERSITY MEDICAL CENTER, SEOUL, REPUBLIC OF KOREA2 |
YONGWOO JEONG,
YONGWOO JEONG1, JAE-JUN SONG2, JISEON YANG1, SUNGMIN KANG1
|
¸ñÀû: This study aims to prove the usability of large language models (LLMs)
for tinnitus Cognitive Behavioral Therapy (CBT) analysis and THI score
prediction. ¹æ¹ý:Cognitive Behavioral Therapy (CBT) for tinnitus alleviates
psychological discomfort caused by severe tinnitus symptoms. During
CBT, the patients will have various homework assignments, including
writing daily diaries and self-monitoring. Most of these homework
assignments are hand-written, textual data. This paper proposes that
tinnitus therapeutics can utilize Large Language Models (LLMs) to
analyze CBT and predict the outcomes of CBT treatments to manage high
caseloads. We anonymized patient data and examined it with GPT-2-
based-embedding, GPU-accelerated dimensionality reduction, and
clustering process to observe how patients themselves changed their
misconceptions and developed less unnecessary excessive emotional
discomfort and how their Tinnitus Handicap Inventory (THI) scores were
improved after the CBT treatment. We also discussed clustering results
as a part of the demonstrations that LLMs can give us insights into
the CBT. Then, we augmented textual patient data in three ways to
minimize augmentation bias with a corresponding penalty to overcome
the constraints of limitation of the number of datasets. The
augmentation algorithm we employed is three combinations, each with a
different penalty level. We created three unique datasets with those
three different augmentation algorithms. We trained the Google T5
Transformer with the augmented data to predict the THI score outcomes
at the end of the CBT sessions. We measured the performance using the
ROUGE-L metric during the training and validation. The generated THI
scores by Google T5 were converted from strings to floats to measure
RMSE performance, which proved that the LLM could predict the outcome
of CBT treatment with CBT data. °á°ú:As the complexity of the level of augmentation increases, the error
rates also increase in general. We individually trained the same Google
T5 LLM per augmented dataset and compared the prediction outputs. As
text augmentation and typo complexity increase, the RMSE drops slightly
but maintains around 0.0138~0.005, and severe numerical augmentation
also increases ROUGE-L losses a little from 0.7514 to 0.8600, which
means that Google T5 LLM is very suitable for generalization and can
predict the outcome of the tinnitus CBT treatment based on CBT text
entries and partial patient information. °á·Ð:Google T5 LLM was able to generalize variations of the augmented
dataset and still predict the correct outcome of the treatment of the
CBT. Even though data augmentation with a small number of data would
bring a risk of overfitting issues, our approach proved that LLM-based
tinnitus therapeutics are crucial to managing a high caseload of
tinnitus patients. However, future work should include a
more extensive and more diverse dataset for the real-world setting. |
|