ELECTRA-Based Clinical Text Modeling for Automated Medical Specialty Classification

Authors

  • B. Rajesh Reddy Author
  • G Navya Author
  • Jeedhula Sai jeevan Author
  • Mohammed Mujaheed Author
  • Bachali Ruthvik Author

DOI:

https://doi.org/10.64751/ajmimc.2026.v5.n2(1).291

Keywords:

Clinical Text Classification, Electronic Health Records (EHRs), Natural Language Processing (NLP), Machine Learning (ML), Text Mining, Flask Deployment

Abstract

The rapid expansion of digital healthcare data, particularly unstructured clinical text such as Electronic Health Records (EHRs) and medical transcriptions, has created significant opportunities for building intelligent healthcare systems. These data sources contain valuable insights that can support medical specialty identification and enhance clinical workflows. However, extracting meaningful information from such text remains challenging due to its complex structure, domainspecific terminology, and high dimensionality. This study addresses the problem of automatically classifying clinical text into appropriate medical specialties, a task essential for improving patient care, optimizing resource allocation, and enabling accurate clinical decision-making. Traditional methods, including manual annotation and rule-based systems, are often time-consuming, error-prone, and lack scalability. Moreover, conventional Machine Learning (ML) approaches rely heavily on handcrafted features and fail to capture deep contextual and semantic relationships, especially in large-scale and imbalanced datasets. To overcome these limitations, the proposed system leverages transformer-based embeddings combined with advanced ML models. Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA) is utilized to generate rich contextual representations of clinical text. To handle class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) is applied for data augmentation. Several classifiers, including Adaptive Boosting (AB), Random Forest (RF), Tree Alternating Optimization (TT), and Extra Trees (ET), are evaluated. Among them, ET demonstrates the best performance and is selected as the final model. The system is deployed using the Flask framework with authentication and real-time prediction capabilities, ensuring improved accuracy, scalability, and robustness for intelligent healthcare analytics.

Downloads

Published

2026-04-23

How to Cite

B. Rajesh Reddy, G Navya, Jeedhula Sai jeevan, Mohammed Mujaheed, & Bachali Ruthvik. (2026). ELECTRA-Based Clinical Text Modeling for Automated Medical Specialty Classification. American Journal of Management and IOT Medical Computing, 5(2), 548-559. https://doi.org/10.64751/ajmimc.2026.v5.n2(1).291