論文庫系統

本校學位論文庫

CITYU Theses & Dissertations

論文詳情

作者

花奕丰

指導教授

魏中倫

學院

數據科學學院

課程

數據科學碩士學位課程(中文學制)

學位類別

碩士

畢業學年度

2024

論文中文題目

基於大語言模型的中醫智慧問答研究

論文英文題目

A Study on Intelligent Q&A of Traditional Chinese Medicine Based on Large Language Model

中文關鍵詞

命名實體識別 ; 知識圖譜 ; 大語言模型 ; 澄江朱氏醫案

英文關鍵詞

Named-entity recognition ; Knowledge map ; Big speech model ; Chengjiang Zhu's medical case

電子全文

公開日期：26/5/2028

中文摘要

隨著信息技術的迅猛發展與互聯網產業的深度變革，全球數據規模呈現指數級擴張，海量異構數據資源的無序性與碎片化特徵日益凸顯。在此背景下，如何從龐雜信息中實現知識的精準定位、語義化組織及高效利用，已成為數字化轉型進程中亟待突破的關鍵挑戰。
針對這一問題，知識圖譜技術通過構建結構化語義關聯網絡，結合基於大規模語言模型的深度語義理解與生成能力，形成了智能化知識服務的核心技術框架。該框架不僅顯著提升了複雜場景下的知識檢索效率與認知決策精準度，更為跨領域知識融合與創新型應用提供了方法論支撐。本研究聚焦中醫領域的知識體系構建與智能化服務實現，首先進行命名實體識別技術研究，針對中醫文本語料特徵，開發高精度實體抽取模型，為中醫知識圖譜的自動化構建奠定技術基礎。接下來構建澄江朱氏臨床醫案知識圖譜，整合朱氏家族臨床醫案(非結構化文本)與醫療百科數據(半結構化信息)，通過多源異構數據融合與衝突消解機制，構建專業化、高可信度的中醫診療知識庫。最後進行領域適配型大語言模型開發，基於中文Large Language Model Meta AI(LLaMA)架構，通過澄江朱氏醫案數據的領域微調，構建中醫專用大模型，驅動智能問答系統實現對中醫理論、方劑配伍及辨證邏輯的科學解析，有效滿足用戶日常中醫諮詢與健康管理需求。本文的主要研究內容為基於Cascade結構構建了Cascade-BERT-BiLSTM -CRF(CBBC)模型，該模型通過多任務級聯機制分階段優化實體標注結果。實驗結果表明，CBBC模型在測試集上的F1值達到84.28%，較傳統BERT-BiLSTM-CRF模型提升1.2%，實體嵌套問題的消解效率提升顯著。
同時基於LLaMA2 構建了醫療問答大語言模型，通過結合知識獲取和回答生成這兩大核心技術，能夠根據用戶的輸入查詢圖數據庫得到中醫知識子圖，然後利用醫療問答大模型和知識子圖生成回答內容返回到前端界面，最終實現了系統的智能問答、圖譜查詢、圖譜管理等功能。

英文摘要

With the rapid advancement of information technology and profound transformation of the internet industry, global data volume has exhibited exponential expansion, while the disorderly and fragmented nature of massive heterogeneous data resources has become increasingly prominent. Under this context, achieving precise knowledge localization, semantic organization, and efficient utilization from complex information ecosystems has emerged as a critical challenge requiring urgent breakthroughs in digital transformation processes.
To address this challenge, knowledge graph technology has established a core technical framework for intelligent knowledge services through the construction of structured semantic association networks, integrated with deep semantic comprehension and generation capabilities derived from large-scale language models. This framework not only significantly enhances knowledge retrieval efficiency and cognitive decision-making accuracy in complex scenarios, but also provides methodological support for cross-domain knowledge integration and innovative applications. This research focuses on the construction of traditional Chinese medicine (TCM) knowledge systems and the realization of intelligent services. First, an investigation into Named Entity Recognition (NER) technology was conducted, developing a high-precision entity extraction model tailored to TCM textual corpus characteristics, thereby establishing technical foundations for automated TCM knowledge graph construction. Subsequently, the Chengjiang Zhu's Clinical Case Knowledge Graph was developed, integrating Zhu family clinical records (unstructured text) with medical encyclopedia data (semi-structured information) through multi-source heterogeneous data fusion and conflict resolution mechanisms, thereby constructing a specialized and high-credibility TCM diagnosis and treatment knowledge base. Finally, a domain-adapted large language model was developed based on the Chinese LLaMA architecture, optimized through domain-specific fine-tuning using Chengjiang Zhu's medical case data, enabling the intelligent Q&A system to achieve scientific interpretation of TCM theories, formula compatibility, and syndrome differentiation logic, effectively addressing user demands for daily TCM consultation and health management. The primary research contribution involves the development of a Cascade-BERT-BiLSTM-CRF (CBBC) model based on cascaded architecture, which optimizes entity annotation results through phased multi-task cascading mechanisms. Experimental results demonstrate that the CBBC model achieves an F1-score of 84.28% on test datasets, representing a 1.2% improvement over traditional BERT-BiLSTM-CRF models, with notable enhancement in nested entity resolution efficiency.
Concurrently, a medical Q&A large language model was constructed based on LLaMA2 architecture. By integrating two core technologies - knowledge acquisition and answer generation - the system enables query-based retrieval of TCM knowledge subgraphs from graph databases, followed by response generation through coordinated operation of the medical Q&A model and retrieved knowledge subgraphs. This implementation ultimately achieves system functionalities including intelligent Q&A, graph querying, and graph management.

論文出版年

2025

語言別

中文

論文頁數

致謝 I
摘要 II
Abstract III
圖目錄 VII
表目錄 VIII
第一章緒論 1
1.1 研究背景 1
1.2 研究動機與問題 2
1.3 研究方案 3
1.4 研究意義 3
1.5 論文組織框架 4
第二章文獻探討 6
2.1 國內外發展歷程 6
2.2 自然語言處理 9
2.2.1 詞向量表徵技術 9
2.2.2 Cascade級聯結構 10
2.3 知識圖譜 12
2.3.1 知識圖譜概述 12
2.3.2 命名實體識別 12
2.3.3 實體對齊 13
2.3.4 醫學知識圖譜 15
2.3.5 知識存儲 16
2.4 大語言模型 17
2.4.1 大模型微調 18
2.4.2 提示詞工程 20
2.4.3 模型性能評估 21
2.5 本章小結 22
第三章研究方法 24
3.1 主要研究內容 24
3.2 模型的構建 25
3.2.1 基於BERT的詞向量編碼 26
3.2.2 BiLSTM層 27
3.2.3 CRF層 28
3.2.4 Cascade結構 29
3.3 本章小結 30
第四章研究結果與分析 31
4.1 命名實體識別模型實驗 31
4.1.1 數據集標註與預處理 31
4.1.2 實驗環境與參數設置 34
4.1.3 實驗結果與分析 35
4.2 中醫知識圖譜的構建與智能問答系統的設計 36
4.2.1 知識圖譜構建流程 36
4.2.2 數據收集與預處理 37
4.2.3 知識抽取 39
4.2.4 圖譜存儲與可眎化 40
4.3 基於LLaMA2的醫療問答大模型 42
4.3.1 醫療問答大模型的構建 42
4.3.2 中醫領域大語言模型評估方法 43
4.3.3 大語言模型對比實驗結果 43
4.4 本章小結 44
第五章總結與展望 46
5.1 總結 46
5.2 展望 46
參考文獻 49
作者簡歷 54
附錄 55

參考文獻

[1]中華人民共和國國家衛生健康委員會.2023年1-8月全國醫療服務情況[EB/OL.(2024-01-09)[2024-01-31.http://www.nhc.gov.cn/mohwsbwstjxxzx/s7967/202401/d6fd0c655fc04e0585ce143e014adc1a.shtm1.
[2]中華人民共和國國家衛生健康委員會.2022中國衛生健康統計年鑑[R/0L1.(2023-05-17)[2024-01-311.http://www,nhc.gov.cn/mohwsbwstjxxzx/tjtjnj/202305/6ef68aac6bd14c1eb9375e01a0faa1fb/fles/b05b3d958fc546d98261d165cea4adba.pdf.
[3]陳正平,顧培潔.江陰朱氏世醫傳承及主要醫家學術思想[J].中國中醫基礎醫學雜志,2011,17(07):714-715.DOI:10.19945/j.cnki.issn.1006-3250.2011.07.005.Yu H B, Liu J G, Liu L Q, et al. Intelligent robotics and applications[M]. Berlin, Germany: Springer, 2019.
[4]趙敬華.中醫學概論[M].北京:中國醫藥科技出版社，1999
[5]國務院.國務院關於印發中醫藥發展戰略規劃綱(2016-2030年)的通知[EB/OL]//國務院關於印發中醫藥發展戰略規劃綱(2016-2030年)的通知.(2016-26).http:/www.govcn/zhengce/content/2016-02/26/content 5046678.htm.
[6]DU Z,QIAN Y,LIU X,GLM: General Language Model Pretraining with Autoregressive Blank Infilling[C/OL]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Computational Linguistics,2022:320-335[2023-05-211.https://aclanthology.org/2022.acl-long.26.DOI:10.18653/v1/2022.ac1-long.26.
[7]ZENG A,LIU X,DU Z,GLM-130B: An Open Bilingual Pre-trained Model[M/0L].arXiv,2023[2024-03-16].http://arxiv.org/abs/2210.02414.DOI:10.48550/arXiv.2210.0241
[8]PAN S,LUO L,WANG Y,Unifying Large Language Models and Knowledge Graphs:A Roadmap[J/OL]. IEEE Transactions on Knowledge and Data Engineering, 2024:1-20.DOI:10.1109/TKDE.2024.3352100.
[9]裴婧,包宏.漢語句子相似度計算在FAQ中的應用門.計算機工程,2009,35(17):46-48.
[10]程騁，尹航，王練術.問答系統中意見型疑問句的分類方法研究.微計算機信息，2009，25(36):166-168
[11]陳程，翟潔,秦錦玉,等.基於中醫藥知識圖譜的智能問答技術研究町.中國新通信,2018，20(2):204-207.
[12]賈李蓉,劉麗紅,劉靜,等.基於中醫藥學語言系統的知識問答系統的設計與構建町].中華醫學圖書情報雜志，2019，28(5):11-14.
[13]趙雪，趙志梟,孫鳳蘭,等.面向語言文學領域的大語言模型性能評測研究[J/OL1.外語電化教學，2023(6):57-65+114.DOI:10.20139/i.issn.1001-5795.20230610.
[14]Z. Cai and N. Vasconcelos, "Cascade R-CNN: Delving Into High Quality Object Detection"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6154-6162, doi: 10.1109/CVPR.2018.00644.
[15]楊正洪，郭良越，劉瑋.人工智能與大數據技術導論[M].北京:清華大學出版社，2018
[16]Shen, R. R., Xia, S. S., & Yan, J. F. (2023). Research progress on named entity recognition in traditional Chinese medicine. Journal of Medical Informatics, 44(1), 47–53.
[17]Shi, Q. R., Li, H., Yu, W. Q., et al. (2025). A review of named entity recognition in traditional Chinese medicine texts. Modern Information, 45(2), 4–16.
[18]Graves A, Eck D, Beringer N, et al. Biologically plausible speech recognition with LSTM neural nets[C]/International Workshop on Biologically Inspired Approaches to Advanced Information Technology, January 29-30, Springer, Berlin, Germany: ANNCL, 2004: 127-136.
[19]Ma X, Hovy E, End-to-end Sequence Labeling via Bi-directional LSTM-CNNS-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, August 7-12, Berlin,Germany:ACL,2016:1064-1074.
[20]He, T., Chen, J., & Wen, Y. Y. (2022). Research on electronic medical record entity recognition based on BERT-CRF model. Computer & Digital Engineering, 50(3), 639–643.
[21]Shi, W. Y., Zhao, F. H., Sun, M. L., et al. (2024). Entity recognition and application in TCM treatment of functional gastrointestinal diseases based on BERT-BiLSTM-CRF model. China Digital Medicine, 19(5), 78–83.
[22]COHEN WW, RICHMAN J. Learning to match and cluster large high-dimensionaldata sets for data integration[C/OL]/Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: Association for Computing Machinery,2002:475-480[2024-02-24].https://doi.org/10.1145/775047.775116.DOI:10.1145/775047.775116.
[23]TEONG KS,SOON L,SU T T.Schema-agnostic entity matching using pre-trainedlanguage models[C/OL]//Proceedings of the 29th ACM International Conference on Information &Knowledge Management, Association for Computing Machinery (ACM), 2020:2241-2244[2024-02-23].https://research.monash.edu/en/publications/schema-agnostic-entity-matching-using-pre-trained-language-models. DOl:10.1145/3340531.3412131.
[24]PATTERSON D,GONZALEZJ,LE Q,Carbon Emissions and Large Neural Network Training[M/OL].arXiv,2021[2023-09-15].http://arxiv.org/abs/2104.10350.DOI:10.48550/arXiv.2104.10350.
[25]RAJBHANDARI S, RASLEY J,RUWASE O.ZeRO: Memory optimizations Toward Training Trillion Parameter Models[C/0L,]//SC20: International Conference for High Performance Computing,Networking,Storage and Analysis. 2020:1-16[2024-03-16]. https://ieeexplore.ieee.org/abstract/document/9355301. DO:10.1109/SC41405.2020.00024.
[26]HU E J,SHEN Y, WALLIS P,LoRA: Low-Rank Adaptation of Large Language Models[M/OL].arXiv,2021[2023-04-16].http://arxiv.org/abs/2106.09685.DOI:10.48550/arXiv.2106.09685.
[27]DETTMERS T, PAGNONI A,HOLTZMAN A,QLoRA: Efcient Finetuning of Quantized LLMs[J], Advances in Neural Information Processing Systems,2023,36: 10088-10115.
[28]WEI J,WANG X,SCHUURMANS，Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[J]. Advances in Neural Inforation Processing Systems,2022，35:24824-24837.
[29]WANG X,WEI J,SCHUURMANS D,Self-Consistency Improves Chain of Thought Reasoning in Language Models[M/OL]. arXiv, 2023[2023-11-29]. http://arxiv.org/abs/2203.11171.DOl:10.48550/arXiv.2203.11171.
[30]YAO S,YU D,ZHA0 J.Tree of Thoughts: Deliberate Problem Solving with Large Language Models[J]. Advances in Neural Information Processing Systems, 2023, 36:11809-11822.
[31]PAPINENI K,ROUKOS S,WARD T,BLEU: a method for automatic evaluationof machine translation[C/OL]/Proceedings of the 40th Annual Meeting on Association forComputational Linguistics -ACL '02. Philadelphia, Pennsylvania: Association for Computational inguistics,2001:311[2023-04-29]. http://portal.acm.org/citation.cfm?doid=1073083.1073135.DOI:10.3115/1073083.1073135.
[32]LIN C Y, ROUGE: A Package for Automatic Evaluation of Summaries[C/OL]//TextSummarization Branches Out, Barcelona, Spain: Association for Computational Linguistics,2004:74-81[2023-04-29].https://aclanthology.org/W04-1013.
[33]WANG A,PRUKSACHATKUN Y,NANGIA N.SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems[C/OL]//Advances in Neural Information Processing Systems:32.Curran Associates, Inc.,2019[2024-03-16]. https://papers.nips.cc/paper files/paper/2019/hash/4496bf24afe7fab6f046bf4923da8de6-Abstract.html.
[34]HUANG Y,BAI Y,ZHU Z,C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models[C/OL]/Advances in Neural Information ProcessingSystems:卷36,2023:62991-63010[2024-03-16].https://proceedings.neurips.cc/paper_files/paper/2023/hash/c6ec1844bec96d6d32ae95ae694e23d8-Abstract-Datasets and Benchmarks.html.
[35]XU L,LI A,ZHU L,SuperCLUE: A Comprehensive Chinese Large LanguageModel Benchmark[M/0L].arXiv,2023[2023-09-12].http://arxiv.org/abs/2307.15020.DOI:1
[36]0.48550/arXiv.2307.15020.[66]BUDLER L C,GOSAK L, STIGLIC G. Review of artificial intelligence-based question-answering systems in healthcare[J/OL]. WREs Data Mining and Knowledge Discovery,2023,13(2):e1487.DO:10.1002/widm.1487.
[37]Hua, H. B., Chen, Z. P., & Gong, W. (2010). A glimpse into the academic thoughts of Zhu Shaohong's Medical Cases. Jiangsu Journal of Traditional Chinese Medicine, 42(11), 68–70.
[38]张阳,王小宁.基于 Word2Vec词嵌入和高维生物基因选择遗传算法的文本特征选择方法[J].计算机应用,202l,41(11):3151-3155
[39]lic S, Marrese-Taylor E, Balazs J, et al, Deep contextualized word representations for detecting sarcasm and irony[C]//Proceedings of the 9th Workshop on Computational Approaches to Subjectivity,Sentiment and Social Media Analysis, October 31, Brussels, Belgium: EMNLP 2018: 2-7.
[40]Kenton J D M W C, Toutanova L, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// The 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 4-7, Minneapolis,USA:NAACL, 2019:4171-4186.
[41]ChangpinyoS,HuH, ShaF,Multi-Task Learing for Sequence Tagging: An Empirical Study[C]//Proceedings of the 27th International Conference on Computational Linguistics, August20-26, New Mexico, USA: ACL,2018: 2965-2977.
[42]ShawP,Uszkoreity,Vaswani A.Self-Attention with Relative Position Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), June t-6, New Orleans, Louisiana,USA:NAACL,2018: 464-468.
[43]Strubell E, Verga P, Belanger D, et al, Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, September 9-11,Copenhagen, Denmark: EMNLP, 2017: 2670-2680.
[44]Jin Z,Zhang Y, Kuang H, et al. Named Entity Recognition in Traditional Chinese Medicine Clinical Cases Combining BilSTM-CRF with Knowledge Graph [J]. Knowledge Seience, Engineering and Management,2019,11775:537-548，2019.
[45]Jia Q,Zhang D,Xu H, et al. Extraction of Traditional Chinese Medicine Entity: Design of a Novel $pan-Level Named Entity Recognition Method With Distant Supervision[J]. JMlR Medical Informatics，2021，9(6):e28219.
[46]Wang B, W,Wang Y,et al. A Neural Transition-based Model for Nested Mention Recognition [C]//Proceedings of the2018 Conference on Empirical Methods in Natural Language Processing,Brussels, Belgium:Association for Computational linguisties,2018:1011-1017.
[47]Tang B, HuJ, Wang X,et al. Recognizing Continuous and Dis-continuous Adverse Drug Reaction Mentions from Social Media U-sing ISIM-CRF [J]. Wireless Communications and Mobile Computing，2018:2379208.
[48]Dai X,Kaimi S, Hachey B,et al. An Effective Transition based Model for Diseontinuous NER [C]//Proceedings of the 58thAnnual Meeting of the Association for Computational Linguisties ,Online:Association for Computational linguistics, 2020:5860-5870.
[49]BUDLER I, C,GOSAK L,STIGLIC G, Review of artificial intelligence-based question-answering systems in healthcare[J/OL]. WREs Data Mining and Knowledge Discovery,2023,13(2):e1487. DOI:10.1002/widm.1487.
[50]TEONG K S,SOON L,SU T T.Schema-agnostic entity matching using pre-trainedlanguage models[C/OL]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery (ACM), 2020:2241-2244[2024-02-23].https://research.monash.edu/en/publications/schema-agnostic-entity-matching-using-pre-trained-language-models. DO:10.1145/3340531.3412131.
[51]LIN CY. ROUGE: A Package for Automatic Evaluation of Summaries[C/OL]//TextSummarization Branches Out, Barcelona, Spain: Association for Computational Linguistics.2004:74-81[2023-04-29].https://aclanthology.org/W04-1013.
[52]SAHU S K,ANAND A.Drug-drug interaction extraction from biomedical. texts using long short-term memory network[J/OL]. Journal of Biomedical Informatics, 2018, 86: 15-24.DOI:10.1016/.ibi.2018.08.005.
[53]REIMERS N,GUREVYCH I.Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks[C/OL]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics,2019:3980-3990[2024-02-28].https://www.aclweb.org/anthology/D19-1410.DOI:10.18653/y1/D19-1410
[54]VASWANIA,SHAZEER N,PARMARN,et al.Attention is all you need[C].Advances in neural information processing systems,2017:5998-6008
[55]Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[J]. Universal Language Model Fine-tuning for Text Classification, 2018: 278-292
[56]Mnih V,Heess N, Graves A, et al. Recurrent models of visual attention[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2, December 8-13,Montreal Canada: NIPS 2014:2204-2212.
[57]YasianlA.Shazeer N, Parmar N,et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, December 4-9, California USA:NIPS,2017: 6000-6010.
[58]Bengio Y, Ducharme R, Vincent P,et al. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research,2003,3:1137-1155
[59]Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems-Volume 2, December 5-10, Nevada, USA: Curran Associates Inc, 2013: 311 1-3119.
[60]Kiros R, Zhu Y, Salakhutdinov R, et al. Skip-thought vectors[C]//Proceedings of the 28th Internationa lConference on Neural Information Processing Systems-Volume 2, December 7-12, Montreal Canada:NIPS,2015:3294-3302.

論文說明