Skip to main content
Log in

Enhancing Language Models with Commonsense Knowledge for Multi-turn Response Selection

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

As a branch of advanced artificial intelligence, dialogue systems are prospering. Multi-turn response selection is a general research problem in dialogue systems. With the assistance of background information and pre-trained language models, the performance of state-of-the-art methods on this problem gains impressive improvement. However, existing studies neglect the importance of external commonsense knowledge. Hence, we design a Siamese network where a pre-trained Language model merges with a Graph neural network (SinLG). SinLG takes advantage of Pre-trained Language Models (PLMs) to catch the word correlations in the context and response candidates and utilizes a Graph Neural Network (GNN) to reason helpful common sense from an external knowledge graph. The GNN aims to assist the PLM in fine-tuning, and arousing its related memories to attain better performance. Specifically, we first extract related concepts as nodes from an external knowledge graph to construct a subgraph with the context response pair as a super node for each sample. Next, we learn two representations for the context response pair via both the PLM and GNN. A similarity loss between the two representations is utilized to transfer the commonsense knowledge from the GNN to the PLM. Then only the PLM is used to infer online so that efficiency can be guaranteed. Finally, we conduct extensive experiments on two variants of the PERSONA-CHAT dataset, which proves that our solution can not only improve the performance of the PLM but also achieve an efficient inference.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

No datasets were generated or analysed during the current study.

Notes

  1. https://github.com/THUDM/XDAI.

References

  1. Auer S, Bizer C, Kobilarov G et al (2007) Dbpedia: A nucleus for a web of open data. In: The semantic web. Springer, 722–735

  2. Bai J, Bai S, Chu Y et al (2023) Qwen technical report. arXiv preprint arXiv:2309.16609

  3. Bond F, Foster R (2013) Linking and extending an open multilingual wordnet. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1352–1362

  4. Bosselut A, Rashkin H, Sap M et al (2019) Comet: Commonsense transformers for automatic knowledge graph construction. arXiv preprint arXiv:1906.05317

  5. Breen J (2004) Jmdict: a japanese-multilingual dictionary. In: Proceedings of the workshop on multilingual linguistic resources, pp 65–72

  6. Brown TB (2020) Language models are few-shot learners. arXiv preprint arXiv:2005.14165

  7. Chen X, He K (2021) Exploring simple siamese representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 15750–15758

  8. Chen Q, Lin J, Zhang Y et al (2019) Towards knowledge-based recommender dialog system. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 1803–1813

  9. Chowdhery A, Narang S, Devlin J et al (2023) Palm: scaling language modeling with pathways. J Mach Learn Res 24(240):1–113

    Google Scholar 

  10. DeepSeek-AI (2025) Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv: 2501.12948

  11. Devlin J, Chang MW, Lee K et al (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 4171–4186

  12. Ding M, Zhou C, Chen Q et al (2019) Cognitive graph for multi-hop reading comprehension at scale. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 2694–2703

  13. Du Z, Zhou C, Yao J et al (2021) Cogkr: cognitive graph for multi-hop knowledge reasoning. IEEE Trans Knowl Data Eng 35(2):1283–1295

    Google Scholar 

  14. Evans JSB (1984) Heuristic and analytic processes in reasoning. Br J Psychol 75(4):451–468

    Article  Google Scholar 

  15. Evans JSB (2003) In two minds: dual-process accounts of reasoning. Trends Cogn Sci 7(10):454–459

    Article  Google Scholar 

  16. Evans JSB (2008) Dual-processing accounts of reasoning, judgment, and social cognition. Annu Rev Psychol 59:255–278

    Article  Google Scholar 

  17. Feng Y, Chen X, Lin BY et al (2020) Scalable multi-hop relational reasoning for knowledge-aware question answering. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1295–1309

  18. Glaese A, McAleese N, Trębacz M et al (2022) Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375

  19. GLM T, Zeng A, Xu B et al (2024) Chatglm: A family of large language models from glm-130b to glm-4 all tools. arXiv preprint arXiv:2406.12793

  20. Gu JC, Ling ZH, Zhu X et al (2019) Dually interactive matching network for personalized response selection in retrieval-based chatbots. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 1845–1854

  21. Gu JC, Ling Z, Liu Q et al (2020) Filtering before iteratively referring for knowledge-grounded response selection in retrieval-based chatbots. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (EMNLP:Findings), pp 1412–1422

  22. Gu JC, Liu H, Ling ZH et al (2021) Partner matters! an empirical study on fusing personas for personalized response selection in retrieval-based chatbots. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021. ACM, pp 565–574

  23. Henderson M, Vulić I, Gerz D et al (2019) Training neural response selection for task-oriented dialogue systems. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 5392–5404

  24. Huang X, Peng H, Zou D et al (2024) Cosent: consistent sentence embedding via similarity ranking. IEEE/ACM Transactions on Audio Speech and Language Processing 32:2800–2813

    Article  Google Scholar 

  25. Huang S, Zhu KQ, Liao Q et al (2020) Enhanced story representation by conceptnet for predicting story endings. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp 3277–3280

  26. Humeau S, Shuster K, Lachaux MA et al (2019) Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring. arXiv preprint arXiv:1905.01969

  27. Jain G, Lobiyal D (2022) Word sense disambiguation using cooperative game theory and fuzzy hindi wordnet based on conceptnet. Transactions on Asian and Low-Resource Language Information Processing 21(4):1–25

    Article  Google Scholar 

  28. Li Q, Peng H, Li J et al (2022) A survey on text classification: from traditional to deep learning. ACM Trans Intell Syst Technol (TIST) 13(2):1–41

    MathSciNet  Google Scholar 

  29. Lin BY, Chen X, Chen J et al (2019) Kagnet: Knowledge-aware graph networks for commonsense reasoning. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 2829–2839

  30. Liu Y, Feng S, Wang D et al (2021a) A graph reasoning network for multi-turn response selection via customized pre-training. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 13433–13442

  31. Liu Y, Ott M, Goyal N et al (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692

  32. Liu Y, Wan Y, He L et al (2021b) Kg-bart: Knowledge graph-augmented bart for generative commonsense reasoning. In: Proceedings of the AAAI conference on artificial intelligence, pp 6418–6425

  33. Liu W, Zhou P, Zhao Z et al (2020) K-bert: Enabling language representation with knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 2901–2908

  34. Li D, Yang Y, Tang H et al (2021) Virt: Improving representation-based models for text matching through virtual interaction. arXiv preprint arXiv:2112.04195

  35. Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101

  36. Lv S, Guo D, Xu J et al (2020) Graph-based reasoning over heterogeneous external knowledge for commonsense question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 8449–8456

  37. Mazare PE, Humeau S, Raison M et al (2018) Training millions of personalized dialogue agents. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 2775–2779

  38. Miller A, Fisch A, Dodge J et al (2016) Key-value memory networks for directly reading documents. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp 1400–1409

  39. OpenAI (2023) Gpt-4 technical report. arXiv preprint arXiv:2303.08774 abs/2303.08774

  40. Peng H, Zhang J, Huang X et al (2024) Unsupervised social bot detection via structural information theory. ACM Trans Inf Syst 42(6):1–42

    Article  Google Scholar 

  41. Peng H, Li J, He Y et al (2018) Large-scale hierarchical text classification with recursively regularized deep graph-cnn. In: Proceedings of the 2018 world wide web conference, pp 1063–1072

  42. Sap M, Le Bras R, Allaway E et al (2019) Atomic: An atlas of machine commonsense for if-then reasoning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 3027–3035

  43. Singh P et al (2002) The public acquisition of commonsense knowledge. In: Proceedings of AAAI Spring Symposium: Acquiring (and Using) Linguistic (and World) Knowledge for Information Access

  44. Sloman SA (1996) The empirical case for two systems of reasoning. Psychol Bull 119(1):3

    Article  Google Scholar 

  45. Song H, Wang Y, Zhang K et al (2021) Bob: Bert over bert for training persona-based dialogue models from limited personalized data. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 167–177

  46. Speer R, Chin J, Havasi C (2017) Conceptnet 5.5: An open multilingual graph of general knowledge. In: Proceedings of the AAAI conference on artificial intelligence

  47. Tao C, Feng J, Yan R et al (2021) A survey on response selection for retrieval-based dialogues. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI, pp 4619–4626

  48. Tao C, Wu W, Feng Y et al (2020) Improving matching models with hierarchical contextualized representations for multi-turn response selection. In: Proceedings of the 43rd International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp 1865–1868

  49. Tao C, Wu W, Xu C et al (2019) One time of interaction may not be enough: Go deep with an interaction-over-interaction network for response selection in dialogues. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp 1–11

  50. Tay Y, Luu AT, Hui SC (2018) Co-stack residual affinity networks with multi-level attention refinement for matching text sequences. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4492–4502

  51. Touvron H, Lavril T, Izacard G et al (2023) Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971

  52. Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need.(nips), 2017. arXiv preprint arXiv:1706.03762

  53. Veličković P, Cucurull G, Casanova A et al (2017) Graph attention networks. arXiv preprint arXiv:1710.10903

  54. Wang Z, Li S, Chen G et al (2017) Deep and shallow features learning for short texts matching. In: 2017 International Conference on Progress in Informatics and Computing (PIC), IEEE, pp 51–55

  55. Whang T, Lee D, Lee C et al (2020) An effective domain adaptive post-training method for bert in response selection. In: INTERSPEECH, pp 1585–1589

  56. Whang T, Lee D, Oh D et al (2021) Do response selection models really know what’s next? utterance manipulation strategies for multi-turn response selection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 14041–14049

  57. Wu CS, Hoi SC, Socher R et al (2020) Tod-bert: Pre-trained natural language understanding for task-oriented dialogue. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 917–929

  58. Wu Y, Li Z, Wu W et al (2018) Response selection with topic clues for retrieval-based chatbots. Neurocomputing 316:251–261

    Article  Google Scholar 

  59. Wu L, Fisch A, Chopra S et al (2018a) Starspace: Embed all the things! Proceedings of the AAAI conference on artificial intelligence 32(1)

  60. Wu Y, Wu W, Xing C et al (2017) Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 496–505

  61. Wu Y, Wu W, Xu C et al (2018c) Knowledge enhanced hybrid neural network for text matching. Proceedings of the AAAI Conference on Artificial Intelligence 32(1)

  62. Xu R, Tao C, Jiang D et al (2021a) Learning an effective context-response matching model with self-supervised tasks for retrieval-based dialogues. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 14158–14166

  63. Xu Y, Zhao H, Zhang Z (2021b) Topic-aware multi-turn dialogue modeling. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 14176–14184

  64. Yan Z, Duan N, Bao J et al (2018) Response selection from unstructured documents for human-computer conversation systems. Knowl-Based Syst 142:149–159

    Article  Google Scholar 

  65. Yang Z, Peng H, Jiang Y et al (2025) Chathttpfuzz: large language model-assisted iot http fuzzing. International Journal of Machine Learning and Cybernetics pp 1–22

  66. Yasunaga M, Ren H, Bosselut A et al (2021) Qa-gnn: Reasoning with language models and knowledge graphs for question answering. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 535–546

  67. Young T, Cambria E, Chaturvedi I et al (2018) Augmenting end-to-end dialogue systems with commonsense knowledge. Proceedings of the AAAI Conference on Artificial Intelligence 32(1)

  68. Yuan C, Zhou W, Li M et al (2019) Multi-hop selector network for multi-turn response selection in retrieval-based chatbots. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 111–120

  69. Yu J, Zhang X, Xu Y et al (2022) A tuning-free framework for exploiting pre-trained language models in knowledge grounded dialogue generation. Proceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp 4422–4432

  70. Zhang X, Bosselut A, Yasunaga M et al (2022) Greaselm: Graph reasoning enhanced language models for question answering. arXiv preprint arXiv:2201.08860

  71. Zhang S, Dinan E, Urbanek J et al (2018) Personalizing dialogue agents: I have a dog, do you have pets too? arXiv preprint arXiv:1801.07243

  72. Zhang C, Wang H, Jiang F et al (2021) Adapting to context-aware knowledge in natural conversation for multi-turn response selection. In: Proceedings of the Web Conference (WWW), pp 1990–2001

  73. Zhang J, Zhang X, Yu J et al (2022) Subgraph retrieval enhanced model for multi-hop knowledge base question answering. arXiv preprint arXiv:2202.13296

  74. Zhao X, Tao C, Wu W et al (2019) A document-grounded matching network for response selection in retrieval-based chatbots. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, pp 5443–5449

  75. Zhou Y, Geng X, Shen T et al (2022) Eventbert: a pre-trained model for event correlation reasoning. Proc ACM Web Conf 2022:850–859

    Google Scholar 

  76. Zhou X, Li L, Dong D et al (2018) Multi-turn response selection for chatbots with deep attention matching network. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 1118–1127

  77. Zhu Y, Nie JY, Zhou K et al (2021) Content selection network for document-grounded retrieval-based chatbots. In: Advances in Information Retrieval: 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28–April 1, 2021, Proceedings, Part I 43, Springer, pp 755–769

Download references

Acknowledgements

The authors would like to thank Ming Ding and Yangli-ao Geng for their empirical suggestions on this work. The authors would also like to thank the anonymous reviewers for their insightful comments provided.

Funding

The work is supported by the National Natural Science Foundation of China under Grant Nos. 62206148, 62272322, 62272323, Beijing Nova Program (20230484409).

Author information

Authors and Affiliations

Authors

Contributions

Yuandong Wang wrote the main manuscript and finished most of the experiments. Xuhui Ren participated in the discussions of the framework and helped conduct some baseline model evaluation. Tong Chen participated in all the discussions of the framework. Hongzhi Yin and Nguyen Quoc Viet Hung provided valuable polishing suggestions and All authors reviewed the manuscript.

Corresponding author

Correspondence to Yuandong Wang.

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Ren, X., Chen, T. et al. Enhancing Language Models with Commonsense Knowledge for Multi-turn Response Selection. Int. J. Mach. Learn. & Cyber. (2025). https://doi.org/10.1007/s13042-025-02804-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13042-025-02804-9

Keywords