Towards Knowledgeable Foundation Models
@ ACL 2025 Workshop
Aug 1, 2025 in Vienna, Austria
Knowledge has been an important pre-requisite for a variety of AI applications, and is typically sourced from either structured knowledge sources such as knowledge bases and dictionaries or unstructured knowledge sources such as Wikipedia documents.
More recently, researchers have discovered that language models already possess a significant amount of knowledge through pre-training: LLMs can be used to generate commonsense knowledge and factual knowledge context for question answering. While the results are encouraging, there are still lingering questions:
This workshop examines the lifecycle of knowledge within language models:
This is the 3rd workshop for Knowledgeable Foundation Model workshop. The previous workshop was hosted at KnowFM@AAAI2025 and KnowLM@ACL2024.
Knowledge has been an important prerequisite for various NLP applications and is typically derived from either structured knowledge sources such as knowledge bases and dictionaries or unstructured knowledge sources such as Wikipedia documents and news articles.
It is known that language models already possess a significant amount of knowledge through pre-training: LLMs can be used to generate commonsense knowledge and factual knowledge when prompted to do so. However, beyond the surface, there are still many lingering questions such as “where the knowledge comes from”, “how do we quantify the amount of knowledge”, “is the knowledge reliable (and do LMs themselves know)”, “how can we augment LMs with domain-specific knowledge”, “how can we revise knowledge without hurting the reasoning abilities of LMs” and “how can we leverage knowledge to assist the self-correction of LMs”.
In this workshop, we want to bring together researchers who focus on different stages and different aspects (structured knowledge, unstructured knowledge, and knowledge acquired from LMs themselves) of the knowledge lifecycle to discuss the role of knowledge in the era of large language models.
Submission Topics
We welcome submissions on all topics related to knowledgable LMs, including:
We will also announce a Best Paper Award at our workshop.
Submission Instructions
We welcome two types of papers: regular workshop papers and non-archival submissions. Only regular workshop papers will be included in the workshop proceedings. Review process will be double-blind. All submissions should be in PDF format following the ACL template and made through OpenReview submission portal (https://openreview.net/group?id=aclweb.org/ACL/2025/Workshop/KnowFM)
All deadlines are 23:59pm UTC-12h (“Anywhere on Earth”).
Submission Deadline | Jun 6th 2025 (23:59pm AoE) |
---|---|
Decision Notifications | Jun 18th 2025 (23:59pm AoE) |
Camera-Ready Deadline | Jun 25th 2025 (23:59pm AoE) |
Workshop Date | 1st Aug 2025 |
Time | Program |
---|---|
09:00-09:10 | Opening Remarks |
09:10-09:50 | Keynote Speech Preslov Nakov: Towards Truly Open, Language-Specific, Safe, Factual, and Specialized Large Language Models First, we will argue for the need for fully transparent open-source large language models (LLMs), and we will describe the efforts of MBZUAI's Institute on Foundation Models (IFM) towards that based on the LLM360 initiative. Second, we will argue for the need for language-specific LLMs, and we will share our experience from building Jais, the world's leading open Arabic-centric foundation and instruction-tuned large language model, Nanda, our open-weights Hindi LLM, Sherkala, our open-weights Kazakh LLM, and some other models. Third, we will argue for the need for safe LLMs, and we will present Do-Not-Answer, a dataset for evaluating the guardrails of LLMs, which is at the core of the safety mechanisms of our LLMs. Forth, we will argue for the need for factual LLMs, we will discuss the factuality challenges that LLMs pose. We will then present some recent relevant tools for addressing these challenges developed at MBZUAI: (i) OpenFactCheck, a framework for fact-checking LLM output, for building customized fact-checking systems, and for benchmarking LLMs for factuality, (ii) LM-Polygraph, a tool for predicting an LLM's uncertainty in its output using cheap and fast uncertainty quantification techniques, and (iii) LLM-DetectAIve, a tool for machine-generated text detection. Finally, we will argue for the need for specialized models, and we will present the zoo of LLMs currently being developed at MBZUAI's IFM. |
09:50-10:30 | Keynote Speech Yunyao Li: Declarative to Generative: Building and Querying Enterprise Knowledge Bases Over the last 25 years -- search, knowledge graph and even large language model innovations have been adopted by consumers much before enterprises. The delay in adoption of such technologies in enterprises is largely due to two factors. First, enterprise knowledge bases vary widely based on industry verticals and even within an industry vertical by organization-specific terminology and vocabulary. Second, querying such knowledge bases needs to account for very low-tolerance enterprise users have for mistakes and hallucination. In this talk I will describe tools to build, maintain and query such knowledge bases and the evolution of these tools over two decades from declarative to generative systems. |
10:30-11:00 | Coffee Break |
11:00-11:15 | Oral Presentation: SIS-Fact: Towards Systematic, Interpretable and Scalable Factuality Evaluation for LLM |
11:15-11:30 | Oral Presentation: Atomic Calibration of LLMs in Long-Form Generations |
11:30-11:45 | Oral Presentation: Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning |
11:45-12:00 | Oral Presentation: Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models |
12:00-12:15 | Oral Presentation: The Mirage of Model Editing: Revisiting Evaluation in the Wild |
12:15-12:25 | Best Paper Award Announcement |
12:25-14:10 | Lunch Break |
14:10-14:50 | Keynote Speech Chengxiang Zhai: From Knowledgeable Foundation Models to Knowledgeable Agents: A Neurosymbolic Perspective on Knowledge Representation Foundation models acquire massive amounts of useful knowledge from both pre-training and fine-tuning, but the knowledge they encode in their parameter space is neither interpretable nor verifiable, and their behavior in applying the knowledge during inference time is unpredictable. These limitations cause concerns about their trustworthiness when they are directly used in real world applications. While much work has attempted to address those limitations via improving a foundation model itself, we argue that those limitations of foundation models are better addressed by building an agent that can augment the foundation model with a memory mechanism, regulate its behavior using a symbolic representation module, and self-improve itself over time. In this talk, we will discuss how compression of deep neural networks enables foundation models to acquire generalizable knowledge in both pre-training and fine-tuning, why the behaviors of foundation models are inherently unpredictable, and why it is necessary to build a knowledgeable agent on top of a knowledgeable foundation model and use a neurosymbolic knowledge representation to enable both trustworthiness and lifelong learning of the agent. We will conclude with some promising future directions for future research. |
14:50-15:30 | Panel Discussion: Ed Hovy, Chengxiang Zhai,Yunyao Li |
15:30-16:00 | Coffee Break |
16:00-17:30 | Poster Session |
Environment Free Coding Benchmarks: Evaluating Language Model Coding Capabilities without a Dedicated Environment [PDF]
Laurence Liang
How Many Parameters for Multi-Hop? An Information-Theoretic Capacity Law for Knowledge Retrieval in Large Language Models [PDF]
Thomas Chen
GeoEdit: Geometric Knowledge Editing for Large Language Models [PDF]
Yujie Feng, Li-Ming Zhan, ZEXIN LU, Yongxin Xu, Xu Chu, Yasha Wang, Jiannong Cao, Philip S. Yu, Xiao-Ming Wu
Superfluous Instruction: Vulnerabilities Stemming from Task-Specific Superficial Expressions in Instruction Templates [PDF] [Poster]
Toma Suzuki, Yusuke Sakai, Justin Vasselli, Hidetaka Kamigaito, Taro Watanabe
DEAL: Disentangling Transformer Head Activations for LLM Steering [PDF]
Li-Ming Zhan, Bo LIU, ZEXIN LU, Yujie Feng, Chengqiang Xie, Jiannong Cao, Xiao-Ming Wu
Reasoning or Memorization? Investigating LLMs’ Capability in Restoring Chinese Internet Homophones [PDF] [Poster]
Jianfei Ma, Zhaoxin Feng, Huacheng Song, Emmanuele Chersoni, Zheng Chen
Knowledge Mechanisms in Large Language Models: A Survey and Perspective [PDF] [Poster]
Mengru Wang, Yunzhi Yao, Shuofei Qiao, Shumin Deng, Jia-Chen Gu, Fei Huang, Huajun Chen, Ningyu Zhang
Structure-Aware Hyperbolic Representation for Coarse-to-Fine Emotion Classification in Lyrics [PDF]
Yutong Hu, Menglin Yang, Reza Mohammadi
Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models [PDF] [Poster]
Samir Abdaljalil, HASAN KURBAN, Khalid Qaraqe, Erchin Serpedin
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector [PDF] [Poster]
Samir Abdaljalil, HASAN KURBAN, Khalid Qaraqe, Erchin Serpedin
Context-Efficient Retrieval with Factual Decomposition [PDF]
Yanhong Li, David Yunis, David McAllester, Jiawei Zhou
Meetalk: Retrieval-Augmented and Adaptively Personalized Meeting Summarization with Knowledge Learning from User Corrections [PDF]
Zheng CHEN, JIANG FUTIAN, Yue Deng, Changyang He, Bo Li
Can LLMs Recognize Their Own Analogical Hallucinations? Evaluating Uncertainty Estimation for Analogical Reasoning [PDF]
Zheng CHEN, Zhaoxin Feng, Jianfei Ma, Jiexi Xu, Bo Li
Democratizing LLM Benchmarking via Automated Dynamic Knowledge Evaluation [PDF]
Yanhong Li, Tianyang Xu, Kenan Tang, Karen Livescu, David McAllester, Jiawei Zhou
A Progressive Learning Strategy for Medical Natural Language Understanding [PDF]
ZHE YANG, Yi Huang, Mengfei Guo, Yaqin Chen, Xiaoting Wu, Junlan Feng, Chao Deng
Exploring Personalization Shifts in Representation Space of LLMs [PDF]
Jiahong Liu, Wenhao Yu, Quanyu Dai, Zhongyang Li, Jieming Zhu, Menglin Yang, Tat-Seng Chua, Irwin King
Semantics-Preserving Adversarial Attacks on Event-Driven Stock Prediction Models [PDF] [Poster]
Aofan Liu, haoxuan li, Hongjian Xing, Yuguo Yin, Zijun Li, Yiyan Qi
Beyond Function-Level Search: Repository-Aware Dual-Encoder Code Retrieval with Adversarial Verification [PDF] [Poster]
Aofan Liu, Shiyuan SONG, haoxuan li, Cehao Yang, Yiyan Qi
MD3R: Minimizing Data Distribution Discrepancies to Tackle Inconsistencies in Multilingual Query-Code Retrieval [PDF] [Poster]
Aofan Liu, Yuguo Yin, Hongjian Xing, Zhen Li, Yiyan Qi
ATEB: Rethinking Advanced NLP Tasks in an Information Retrieval Setting [PDF]
Simeng Han, Frank Palma Gomez, Tu Vu, Zefei Li, Daniel Cer, Hansi Zeng, Chris Tar, Arman Cohan, Gustavo Hernandez Abrego
When to Trust Context: Self-Reflective Debates for Context Reliability [PDF]
Zeqi Zhou, Fang Wu, Shayan Talaei, Haokai Zhao, Cheng Meixin, Tinson Xu, Amin Saberi, Yejin Choi
Truth Neurons [PDF] [Poster]
Haohang Li, Yupeng Cao, Yangyang Yu, Jordan W. Suchow, Zining Zhu
FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning [PDF]
Ziming Luo, Ruosen Li, Xinya Du
ToolReAGt: Tool Retrieval for LLM-based Complex Task Solution via Retrieval Augmented Generation [PDF]
Norbert Braunschweiler, Rama Doddipatla, TUDOR-CATALIN ZORILA
COSMIC: Generalized Refusal Direction Identification in LLM Activations [PDF]
Vincent Siu, Nicholas Crispino, Zihao Yu, Sam Pan, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang
Predicting Task Performance with Context-aware Scaling Laws [PDF]
Kyle Montgomery, David Park, Jianhong Tu, Michael Bendersky, Beliz Gunel, Dawn Song, Chenguang Wang
MLAN: Language-Based Instruction Tuning Preserves and Transfers Knowledge in Multimodal Language Models [PDF]
Jianhong Tu, Zhuohao Ni, Nicholas Crispino, Zihao Yu, Michael Bendersky, Beliz Gunel, Ruoxi Jia, Xin Liu, Lingjuan Lyu, Dawn Song, Chenguang Wang
Stress-Testing Multimodal Foundation Models for Crystallographic Reasoning [PDF] [Poster]
Can Polat, HASAN KURBAN, Erchin Serpedin, Mustafa Kurban
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models [PDF]
Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang
Evaluating RAG Robustness to Symbolic Perturbations [PDF]
Xinyun Zhou, Xinfeng Li, Kun Wang, Xuanwang Zhang, Ming Xu, Yinan Peng, Miao Yu, Yidong Wang, Xiaojun Jia, Qingsong Wen, XiaoFeng Wang, Wei Dong
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning [PDF]
Shuzheng Si, Haozhe Zhao, Cheng Gao, Yuzhuo Bai, Zhitong Wang, Bofei Gao, Kangyang Luo, Wenhao Li, Yufei Huang, Gang Chen, Fanchao Qi, Minjia Zhang, Baobao Chang, Maosong Sun
Knowledge-Grounded Detection of Cryptocurrency Scams with Retrieval-Augmented LMs [PDF]
Zichao Li
FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation [PDF]
Liqiang Jing, Viet Dac Lai, Seunghyun Yoon, Trung Bui, Xinya Du
A Comprehensive Analysis for Visual Object Hallucination in Large Vision-Language Models [PDF]
Liqiang Jing, Hardy Chen, Ehsan Aghazadeh, Xin Eric Wang, Xinya Du
Latent Knowledge Scalpel: Precise and Massive Knowledge Editing for Large Language Models [PDF]
Xin Liu, Qiyang Song, Shaowen Xu, Kerou Zhou, Wenbo Jiang, Xiaoqi Jia, Weijuan Zhang, Heqing Huang, Yakai Li
What makes Reasoning Models Different? Follow the Reasoning Leader for Efficient Decoding [PDF]
Ming Li, Zhengyuan Yang, Xiyao Wang, Dianqi Li, Linjie Li, Kevin Lin, Tianyi Zhou, Lijuan Wang
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners [PDF]
Yunzhi Yao, Jizhan Fang, Jia-Chen Gu, Ningyu Zhang, Shumin Deng, Huajun Chen, Nanyun Peng
SIS-Fact: Towards Systematic, Interpretable and Scalable Factuality Evaluation for LLM [PDF]
Yuzhuo Bai, Kangyang Luo, Wenhao Li, Shuzheng Si, Gang Chen, Fanchao Qi, Maosong Sun
Shallow Focus, Deep Fixes: Enhancing Shallow Layers Vision Attention Sinks to Alleviate Hallucination in LVLMs [PDF]
Xiaofeng Zhang, Yihao Quan, Chen Shen, Chaochen Gu, Xiaosong Yuan, Shaotian Yan, Jiawei Cao, Hao Cheng, Kaijie Wu, Jieping Ye
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training [PDF] [Poster]
Yixin Ou, Yunzhi Yao, Ningyu Zhang, Hui Jin, Jiacheng Sun, Shumin Deng, Zhenguo Li, Huajun Chen
Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals [PDF]
Lida Chen, Zujie Liang, Xintao Wang, Jiaqing Liang, Yanghua Xiao, Feng Wei, Jinglei Chen, ZHENGHONG HAO, Bing Han, Wei Wang
AttentionRAG: Attention-Guided Context Pruning in Retrieval-Augmented Generation [PDF] [Poster]
Yixiong Fang, Tianran Sun, Yuling Shi, Xiaodong Gu
Atomic Calibration of LLMs in Long-Form Generations [PDF]
Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen Yang, Dong Yu, Nigel Collier
Transparent and Coherent Procedural Mistake Detection [PDF] [Poster]
Shane Storks, Itamar Bar-Yossef, Yayuan Li, Zheyuan Zhang, Jason J Corso, Joyce Chai
CoRE: Condition-based Reasoning for Identifying Outcome Variance in Complex Events [PDF] [Poster]
Sai P Vallurupalli, Francis Ferraro
EdTec-ItemGen: Enhancing Retrieval-Augmented Item Generation Through Key Point Extraction [PDF]
Alonso Palomino, David Buschhüter, Roland Roller, Niels Pinkwart, Benjamin Paassen
ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision [PDF] [Poster]
Dosung Lee, Wonjun Oh, Boyoung Kim, Minyoung Kim, Joonsuk Park, Paul Hongsuck Seo
Temporal Information Retrieval via Time-Specifier Model Merging [PDF]
SeungYoon Han, Taeho Hwang, Sukmin Cho, Soyeong Jeong, Hoyun Song, Huije Lee, Jong C. Park
The Mirage of Model Editing: Revisiting Evaluation in the Wild [PDF]
Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Qi Cao, Dawei Yin, Huawei Shen, Xueqi Cheng
MT2ST: Adaptive Multi-Task to Single-Task Learning [PDF]
Dong Liu, Yanxuan Yu