arXiv AI Publications - 2025 Week 48

Publications de la semaine #48 - 2025

Here are the top 5 most relevant AI papers from arXiv week 48/2025, complete with analysis and insights.

Publications at a Glance

Failure Modes in LLM Systems: A System-Level Taxonomy for Reliable AI Applications Vaishali Vinay | 11/25/2025

Efficient Multi-Hop Question Answering over Knowledge Graphs via LLM Planning and Embedding-Guided Search Manil Shrestha, Edward Kim | 11/24/2025

FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning Xin Yuan, Siqi Li, Jiateng Wei, Chengrui Zhu, Yanming Wu, Qingpeng Li, Jiajun Lv, Xiaoke Lan, Jun Chen, Yong Liu | 11/24/2025

MLPMoE: Zero-Shot Architectural Metamorphosis of Dense LLM MLPs into Static Mixture-of-Experts Ivan Novikov | 11/26/2025

Breaking the Safety-Capability Tradeoff: Reinforcement Learning with Verifiable Rewards Maintains Safety Guardrails in LLMs Dongkyu Derek Cho, Huan Song, Arijit Ghosh Chowdhury, Haotian An, Yawei Wang, Rohit Thekkanal, Negin Sokhandan, Sharlina Keshava, Hannah Marlowe | 11/26/2025

Failure Modes in LLM Systems: A System-Level Taxonomy for Reliable AI Applications

Published

11/25/2025

arXiv ID

[2511.19933v2]

Authors

Vaishali Vinay

Key Insights

This research introduces a comprehensive taxonomy of fifteen hidden failure modes specific to large language models (LLMs), highlighting their unique challenges compared to traditional machine learning models in real-world applications. By framing LLM reliability as a system-engineering problem, it emphasizes the need for improved evaluation and monitoring practices tailored to the complexities of LLM deployment.

Potential Impact

The findings could significantly enhance the reliability and robustness of LLM applications in decision-support tools and automation workflows, ultimately leading to safer and more effective AI systems. This work may shift the focus of future research and development from solely model performance to a broader consideration of system-level integration and operational stability.

back to list

Efficient Multi-Hop Question Answering over Knowledge Graphs via LLM Planning and Embedding-Guided Search

Published

11/24/2025

arXiv ID

[2511.19648v1]

Authors

Manil Shrestha, Edward Kim

Key Insights

This research introduces two innovative algorithms that enhance the efficiency and verifiability of multi-hop question answering over knowledge graphs, significantly reducing reliance on large language models. The proposed methods achieve high accuracy while ensuring that answers are grounded in structured knowledge, addressing key limitations of existing approaches.

Potential Impact

By demonstrating that effective multi-hop reasoning can be achieved with smaller models and reduced computational costs, this work could democratize access to advanced question-answering systems, making them more practical for various applications. The integration of symbolic structures with learned representations may also inspire new directions in AI research, particularly in knowledge representation and reasoning.

back to list

FastForward Pruning: Efficient LLM Pruning via Single-Step Reinforcement Learning

Published

11/24/2025

arXiv ID

[2511.18977v1]

Authors

Xin Yuan, Siqi Li, Jiateng Wei, Chengrui Zhu, Yanming Wu, Qingpeng Li, Jiajun Lv, Xiaoke Lan, Jun Chen, Yong Liu

Key Insights

FastForward Pruning introduces a novel single-step reinforcement learning framework that efficiently discovers optimal non-uniform layer-wise sparsity allocations for large language models, overcoming the computational challenges of traditional search-based methods. This innovative approach employs a curriculum-based strategy to significantly reduce computational overhead while achieving superior pruning performance compared to existing heuristics.

Potential Impact

By enabling more efficient and effective pruning of large language models, FastForward Pruning could lead to wider adoption of model compression techniques, making advanced AI more accessible and resource-efficient. This advancement may catalyze the development of more powerful and optimized applications across various domains, from natural language processing to real-time AI systems.

back to list

MLPMoE: Zero-Shot Architectural Metamorphosis of Dense LLM MLPs into Static Mixture-of-Experts

Published

11/26/2025

arXiv ID

[2511.21089v1]

Authors

Ivan Novikov

Key Insights

The MLPMoE method presents a novel, training-free approach to transform dense MLPs in large language models into static mixtures of experts, enhancing computational efficiency without the need for calibration data or custom routing. This innovative restructuring leverages tensor slicing and summation to optimize model performance while maintaining a comparable level of perplexity to traditional dense architectures.

Potential Impact

By enabling a post hoc transformation of existing dense language models into more efficient architectures, MLPMoE could significantly reduce inference costs and resource consumption in practical applications. This shift could lead to broader adoption of large language models in resource-constrained environments, ultimately democratizing access to advanced AI technologies.

back to list

Breaking the Safety-Capability Tradeoff: Reinforcement Learning with Verifiable Rewards Maintains Safety Guardrails in LLMs

Published

11/26/2025

arXiv ID

[2511.21050v1]

Authors

Dongkyu Derek Cho, Huan Song, Arijit Ghosh Chowdhury, Haotian An, Yawei Wang, Rohit Thekkanal, Negin Sokhandan, Sharlina Keshava, Hannah Marlowe

Key Insights

This research introduces reinforcement learning with verifiable rewards (RLVR) as a novel approach that challenges the existing belief in a fundamental safety-capability tradeoff in large language models (LLMs). It provides both theoretical and empirical evidence that RLVR can enhance reasoning capabilities while either maintaining or improving safety guardrails.

Potential Impact

By demonstrating that safety and capability can be optimized simultaneously, this work could fundamentally alter the training methodologies for LLMs, allowing for safer and more effective applications in sensitive domains. It paves the way for the responsible deployment of advanced AI systems, ensuring they perform complex tasks without compromising safety.

back to list