arXiv AI Publications - 2025 Week 32

Publications de la semaine #32 - 2025

Here are the top 5 most relevant AI papers from arXiv week 32/2025, complete with analysis and insights.

Publications at a Glance

Enhancing Japanese Large Language Models with Reasoning Vectors Carolina Minami Oguchi, Leo Wei, Koyo Kobayashi, Hsin-Tai Wu, Dipak Ghosal | 8/4/2025

Refine-n-Judge: Curating High-Quality Preference Chains for LLM-Fine-Tuning Derin Cayir, Renjie Tao, Rashi Rungta, Kai Sun, Sean Chen, Haidar Khan, Minseok Kim, Julia Reinspach, Yue Liu | 8/3/2025

Compressing Chain-of-Thought in LLMs via Step Entropy Zeju Li, Jianyuan Zhong, Ziyang Zheng, Xiangyu Wen, Zhijian Xu, Yingying Cheng, Fan Zhang, Qiang Xu | 8/5/2025

InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities Shuo Cai, Su Lu, Qi Zhou, Kejing Yang, Zhijie Sang, Congkai Xie, Hongxia Yang | 8/7/2025

TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs Amitava Das, Vinija Jain, Aman Chadha | 8/4/2025

Enhancing Japanese Large Language Models with Reasoning Vectors

Published

8/4/2025

arXiv ID

[2508.02913v1]

Authors

Carolina Minami Oguchi, Leo Wei, Koyo Kobayashi, Hsin-Tai Wu, Dipak Ghosal

Key Insights

This research introduces an innovative method of utilizing reasoning vectors from reasoning LLMs to enhance the performance of Japanese LLMs, addressing the resource challenges typically faced in improving these models. The approach offers a simple yet effective solution that could significantly elevate the reasoning capabilities of Japanese language models.

Potential Impact

By demonstrating a viable method for enhancing Japanese LLMs, this work could lead to broader applications and improvements in natural language processing for less-resourced languages, potentially setting a precedent for similar strategies in other linguistic contexts. This advancement may foster greater inclusivity and accessibility in AI technologies, enabling better support for diverse languages in various applications.

back to list

Refine-n-Judge: Curating High-Quality Preference Chains for LLM-Fine-Tuning

Published

8/3/2025

arXiv ID

[2508.01543v1]

Authors

Derin Cayir, Renjie Tao, Rashi Rungta, Kai Sun, Sean Chen, Haidar Khan, Minseok Kim, Julia Reinspach, Yue Liu

Key Insights

The research introduces Refine-n-Judge, an innovative automated method for enhancing dataset quality for LLM fine-tuning by utilizing a single LLM for both refinement and evaluation. This approach eliminates the reliance on costly human feedback and separate reward models, streamlining the process of generating high-quality preference chains.

Potential Impact

By significantly improving the scalability and efficiency of dataset refinement, Refine-n-Judge could transform how LLMs are fine-tuned, leading to more robust and capable models across various applications. The ability to create high-quality datasets with minimal human intervention may accelerate advancements in AI, making it more accessible and effective in diverse fields such as coding, mathematics, and conversational agents.

back to list

Compressing Chain-of-Thought in LLMs via Step Entropy

Published

8/5/2025

arXiv ID

[2508.03346v1]

Authors

Zeju Li, Jianyuan Zhong, Ziyang Zheng, Xiangyu Wen, Zhijian Xu, Yingying Cheng, Fan Zhang, Qiang Xu

Key Insights

This research introduces a groundbreaking CoT compression framework leveraging step entropy to effectively identify and eliminate redundant reasoning steps in Large Language Models, achieving significant reductions in verbosity while maintaining accuracy. The innovative two-stage training strategy allows LLMs to autonomously optimize their reasoning processes, enhancing both efficiency and understanding of the underlying reasoning structures.

Potential Impact

By improving inference efficiency without sacrificing performance, this work could revolutionize the deployment of LLMs in resource-constrained environments, making complex reasoning tasks more feasible and accessible. Additionally, the insights gained from this compression method may lead to a paradigm shift in how LLMs are trained and utilized, fostering advancements in applications across various fields that rely on sophisticated reasoning capabilities.

back to list

InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities

Published

8/7/2025

arXiv ID

[2508.05496v1]

Authors

Shuo Cai, Su Lu, Qi Zhou, Kejing Yang, Zhijie Sang, Congkai Xie, Hongxia Yang

Key Insights

InfiAlign introduces a novel framework that combines supervised fine-tuning with Direct Preference Optimization, significantly enhancing the reasoning capabilities of large language models while drastically reducing data requirements. The innovative data selection pipeline curates high-quality alignment data using multidimensional quality metrics, allowing for scalable improvements in model performance.

Potential Impact

This research could revolutionize the post-training process for large language models, making it more efficient and accessible for a wider range of applications in reasoning tasks. By lowering the data and computational costs associated with enhancing LLMs, it paves the way for broader adoption in industries that rely on advanced reasoning capabilities.

back to list

TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs

Published

8/4/2025

arXiv ID

[2508.02063v1]

Authors

Amitava Das, Vinija Jain, Aman Chadha

Key Insights

This research introduces TraceAlign, a novel framework that identifies the root causes of alignment failures in large language models by linking unsafe completions to their training-time belief sources through the Belief Conflict Index (BCI). The framework includes innovative interventions that significantly reduce alignment drift while maintaining model utility, marking a substantial advancement in understanding and addressing alignment issues in LLMs.

Potential Impact

By providing a systematic approach to identify and mitigate alignment drift, TraceAlign has the potential to enhance the safety and reliability of LLMs in real-world applications, thereby improving user trust and compliance with ethical standards. The open-source nature of this toolkit encourages further research and development, which could lead to more robust alignment strategies across various AI applications.

back to list