Artificial Intelligence

(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs
Avatar
librarian
5 views
SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation
Avatar
librarian
3 views
Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization
Avatar
Mohammad Beigi
3 views
SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research
Avatar
librarian
3 views
Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting
Avatar
librarian
10 views
From 0-to-1 to 1-to-N: Reproducible Engineering Evidence for MetaAI Recursive Self-Design
Avatar
librarian
3 views
Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text
Avatar
Yutong Bian
5 views
TokenMizer: Graph-Structured Session Memory for Long-Horizon LLM Context Management
Avatar
Shweta Mishra
75 views
Vortex: Efficient and Programmable Sparse Attention Serving for AI Agents
Avatar
Zhuoming Chen
25 views
Benchmark Everything Everywhere All at Once
Avatar
librarian
20 views
Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement
Avatar
librarian
19 views
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery
Avatar
Xiangchao Yan
19 views
Beyond Objective Equivalence: Constraint Injection for LLM-Based Optimization Modeling on Vehicle Routing Problems
Avatar
librarian
23 views
R-APS: Compositional Reasoning and In-Context Meta-Learning for Constrained Design via Reflective Adversarial Pareto Search
Avatar
librarian
26 views
AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?
Avatar
librarian
26 views
Knowledge Index of Noah's Ark

Knowledge Index of Noah's Ark

Artificial Intelligence
Avatar
librarian
26 views
Entropy Is Not Enough: Unlocking Effective Reinforcement Learning for Visual Reasoning via Vision-Anchored Token Selection
Avatar
Senjie Jin
16 views
Reasoning Structure of Large Language Models
Avatar
librarian
23 views
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models
Avatar
librarian
27 views
Gender-Dependent Diagnostic Substitution in LLM Medical Triage: Same Symptoms, Unequal Urgency
Avatar
Qi Han Wong
23 views
Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
Avatar
Jinnuo Liu
27 views
From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models
Avatar
librarian
25 views
MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation
Avatar
librarian
25 views
Iteris: Agentic Research Loops for Computational Mathematics
Avatar
librarian
29 views
AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents
Avatar
Yiheng Shu
26 views
ClinEnv: An Interactive Multi-Stage Long Horizon EHR Environment for Agents
Avatar
librarian
26 views
eMoT: evolving Memory-of-Thought via Symbolic Anchoring and Memory Corrosion
Avatar
librarian
14 views
Property Prediction of Stacked Bilayer Materials: A Multimodal Learning Approach
Avatar
librarian
20 views
Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions
Avatar
Di Wu
20 views
Towards Understanding Modality Interaction in Multimodal Language Models via Partial Information Decomposition
Avatar
librarian
22 views
Subliminal Learning Is Steering Vector Distillation
Avatar
librarian
17 views
Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software
Avatar
librarian
46 views