Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.CV) 2026-06-17

TextMesh4D: Zero-shot Text-to-4D Mesh Generation

Large-scale, high-quality dynamic 3D (4D) assets are essential for learning physically grounded representations, but remain costly to capture and annotate at scale. This limits the viability of supervised 4D learning and motivates zero-shot text-to-4D generation leveraging pretrained diffusion priors. To model complex dynamics, prior methods typically adopt implicit 3D representations (e.g., NeRFs or 3DGS) for their deformation capacity. However, their implicit nature provides limited control over surface topology, which hinders high-fidelity geometry and makes temporally coherent surface reconstruction challenging. To address these limitations, we explore zero-shot text-to-4D mesh generation. However, a structural mismatch arises when combining diffusion-based guidance with topology-constrained meshes: the guidance is noisy and spatially inconsistent, while meshes impose severe topological constraints, making direct vertex-level deformation unstable. In this paper, we introduce TextMesh4D, the first zero-shot framework for text-to-4D that directly generates dynamic meshes by addressing the above challenge at two complementary levels. Geometrically, we shift deformation modeling from vertices to faces via a Jacobian Deformation Field (JDF), enabling topology-aware surface reconstruction through an integrability-enforcing integration formulation. Semantically, we propose a Local-Global Semantic Regularizer (LGSR) that preserves identity over time by jointly constraining local deformation plausibility and global shape consistency. Extensive experiments demonstrate state-of-the-art temporal consistency, structural fidelity, and visual quality, while remaining efficient on a single 24GB GPU.

02.
arXiv (CS.CV) 2026-06-15

Fusion of Pervasive RF Data with Spatial Images via Vision Transformers for Enhanced Mapping in Smart Cities

In this paper, we present a deep learning-based approach that integrates the DINOv2 architecture to improve building mapping by combining (possibly erroneous) maps from open-source platforms with pervasive radio frequency (RF) data collected from multiple wireless user equipments and base stations. Unlike prior methods, our approach leverages a vision transformer-based architecture to jointly process both RF and map modalities within a unified framework, effectively capturing spatial dependencies and structural priors for enhanced mapping accuracy. For the evaluation purposes, we employ a synthetic dataset co-produced by Huawei. To address the challenges associated with real-world data imperfections, we introduce controlled noise to its RF data so as to simulate real-world conditions. Additionally, we develop and train a model that leverages only aggregated path loss information to tackle the mapping problem. We measure the results according to three performance metrics: the Jaccard index (intersection over union, IoU), the Hausdorff distance, and the Chamfer distance. Our design achieves a macro IoU of 65.3%, significantly surpassing (i) the erroneous maps baseline, which yields 40.1%, (ii) an RF-only method from the literature, which yields 37.3%, and (iii) a non-AI fusion baseline that we designed which yields 42.2%. The comparative evaluation highlights the limitations of relying solely on RF data or on spatial data, as well as the effectiveness that AI can have on fusing data towards enhancing smart city mapping accuracy. We further validate our method on real-world data from the Oslo region, complementing the synthetic evaluation with a real deployment setting, where our best fusion model reaches 64.9% macro IoU. We additionally outline a strategy for deploying the model over larger areas by tiling the region with overlapping windows.

03.
arXiv (CS.AI) 2026-06-19

Creativity Reconsidered: Generative AI and the Problem of Intentional Agency

arXiv:2601.15797v2 Announce Type: replace Abstract: Many theorists maintain that conscious intentional agency is a necessary condition of creativity. We argue that this requirement, which we call the Intentional Agency Condition (IAC), should be abandoned. We motivate this by highlighting the problems this criterion encounters in the face of recent advances in generative AI, which is ostensibly creative despite being incapable of intentional agency. We present two corpus analyses to illustrate the rapidly increasing tendency of people to predicate creativity to generative AI. In response to this predicament, theorists of creativity have proposed a range of conflicting solutions, which we critically evaluate. We find that none of these satisfyingly resolves the initial predicament, and we therefore propose a novel approach. Our claim is that ascriptions of creativity are dependent on what we call creative ability. This solution explains why intentional agency is important for judgements of creativity, without being a necessary condition. Our approach thereby accommodates AI creativity without dismissing the intuition that perceived intentions are of key importance for ascriptions of creativity.

04.
arXiv (CS.CL) 2026-06-11

An Ontology-Guided Multi-Anchor Graph Retrieval Framework for Traffic Legal Liability Determination

Traffic law liability determination is critical for assigning legal penalties, requiring the simultaneous identification of interdependent statutory provisions across multiple legal dimensions. However, existing retrieval-augmented generation methods suffer from a multi-dimensional retrieval bottleneck: single axis architectures compress complex legal queries into a single pathway, causing interdependent statutory dimensions to be overlooked. To address this, we propose OMAGR, an ontology-guided framework that decomposes queries into ontology-aligned anchors and executes parallel graph retrieval across each dimension, ensuring independent retrieval across dimensions before fusion. To evaluate the proposed method, we created the TrafficLaw-QA dataset, an expert-validated benchmark dataset containing 200 questions and 527 legal provisions. Results show that TrafficOmni-RAG outperforms baselines on Context Precision and Faithfulness metrics. The findings demonstrate that parallel multi-anchor retrieval effectively resolves the multi-dimensional retrieval bottleneck, offering a promising direction for traffic law liability determination research.

05.
arXiv (CS.AI) 2026-06-11

Embodied-R1.5: Evolving Physical Intelligence via Embodied Foundation Models

arXiv:2606.11324v1 Announce Type: cross Abstract: We introduce Embodied-R1.5, a unified Embodied Foundation Model (EFM) that integrates comprehensive embodied reasoning capabilities, spanning embodied cognition, task planning, correction, and pointing, within a single architecture toward general physical intelligence. Leveraging three automated data construction pipelines to significantly expand the data coverage of critical capabilities, we build a large-scale data system of over 15B tokens, and design a multi-task balanced RL recipe to alleviate heterogeneous task conflicts. We further introduce a Planner-Grounder-Corrector (PGC) closed-loop framework that enables a single model to autonomously execute and self-correct over long-horizon tasks. With only 8B parameters, Embodied-R1.5 achieves SOTA on 16 out of 24 embodied VLM benchmarks, surpassing leading models like Gemini-Robotics-ER-1.5 and GPT-5.4. Benefiting from the internalized embodied capabilities, Embodied-R1.5 can be fine-tuned into a VLA with only a small amount of data, outperforming leading VLA models like $\pi_{0.5}$ across 4 popular manipulation benchmark suites. We further conduct extensive zero-shot real-robot experiments, validating performance in instruction following, affordance grounding, articulated object manipulation, and long-horizon complex tasks, demonstrating strong generalization to the physical world. We open-source model weights, datasets, training code, and EmbodiedEvalKit, an evaluation framework tailored for embodied tasks, to facilitate future research in EFMs.

06.
Nature (Science) 2026-06-17

A mosaic of whole-body representations on the human precentral gyrus

Authors:

Understanding how the body is represented in the motor cortex is key to understanding how the brain controls movement. Although the motor cortex has been mapped in animal models at a fine scale1–10, characterization in humans remains primarily limited to low-resolution recording11–16 and stimulation techniques17–20. Here we created a comprehensive map of the human motor cortex at single-neuron resolution, spanning microelectrode array recordings from 20 arrays across 8 individuals with paralysis from spinal cord injury, amyotrophic lateral sclerosis or brainstem stroke, all enrolled in brain–computer interface clinical trials. These arrays broadly sample the crown of the precentral gyrus (PCG; thought to be composed largely of the premotor cortex (Brodmann area 6)). We found that body parts were highly intermixed, such that the entire body was represented in all sampled locations of the PCG, although the relative strength of body parts was roughly consistent with the motor homunculus17,18. We also found two speech-preferential areas with a broadly tuned, orofacial-dominant area in between them. Throughout the PCG, movement representations of the four limbs were interlinked, with homologous movements of different limbs (for example, toe curl and hand close) having correlated representations. These data provide evidence consistent with an intermixed, interrelated and behaviour-centred organization of the motor cortex3,21. The resulting map also provides important targeting information for brain–computer interfaces that seek to restore motor function. A comprehensive map of the human motor cortex at single-neuron resolution is described.

07.
Science (Express) 2026-05-21

Observation of quantum vortex core fractionalization and skyrmion formation in a superconductor | Science

Authors: Unknown Author

Magnetic fields can penetrate a superconductor in the form of quantum vortices, which consist of a core singularity with circulating currents. London’s quantization implies that there is one core singularity per quantum of magnetic flux in single-component superconductors. Here, we report signatures of quantum vortex core fractionalization on the potassium-terminated surface of a multiband superconductor KFe 2 As 2 . The observed splitting of single integer-flux vortices into several fractional vortices results in a disparity between the numbers of flux quanta and vortex cores. These fractional vortices often arrange in chains, which calculations show are characterized by a ℂP 2 skyrmionic topological invariant; this constitutes a different type of topological defect: the chiral skyrmion. The disparate natures of integer and fractional vortices comprising skyrmions lead to distinct spectroscopic signatures.

08.
arXiv (quant-ph) 2026-06-19

QPU-scale randomized benchmarking via Bell-pair injection

arXiv:2606.20123v1 Announce Type: new Abstract: Mirror randomized benchmarking (MRB) is an established technique that provides a global error metric at the scale of a whole QPU. To expand upon this we introduce Mirror Quantum Awesomeness (MQA), a hybrid protocol that adds a structured entangling layer to MRB circuits. This enables per-edge correlation dynamics to be tracked via mutual information while preserving the MRB infidelity estimate. The resulting analysis of the injected entangled pairs locates a critical circuit depth, beyond which rudimentary error mitigation techniques can be expected to fail. A topological variant, Topological MQA, supplies a second critical depth via a decoder based on the surface-code decoding problem. Both are validated in simulation and demonstrated on the 156-qubit \texttt{ibm\_fez} and \texttt{ibm\_kingston} processors, where MQA closely agrees with MRB on the entanglement infidelity and the critical depth for \texttt{ibm\_fez} is found to be $\sim 50$.

09.
arXiv (quant-ph) 2026-06-19

Efficient classical representation and quantum state preparation of complete active space wavefunctions

Authors:

arXiv:2606.19457v1 Announce Type: new Abstract: Quantum computers promise to solve the electronic structure problem for a large class of molecules. However, the performance of relevant quantum algorithms hinges on preparing initial states with substantial overlap with the target eigenvector. For classically challenging molecules with strong electron correlation, starting from multi-reference states, such as complete active space (CAS) wavefunctions is necessary. Unfortunately, the most advanced state preparation protocols applied to such states result in a gate complexity that scales exponentially with the active space size $d$. In fact, even encoding a CAS state classically is traditionally believed to be intractable for chemically relevant systems. Here, we draw insights from the recently introduced Quantum Paldus Transform (QPT) to show that there exists an efficient classical representation of CAS states and to design a new state preparation routine outperforming previous ones. The QPT represents a transformation from the Fock basis to a friendlier symmetry-adapted basis. Our main contribution consists in showing that CAS states expanded in this basis can efficiently be represented as a matrix product state (MPS) with a bond dimension scaling as $O(d^2)$. One can then efficiently load the MPS on a quantum computer and use the inverse QPT to transform the state to the Fock basis. Moreover, our method can easily be extended to the efficient preparation of CAS states in first quantisation with similar complexity. Crucially, we demonstrate that the complexity of both state preparation protocols only grows polynomially as $O(d^3)$ , which constitutes to the best of our knowledge an exponential improvement over the state of the art.

10.
arXiv (CS.AI) 2026-06-16

The algebra of Krom logic programs

arXiv:2606.15719v1 Announce Type: cross Abstract: This paper investigates the algebraic structure of Krom logic programs, consisting only of facts and rules with at most one body atom. We show that sequential composition endows the class of Krom programs with a natural monoid structure and that this structure admits rich algebraic extensions to Krom seminearrings, Krom quemirings, Krom-Conway seminearrings, and Krom-Conway omegaseminearrings. Furthermore, we establish explicit generating sets and canonical decompositions, study the associated ${}^\omega$-operator, characterize the Kleene star in graph-theoretic terms, and relate finite Krom monoids to transformation monoids and finite-state automata. These results provide new connections between logic programming, algebraic automata theory, and algebraic graph theory.

11.
arXiv (CS.LG) 2026-06-16

ReQAT: Achieving Full-Precision Reasoning Accuracy with 4-bit Floating-Point Quantization-Aware Training

arXiv:2606.15682v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) achieve strong problem-solving through long chain-of-thought, but their deployment is constrained by the high cost of full-precision inference and growing KV cache footprints. Microscaled FP4 formats enable efficient FP4 deployment; however, fully quantizing weights, activations, and KV caches (W4A4KV4) causes severe reasoning degradation that existing PTQ and QAT fail to recover. We identify that FP4 failures concentrate on low-entropy tokens–precise symbolic commitments such as digits and operators–where quantization noise inflates sampling errors that cascade through reasoning traces. Based on this insight, we propose ReQAT, a reasoning-centric FP4 training framework with three components: (i) Trace-Aligned QAT (TAQ), which revisits identical reasoning traces to focus updates on critical low-entropy decisions; (ii) Selective Entropy Minimization (SEM), which reinforces confidence at low-entropy positions; and (iii) Q-FIT, a quantization-friendly initialization that jointly calibrates RoPE-consistent KV cache transformations to stabilize QAT. Under the same training budget, ReQAT not only recovers but surpasses BF16 fine-tuning accuracy, while delivering up to 3.9x throughput speedup on NVIDIA DGX Spark and 3.1x on B200.

12.
arXiv (math.PR) 2026-06-19

A Cycle Walk for Sampling Measures on Spanning Forests for Redistricting

arXiv:2509.08629v2 Announce Type: replace-cross Abstract: We introduce the Cycle Walk, a new Markov chain Monte Carlo method for sampling distributions on balanced graph partitions, motivated by applications in political redistricting. The method operates on spanning forests and combines two types of updates: local "cycle" moves within districts and global moves that exchange population between adjacent districts while preserving balance constraints. This construction enables efficient Metropolis–Hastings correction while allowing proposals at multiple spatial scales. We show that the Cycle Walk naturally interpolates between existing approaches based on local updates and a class of global update methods derived from recombination (RECOM). Through a range of numerical experiments on synthetic graphs and real-world precinct data, we demonstrate that the Cycle Walk exhibits improved empirical convergence diagnostics for distributions that place weaker weight on spanning-tree counts, a regime that is challenging for existing methods. In particular, the algorithm remains effective when incorporating alternative compactness measures that more closely reflect policy-relevant criteria. These results suggest that the Cycle Walk provides a flexible and computationally efficient framework for sampling from a broader class of redistricting distributions than previously accessible with MCMC techniques.

13.
arXiv (quant-ph) 2026-06-19

Topological Quantum Interferometry

arXiv:2606.19730v1 Announce Type: new Abstract: Structured light provides high-dimensional Hilbert spaces holding tremendous potential for fundamental quantum optics and quantum technologies. However, existing characterization methods, like Hong-Ou-Mandel (HOM) interference, typically assume perfectly tuned conditions, overlooking the geometric physics governing spatial mode evolution. Here, we establish topological quantum interferometry driven by an interaction-based geometric phase, the exchange Berry phase (BPX). Our formalism generalizes $q$-plate state generation and characterization to arbitrary topological charges and (de)tuning conditions, demonstrating that BPX acts as a geometric marker governing spatial interference. We show BPX serves as a deterministic control parameter, decomposing two-photon spatial patterns into geometry-dictated fundamental modes. This mapping reveals topological invariants and phase singularities that function as a non-tomographic witness for state dimensionality estimation, circumventing full-state reconstruction. Being device-independent and highly scalable, this approach enables scalable high-dimensional characterization and topologically protected state selection, with direct applicability to quantum metrology and high-capacity quantum networks.

14.
arXiv (CS.AI) 2026-06-16

FastMix: Fast Data Mixture Optimization via Gradient Descent

arXiv:2606.14971v1 Announce Type: cross Abstract: While large and diverse datasets have driven recent advances in large models, identifying the optimal data mixture for pre-training and post-training remains a significant open problem. We address this challenge with FASTMIX, a novel framework that automates data mixture discovery while training only a single proxy model. Instead of relying on predefined heuristics or resource-intensive simulations, FASTMIX jointly optimizes mixture coefficients and model parameters, substantially improving efficiency and scalability over prior approaches. At the core of FASTMIX is a reformulation of mixture selection as a bilevel optimization problem. Under this reformulation, we show that optimizing mixture ratios is mathematically equivalent to assigning per-source loss weights under uniform source sampling. This embeds the mixture coefficients directly into the differentiable iterative optimization objective, enabling efficient, gradient-based optimization of both mixture and model. To solve the optimization problem, FASTMIX implements an approximate iterative optimization procedure, alternating between (i) updating model parameters on data sampled according to current mixture ratios (inner loop) and (ii) updating mixture ratios based on validation feedback (outer loop). Across pre- and post-training, FASTMIX outperforms baselines while drastically reducing search cost. Code (https://github.com/hrtan/fastmix)

15.
Nature (Science) 2026-06-22

Why heritage sites are at risk in a warming world — and how to save them

As rising seas and intensifying disasters threaten historic sites worldwide, new ways to understand, preserve and adapt these places are needed urgently. As rising seas and intensifying disasters threaten historic sites worldwide, new ways to understand, preserve and adapt these places are needed urgently.

16.
arXiv (CS.LG) 2026-06-18

RNN(p) for Power Consumption Forecasting

arXiv:2209.01378v3 Announce Type: replace Abstract: An elementary Recurrent Neural Network that operates on p time lags, called an RNN(p), is the natural generalisation of a linear autoregressive model ARX(p). It is a powerful forecasting tool for variables displaying inherent seasonal patterns across multiple time scales, as is often observed in energy, economic, and financial time series. The architecture of RNN(p) models, characterised by structured feedbacks across time lags, enables the design of efficient training strategies. We conduct a comparative study of learning algorithms for these models, providing a rigorous analysis of their computational complexity and training performance. We present two applications of RNN(p) models in power consumption forecasting, a key domain within the energy sector where accurate forecasts inform both operational and financial decisions. Experimental results show that RNN(p) models achieve excellent forecasting accuracy while maintaining a high degree of interpretability. These features make them well-suited for decision-making in energy markets and other fintech applications where reliable predictions play a significant economic role.

17.
medRxiv (Medicine) 2026-06-16

Validation of a Smartphone-Image-Based Computer-Vision Model for Lean Mass and Body Fat Estimation Against Dual-Energy X-ray Absorptiometry

Introduction Body composition, rather than body weight alone, is an increasingly important health metric, and preservation of lean mass has become a central concern in obesity treatment, aging, and chronic disease management. Dual-energy X-ray absorptiometry (DXA) provides accurate assessment of fat and lean tissue, but its cost and logistical requirements limit repeated measurement. Computer-vision approaches show promise for estimating adiposity from smartphone images, but lean-mass estimation remains less established. Methods We evaluated a computer-vision body composition model, applied to consumer-grade smartphone photographs, against DXA in a held-out validation sample of 195 adults from an ongoing cross-sectional study. Body fat percentage and total lean mass percentage were co-primary outcomes; for total lean mass percentage, an image-only configuration (no added covariates) was pre-specified as primary. Agreement was quantified using Lin's concordance correlation coefficient (CCC) as the lead statistic, with Pearson correlation, mean absolute error, root mean square error, mean bias, and Bland-Altman limits of agreement. In secondary analyses, appendicular lean mass and total lean mass percentage were each estimated with and without routine anthropometric and demographic inputs (body weight, height, age, and sex). Results Total lean mass percentage agreed with DXA from image features alone (CCC 0.916). Body fat percentage, estimated with routine inputs added, agreed at least as closely (CCC 0.930). Adding routine inputs barely changed agreement for total lean mass percentage but markedly improved it for appendicular lean mass, an absolute quantity that scales with body size. Conclusions A smartphone-image-based model estimated both body fat and lean mass with strong agreement to DXA, with lean mass percentage from image features alone. The approach needs no fixed equipment or ionizing radiation. Whether it can track change over time, including in incretin-based weight loss where lean mass preservation is a concern, was not assessed in this cross-sectional study.

18.
arXiv (CS.AI) 2026-06-16

VGPT-RSI for RH-Adjacent Formal Progress: Boundary Certificates, Verified Finite Lagarias Inequalities, and Explicit Failure Localization

arXiv:2606.15096v1 Announce Type: new Abstract: The Riemann Hypothesis remains one of the central unsolved problems in mathematics. Rather than claiming proof, we investigate whether a verifiable AI-assisted reasoning system can produce reliable, formally checked partial progress while explicitly identifying the remaining mathematical obstructions. We apply the Verifiable Growing Physical Transformer with Recursive Self-Improvement (VGPT-RSI) to two RH-adjacent certification tasks. First, we construct and verify a finite RH-boundary certificate for inequality on a parameterized safe lower curve over a region. The numerical boundary curve is converted into a certificate-backed lower curve, audited using outward-rounded interval arithmetic and Arb/FLINT ball arithmetic, and then checked in Rocq/CoqInterval for the parameterized theorem. Second, we initiate a formal Lagarias-route certificate. Lagarias criterion states that RH is equivalent to the global inequality. We formalize the finite quantity and produce a Coq-checked finite certificate. The final system identifies the exact unresolved mathematical bottlenecks: formalizing the Lagarias equivalence, proving the global tail theorem beyond any finite cutoff, and potentially reducing counterexamples to colossally abundant or related extremal integers. These results demonstrate that VGPT-RSI can produce certified RH-adjacent formal progress, organize proof dependencies, and avoid overclaiming when the remaining obstruction is genuinely mathematical.

19.
arXiv (CS.CV) 2026-06-17

RAIGen: Rare Attribute Identification in Text-to-Image Generative Models

Text-to-image diffusion models achieve impressive generation quality but inherit and amplify training-data biases, skewing coverage of semantic attributes. Prior work addresses this in two ways. Closed-set approaches mitigate biases in predefined fairness categories (e.g., gender, race), assuming socially salient minority attributes are known a priori. Open-set approaches frame the task as bias identification, highlighting majority attributes that dominate outputs. Both overlook a complementary task: uncovering rare or minority features underrepresented in the data distribution (social, cultural, or stylistic) yet still encoded in model representations. We introduce RAIGen, the first framework, to our knowledge, for label-free rare-attribute discovery in diffusion models, requiring no predefined minority categories. RAIGen leverages Matryoshka Sparse Autoencoders and a novel minority metric combining neuron activation frequency with semantic distinctiveness to identify interpretable neurons whose top-activating images reveal underrepresented attributes. Experiments show RAIGen discovers attributes beyond fixed fairness categories in Stable Diffusion, scales to larger models such as SDXL, supports systematic auditing across architectures, and enables targeted amplification of rare attributes during generation. The project page is available at https://vssilpa.github.io/RAIGen_webpage/ .

20.
arXiv (quant-ph) 2026-06-17

Variational Quantum Eigensolver-Based Quantum Bootstrap Embedding for Molecules

Authors:

arXiv:2606.17095v1 Announce Type: cross Abstract: Simulating strongly correlated molecular systems on near-term quantum hardware remains challenging due to modern hardware's limited quantum volume and moderate-fidelity qubits. One potential way to circumvent this challenge is through bootstrap embedding (BE). Bootstrap embedding breaks molecules into smaller fragments that are then embedded into the "bath" of other fragments in an iterative way. Bootstrap embedding is appealing for quantum simulation because fragmenting the system reduces the qubit requirements for any given fragment. In this work, we develop a quantum bootstrap embedding (QBE) workflow that uses variational quantum eigensolver (VQE) fragment solvers and study the algorithmic choices that determine the overall VQE-QBE algorithm's success. To improve efficiency, we introduce FastAdaptVQE, a sparse matrix-accelerated form of the adaptive variational quantum eigensolver (ADAPT-VQE) that replaces symbolic commutator evaluation with direct statevector linear algebra, and MatrixFreeAdaptVQE, a matrix-free extension that removes the sparse-matrix memory bottleneck that appears when treating larger fragments. We also modify the ADAPT-VQE operator selection step by replacing the purely greedy choice with a look-ahead strategy. Benchmarks on $H_4$ and $F_2$ reach chemical accuracy, within 1 kcal/mol of bootstrap embedding results using a full configuration interaction (FCI) solver. These results show that combining QBE with VQE can accurately calculate energies of molecular systems. This research lays the foundation for extending energy calculations to larger molecular systems and quantum materials on near-term quantum hardware.

21.
arXiv (CS.AI) 2026-06-15

FactoryLLM: A Safe and Open-Source AI Playground for Evaluating LLMs in Smart Factories

arXiv:2606.14119v1 Announce Type: new Abstract: Fault diagnostics and recovery in smart factories is challenging because critical information is dispersed across manuals of multiple machines which are interconnected through the manufacturing process. Large Language Models (LLMs) can provide a promising approach. In this paper, we propose FactoryLLM, a safe and open-source AI playground designed for evaluating different LLM-based retrieval-augmented generation (RAG) models by analysing documents from multiple machines across the manufacturing process. FactoryLLM enables the user to configure the LLM, and assess performance when reasoning over multiple documents, through a dual evaluation setup using both RAGAS and NVIDIA's LLM-as-a-Judge metrics. FactoryLLM is safe because it allows users to run local or open-source LLMs without sharing sensitive industrial data, providing a controlled environment for experimentation. We demonstrate the efficacy of FactoryLLM through a case study which involves an Autonomous Intelligent Vehicle and its Mobile Planner software, evaluating three LLMs across 30 maintenance queries derived from approximately 600 pages of cross-machine documentation. The results suggest that FactoryLLM is effective in cross-machine document reasoning: every model achieved a groundedness score above 0.88. The full code and documentation for community to test FactoryLLM with their manufacturing specific scenarios are publicly available.

22.
arXiv (CS.AI) 2026-06-19

VitalAgent: A Tool-Augmented Agent for Reactive and Proactive Physiological Monitoring over Wearable Health Data

arXiv:2605.29483v2 Announce Type: replace Abstract: Wearable devices enable continuous monitoring of physiological signals such as ECG and PPG, but existing mHealth systems are largely limited to task-specific prediction pipelines or reactive question answering over static summaries. They lack the ability to support temporal reasoning, persistent physiological context, and proactive monitoring over long-term signal streams. We propose VitalAgent, a tool-augmented agentic framework for ECG/PPG-based mHealth that supports both reactive question answering and proactive monitoring. VitalAgent is built on a longitudinal physiological memory and a tool-augmented reasoning interface that enables dynamic computation over raw signals. We further introduce VitalBench, a longitudinal physiological monitoring benchmark dataset comprising 1,862 QA pairs for reactive question answering and 90.2 hours of continuous ECG/PPG recordings for proactive monitoring, covering cardiac, physical activity, and stress-related tasks. Experiments demonstrate that VitalAgent achieves over 25% improvement over prompt-based and ReAct baselines in reactive evaluation and supports proactive alert monitoring over long-term physiological signals, highlighting the importance of dynamic tool use and long-term physiological monitoring.

23.
arXiv (quant-ph) 2026-06-16

Scalable Graph State Generation with O(1) Local Feedforward in Quantum Networks

arXiv:2606.16375v1 Announce Type: new Abstract: The development of quantum networks faces a key challenge: the contradiction between probabilistic long-range entanglement generation and finite coherence time. Existing routing protocols typically focus on global state computation or path optimization. As the network scales up, classical delays accumulate and exacerbate decoherence, leading to a decrease in entanglement fidelity. To reduce routing decision delays to levels far below the coherence time of qubits, we propose a protocol based on local measurement and classical feedforward. This protocol reduces the local decision complexity to amortized O(1) level, ensuring that the decision delay is always much smaller than the coherence time of qubits. We map this protocol onto a dual-species trapped-ion platform and perform hybrid simulations. The results show that the proposed protocol performs well in terms of both resource efficiency and time feasibility. Noise analysis indicates that readout fidelity is the main bottleneck of this protocol, but noise suppression can be achieved by employing an erasure transformation in the dual-species architecture, combined with spatial multiplexing and branch independence, thereby ensuring the generation of high-fidelity star subgraphs. This protocol provides a clear path to achieving high-fidelity star subgraphs. These subgraphs can serve as general modules, merging to construct arbitrary subgraphs, providing a feasible solution for future fault-tolerant distributed quantum computing.

24.
arXiv (CS.AI) 2026-06-19

RoboSSM: Scalable In-context Imitation Learning via State-Space Models

arXiv:2509.19658v2 Announce Type: replace-cross Abstract: In-context imitation learning (ICIL) enables robots to learn tasks from prompts consisting of just a handful of demonstrations. By eliminating the need for parameter updates at deployment time, this paradigm supports few-shot adaptation to novel tasks. However, recent ICIL methods rely on Transformers, which have computational limitations and tend to underperform when handling longer prompts than those seen during training. In this work, we introduce RoboSSM, a scalable recipe for in-context imitation learning based on state-space models (SSM). Specifically, RoboSSM replaces Transformers with Longhorn – a state-of-the-art SSM that provides linear-time inference and strong extrapolation capabilities, making it well-suited for long-context prompts. Through diverse experiments on the LIBERO benchmark, we demonstrate the effectiveness of applying SSMs to ICIL, achieving improved generalization to both unseen and long-horizon tasks than Transformer-based ICIL methods by handling longer contexts at test-time. These results show for the first time that SSMs are an efficient and scalable backbone for ICIL. Our code is available at https://github.com/youngjuY/RoboSSM.

25.
arXiv (math.PR) 2026-06-19

The central heat trace on large compact classical groups

arXiv:2511.08288v2 Announce Type: replace-cross Abstract: We study the large-$N$ asymptotics of the central trace of the heat kernel on compact classical groups. For every classical family $G_N\subset \mathrm{GL}_N(\C)$, we prove a full large-$N$ asymptotic expansion, using a highest weights/partitions correspondence adapted to the large-rank regime, under which the eigenvalues of the Laplace–Beltrami operator stabilize as observables in the algebra of shifted symmetric functions. Then, we prove a random surface representation of the trace in terms of ramified coverings of the torus. We provide two independent applications: an explicit large-rank counting law for the Casimir spectrum, with exponential Hardy–Ramanujan-type growth in contrast with the polynomial behavior of Weyl's law at fixed rank, and a rigorous probabilistic formulation of the Yang–Mills/Hurwitz duality on a two-dimensional torus initiated by Gross and Taylor, completing a previous work of the authors. We also extend this duality to a Yang–Mills/Gromov–Witten duality by expressing the coefficients of the central heat trace as explicit functionals of the generating function of Gromov–Witten invariants.