Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.AI) 2026-06-24

Attention in Motion: Secure Platooning via Transformer-based Misbehavior Detection

arXiv:2512.15503v3 Announce Type: replace-cross Abstract: Vehicular platooning promises transformative improvements in transportation efficiency and safety through the coordination of multi-vehicle formations enabled by Vehicle-to-Everything (V2X) communication. However, the distributed nature of platoon coordination creates security vulnerabilities, allowing authenticated vehicles to inject falsified kinematic data, compromise operational stability, and pose a threat to passenger safety. Traditional misbehaviour detection approaches, which rely on plausibility checks and statistical methods, suffer from high False Positive (FP) rates and cannot capture the complex temporal dependencies inherent in multi-vehicle coordination dynamics. We present Attention In Motion (AIMformer), a transformer-based framework specifically tailored for real-time misbehaviour detection in vehicular platoons with edge deployment capabilities. AIMformer leverages multi-head self-attention mechanisms to capture intra-vehicle temporal dynamics, with a spatio-temporal variant that further models inter-vehicle spatial correlations. It incorporates global positional encoding with vehicle-specific temporal offsets to handle join/exit maneuvers. We propose a Precision-Focused Binary Cross-Entropy (PFBCE) loss function that penalizes FPs to meet the requirements of safety-critical vehicular systems. Extensive evaluation across 4 platoon controllers, multiple attack vectors, and diverse mobility scenarios demonstrates superior performance ($\geq$ 0.93) compared to state-of-the-art baseline architectures. A comprehensive deployment analysis utilizing TensorFlow Lite (TFLite), Open Neural Network Exchange (ONNX), and TensorRT achieves sub-millisecond inference latency, making it suitable for real-time operation on resource-constrained edge platforms. Hence, validating AIMformer is viable for both in-vehicle and roadside deployment.

02.
arXiv (CS.CL) 2026-06-12

Getting Better at Working With You: Compiling User Corrections into Runtime Enforcement for Coding Agents

Interactive LLM agents are becoming part of daily work, but they do not reliably become easier to work with over time: a correction remembered in one session may still be violated in the next. We study this gap between preference access and preference compliance. In tasks derived from anonymized real-user friction cases, Mem0 memory still leaves 57.5% of applicable preference checks violated. We introduce Test-time Rule Acquisition and Compiled Enforcement (TRACE), a drop-in skill-layer pipeline for coding-agent runtimes that mines user corrections, rewrites them as atomic rules, and compiles them into runtime checks that must pass before an agent completes future tasks. Unlike runtime checks written ahead of time by developers, TRACE skills come from the user's own chat corrections. We evaluate TRACE with simulated user-in-the-loop experiments on ClawArena coding-agent tasks and MemoryArena-derived memory-intensive tasks. On ClawArena, TRACE reduces held-out preference violation from 100.0% to 37.6% on in-distribution tasks and from 100.0% to 2.0% on out-of-distribution tasks. On MemoryArena-derived tasks, TRACE reduces in-distribution violation from 100.0% to 60.5% while matching or exceeding the strongest memory baseline on task pass. These results suggest that compiling corrections into runtime enforcement can address a repeated-friction failure mode that memory alone does not reliably solve, reducing the need for users to restate the same correction across future sessions. Experiment code is available at https://github.com/YujunZhou/TRACE_exp, and the deployable skill is available at https://github.com/YujunZhou/tellonce.

03.
arXiv (quant-ph) 2026-06-12

Entropic order parameters and topological holography

arXiv:2512.24225v2 Announce Type: replace-cross Abstract: We show that the symmetry topological field theory (SymTFT) construction, also known as the topological holography, provides a natural and intuitive framework for the entropic order parameter characterising phases with (partially) broken symmetries. Various examples of group and non-invertible symmetries are studied. In particular, the origin of the distinguishability of the vacua resulting from spontaneously broken non-invertible symmetries is made manifest with an information-theoretic perspective, where certain operators in the SymTFT are excluded from observation.

04.
arXiv (CS.LG) 2026-06-12

Feature-preserving Latent-EnKF for Data Assimilation of Flows with Shocks

arXiv:2606.12559v1 Announce Type: cross Abstract: The ensemble Kalman filter (EnKF) is widely adopted for sequential data assimilation, but fails for solutions with discontinuities, such as shocks in compressible flows. Uncertainty in shock location induces multimodal ensemble statistics that violate the Gaussian assumptions underlying the EnKF, producing large-scale spurious oscillations in the analysis state. We introduce a feature-preserving latent-EnKF that performs the ensemble update in a learned low-dimensional latent space, where shock and flow features admit a smooth manifold representation, thereby preserving sharp features during EnKF analysis. The updated latent state is mapped back to physical state through a shared decoder for all ensemble members. The algorithm eliminates the member-specific ordered training and positivity flooring used in prior approaches. Numerical experiments on a Sod shock tube and Mach 2 shock interaction with a 2D cylinder, using sparse and noisy observations, show accurate feature recovery of shocks and contact discontinuities without spurious oscillations.

05.
arXiv (CS.CV) 2026-06-15

Encoder Winners Do Not Reliably Transfer Across VLA Backbone Scale: A Frozen-Backbone Grafting Diagnostic

Vision-language-action (VLA) policies typically inherit their vision encoder from upstream VLM releases, but it is unclear whether an encoder choice validated on a small VLA transfers to a larger backbone. We introduce a frozen-backbone grafting diagnostic: the vision tower of a released VLA is replaced by a candidate encoder under a fixed protocol (adaptive average pooling, LayerNorm, and a single trainable linear projector), with the language model and action expert frozen. Across four encoders, two LIBERO suites, two backbones (SmolVLA-450M and $\pi_{0.5}$-3.3B), and two-to-three seeds per cell (40 main grafting runs plus native, LoRA, pooling, and zero-/shuffled-image controls, all scored by offline action MSE), the small-backbone winner does not reliably select the large-backbone top tier: SigLIP is best on SmolVLA across both suites, while on $\pi_{0.5}$ DINOv2-small leads the spatial suite and the object suite is a seed-sensitive near-tie band; three of the four backbone-suite comparisons (and 11 of 12 seed-level cells) support backbone-dependent rankings. The grafting wrapper is itself non-neutral with opposite sign across backbones (+45-56% MSE on the SmolVLA native tower, -50-52% on $\pi_{0.5}$), so all conclusions are conditional on the fixed grafting protocol. We position frozen grafting as a cheap target-backbone diagnostic to run before committing to an encoder at scale, not as a closed-loop deployment claim.

06.
Nature (Science) 2026-06-17

<i>CHPO</i> coordinates chilling recovery and nitrogen use in rice

Authors:

Global rice production faces mounting challenges from abnormal temperature fluctuations and nitrogen-fertilizer-driven environmental pollution1–7. Developing varieties that balance chilling resilience and nitrogen-use efficiency (NUE) offers a promising solution, but the molecular networks coordinating these traits remain poorly understood. Here we identify CHILLING PHOENIX (CHPO), a major gene underlying the quantitative trait locus shared by both chilling tolerance and resilience. It encodes a MYB transcription factor that acts as a key regulator coordinating post-chilling recovery with nitrogen use in rice. Natural variation in a GCG-repeat-encoded polyalanine tract alters CHPO DNA-binding preference and redirects regulatory outputs between the japonica-type (CHPOjap) and indica-type (CHPOind), causing opposing effects on chilling tolerance and resilience. This allelic variation is shaped by domestication selection, with the CHPOjap allele probably derived from Chinese wild rice. CHPOjap directly targets OsTCP19 and OsNRT2.4 to fine-tune NUE, thereby enhancing chilling tolerance and resilience. These findings provide a mechanistic framework for a chilling-induced high-nitrogen-utilization module that alleviates the damage caused by chilling stress, and a potential molecular design&nbsp;strategy for breeding rice varieties with both chilling resilience and high NUE at the&nbsp;recovery stage. A rice gene, CHPO, links chilling resilience with nitrogen-use efficiency, revealing a domestication-shaped regulatory mechanism that could guide breeding of climate-resilient, sustainable rice varieties.

07.
arXiv (CS.LG) 2026-06-11

Learning Patterns and Abstractions from Perceptual Sequences

Authors:

arXiv:2503.10973v2 Announce Type: replace Abstract: Cognition swiftly breaks high-dimensional sensory streams into familiar parts and uncovers their relations. Why do structures emerge, and how do they enable learning, generalization, and prediction? What computational principles underlie this core aspect of perception and intelligence? A sensory stream, simplified, is a one-dimensional sequence. In learning such sequences, we naturally segment them into parts – a process known as chunking. In the first project, I investigated factors influencing chunking in a serial reaction time task and showed that humans adapt to underlying chunks while balancing speed and accuracy. Building on this, I developed models that learn chunks and parse sequences chunk by chunk. Normatively, I proposed chunking as a rational strategy for discovering recurring patterns and nested hierarchies, enabling efficient sequence factorization. Learned chunks serve as reusable primitives for transfer, composition, and mental simulation – letting the model compose the new from the known. I demonstrated this model's ability to learn hierarchies in single and multi-dimensional sequences and highlighted its utility for unsupervised pattern discovery. The second part moves from concrete to abstract sequences. I taxonomized abstract motifs and examined their role in sequence memory. Behavioral evidence suggests that humans exploit pattern redundancies for compression and transfer. I proposed a non-parametric hierarchical variable model that learns both chunks and abstract variables, uncovering invariant symbolic patterns. I showed its similarity to human learning and compared it to large language models. Taken together, this thesis suggests that chunking and abstraction as simple computational principles enable structured knowledge acquisition in hierarchically organized sequences, from simple to complex, concrete to abstract.

08.
arXiv (CS.AI) 2026-06-15

SpheriCity: Designing Trustworthy Conversational AI for Sustainability Decision Support

arXiv:2606.13854v1 Announce Type: cross Abstract: We present SpheriCity, an expert-grounded conversational prototype designed to support trustworthy knowledge sensemaking from sustainability reports. City-level circularity assessment reports contain rich information about materials, infrastructure, and policy interventions, yet their length and heterogeneous structure make cross-document synthesis and comparison difficult for practitioners and researchers working on circular economy initiatives. While large language models (LLM) promise faster knowledge access and synthesis, their opaque reasoning, hallucinations, and lack of source transparency introduce risks for trust and interpretability, and require verification in high-stakes sustainability contexts. SpheriCity addresses these challenges through a provenance-first conversational agent that foregrounds evidence traceability, structured synthesis, and interaction scaffolds to support exploratory querying and cross-document synthesis across sustainability reports. We conducted a formative expert review with six sustainability experts using representative queries spanning cross-city comparison, policy summarization, and recommendation-oriented tasks. Experts evaluated responses across dimensions and provided qualitative reflections on the system's usefulness for sustainability knowledge work. Our results reveal that transparent sourcing, contextual explanation, interpretability, and alignment with expert workflow strongly shape expert trust and judgments of system usefulness. This work contributes (1) a conversational prototype for sustainability knowledge sensemaking, (2) an expert-grounded evaluation framework for assessing AI responses in high-stakes knowledge domains, and (3) design insights into how provenance, uncertainty communication, and integration in workflow influence expert users' trust in AI assistance for sustainability decision support.

09.
arXiv (CS.AI) 2026-06-19

JustDiag!: A Diagnostic Justification Engine for Accountable Root Cause Analysis

arXiv:2606.19407v1 Announce Type: cross Abstract: Large language models can produce fluent root cause analyses, but fluent final answers alone are insufficient evidence for accountability in high-stakes operations. In real incident response, engineers need to know what evidence supported a diagnosis, which alternatives were considered, where contradictions remained, and whether the system resolved the case or preserved uncertainty. We address this gap with JustDiag, a diagnostic justification engine for RCA that maintains an explicit process state over evidence, findings, competing hypotheses, conflicts, and next checks. We evaluated the system on 66 real-world incidents using a two-layer protocol that separately scores final-answer quality and process quality. Relative to a matched control without diagnostic justification, JustDiag achieved stronger outcome and process scores, while accepting slightly lower terminal completion due to more calibrated non-closure. These results suggest that accountable RCA requires explicit diagnostic justification artifacts and process-aware evaluation, not only fluent final answers.

10.
arXiv (CS.CV) 2026-06-16

Post-Launch Capability Expansion of Vision-Language Models via Prompting for On-Orbit Spacecraft Inspection

Spaceborne inspection systems often deploy perception models prior to launch, after which updating model weights or expanding fixed label sets becomes operationally impractical. While supervised models can be integrated pre-flight, adding new semantic capabilities in orbit requires retraining and re-uploading parameters. We investigate whether prompt-driven vision–language models can enable post-launch semantic expansion, allowing new spacecraft components to be specified via natural-language prompts without modifying onboard weights. We evaluate zero-shot instance segmentation of spacecraft components under a strictly frozen, single-pass inference protocol on a test set of $129$ images of previously unseen satellites. Under fixed global thresholds and no post-processing, SAM3 achieves $0.385$ mAP@$0.5$ and $0.267$ mAP@$0.5{:}0.95$. Performance is strongly scale-dependent: large structural elements like spacecraft bodies ($0.639$ AP@$0.50$) and solar arrays ($0.598$ AP@$0.5$) localize reliably, while relatively small appendages like antennas ($0.221$ AP@$0.5$) and thrusters ($0.081$ AP@$0.5$) remain difficult. Prompt formulation influences performance, with structured prompts incorporating spatial and geometric descriptors yielding up to $82%$ improvement over short category-name prompts. The model operates within the memory and compute envelope of contemporary embedded GPUs, suggesting prompt-driven grounding can provide a practical mechanism for post-launch semantic extension of dominant spacecraft structures while highlighting limitations of zero-shot localization for fine-scale components under orbital domain shift.

11.
arXiv (CS.CL) 2026-06-11

FlowBank: Query-Adaptive Agentic Workflows Optimization through Precompute-and-Reuse

Large Language Model (LLM)-based multi-agent systems are increasingly powerful, but current agentic workflow optimization paradigms make an unsatisfying trade-off. Task-level methods spend substantial offline compute yet deploy only a single workflow, leaving complementary candidates unused, while query-level methods synthesize a new workflow per query at substantial inference cost. Our motivating analysis shows these paradigms are more complementary than competing: workflows discovered during offline search often solve different subsets of queries, and many queries handled by expensive query-level generation can already be solved by cheaper precomputed workflows. This suggests a different objective: rather than searching for one universally best workflow or regenerating one per instance, we should build a compact bank of reusable, complementary workflows and select among them adaptively at inference time. Doing so requires solving three coupled problems: generating complementary rather than redundant candidates, compressing them into a small deployable portfolio, and assigning each query to the right workflow under a performance-cost trade-off. To this end, we present FlowBank, a three-stage framework for portfolio-based agentic workflow optimization. Diversifying proposes DiverseFlow to steer search toward under-covered queries and produce a high-coverage candidate pool. Curating proposes CuraFlow to compress this pool into a compact portfolio with minimal redundancy. Matching casts deployment as edge-value prediction on a query-workflow bipartite graph and routes each incoming query to the portfolio member with the best predicted utility. Across five benchmarks, FlowBank achieves the highest average score among the evaluated methods while remaining cost-competitive, improving over the strongest automated and handcrafted baselines by 4.26% and 14.92% relative, respectively.

12.
arXiv (CS.CL) 2026-06-11

NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track

We present NightFeats, a structured multi-agent retrieval-augmented generation (RAG) system submitted to the MMU-RAGent competition at NeurIPS 2025, where it was awarded Best Dynamic Evaluation in the text-to-text track. Rather than targeting benchmark maximization, this work proposes a principled pipeline that decomposes knowledge synthesis into three coordinated phases: retrieval, curation, and composition, each governed by explicit intermediate representations and handoff contracts. Inspired by Agentic Context Engineering (ACE), the system introduces temporal-semantic reranking, bounded contradiction reconciliation, and citation-preserving composition as core architectural primitives. Competition results show that NightFeats surpasses proprietary baselines including Claude-SonnetV2 and Nova-Pro on LLM-as-a-Judge and Human Likert evaluations, confirming that architectural transparency and verifiable evidence grounding are better aligned with human preferences than systems optimizing narrowly for automatic similarity metrics.

13.
arXiv (CS.LG) 2026-06-18

Pointwise is Pointless? A Multimodal Ablation Study for Precipitation Nowcasting with Graph Neural Networks

arXiv:2606.18436v1 Announce Type: cross Abstract: Sparse point observations are increasingly available for precipitation nowcasting, but it is unclear how much they improve dense radar-field forecasts. We partially address this question with a multimodal graph neural network nowcasting system over the Nordic radar domain. The model predicts rain rate every five minutes up to two hours ahead and is trained with different combinations of radar history, MEPS numerical weather prediction, Netatmo surface observations, MSG satellite channels, stochastic noise, and CRPS-based ensemble losses. The study is designed as an ablation of operationally relevant information sources and training objectives. We compare radar-only, NWP-informed, station-informed, satellite-informed, noise-augmented, and CRPS-based configurations using complementary diagnostics on the radar grid, at station locations, for rain onset, and through oracle, displacement, and amplitude scores. The results show that each source improves a different part of the forecast problem. MEPS stabilises radar-only extrapolation, Netatmo observations improve local station and onset diagnostics, and satellite predictors reduce some station-level biases but may activate rain too early when used deterministically. CRPS-based configurations provide the most consistent radar-grid gains, while the combined satellite and CRPS setup gives the best overall oracle/DAS score. These results do not support the conclusion that point observations are uninformative for nowcasting, but they show that local observational skill and spatially coherent radar-field skill are distinct targets. The practical implication is that sparse observations can provide useful local constraints, but their benefit for radar-like fields depends on the training loss, uncertainty representation, and how observation support is encoded in the model.

14.
arXiv (CS.LG) 2026-06-11

Annealed Entropic Allocation for Ranking and Selection

arXiv:2606.11347v1 Announce Type: cross Abstract: We propose Annealed Entropic Allocation, an annealed weighted soft-min framework for sequential budget allocation in ranking and selection. The central idea is to replace the non-smooth maximin large-deviation rate objective with a weighted log-sum-exp surrogate that aggregates challenger-specific pairwise scores through soft-min weights, mitigating hard switching when several challengers are nearly active. To improve finite-budget discrimination, we incorporate the saddlepoint approximation – a sub-exponential correction derived from refined pairwise tail asymptotics. Because these corrections are sub-exponential and the smoothing parameter is annealed to zero, the surrogate preserves the same first-order large-deviation target as the classical maximin formulation. We show that the surrogate converges uniformly to the hard minimum, that the soft-min weights concentrate on the active challengers, and that, under fixed weights, the induced target allocation map is continuous on the simplex interior. Numerical experiments on Gaussian and exponential instances demonstrate competitive performance, especially when multiple challengers are nearly tied.

15.
arXiv (CS.CL) 2026-06-12

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

Large language models (LLMs) often hallucinate by generating factually incorrect or unfaithful content, posing significant risks to their safe use. Detecting such hallucinations is particularly challenging under the zero-source constraint, where no model internals or external references are available, and detection must rely solely on the textual query-answer pair. In this paper, we propose Human-like Criteria Probing for Hallucination Detection (HCPD), a paradigm that emulates the multi-faceted reasoning of human evaluators. Its core is a Human-like Criteria Probing (HCP) mechanism, in which a LLM agent adaptively decomposes its judgment into a weighted set of interpretable criteria and aggregates criterion-specific scores into a final truthfulness measure. To achieve this adaptive capability, we introduce a reward-based alignment scheme using only weak supervision from semantic consistency. At inference, we employ a multi-sampling aggregation strategy to ensure robust decisions while preserving full interpretability. We further provide theoretical analysis supporting the reliability of our approach. Extensive experiments show that HCPD consistently outperforms state-of-the-art baselines, offering an effective and explainable solution for zero-source hallucination detection. Code is available at https://github.com/TRISKEL10N/HCPD.

16.
arXiv (CS.CV) 2026-06-11

MedVeriSeg: Teaching LISA-Like Medical Segmentation Models to Verify Query Validity Without Extra Training

Despite recent progress in text-prompt-based medical image segmentation, existing LISA-like MLLM-based methods typically generate masks regardless of whether the target specified in the query is present, leading to hallucinated segmentation. In this work, we propose MedVeriSeg, a training-free query verification framework that enables LISA-like medical segmentation models to reject false segmentation queries. MedVeriSeg first quantifies the response quality between the [SEG] token and image features through a Similarity Response Quality Scoring Module. To further improve robustness, it employs a Lightweight Routed Multi-Agent Verification Module, which fuses quantitative score evidence with qualitative agent evidence to comprehensively verify the validity of the query. To support systematic evaluation, we construct MedVeriSeg-Bench, a benchmark designed for query verification in medical image segmentation. Experimental results demonstrate that MedVeriSeg effectively identifies false segmentation queries and reduces hallucinated segmentation, while maintaining a high acceptance rate for valid queries, thereby largely preserving the segmentation utility of LISA-like medical segmentation models.

17.
arXiv (CS.CV) 2026-06-16

PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory

Consistent video generation under editing operations requires persistence: when edits modify scene appearance or layout, subsequent generations should remain coherent across time and viewpoints. However, existing memory designs struggle to maintain long-term consistency after such modifications, as stored contexts may become outdated or invalid. To address this, we propose PermaVid, a novel framework built upon a multi-modal context memory that disentangles spatial context into semantic appearance and geometric structure, together with an edit-aware memory update and retrieval strategy that keeps memory evolution aligned with subsequent observations. Specifically, we develop two complementary memory banks: an RGB context memory that captures appearance-aware observations while implicitly encoding geometry, and a depth context memory that preserves geometry-only structure disentangled from semantics. Building on this design, we introduce a memory-guided video generation model that performs multi-modal feature fusion under reference conditions drawn from mixed-modality memory contexts. Experiments demonstrate that our method maintains strong long-term semantic and structural consistency after edits, significantly outperforming state-of-the-art methods.

18.
medRxiv (Medicine) 2026-06-10

A Heterogeneous Graph Neural Network Framework for Multi-Horizon Stroke Mortality Prediction

Background: Machine learning models for stroke mortality prediction typically treat each time horizon independently and use flat tabular features that ignore the relational structure of electronic health records (EHRs). In this pilot study, we leveraged graph-based machine learning models to predict post stroke all-cause-mortality across three different time horizons. Methods: We developed Stroke Temporal Heterogeneous Graph (StrokeTHG), a heterogeneous graph neural network model for simultaneous multi-horizon stroke mortality prediction (30-day, 90-day, 1-year) using EHR data from Penn State Health System. The model encodes various relations among EHR entities (e.g., patient, diagnosis, comorbidity) and temporal encoding of admission time to better predict stroke mortality. We compared our proposed approach against various baseline methods, including Logistic Regression, Random Forest, and XGBoost. We also performed ablation and subgroup analyses, evaluated the quality of learned graph embeddings, and assessed the importance of different edge types in the graph. Results: We included 4,144 stroke patients (mean age 69.2 years; 54.3% men), of whom 3,332 (80.4%) survived their stroke after one year. 30-day, 90-day, and 1-year mortality rates were 9.7%, 13.7%, and 19.6%, respectively. Our proposed approach, StrokeTHG, achieved AUROC of 0.872, 0.878, and 0.837 across horizons, outperforming all tabular baselines. At [&ge;] , 75% specificity, the model identified 5-10 percentage points more mortality cases than the best baseline at each horizon. Subgroup analysis demonstrated consistent performance across sex subgroups and the largest discriminative gains in the Age 65-80 stratum. Edge-type ablation identified phenotype-patient and admission-patient edges in the constructed EHR graph as the most influential relational edges for mortality prediction. StrokeTHG embeddings outperformed all graph and matrix factorization baselines under an identical downstream classifier, confirming that performance gains stem from representation quality rather than classifier capacity. Conclusions: StrokeTHG demonstrates that heterogeneous graph representations of EHR data provide a consistent improvement over flat tabular models for multi-horizon stroke mortality prediction, with particular advantage at clinically actionable sensitivity thresholds and novel multi-horizon monotonic prediction capability. This methodological framework may be adaptable to other EHR-based clinical research studies seeking to leverage heterogeneous relational structures for predictive modeling.

19.
arXiv (math.PR) 2026-06-12

Diffusion approximations for interacting stochastic systems with reflection and control

arXiv:2601.05895v2 Announce Type: replace Abstract: We study diffusion approximations for a class of interacting stochastic systems with reflection and control. Motivated by interacting stochastic dynamics subject to feedback mechanisms and boundary constraints, we consider diffusion-scaled stochastic processes incorporating stochastic fluctuations, state-dependent interactions, and reflection. Under suitable assumptions, we establish convergence in distribution of the scaled processes to systems of interacting reflected stochastic differential equations of Ornstein-Uhlenbeck type. The limiting dynamics capture key features of constrained multi-agent systems, including mean-reverting behavior, interaction effects, and confinement within bounded domains through Skorokhod reflection. The analysis combines diffusion-scaling arguments, stability estimates, and continuity properties of the Skorokhod map to connect discrete stochastic systems with their reflected diffusion limits. To illustrate the framework, we present numerical examples motivated by crowd dynamics and neural population dynamics. The simulations demonstrate qualitative agreement between the finite stochastic systems and the corresponding reflected diffusion models and illustrate how diffusion approximations can provide tractable descriptions of interacting stochastic systems with constraints.

20.
arXiv (CS.CV) 2026-06-19

Mix-QVLA: Task-Evidence-Aware Mixed-Precision Quantization of Vision-Language-Action Models

We propose Mix-QVLA, a task-evidence-aware mixed-precision PTQ framework for VLA models. Mix-QVLA anchors each quantized variant to the full-precision action-token reference decision and evaluates whether quantization preserves task-relevant evidence across key VLA functional boundaries. It computes normalized gradient-weighted task-evidence maps from boundary activations and compares full-precision and quantized maps using evidence-mass and attribution-distribution distortion, capturing changes in both the strength and allocation of decision-supporting evidence. A soft-bottleneck objective aggregates boundary-level degradation into layer-wise sensitivity scores. Mix-QVLA further models sensitivity throughout task execution, capturing phase-dependent shifts in layer importance rather than assuming a fixed sensitivity profile. The resulting evidence- and time-aware scores guide mixed-precision bit allocation under model-size and BitOps budgets. Extensive evaluations on OpenVLA-style policies show that Mix-QVLA improves the accuracy-efficiency trade-off of low-bit VLA deployment. On LIBERO, Mix-QVLA reduces OpenVLA-OFT memory from 15.4 GB to 4.1 GB, retains 96.3 average success compared with 97.1 for the BF16 model, and achieves a 1.52x inference speedup.

21.
arXiv (CS.AI) 2026-06-17

ANEForge: Python for direct computation on the Apple Neural Engine

arXiv:2606.17090v1 Announce Type: cross Abstract: ANEForge is a Python package that programs the Apple Neural Engine (ANE), the fixed-function neural accelerator on every recent Apple device, directly and without CoreML. In production the engine is reachable only through CoreML, which treats it as a scheduling option: no configuration requires the ANE, and a model can silently run on the CPU or GPU instead. ANEForge compiles a lazy tensor graph, built from 58 fused operators and 19 native bridge operators, into a single ANE program. The program is dispatched through the same ANE daemon and kernel-driver stack as Apple's internal framework. Beyond inference, the package reaches the engine's native fused attention, streams int8, int4, and sparse weights, keeps decoder and optimizer state resident across steps, and runs the forward pass, backward pass, and optimizer update of training on the engine. A small fused program completes a call in about 90us, near the engine's 70us per-program dispatch floor, and a pretrained ResNet-18 forward runs end-to-end in 0.33ms. ResNet-18, a sentence encoder, and a Vision Transformer run end-to-end against framework references, and a Stable Diffusion U-Net validates its forward pass. ANEForge targets Apple Silicon under macOS 14 and later. Each release is verified against a recorded macOS and ANE-compiler version.

22.
arXiv (CS.CL) 2026-06-16

Islamic Large Language Models: From Knowledge Acquisition to Trustworthy and Hallucination-Resistant AI

Large language models (LLMs) are increasingly used for knowledge-intensive question answering, including religious and legal questions. Islamic knowledge is a particularly demanding setting: answers are expected to be grounded in authoritative sources, citations must be exact, Arabic varieties differ substantially from the language of classical sources, and legitimate jurisprudential disagreement must be represented rather than collapsed into a single answer. This survey reviews the emerging field of Islamic LLMs and trustworthy Islamic AI. We organize the literature around Arabic NLP and Arabic-centric LLMs, Islamic NLP resources, Qur'anic question answering, Islamic knowledge benchmarks, retrieval-augmented generation, Islamic legal reasoning, inheritance reasoning, hallucination evaluation, and trustworthiness. We argue that fluency in Arabic is not sufficient for Islamic AI. Reliable systems require curated sources, retrieval and verification modules, citation-aware generation, madhhab-aware reasoning, human expert evaluation, and benchmarks that measure not only answer accuracy but also faithfulness, source validity, and reasoning quality. The survey concludes with a research agenda for hallucination-resistant Islamic AI systems.

23.
arXiv (quant-ph) 2026-06-16

A Gauge-Covariant Geometric Framework for Non-Hermitian Quantum Systems

arXiv:2606.15922v1 Announce Type: new Abstract: We develop a comprehensive, gauge-covariant geometric framework for non-Hermitian quantum systems in the quasi-Hermitian regime, that is, the region of parameter space where the non-Hermitian Hamiltonian admits a real spectrum and a positive-definite metric operator. We build this framework by elevating the Dyson map to a central geometric object. This map is the transformation that converts a non-Hermitian Hamiltonian into an equivalent Hermitian one. From it we construct the Dyson connection and decompose it into Hermitian and anti-Hermitian parts, identified respectively as {\it stretching } and {\it rotation } components. This decomposition cleanly separates the genuine physical metric deformations from the unitary gauge redundancies. Working with manifestly gauge-covariant states, we then derive the complex non-Hermitian Berry phase and the quantum geometric tensor (QGT), and show that the non-Hermitian geometric curvature originates from the non-commutativity of the stretching components at the operator level. We further analyse the geometric singularities near an exceptional point (EP) and uncover a distinct hierarchy of divergences. For a general two-level non-Hermitian model, the quantum metric tensor (QMT) exhibits a leading-order divergence $\sim |\epsilon_\mu|^{-2}$, while the Berry curvature shows a weaker, subleading divergence $\sim |\epsilon_\mu|^{-3/2}$, with $\epsilon_\mu$ denoting the parameter displacement from the EP along an individual parameter axis $\mu$. Finally, we examine physical realizations of this model, including the non-Hermitian Su–Schrieffer–Heeger (SSH) and Hatano–Nelson (HN) models, where exact analytical results confirm the predicted critical scaling laws and illustrate the metric-deformation-driven non-Hermitian geometries.

24.
arXiv (CS.LG) 2026-06-16

Fantastic Pretraining Optimizers and Where to Find Them II: Hyperball Optimization

arXiv:2606.16899v1 Announce Type: new Abstract: Matrix based optimizers such as Muon can substantially speed up language model pretraining, but their gains over AdamW are observed to shrink as model size and data scale grow when using standard constant decoupled weight decay. We propose Hyperball, a simple optimizer wrapper that addresses this issue. Given a base optimizer such as Adam or Muon, Hyperball sets the Frobenius norms of weight matrices and their corresponding optimizer updates to fixed constants. On Qwen3 style models up to 1.2B parameters, Muon Hyperball achieves 20–30% token equivalent speedup over weight decay baselines. Hyperball also improves learning rate transfer across widths and depths compared to decoupled weight decay. This method is motivated by prior theory showing that training with weight decay leads to an equilibrium weight norm that only depends on the training hyperparameters. Through this mechanism, the weight decay then decides the angular learning rate, i.e. how fast the direction of the weight matrix changes.

25.
arXiv (CS.LG) 2026-06-16

Surrogate-Assisted Framework for SI-Compliant Interconnect Design Optimization Using the Earth Mover's Distance

arXiv:2606.15234v1 Announce Type: cross Abstract: This work presents a deterministic, machine-assisted framework for SI-compliant PCB design based on the Earth Mover's Distance (EMD). In contrast to conventional surrogate-based optimization methods that rely on iterative black-box search procedures, the proposed approach follows an interpretable, sequential evaluation strategy. Neural surrogate models are first used to efficiently predict waveform describing features from topology-dependent design parameters. A decision tree then acts as a physically motivated quality gate that identifies SI-compliant waveforms according to predefined SI criteria. Within the resulting valid solution space, the Earth Mover's Distance is employed as a similarity metric to rank candidate designs according to their proximity to an ideal reference signal. This enables not only the deterministic identification of admissible parameter regions but also a transparent prioritization of physically superior solutions without inverse modeling or stochastic search procedures. The methodology is demonstrated using a large-scale set of simulated DDR3 fly-by waveforms. By combining surrogate prediction, interpretable classification, and EMD-based waveform evaluation, the framework provides an explainable and computationally efficient alternative to conventional optimization strategies for supporting PCB development with AI-based methods.