Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-16

FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies

arXiv:2605.27284v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models are increasingly expected to not only complete robot tasks, but also follow human instructions about how those tasks should be executed. However, existing robot datasets usually pair trajectories with coarse goal-level language, leaving execution-critical details such as active arm, approach direction, and contact region unspecified. This limits steerable policy learning and robotic video understanding. We introduce FineVLA, an open framework for action-aligned fine-grained VLA supervision. The framework includes: (1) a data construction tool that unifies 972,247 trajectories across 85K tasks from 10 open-source robot datasets and builds FineVLA-Data, a human-verified dataset of 47,159 fine-grained trajectories; (2) a held-out benchmark with 500 videos, 11,631 atomic facts, and 1,030 VQA questions; (3) a robotics-specialized VLM annotator for scalable fine-grained annotation; and (4) a steerable VLA policy trained with controlled mixtures of fine-grained and raw goal-level instructions. Our experiments yield three findings. First, fine-grained supervision does not sacrifice goal-level success: FG-only improves over Raw-only by +1.4 to +8.1 success-rate points across settings. Second, fine-grained and raw instructions are complementary, following a consistent inverted-U trend peaking at FG:Raw = 1:2 to 1:1. The best mixed setting reaches 86.8%/82.5% in RoboTwin simulation and 62.7/100 in real-world dual-arm manipulation (vs. 49.9 Raw-only). Third, fine-grained supervision improves steerable control: the largest real-world gains appear on pose (+23), color (+18), and approach direction (+18)–factors where goal-level instructions provide no guidance. Overall, fine-grained language should augment goal-level instructions: specifying how to execute alongside what to achieve. Project page: https://finevla.xlang.ai/

02.
arXiv (quant-ph) 2026-06-19

Sparse positive maps on qutrits with exact nondecomposability thresholds and PPT-entanglement transitions

arXiv:2606.19765v1 Announce Type: new Abstract: We study a family of sparse positive maps on qutrits for which positivity, decomposability, and PPT entanglement can all be analysed explicitly. The block structure of the associated Choi matrices reduces positivity to a Hermitian biquadratic form and leads to exact positivity boundaries for three representative parametric families. For the same families we determine the exact transition between decomposable and non-decomposable maps and construct associated PPT states of two classes. The first consists of witness-adapted deformations naturally tied to the non-decomposability analysis. The second consists of analytically tractable families whose full PPT-entangled branch is detected by fixed positive maps, yielding exact thresholds between separability and bound entanglement. For the trace-preserving subclass, we further compare positivity with a recent eigenvalue bound for 2-positive maps, thereby making the gap between positivity and higher-order positivity fully explicit within this family.

03.
arXiv (CS.LG) 2026-06-19

Alternating Direction Method of Multipliers for Nonlinear Matrix Decompositions

arXiv:2512.17473v3 Announce Type: replace-cross Abstract: We present an algorithm based on the alternating direction method of multipliers (ADMM) for solving nonlinear matrix decompositions (NMD). Given an input matrix $X \in \mathbb{R}^{m \times n}$ and a factorization rank $r \ll \min(m, n)$, NMD seeks matrices $W \in \mathbb{R}^{m \times r}$ and $H \in \mathbb{R}^{r \times n}$ such that $X \approx f(WH)$, where $f$ is an element-wise nonlinear function. We evaluate our method on several representative nonlinear models: the rectified linear unit activation $f(x) = \max(0, x)$, suitable for nonnegative sparse data approximation, the component-wise square $f(x) = x^2$, applicable to probabilistic circuit representation, and the MinMax transform $f(x) = \min(b, \max(a, x))$, relevant for recommender systems. The proposed framework flexibly supports diverse loss functions, including least squares, $\ell_1$ norm, and the Kullback-Leibler divergence, and can be readily extended to other nonlinearities and metrics. We illustrate the applicability, efficiency, and adaptability of the approach on real-world datasets, highlighting its potential for a broad range of applications.

04.
medRxiv (Medicine) 2026-06-17

Efficacy of a Gamified Digital Platform for Substance Use Education and Overdose Prevention Among College Students: a Pilot and Feasibility Study

Background: For US young adults aged 18-25 in the 2018-2024 period, fentanyl was involved in 78.2% of the 44,020 unintentional or undetermined-intent overdose deaths, most often co-involving stimulants and other non-opioid substances. While fatal overdose rates in this age group have fallen to their lowest recorded level, emergency medical services-attended non-fatal overdose events have reached record highs, shifting the decisive variable toward bystander recognition and response. College students report near-universal alcohol education but minimal education on the substances actually driving overdose mortality. Methods: We conducted a single-group pre-post evaluation of the DopaGE Portal, a gamified, mastery-based digital platform covering cocaine, MDMA, benzodiazepines, and opioid overdose response, deployed at a public university (UNL) and a multi-campus volunteer network (TACO). Paired pre/post surveys (N=42) measured self-efficacy (7 items; primary), behavioral intentions, risk perception, and knowledge/attitudes on 5-point scales, plus four factual knowledge questions. Paired t-tests, exact McNemar tests, and Benjamini-Hochberg correction across eight primary tests were applied. Institutional naloxone distribution at UNL was tracked as an ecological behavioral outcome. A mandated high-school cohort (N=94) provided supplementary acceptability data. Results: Self-efficacy increased from 2.82 to 4.46 (d=2.00, 95% CI 1.46-2.55; adjusted p

05.
arXiv (CS.CL) 2026-06-18

Efficient Hallucination Detection for LLMs Using Uncertainty-Aware Attention Heads

While large language models (LLMs) have become highly capable, they remain prone to factual inaccuracies, commonly referred to as "hallucinations." Uncertainty quantification (UQ) offers a promising way to mitigate this issue, but most existing methods are computationally intensive and/or require supervision. In this work, we propose Recurrent Attention-based Uncertainty Quantification (RAUQ), an unsupervised and efficient framework for identifying hallucinations. The method leverages an observation about transformer attention behavior: when incorrect information is generated, certain "uncertainty-aware" attention heads tend to reduce their focus on preceding tokens. RAUQ automatically detects these attention heads and combines their activation patterns with token-level confidence measures in a recurrent scheme, producing a sequence-level uncertainty estimate in just a single forward pass. Through experiments on twelve datasets spanning question answering, summarization, and translation across nine different LLMs, we show that RAUQ consistently outperforms state-of-the-art UQ baselines. Importantly, it incurs minimal overhead, requiring less than 1\% additional computation. Since it requires neither labeled data nor extensive parameter tuning, RAUQ serves as a lightweight, plug-and-play solution for real-time hallucination detection in white-box LLMs.

06.
arXiv (CS.CV) 2026-06-17

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

Recent progress in latent world models (e.g., V-JEPA2) has shown promising capability in forecasting future world states from video observations. Nevertheless, dense prediction from a short observation window limits temporal context and can bias predictors toward local, low-level extrapolation, making it difficult to capture long-horizon semantics and reducing downstream utility. Vision–language models (VLMs), in contrast, provide strong semantic grounding and general knowledge by reasoning over uniformly sampled frames, but they are not ideal as standalone dense predictors due to compute-driven sparse sampling, a language-output bottleneck that compresses fine-grained interaction states into text-oriented representations, and a data-regime mismatch when adapting to small action-conditioned datasets. We propose a VLM-guided JEPA-style latent world modeling framework that combines dense-frame dynamics modeling with long-horizon semantic guidance via a dual-temporal pathway: a dense JEPA branch for fine-grained motion and interaction cues, and a uniformly sampled VLM thinker branch with a larger temporal stride for knowledge-rich guidance. To transfer the VLM's progressive reasoning signals effectively, we introduce a hierarchical pyramid representation extraction module that aggregates multi-layer VLM representations into guidance features compatible with latent prediction. Experiments on hand-manipulation trajectory prediction show that our method outperforms both a strong VLM-only baseline and a JEPA-predictor baseline, and yields more robust long-horizon rollout behavior.

07.
arXiv (CS.LG) 2026-06-15

Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc Adversarial Auditing and Multi-Agent Feedback Loops

arXiv:2606.14149v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in healthcare settings, yet their tendency to hallucinate poses risks when clinical decisions are involved. This study examine whether LLMs recommend recently banned or withdrawn pharmaceuticals when answering clinical questions and tests an agent-based method for reducing such errors. We developed a five-agent "Trust but Verify" system using a single LLM backbone. To measure regulatory knowledge obsolescence, we created an adversarial dataset of 103 clinical MCQs where historically correct answers now refer to banned substances. This scale ensures statistical significance across various therapeutic classes. We evaluated three open-access model families (GPT-OSS, Llama-3, Falcon-3) under vanilla and agentic conditions. Performance was measured via pointwise score, label accuracy, Hallucination Error Rate (HER), and Component Fidelity (CF) score. We also observed clinical safety regression in proprietary models. In default configurations, all models showed high hallucination rates, consistently selecting banned drugs that matched training data patterns. Our proposed agentic architecture reduced HER by approximately 53% across models. Pointwise scores shifted from -0.25 (unsafe recommendation) toward 0.0 (appropriate refusal). The safety audit intercepted dangerous outputs even when models' parametric knowledge favored the banned substance. The proposed multi-agent framework offers a model-agnostic method for enforcing regulatory compliance that prioritizes patient safety over fluent text generation. Our work demonstrates a practical approach for deploying autonomous AI systems in safety-critical healthcare settings. It shows how real-time regulatory data can be integrated into LLM pipelines to support clinical decision-making.

08.
Nature (Science) 2026-06-10

Two-component exciton condensates in an electron–hole bilayer

作者:

Macroscopic quantum coherence emerges when bosons condense into a Bose–Einstein condensate (BEC)1–5. Excitons are a long-sought solid-state route to high-temperature BECs with strong interactions, electrical tunability and potentially multicomponent spinor order, but conclusive evidence for equilibrium condensation has remained elusive. Here we report evidence for two-component exciton BECs in MoSe2/hBN/WSe2 electron–hole bilayers6–9 by probing the spin–valley susceptibility of constituent electrons and holes. This heterostructure hosts equilibrium exciton fluids with four spin–valley flavours. Magneto-optical spectroscopy in a dilution refrigerator reveals three exciton condensate phases with distinct flavour polarizations. At zero magnetic field, the many-body ground state is a coherent superposition of two condensed intravalley exciton flavours. Under a magnetic field, the intravalley exciton condensate first switches to a two-component intervalley condensate through a first-order quantum phase transition at a weak critical field and then turns into a fully polarized single-component condensate at high fields. The condensate signatures form a dome in density–temperature space, persisting up to approximately 1.8 K. Our results establish van der Waals electron–hole bilayers as a versatile platform for strongly interacting, multicomponent exciton BECs. Macroscopic quantum coherence arises in two-component exciton Bose–Einstein condensates within MoSe2/hBN/WSe2 electron–hole bilayers, exhibiting distinct spin–valley polarized phases, quantum phase transitions under magnetic fields and stable condensate behaviour up to approximately 1.8 K.

09.
arXiv (CS.AI) 2026-06-16

An affordable hardware-aware neural architecture search for deploying convolutional neural networks on ultra-low-power computing platforms

arXiv:2606.16290v1 Announce Type: cross Abstract: Hardware-aware neural architecture search (HW-NAS) allows the integration of Convolutional Neural Networks (CNNs) in microcontrollers devices by automatically designing neural architectures that can fit prearranged hardware constraints. However, state-of-the-art HW-NAS target high-performance microcontrollers, whose power consumption does not meet sensing nodes requirements. This work presents a HW-NAS generating tiny CNNs that can run on ultra-low-power microcontrollers, featuring a lightweight search procedure enabling its execution even on embedded devices. Empirical results on three well-known benchmarks for tiny computer vision proved that the proposed HW-NAS was able to generate tiny CNNs while preserving state-of-the-art classification accuracy.

10.
arXiv (CS.LG) 2026-06-16

Latent space mapping of interpretable structural coordinates from stochastic single-molecule signals

arXiv:2606.16950v1 Announce Type: cross Abstract: Nanopores are versatile single-molecular sensors, but their utility is fundamentally constrained by stochastic translocation dynamics warping any encoded information. We resolve it by shifting from time-domain analysis to a learned latent-space mapping via a contrastive encoder trained exclusively on simulated signals from a physics-informed model. This encoder maps solid-state nanopore signals of engineered DNA barcodes into an interpretable molecular coordinate system. The learned representation is responsive to structural barcode parameters while remaining invariant to acquisition conditions and translocation conformation, allowing data pooling across devices. Molecule identification requires a single pass through the encoder, reducing computational cost by three orders of magnitude relative to alignment-based methods. We experimentally validate through mixture quantification, rare-variant detection, consensus barcode reconstruction, and real-time signal acquisition. This shift from temporal analysis to mapping structural coordinates into a latent space changes the paradigm behind analyzing stochastic sensor signals by linking classification to interpretable encoded molecular information.

11.
arXiv (CS.AI) 2026-06-17

MathVis-Fine: Aligning Visual Supervision with Necessity via Progressive Dependency-Guided Training for Multimodal Mathematical Reasoning

arXiv:2606.17888v1 Announce Type: new Abstract: Chain-of-Thought (CoT) reasoning has extended from purely linguistic domains to multimodal scenarios; however, existing approaches often treat visual inputs as homogeneous or auxiliary signals, failing to capture the intricate and sample-specific dependencies between text and images in mathematical problem-solving. This gives rise to two core issues: first, the supervisory signals for visual content are generalized and coarse-grained, lacking adaptation to the actual necessity of visual information in each sample; second, training feedback becomes inaccurate when visual rewards are uniformly applied without distinguishing the complementary relationships among inputs. These limitations hinder models from achieving precise multimodal reasoning. In this work, we propose a framework for modeling fine-grained visual dependencies in mathematical reasoning. We first construct the MathVis-Fine dataset, augmenting fine-grained visual annotations with visual dependency ratings. Building upon this dataset, we introduce a two-stage progressive visual enhancement training paradigm that balances answer correctness rewards and visual grounding rewards according to the intrinsic visual dependency level of each sample, thereby mitigating reward bias and improving supervision accuracy. Extensive experiments demonstrate that the MathVis-Fine framework effectively enhances visual perception progressively based on visual dependency, offering a more precise training framework for multimodal mathematical reasoning. We will release the dataset upon acceptance.

12.
arXiv (CS.CL) 2026-06-16

The Truth Stays in the Family: Enhancing Contextual Grounding via Inherited Truthful Heads in Model Lineages

Recent advances in large language models (LLMs) have produced many specialized multimodal LLMs (MLLMs) that share common foundational LLMs, forming distinct model lineages. It remains unclear whether a fundamental behavioral link exists between the foundational LLMs and downstream variants. We investigate this question by quantifying head-level context-truthfulness scores. Across diverse LLM and MLLM lineages, including Vicuna-, Qwen2.5-, LLaMA2-, and Mistral-based models, we find that Truth Scores are strongly preserved within model families, even after instruction tuning or multimodal adaptation. We further show that this inheritance is consistent with attention-head weight preservation, and that context-truthful heads attend to query-relevant evidence. Building on this finding, we propose TruthProbe, a soft-gating strategy that amplifies context-truthful heads while preserving other head contributions. TruthProbe improves contextual truthfulness on HaluEval and reduces multimodal hallucination on POPE and CHAIR, with base-LLM Truth Scores transferring effectively to their fine-tuned LLM and MLLM descendants. Code is available at https://github.com/miso-choi/TruthProbe.

13.
arXiv (CS.CL) 2026-06-11

Beyond representational alignment with brain-guided language models for robust reasoning

The correspondence between large language models (LLMs) and the neural mechanisms underlying human higher-order cognition remains insufficiently characterized. Given that language and reasoning in the human brain appear dissociable, an open question is whether LLMs align with neural signals from reasoning-related regions and whether such signals can improve them. Here, focusing on deductive reasoning, we show that LLM internal representations are not only partially aligned with task-fMRI activity but can also be directly enhanced by these signals. Using a neural-predictivity metric, we find that LLMs explain a substantial fraction of the explainable variance in reasoning-related regions at the aggregate level, whereas predictivity within specific reasoning types is lower, indicating both alignment and divergence. Building on this, we propose a brain-guided framework: we steer model representations along directions induced by the joint structure of model and brain representations, applying intervention at inference and fine-tuning during training. We demonstrate that task-evoked brain signals can directly enhance LLM reasoning, yielding gains orthogonal to language-only supervision across 10 LLMs (1.5B-72B), with transfer across reasoning types and up to 13\% absolute accuracy gain. Our results advance LLM-brain correspondences from correlation to guidance, establishing a brain-signal-driven pathway toward more robust and cognitively aligned AI.

14.
arXiv (math.PR) 2026-06-12

Conditional means, vector pricings, amenability and fixed points in cones

arXiv:2512.13829v4 Announce Type: replace Abstract: We develop a generalization of conditional probability for arbitrary ordered vector spaces. A related problem is that of assigning a numerical value to one vector relative to another. We characterize the groups for which these generalized probabilities can be stationary, respectively invariant. Our results deviate from the setting of classical probability and lead to a new criterion for amenability and for fixed points in cones.

15.
PLOS Computational Biology 2026-06-09

Evolution of phenocopying in a dynamical model of developmental trajectories

by Yuuki Matsushita, Archishman Raju Developmental trajectories are known to be canalized, or robust to both environmental and genetic perturbations. However, even when these trajectories are decanalized by an environmental perturbation outside the range of conditions to which they are robust, they often produce phenotypes similar to known mutants, called phenocopies. This correspondence between the effects of environmental and genetic perturbations has received little theoretical attention. Here, we study an abstract regulatory model that is evolved to follow a specific trajectory. We then study the effects of small and large perturbations to the trajectory, both by changing parameters and by perturbing the state at specific times. We find that the phenomenon of phenocopying emerges in evolved trajectories and is not present in a null model of randomly sampled trajectories. Our results suggest that, in this class of dynamic models, evolution can allow high-dimensional phenotypic landscapes to simultaneously exhibit robustness and phenocopying.

16.
arXiv (CS.AI) 2026-06-16

FORTIS: Benchmarking Over-Privilege in Agent Skills

arXiv:2605.09163v3 Announce Type: replace Abstract: Large language model agents increasingly operate through an intermediate skill layer that mediates between user intent and concrete task execution. This layer is widely treated as an organizational abstraction, but we argue it is also a privilege boundary that current models routinely exceed. We present FORTIS, a benchmark that evaluates over-privilege in agent skills across two stages: whether a model selects the minimally sufficient skill from a large overlapping library, and whether it executes that skill without expanding into broader tools or actions than the skill permits. Across ten frontier models and three domains, we find that over-privileged behavior is the norm rather than the exception. Models consistently reach for higher-privilege skills and tools than the task requires, failing at both stages at rates that remain high even for the strongest available models. Failure is especially severe under the ordinary conditions of real user interaction: incomplete specification, convenience framing, and proximity to skill boundaries. None of these requires adversarial construction. The results indicate that the skill layer, far from containing agent behavior, is itself a primary source of privilege escalation in current systems.

17.
arXiv (math.PR) 2026-06-18

A simple approach to the L{\o}kka-Zervos dichotomy for absolutely continuous dividend strategies

arXiv:2604.13302v3 Announce Type: replace-cross Abstract: We revisit the optimization problem solved in L{\o}kka & Zervos (2008), i.e., the maximization of dividends, in a Brownian risk model, with the possibility (not the obligation) of making capital injections. Following the approach introduced in Alvarez & Shepp (1998), Renaud & Simard (2021), Renaud et al. (2023), we consider instead absolutely continuous (AC) dividend strategies with an affine bound on the payment rates, while singular capital injections are still allowed. In addition, we incorporate a parameter for the cost of ruin or, said differently, a penalty at ruin in the performance function. We show that the solution is a so-called L{\o}kka-Zervos dichotomy: the surplus is never ruined by making bail-out payments, or no capital is injected and bankruptcy can occur; in either case, dividends are paid at full rate when the surplus is above a threshold. Our framework allows us to provide explicit conditions to express the dichotomy, either using the cost of capital injections or the cost of ruin as a criterion, which also exposes the underlying structure of the solution. In particular, for some values of the parameters, we show that it is optimal to liquidate. Moreover, we perform a numerical analysis highlighting the range of values generated under this AC affine-bound structure.

18.
arXiv (CS.LG) 2026-06-16

Amortized mean-shift interacting particles

arXiv:2606.15871v1 Announce Type: cross Abstract: Bayesian inference for inverse problems is run to evaluate integrals – posterior expectations, tail probabilities, and risks – across a stream of observations. The standard estimate averages the integrand over posterior samples, a Monte-Carlo average whose error decays only as the square root of the sample size, so accuracy demands many samples – prohibitive when each one calls a partial-differential-equation forward model. Mean-shift interacting particles need far fewer: they return a small set of signed-weight nodes – a deterministic quadrature whose weighted averages estimate those integrals. Finding the nodes, however, is a per-observation optimization that, in its most accurate form, reads the posterior score at every step – returning the cost it meant to save. We introduce amortized mean-shift interacting particles, a learned map that emits the weighted nodes from an observation and a few posterior samples in a single forward pass. Training asks only for joint parameter-observation samples and a posterior to draw from – a conditional normalizing flow, an empirical conditional, or any reference the user can sample – and the map learns to integrate that posterior from samples alone, evaluating neither its density nor its score. Once trained, it generalizes to unseen observations and integrands at any node budget and improves on independent samples in two ways: by reweighting them, provably no worse than the equal weights of Monte-Carlo; and by moving them, which empirically lowers it further. Across closed-form, sampled, learned, and physics-based posteriors – up to a thousand-coefficient groundwater field – it integrates more accurately than the same number of samples at every budget, and a posterior-whitened, dimension-aware kernel removes the high-dimensional wall. The result is a Pareto improvement on Monte-Carlo integration, not a competitor to drawing more samples.

19.
arXiv (CS.CL) 2026-06-11

FlowBank: Query-Adaptive Agentic Workflows Optimization through Precompute-and-Reuse

Large Language Model (LLM)-based multi-agent systems are increasingly powerful, but current agentic workflow optimization paradigms make an unsatisfying trade-off. Task-level methods spend substantial offline compute yet deploy only a single workflow, leaving complementary candidates unused, while query-level methods synthesize a new workflow per query at substantial inference cost. Our motivating analysis shows these paradigms are more complementary than competing: workflows discovered during offline search often solve different subsets of queries, and many queries handled by expensive query-level generation can already be solved by cheaper precomputed workflows. This suggests a different objective: rather than searching for one universally best workflow or regenerating one per instance, we should build a compact bank of reusable, complementary workflows and select among them adaptively at inference time. Doing so requires solving three coupled problems: generating complementary rather than redundant candidates, compressing them into a small deployable portfolio, and assigning each query to the right workflow under a performance-cost trade-off. To this end, we present FlowBank, a three-stage framework for portfolio-based agentic workflow optimization. Diversifying proposes DiverseFlow to steer search toward under-covered queries and produce a high-coverage candidate pool. Curating proposes CuraFlow to compress this pool into a compact portfolio with minimal redundancy. Matching casts deployment as edge-value prediction on a query-workflow bipartite graph and routes each incoming query to the portfolio member with the best predicted utility. Across five benchmarks, FlowBank achieves the highest average score among the evaluated methods while remaining cost-competitive, improving over the strongest automated and handcrafted baselines by 4.26% and 14.92% relative, respectively.

20.
arXiv (quant-ph) 2026-06-16

Interaction and non-Hermiticity controlled transmission in extended Su-Schrieffer-Heeger models

arXiv:2606.15245v1 Announce Type: cross Abstract: We study the transport characteristics of an extended version of the Su-Schrieffer-Heeger (SSH) model with next-nearest-neighbor (NNN) interactions and non-Hermitian onsite energies. We observed that transport in such a system is significantly modified by the NNN interaction and the non-Hermitian terms. The transmission coefficient exhibits oscillatory behavior as the strength of the NNN interaction varies in a fixed-length chain. Moreover, the transmission coefficient also shows oscillation with system size for a fixed strength of the NNN interaction. We find that novel oscillatory behavior of the transmission coefficient, arising form the NNN interaction, is a unique feature of such a model and has not been reported previously. The presence of the non-Hermitian terms also enhances/reduces the transmission coefficient depending on the values of the other system parameters like intra-, inter- and NNN hopping. It appears from our study that both the NNN interaction and the non-Hermiticity introduce significant changes in the transport properties of the extended SSH chain, which are not observed in the standard Hermitian nearest-neighbour variant of the SSH model.

21.
arXiv (CS.LG) 2026-06-17

HeteRo-Select: Informativeness as the Participation Driver in Heterogeneous Federated Learning

arXiv:2508.06692v2 Announce Type: replace Abstract: Federated learning systems typically allocate gradient compression by link speed. This is sensible when bandwidth and data informativeness align. However, under non-IID data, these signals often decorrelate or invert. A bandwidth-driven allocator then risks compressing the most informative gradients hardest. We propose HeteRo-Select, a framework that replaces bandwidth with a per-client informativeness score as the primary driver of compression. The score jointly governs three decisions per round: client selection, compression ratio, and server aggregation weight, with bandwidth retained only as a hard ceiling. Score-proportional selection provably reduces the effective heterogeneity of the chosen subset; score-proportional compression provably lowers aggregate top-$k$ error at fixed traffic. Under the exact FedCG simulation protocol, HeteRo-Select delivers a $1.78\times$ speedup and an $18.2\%$ reduction in traffic on CIFAR-10. The same configuration, unchanged, scales from a $7{,}850$-parameter logistic regression to an $11.27$M-parameter ResNet-18, hitting the accuracy target on three of four benchmarks. When bandwidth and informativeness are deliberately anti-correlated, the method still achieves the target accuracy with less traffic than the normal-bandwidth run.

22.
bioRxiv (Bioinfo) 2026-06-22

CellTosg2Sequence: A Unified Text-Omics-Signaling-Graph Large Language Model for Single-Cell Analysis

bioRxivLaTeXUnicodeabstract — In single-cell (sc)-based scientific discovery, text-formatted biomedical prior knowledge and signaling graphs are essential for annotating and interpreting numeric sc-omics data and for generating novel testable hypotheses. A major limitation of existing single-cell large language models (scLLMs) is that they rely on numeric expression data with gene names as the only textual signal, while comprehensive biomedical priors – cellular localization, gene function, disease associations, and signaling interaction patterns – remain absent from the model input. We introduce CellTosg2Sequence, a textual-prior- and signaling-graph-augmented cell-omics-sentence language model. A lightweight heterogeneous graph encoder maps a curated 62,507-node biomedical knowledge graph (KG) into compact virtual tokens that are prepended to each cell sentence, allowing the language model to condition on biological structure with minimal sequence-length overhead. We train CellTosg2Sequence with a three-stage objective: Stage I anchors the KG channel under autoregressive language-model pretraining, leveraging Qwen2.5-32B's own language reasoning for rapid KG alignment; Stage II aligns labels via supervised fine-tuning with KG-anchored InfoNCE; Stage III applies Group Relative Policy Optimization (GRPO) with an ontology-hierarchy reward, enabling free-generation cell-type prediction that generalizes beyond the closed training vocabulary. Across multiple benchmarks and ablation experiments, CellTosg2Sequence outperforms strong baselines. All results are achieved with lightweight LoRA training and a single unified checkpoint.

23.
arXiv (CS.LG) 2026-06-19

Federated Bilevel Performative Prediction

arXiv:2606.19734v1 Announce Type: new Abstract: Federated bilevel optimization is widely used for nested learning problems across distributed clients, such as federated hyperparameter tuning and meta-learning under privacy and communication constraints. Most existing formulations assume fixed client data distributions, which can be violated by performativity, where deployed decisions reshape client behavior and data collection, inducing client-specific, decision-dependent distribution shift. We study federated bilevel performative prediction, where both upper-level (UL) and lower-level (LL) objectives are evaluated under client-dependent, decision-dependent distributions. We formalize the federated bilevel performatively stable (FBPS) point under a decoupled-risk perspective and provide sufficient conditions for its existence and uniqueness. We then develop two federated methods to compute the FBPS solution: FBi-RRM, which converges linearly under a contraction condition, and FBi-SGD, a communication-efficient stochastic method based on federated hypergradient estimation with convergence guarantees under diminishing step sizes when sensitivities are sufficiently small. Experiments on strategic regression and meta strategic classification validate the predicted stability thresholds and demonstrate improved meta-generalization over non-performative baselines, and CNN-based classification further demonstrates the practical effectiveness of the proposed methods in nonconvex neural network settings.

24.
arXiv (CS.AI) 2026-06-16

SMEPilot: Characterizing and Optimizing LLM Inference with Scalable Matrix Extensions

arXiv:2606.16332v1 Announce Type: cross Abstract: Modern CPUs increasingly integrate matrix extensions, such as Arm Scalable Matrix Extension (SME), that provide high-throughput matrix execution within the CPU. For LLM inference, however, these units are not a universal replacement for conventional CPU cores: prefill, decode, attention, and KV-cache operations expose different arithmetic intensities, vector behavior, and layout requirements, while SME units and CPU cores still compete for shared memory bandwidth. This paper studies this mismatch through a roofline-based characterization of SME-enabled CPUs and uses the resulting model to guide operator-level execution choices. We present SMEPilot, an LLM inference engine that selects CPU-only, SME-only, or cooperative SME+CPU execution for each operator shape. SMEPilot partitions matrix work across SME and CPU cores at tile granularity, overlaps SME-suitable matrix stages with CPU-suitable vector stages in attention, and maintains layout state so packed tensor representations are reused rather than repeatedly rebuilt on critical paths. Across Llama-3.2-3B, Qwen3-4B, and Qwen3-30BA3B on phone, PC, and server platforms, SMEPilot improves end-to-end inference performance by up to 3.94$\times$.

25.
arXiv (quant-ph) 2026-06-19

Quantum Algebraic Diversity: Single-Copy Density Matrix Estimation via Group-Structured Measurements

arXiv:2604.03725v3 Announce Type: replace Abstract: We extend the algebraic diversity (AD) framework from classical signal processing to quantum measurement theory. The Quantum Algebraic Diversity (QAD) Theorem establishes that a group-structured positive operator-valued measure (POVM) applied to a single copy of a quantum state produces a full-rank, group-averaged density matrix estimator whose eigenbasis and eigenvalue ordering track those of the true density matrix, with a bias toward the symmetrized state, analogous to the classical recovery of covariance eigenstructure from a single observation. We establish a Classical-Quantum Duality Map connecting classical covariance estimation to quantum state tomography, and an Optimality Inheritance Theorem showing that classical group optimality transfers to quantum settings via the Born map within the group-averaged family. SIC-POVMs are identified as AD with the Heisenberg-Weyl group and mutually unbiased bases as AD with the Clifford group, revealing the hierarchy $\mathrm{HW}(d) \subseteq \mathcal{C}(d) \subseteq S_d$ that mirrors the classical $\mathbb{Z}_M \subseteq G_{\min} \subseteq S_M$. The double-commutator eigenvalue theorem gives polynomial-time adaptive POVM selection. A worked qubit example shows the group-averaged estimator from a single computational-basis measurement, averaged over a matched $\mathbb{Z}_2$ group, reaching fidelity 0.99 where standard single-basis tomography gives a rank-1 estimate of fidelity 0.80. Monte Carlo simulations for $d = 2$ to $13$ confirm fidelity above 0.90 from a single outcome while standard fidelity degrades as $\sim 1/d$. The growing ratio reflects collapse of the rank-1 standard estimator, not fewer copies per parameter: the biased single-copy estimator reduces the number of distinct measurement settings, not the per-parameter sampling cost, and a genuine copy reduction holds only under exact symmetry.